Help

Difference between revisions of "SPARQL"

Line 395: Line 395:
Β 
|}
Β 
|}
Β Β 
βˆ’
=== Isolang β†’ Language LL Qid ===
+
=== Languages β†’ Languages iso-639-3 ===
Β Β 
Β 
{| style="width:100%" Β 
Β 
{| style="width:100%" Β 

Revision as of 16:42, 8 December 2021

Base

Fetch data using SPARQL

LinguaLibre data can be fetched using various coding languages such as Python, Javascript, R and others, returning JSON or other formats.

  • For code snippet in your language : open query.wikidata.org (WikiData Query Service, aka WDQS), run your SPARQL query, click "Code" : a pop up window appears with various implementations.
  • For downloading data, click "Download".

Javascript:
At least 3 methods exists (code snippet), example:

Query Result's basic unit
SPARQL:
SELECT ?item WHERE { ?item prop:P2 entity:Q5 } LIMIT 10
{ … },
{
  "item": {
    "type": "uri",
    "value": "https://lingualibre.org/entity/Q12"
  },
  "itemLabel": {
    "xml:lang": "en",
    "type": "literal",
    "value": "beginner"
  }
},
{ … }
Javascript:
var endpoint = 'https://lingualibre.org/sparql';
var sparql = 'SELECT ?item WHERE { ?item prop:P2 entity:Q5 } LIMIT 10';
$.getJSON(endpoint,
	{ query: sparql, format: 'json' },
	function(data){ console.log('JQuery: ',data)}
);

Lingualibre's ground

βœ… Is Language (language/dialect (Q4)) β†’ List existing languages

SELECT ?lang ?iso ?langLabel WHERE {
  ?lang prop:P2 entity:Q4 .
  ?lang prop:P13 ?iso .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

βœ…πŸ‡Ά Is Speaker (speaker (Q3)) β†’ List existing speakers

SELECT ?speaker ?speakerLabel
WHERE {
  ?speaker prop:P2 entity:Q3 .  # Condition 1, P2 'instance of' is Q3 'speaker'.
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

βœ… Is Language level (language level (Q5)) β†’ List existing levels

SELECT ?item ?itemLabel
WHERE {
  ?item prop:P2 entity:Q5    # Condition 1, P2 'instance of' is Q5 'language level'.
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

βœ… Is Sex or Gender (sex or gender (Q7)) β†’ List existing values

SELECT ?item ?itemLabel
WHERE {
  ?item prop:P2 entity:Q7    # Condition 1, P2 'instance of' is Q7 'sex or gender'.
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

Speaker

βœ… Speaker name(s) β†’ Speaker Qid(s)

SELECT ?speakerName ?speakerId
WHERE {
  VALUES ?speakerName { "Yug" "VIGNERON" } # One or multiple values
  BIND ( STRLANG(?speakerName, "en") AS ?speakerLabel )
  # P2: instance of; Q3: speaker.
  ?speakerId prop:P2 entity:Q3 ; rdfs:label ?speakerLabel .
}
... Loading ...

βœ…πŸ‡Ά Speaker Qid (0x010C (Q42)) β†’ Speaker data, all

# Get Q42 (User:0x010C)'s data
SELECT ?predicate ?object ?objectLabel
WHERE {
  entity:Q42 ?predicate ?object .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

βœ…πŸ‡Ά Speaker Qid (0x010C (Q42)) β†’ Speaker languages (P4)

SELECT ?languages ?languagesLabel
WHERE {
  entity:Q42 prop:P4 ?languages .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

βœ… Speaker Qid + language β†’ List associated audios

SELECT ?audio ?audioLabel
WHERE {
  ?audio prop:P5 entity:Q42 .   # Condition 1, P5 Speaker is Q42 User:0x010C
  ?audio prop:P4 entity:Q21 .   # Condition 2, P4 language is Q21 French
  # Labels
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

Languages

βœ… Language LL Qid (Q21) β†’ Count items

SELECT ?language (COUNT(?audio) AS ?nbAudio) WHERE {
  VALUES ?language { entity:Q21 }
  ?audio prop:P4 ?language .
}
GROUP BY ?language
... Loading ...

βœ… Language LL Qid (Q21) β†’ Count records

SELECT ?language (COUNT(?audio) AS ?audio) WHERE {
  VALUES ?language { entity:Q21 }
  ?audio prop:P2 entity:Q2 .  # P2 'instance of' is Q2 'record'
  ?audio prop:P4 ?language .  # P4 'language' is Q21 'French'
}
GROUP BY ?language
... Loading ...

Language LL Qid (Q21) β†’ Count unique words

βœ… Language LL Qid (Q21) β†’ Count speakers

SELECT ?language (COUNT(?audio) AS ?audio) WHERE {
  VALUES ?language { entity:Q21 }
  ?audio prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
  ?audio prop:P4 ?language .  # P4 'language' is Q21 'French'
}
GROUP BY ?language
... Loading ...

βœ… Language LL Qid (Q209) β†’ List speakers

SELECT ?language ?speaker ?speakerLabel WHERE {
  VALUES ?language { entity:Q209 }
  ?speaker prop:P2 entity:Q3 .  # P2 'instance of' is Q3 'speaker'
  ?speaker prop:P4 ?language .  # P4 'language' is Q21 'French'
  # Labels
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
... Loading ...

βœ… Language LL Qid (French (Q21)) + Speaker (0x010C (Q42)) β†’ Count records

SELECT ?language ?speakerLabel (COUNT(?audio) AS ?audio)
WHERE {
  VALUES ?language { entity:Q21 }
  VALUES ?speaker { entity:Q42 }
  ?audio prop:P4 ?language .  # P4 'language' is Q21 'French'
  ?audio prop:P2 entity:Q2 .  # P2 'instance of' is Q2 'record'
  ?audio prop:P5 ?speaker . # P5 'speaker' is Q42 '0x010C'
  # Labels
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
  } 
}
GROUP BY ?language ?speakerLabel
... Loading ...

Languages β†’ Languages iso-639-3

SELECT * WHERE {
  ?lang prop:P13 ?code .
}
... Loading ...

βœ… Isolang β†’ Language WD Qid

SELECT ?langIso ?langId
WHERE {
  VALUES ?langIso { "ban" "bre" } # One or multiple values
  # P2 'instance of'; Q4 'language'; P13 'ISO 639-3 code'
  ?langId prop:P2 entity:Q4 ; prop:P13 ?langIso .
}
... Loading ...

βœ… Language WD Qid β†’ Language data, all

SELECT * WHERE {
  ?lang prop:P12 "Q12107" .  # P12 'Wikidata id' is Wikidata's "Q12107"
  ?lang ?predicate ?object . # 
}
... Loading ...

βœ… Language LL Qid (Breton (Q209)) β†’ Language data, all

'Case: Get for language Q209 'Breton' all its data.

SELECT * WHERE {
  # Given Q209 'Breton language', get all properties and values
  entity:Q209 ?predicate ?object .
}
... Loading ...

βœ… Language LL Qid (Breton (Q209)) β†’ Language data, core

'Case: Get for language Q209 'Breton' all its CORE data.

SELECT * WHERE {
  # Given Q209 'Breton language', get all properties and values
  entity:Q209 ?predicate ?object .
  ?predicate rdf:type owl:DatatypeProperty .
}
... Loading ...

βœ… Language (Breton (Q209)) + speaker (ThonyVezbe (Q584098)) + word (ni) β†’ Audio's Qid

Case: Search in Breton language, with speaker 'ThonyVezbe',

SELECT ?audio
WHERE {
  ?audio prop:P4 entity:Q209 .    # P4 'language' is Q209 'Breton'
  ?audio prop:P5 entity:Q584098 . # P5 'speaker' is Q584098 'ThonyVezbe'
  ?audio rdfs:label ?word . #word
  FILTER ( STR(?word) = "ni" )    # word = 'ni'
}
... Loading ...

Records

βœ… Item name β†’ Qid(s)

SELECT ?item ?itemLabel
WHERE { 
  ?item rdfs:label ?itemLabel. 
  FILTER(CONTAINS(LCASE(?itemLabel), "Yug"@en)). 
} limit 10
... Loading ...

Audio Qid β†’ Audio data

Langue + speaker + word β†’ Audio's Commons url

Heavy queries

❌ Is Language (speaker (Q3)) β†’ list all languages with number of unique words and speakers

Too large to run (not even on Lingualibre Query).

SELECT ?language (COUNT(?audio) AS ?nbAudio) (COUNT(?speaker) AS ?nbSpeaker) WHERE {
  ?language prop:P2 entity:Q4 .
  ?audio prop:P4 ?language .
  ?speaker prop:P4 ?language .
}
GROUP BY ?language

To do: do smaller sub-queries. For now, works only for one counter and one language at a time:

❌ Languages β†’ Name, Wikidata Qid, LLQid, Iso-639-3, and genders

Query Result
SELECT ?languageQidLabel ?wdQid ?languageQid ?isoCode 
(COUNT(DISTINCT(?record)) AS ?recordCount)
(COUNT(DISTINCT(?speakerLangM)) AS ?speakerM) 
(COUNT(DISTINCT(?speakerLangF)) AS ?speakerF)
wWHERE{
  ?record prop:P2 entity:Q2 .     # Filter: items where P2 'instance of' is Q2 'record'
  ?record prop:P4 ?languageQid .  # Assign value: P4 'language' into variable ?language
  ?languageQid prop:P12 ?wdQid .  # Assign value: P12 'wikidata id' into variable ?WD
  ?languageQid prop:P13 ?isoCode. # Assign value: P13 'iso639-3' into ?isoCode
  
  #?record prop:P5 ?speakerQidM .   # Assign value: P5 'speaker' into variable ?speakerQidM
  #?speakerQidM prop:P8 entity:Q16 .   # Filter: P8 'sex or gender' is Q16 'male
  #?speakerQidM prop:P4 ?speakerLangM .  # Assign value: P4 'language' into variable ?spakerLangM
  
  ?record prop:P5 ?speakerQidF .   # Assign value: P5 'speaker' into variable ?speakerQidF
  ?speakerQidF prop:P8 entity:Q17 .   # Filter: P8 'sex or gender' is Q17 'female
  ?speakerQidF prop:P4 ?speakerLangF .  # Assign value: P4 'language' into variable ?spakerLangF
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . } 
}
GROUP BY ?languageQidLabel ?languageQid ?wdQid ?isoCode
ORDER BY DESC(?recordCount)
languageQidLabel	wdQid	languageQid	isoCode	recordCount	speakerM	speakerF
French	Q150	Q21	fra	16761	0	18
Marathi	Q1571	Q34	mar	13153	0	5
Polish	Q809	Q298	pol	11686	0	1
…

Tools

  • Special:ApiSandbox – API queries generator for Lingualibre wikipage and wikibase contents.