LinguaLibre

Difference between revisions of "Citations"

Citations gathers all citations of LinguaLibre by external actors.

Line 7: Line 7:
 
== Academic ==
 
== Academic ==
 
=== Lingualibre ===
 
=== Lingualibre ===
[[File:File:Hutin and Allasonniere-Tang, L'apport des données collaboratives à l'exploration linguistique.pdf|thumb|thumb|L'apport des données collaboratives à l'exploration linguistique]]
+
[[File:Hutin and Allasonniere-Tang, L'apport des données collaboratives à l'exploration linguistique.pdf|thumb|thumb|L'apport des données collaboratives à l'exploration linguistique]]
 
* https://www.researchgate.net/publication/361565674_Crowd-sourcing_for_Less-resourced_Languages_Lingua_Libre_for_Polish
 
* https://www.researchgate.net/publication/361565674_Crowd-sourcing_for_Less-resourced_Languages_Lingua_Libre_for_Polish
 
** Mathilde Hutin, Marc Allassonnière-Tang (2022), Crowd-sourcing for Less-resourced Languages: Lingua Libre for Polish
 
** Mathilde Hutin, Marc Allassonnière-Tang (2022), Crowd-sourcing for Less-resourced Languages: Lingua Libre for Polish

Revision as of 21:27, 30 June 2022

Draft
Twemoji12 1f3d7.svg
Twemoji12 1f3d7.svg

This page is a work in progress.

Press

France

World

Wikimedia Newsrooms

Academic

Lingualibre

L'apport des données collaboratives à l'exploration linguistique

Word lists by Google / Unilex researches

  • https://research.google/pubs/pub47206/ for mining wordlists (Unilex-style) from 2,000+ languages
    • Prasad, Manasa; Breiner, Theresa; Esch, Daan van (2018). "Mining Training Data for Language Modeling across the World's Languages" (PDF). Proceedings of the 6th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2018).
  • https://research.google/pubs/pub46952/ cleaning them up;
    • Chua, Mason; Esch, Daan van; Coccaro, Noah; Cho, Eunjoon; Bhandari, Sujeet; Jia, Libin (2018). "Text Normalization Infrastructure that Scales to Hundreds of Language Varieties". Proceedings of the 11th edition of the Language Resources and Evaluation Conference.
  • https://arxiv.org/abs/2103.15845 open-sourced;
    • Zupon, Andrew; Crew, Evan; Ritchie, Sandy (2021-03-29). "Text Normalization for Low-Resource Languages of Africa". arXiv:2103.15845 [cs].
  • https://research.google/pubs/pub49814/ using these wordlists to find sentences using our web crawler
    • Caswell, Isaac; Breiner, Theresa; Esch, Daan van; Bapna, Ankur (2020). "Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus".
  • https://research.google/pubs/pub50211/ cleaning up web-crawled text
    • Kreutzer, Julia; Caswell, Isaac; Wang, Lisa; Wahab, Ahsan; Esch, Daan van; Ulzii-Orshikh, Nasanbayar; Tapo, Allahsera Auguste; Subramani, Nishant; Sokolov, Artem; Sikasote, Claytone; Setyawan, Monang (2022). "Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets". TACL.
  • https://arxiv.org/abs/2205.03983 building machine translation systems from them
    • Bapna, Ankur; Caswell, Isaac; Kreutzer, Julia; Firat, Orhan; van Esch, Daan; Siddhant, Aditya; Niu, Mengmeng; Baljekar, Pallavi; Garcia, Xavier; Macherey, Wolfgang; Breiner, Theresa (2022-05-16). "Building Machine Translation Systems for the Next Thousand Languages". arXiv:2205.03983 [cs].
  • https://ai.googleblog.com/2022/05/24-new-languages-google-translate.html blog post
    • "Unlocking Zero-Resource Machine Translation to Support New Languages in Google Translate". Google AI Blog. Retrieved 2022-06-30.

See also

Lingualibre:Help