Resources for conservation, development, and documentation of low resource (human) languages.
-
Updated
Apr 2, 2025 - TeX
Resources for conservation, development, and documentation of low resource (human) languages.
Speeding the availability of language resources for endangered languages. Tools such as this have the power to shift how we think about endangered languages. Rather than perceiving them as being antiquated, difficult to learn and on the brink of vanishing, we see them as modern, easily accessible for learning online in text and audio formats.
My thesis on "Open Source Code and Low Resource Languages" for an MSc in Language Science and Technology at Saarland University
A pipeline to isolate and transcribe one language in mixed-language speech
A Python module for retrieving script types of writing systems including alphabets, abjads, abugidas, syllabaries, logographs, featurals as well as Latin script codes
Weather app originally created by CodeExplained (https://github.com/CodeExplainedRepo/Weather-App-JavaScript) to which I have added translations of weather descriptions in Kouri-Vini (Louisiana Creole) and a location search bar. Please let me know if any of my translations aren't showing up correctly. Byin mèsi. :)
Repository of our paper Nesciun Lengaz Lascià Endò: Machine Translation for Fassa Ladin.
Scottish Gaelic Spellchecker - GOC (Gaelic Orthographic Convention)
Dictionary of Endangered Languages
Saving endangered Indian languages with open AI innovation
A Python script supporting Chamorro language preservation through the creation of a custom Chamorro-English dictionary for Kindle devices—making reading in Chamorro more accessible.
A Oneida (Canada) to English Dictionary
Scottish Gaelic Spellchecker (Universal)
tema per u chjam'è rispondi: a python application written with tkinter
A Python script to scrape, process and export Chamorro Bible text from different online sources, making the text accessible for analysis, research, and digital preservation. (WIP)
Digitised comparative Enggano word list from Oudemans (1889). This publication contains the unpublished Enggano word list by Francis (1870) put in comparison with those by Boewang (1854), van de Straaten & Severijn (1855), von Rosenberg (1855). View the data at https://github.com/engganolang/oudemans1889/blob/main/data/oudemans1889-long.csv
A project to scrape and process Chamorro language news articles for language preservation, analysis, and learning tools development. (WIP)
A project to scrape and process online Chamorro language dictionaries to support language analysis and revitalization efforts. (WIP)
This project compiles Chamorro-language text data from various sources into clean, structured datasets to support language preservation, analysis, and educational tool development. (WIP)
Add a description, image, and links to the endangered-languages topic page so that developers can more easily learn about it.
To associate your repository with the endangered-languages topic, visit your repo's landing page and select "manage topics."