Tuesday, July 18, 2023

Jèrriais

 I stumbled on a charming BBC article about a language, all of whose speakers live on a single island that measures just nine miles by five. As the article notes: 

"Jersey is the largest of the Channel Islands, an archipelago of six inhabited islands (and many more uninhabited ones) scattered over the English Channel, just over 14 miles from the French coast. Despite its proximity to France, it's actually a British Crown Dependency, and as such, the island has two official languages: English and French. But it also has Jèrriais."

The story of the  survival of Jèrriais got me to thinking about the subjects of language learning, language preservation and language translation. These are areas of personal and practical interest to me. I feel fortunate to be able to negotiate novels in Spanish, but my vocabulary is far from the level I would like and I frequently resort to Google Translate to clarify passages which are not perfectly clear to me. I also occasionally use the help of Google to quickly translate whole articles I want to share with English speakers. In the time I have been doing that I have seen a very noticeable improvement over a rather short time in the quality of such translations.

I know that Large Language Models like Google Bard have language capabilities including translation that far exceed what Google Translate can presently demonstrate, and for me that seems one of the most promising developments in the rapidly expanding sphere of artificial intelligence.  To delve a bit further into that phenomenon I decided to ask Bard to provide me with an overview of today's state-of-the-art AI language preservation capabilities, thus:

* * *

Discuss some ways Large Language Models might contribute to the preservation of endangered languages.

Large language models (LLMs) are a type of artificial intelligence that can be trained on massive datasets of text and code. This allows them to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.

LLMs can contribute to the preservation of endangered languages in a number of ways:

Documenting and preserving endangered languages: LLMs can be used to transcribe and translate endangered languages, creating a digital record of these languages that can be accessed by future generations. This is particularly important for languages that are not well-documented or that have no written form.

Creating educational resources: LLMs can be used to create educational resources for endangered languages, such as dictionaries, grammars, and textbooks. This can help to increase the number of people who learn these languages and can also help to revitalize them.

Developing new language technologies: LLMs can be used to develop new language technologies that can be used to support endangered languages. For example, LLMs could be used to develop translation tools, speech recognition tools, and text-to-speech tools. These tools could make it easier for people to learn and use endangered languages.

Promoting awareness of endangered languages: LLMs can be used to promote awareness of endangered languages and the importance of their preservation. This can be done by creating online resources, social media campaigns, and educational events.

Overall, LLMs have the potential to play a significant role in the preservation of endangered languages. By documenting, preserving, and promoting these languages, LLMs can help to ensure that they continue to be spoken and understood for generations to come.

Here are some specific examples of how LLMs have been used to preserve endangered languages:

The Endangered Languages Project is a website that uses LLMs to transcribe and translate endangered languages. The project has helped to document over 300 endangered languages and has made this information available to researchers and the public.

The Cherokee Nation has used LLMs to develop a new language learning app. The app uses LLMs to provide personalized instruction and feedback to learners.

The University of Hawaiʻi at Mānoa is using LLMs to develop a new speech recognition tool for the Hawaiian language. The tool will make it easier for people to use Hawaiian in everyday life.

These are just a few examples of how LLMs can be used to preserve endangered languages. As LLMs continue to develop, it is likely that we will see even more innovative ways to use these technologies to support endangered languages.

 * * *

It seemed like the next logical step was to challenge Bard with a practical example, so:

* * *

Translate Lincoln's Gettysburg Address into Jèrriais.

Sure, here is the translation of Lincoln's Gettysburg Address into Jèrriais:

**Quarante-huit ans d'avan, nos pères ont foaît naitre, sus cot' continant, eune nouvelle nation, conceûte en libaïté, et dediée à la proposition que tout ches gens sont créés égaues.

Maintenant, nous sommes engâgés dans eune grande guerre civile, qui teste si cot' nation, ou eune nation conceûte et dediée coume ça, peut durer longtemps. Nous sommes r'trouvés sus eune grande bataille-ch'tie d'cette guerre. Nous avons v'nu dedicâer eune partie d'cette ch'tie, coume eune dernié r'posée pouor ches gens qui ont lyi donnî l'vies ichi que cot' nation puisse vîvre. Ch'est tout à fait approprié et juste que nous fassions ch'ci.

Mais, dans un sens pus grand, nous ne pouvons pas dedicâer, nous ne pouvons pas consâacrer - nous ne pouvons pas sanctifier - ch'te terre. Les braves gens, vix et morts, qui ont lutté ichi, l'ont sanctifiée bien au-dessus de nos pouôrs misérables d'ajouter ou de retrancher. Le monde n'priendra pas bien note, ni n'se souviendra pas longtemps de ce que nous disons ichi, mais il ne pourra jamais oublier ce qu'ils ont fait ichi. C'est pour nous, les vivants, plutôt, d'être dedicâés ichi à l'ouvrage inachevé que ches gens qui ont tchié ichi ont achteu tant noblement avanci.

Nous résolvons donc, ch't'an, qu'ces morts n'auront pas mouôrt en vain; que cot' nation, sous l'Dieu, aura eune nouvelle naissaunce de libaïté; et que l'gouvernement du peuple, par le peuple, pouor le peuple, ne périra pas d'sus la terre.**

I hope this is helpful!

 * * *

Well, I don't speak Jèrriais, its Norman French ancestor, or much modern French and I don't know any actual Jèrriais speakers,  so I cannot say with any confidence that Bard correctly made the requested translation from Lincoln's English.  Interestingly, when I submit a sentence to Bard from the Bard-generated Jèrriais text and ask Bard to identify the language, Bard responds that the text is French and provides some comments on the contents using modern French!  In taking the same Bard-generated Jèrriais text to Google Translate, that program also initially identifies the language as French, but there are also a number of corrections provided to yield proper French.  So, Bard may not yet be totally fluent in Jèrriais, but it certainly seems to have made a good effort with what I supplied in the way of prompts.

No comments: