In the bustling world of artificial intelligence, a visionary group led by María Grandury, founder of SomosNLP, is navigating the frontier of Spanish-language AI development.
Their mission? To create a ChatGPT model that understands and responds in Spanish as fluently as it does in English.
The challenge begins with a task as simple as fetching a Peruvian recipe—a feat that demands much more than it suggests.
Despite these hurdles, the Spanish language, which is vibrant and widespread, finds itself underrepresented in the AI realm.
The cornerstone of this ambitious project is the collection of vast textual data to train what is dubbed a foundational model.
This initiative has seen a boost from community contributions and significant projects like the Alia model, backed by the Spanish government.
Alia aims to revolutionize technology with the rich linguistic heritage of Castilian and other co-official Spanish languages.
Yet, building a responsive AI model isn’t merely about amassing data; it requires immense computational efforts.
SomosNLP‘s strategy includes crafting the largest open corpus of instructional content in Spanish to date.
As nations like France advance with AI innovations through companies like Mistral, the disparity in investment is stark.
While OpenAI garners billions, Mistral gathers millions, focusing primarily on English, with occasional nods to French.
This narrative emphasizes the crucial role of developing AI that respects and preserves linguistic diversity.