The rapid development of artificial intelligence is leaving low- and middle-income countries behind, due in large part to the lack of digitization of the vast majority of the world’s languages.
While there are over 7,000 languages spoken globally, AI models, and particularly large language models, or LLMs, are predominantly trained on just two languages — English and Mandarin.
This disparity creates a new kind of digital divide, which is not about infrastructure or the internet, but rather a data divide.