How ASEAN’s first large language model will support businesses
By Yogesh Hirdaramani
GovInsider speaks to representatives from AI Singapore and Amazon Web Services to learn more about SEA-LION, Southeast Asia’s first multilingual large language model, and what it can offer businesses.
A new LLM focusing on Southeast Asian languages aims to support regional organisations in creating local experiences. Image: Canva
When users in Southeast Asia encounter a server error on the ride-hailing app Gojek, the message that pops up differs radically depending on where you are – striking the right balance between conversational and helpful, say UX writers at Gojek.
With the increasing adoption of large language models (LLMs) among businesses, it is critical that such models can similarly account for cultural nuances – particularly in a diverse region like Southeast Asia, home to over 1,200 languages.
This is why AI Singapore has built SEA-LION (Southeast Asian Languages In One Network), a family of open source LLMs trained on 11 major regional languages – Indonesian, Thai, Vietnamese, Filipino, Burmese, Khmer, English, Chinese, Malay, Tamil, and Lao.
As of February 2024, AI Singapore has made SEA-LION available via Amazon SageMaker JumpStart, a platform that makes it easier for businesses to access, customise, and deploy pre-trained models.
Businesses can use SEA-LION to build AI tools like chatbots, coding assistants, and meeting summarisers, says Leslie Teo, Senior Director for AI Products at AI Singapore to GovInsider.
Multinational businesses can also tap on SEA-LION as a translator to conduct conversations with stakeholders across multiple languages.
Already, Indonesian e-commerce platform, Tokopedia, has begun using the model to generate product descriptions in other Southeast Asian languages. Elsewhere, multinational Singapore IT company, NCS, will be using the model to translate content from its legal compliance process into Bahasa Indonesia and Thai.
A culturally responsive set of models
“Culture does matter and culture across the ASEAN region is very diverse. Models like SEA-LION are able to present a better understanding of the different nuances in the region,” says Elsie Tan, Singapore Country Manager, Public Sector, Amazon Web Services (AWS).
Western models like Meta’s LlaMA-2 may not be aware that ojek is a general term referring to any motorcycle taxi, and confuse it with Gojek, a ride-hailing service, says Teo.
“If I just want a simple answer for a simple problem, there is no issue. But it does matter when it comes to customer service or deeper issues relating to values,” he says.
According to one video comparison, SEA-LION provides a more accurate translation than LlaMA-2 for the acronym for a popular stadium in Indonesia. SEA-LION was also able to provide an accurate response in Bahasa for a sentence written in Thai.
“We see tremendous potential for governments and businesses to create local experiences which can enable them to reach a far wider audience,” says Tan.
Open-source models
SEA-LION’s open-source nature means that users can take a peek under the hood to verify the training data and build new applications on top of the infrastructure, explains Teo. This can also help the model improve over the long term.
“If people take the data we have and build better models, it’s better for us. The advantage we have is that we’re not driven by profit and loss,” says Teo.
“That’s why we thought very hard about the name SEA-LION… We want Singapore to be there, but we don’t want Singapore to be the ‘owner’. This is everyone’s culture and language.”
Teo shares that he hopes that AI Singapore will be able to make the datasets publicly available via SageMaker JumpStart soon as well, which could support others in building and training their own models.
How does it stack up?
But how can SEA-LION stand out among other LLMs released by the private sector?
First, 13 per cent of training data in SEA-LION comprises Southeast Asian languages – compared to Western models like LlaMA-2, of which only 0.5 per cent of training data is Southeast Asian, a large difference of 26 times, Teo notes.
To date, only GPT-4 outranks SEA-LION, according to BHASA, a benchmark developed by AI Singapore that evaluates how competent LLMs are in Southeast Asian languages.
Where it differs from GPT-4 is its size. SEA-LION has two models, one with 3 billion parameters and the other with 7 billion parameters – a far cry from GPT-4’s 1.7 trillion parameters.
“In real life, you don’t just use one model… When you have a small model, it can fit a specific need, it’s easier to use, and cheaper,” says Teo. Though it is less powerful than GPT-4, it may be more useful for specific applications.
What about models emerging from Southeast Asia specialising in Vietnamese or Thai? Many businesses will also want models with multilingual capabilities, depending on their use cases, he notes.
“It’s very much like how Amazon thinks about generative AI. We don’t believe there’s one model for the world,” says Tan. She points to Amazon Bedrock, a library of foundation models that users can mix and match for their own use cases.
Partnering with AWS to bring the model to life
AWS supported AI Singapore by providing the high number of GPUs needed to train the models, says Teo.
AWS’ High Performance Computing team also provided invaluable assistance in optimising and training the model in a cost-effective and timely manner, adds Tan.
AWS’ network of AI partners also proved critical. For one, the company connected AI Singapore to Hugging Face, a platform that helps users build and train machine learning models, which sped up the model training process.
“Until last year, I think it’s fair to say there were no Singapore-based teams that were pre-training models we were pre-training,” says Teo. AWS also connected the team with Glair AI, an Indonesian technology company which supports the development of Indonesian Malay content and translation.
GovInsider previously reported that the Singapore Government will be channeling US$52 million to further support the initiative.