Original article excerpt
Server-side extracted preview paragraphs from the original source.
Azercell Telecom LLC, Azerbaijan's leading telecommunications provider, wanted to build an Azerbaijani large language model (LLM) on Amazon SageMaker AI for telecom use cases and a customer-facing chatbot. The challenge: adapting foundation models (FMs) to a morphologically rich language with limited training data and no existing blueprint for efficient LLM training in Azerbaijani. In a six-week collaboration, Azercell worked with the AWS Generative AI Innovation Center to establish a production-ready framework on Amazon SageMaker AI.
This solution builds on open source tools including PyTorch, Hugging Face Transformers, and Liger Kernels. The authors would also like to thank Aiham Taleb, Arefeh Ghahvechi, Manav Choudhary, Rohit Thekkanal, Daz Akbarov, Jamila Jamilova, Ross Povelikin, Almas Moldakanov, Christelle Xu, and Ivan Khvostishkov for their contributions in making this project possible.
Azercell Telecom LLC, Azerbaijan’s leading telecommunications provider, wanted to build an Azerbaijani large language model (LLM) on Amazon SageMaker AI for telecom use cases and a customer-facing chatbot. The challenge: adapting foundation models (FMs) to a morphologically rich language with limited training data and no existing blueprint for efficient LLM training in Azerbaijani. In a six-week collaboration, Azercell worked with the AWS Generative AI Innovation Center to establish a production-ready framework on Amazon SageMaker AI that delivered a 23% higher training throughput and 58% lower peak GPU memory usage through kernel-level optimizations on an ml.p5.48xlarge instance. The framework also achieved a 2× improvement in tokens per word using a custom tokenizer, effectively doubling the amount of Azerbaijani text that fits within the model’s context window. If you work with low-resource or morphologically complex languages, this post walks through the approach so you can evaluate similar techniques.
The framework implements three sequential stages, each producing artifacts that feed the next.
The training stages (CPT and LoRA fine-tuning) were run as Amazon SageMaker AI training jobs launched from Amazon SageMaker Unified Studio, each pointing to a custom training script. Each job provisions fresh Amazon Elastic Compute Cloud (Amazon EC2) instances and terminates after completion, so you pay only for actual compute time with no idle cluster cost.
The following diagram illustrates the modular architecture, where each stage can be optimized independently. Tokenizer improvements benefit every subsequent training stage, and CPT configurations transfer across fine-tuning tasks.
Figure 1. The training pipeline architecture. Operators launch training jobs from Amazon SageMaker AI Notebook Instances. Training data and model artifacts are stored in Amazon Simple Storage Service (Amazon S3). Training metrics are tracked with TensorBoard in Amazon SageMaker AI, and system metrics are captured through Amazon CloudWatch.
