Table of Contents
Context: Recent advancements in Artificial Intelligence (AI) have seen a shift from scaling up massive models to focusing on smaller, more efficient models.
About Small Language Models (SLMs)
- SLMs are artificial intelligence (AI) models that can process, understand and generate human language.
- SLMs also work on a neural network-based architecture known as the transformer model.
- SLMs are cheaper, faster to train and require fewer computational resources compared to LLMs.
Advantages of SLMs
- Accuracy: SLMs can provide higher quality and more accurate results than LLMs.
- Privacy: SLMs can perform tasks locally, such as text prediction, voice commands, and translation without sending data to the cloud.
- Specialization: SLMs are designed for specific use cases rather than the broad general intelligence goals of LLMs.
- Sustainability: SLMs have lower energy consumption, which contributes to improved sustainability.
- On-Device Functionality: Apple Intelligence integrates SLMs into iPhones and iPads, balancing performance with device constraints.
Limitations of SLMs
- Inferior Performance on Complex Tasks:
- SLMs struggle with benchmarks like coding, logical reasoning and solving intricate problems.
- They cannot yet match LLMs in addressing challenges requiring broad general intelligence.
- Factual accuracy: SLMs may not be able to store much factual knowledge, which can lead to factual inaccuracies
Industry Adoption
- Google DeepMind released Gemini Ultra, Nano, and Flash models.
- OpenAI launched the GPT-4o mini, and Meta introduced Llama 3 models.
- Amazon-backed Anthropic AI released Claude 3 and Haiku.
- Indian initiatives: Visvam (IIIT Hyderabad) and Sarvam AI.