Over the past few years, we’ve witnessed an arms race in AI to build ever-larger language models. Models have grown steadily from hundreds of millions of parameters to hundreds of billions. However, an opposing trend is emerging that favors smaller, more specialized AI models over these behemoths.
In our recent whitepaper, The Bright Future of AI, Hum explores both trajectories and predicts where models of different scales will deliver the most value in the years ahead.
The Rise of Massive Models
In the race to push AI capabilities forward, model scale has proven to be a key lever. As models grow in size, they become more capable at an exponential rate. Leading AI experts predict that over the next few years, frontier models will continue to expand at a breakneck pace.
Anthropic, the startup behind conversational AI model Claude, believes that in 18 months they can build a model 10x more powerful than today’s largest models. This “Claude-Next” will likely require over a billion dollars to train and run. Meanwhile, DeepMind co-founder Mustafa Suleyman forecasts that model sizes will grow 1000x in the next three years alone.
Bigger models paired with more data and compute tend to become more generally intelligent. As model capacity increases into the trillions of parameters, we can expect them to take on more complex, human-like capabilities. These massive models will likely play an essential role where diverse data and open-ended tasks are involved - think creative applications like writing novels or conducting scientific research. They have the sheer “brain size” to take on these frontier challenges.
The Promise of Smaller Models
Despite the appeal of scale, smaller models are proving they can match or even beat far larger models on focused tasks. And they do so at a fraction of the financial and computational cost.
Over the next few years, companies will realize cheaper, specialized models make more sense than gigantic general models for most real-world AI applications. The inflated cost of large language models often comes from tech giants sponsoring model training and deployment to drive platform usage. Once market incentives realign, the capabilities and efficiency of right-sized models will become more apparent.
Smaller models can compress similar intelligence into far fewer parameters by training on cleaner, less noisy data. For instance, Microsoft recently unveiled Phi-2, a model with just 2.7 billion parameters that matches models 25x larger on complex language tasks. By carefully curating the model’s training data, Microsoft achieved state-of-the-art performance in a tiny package.
For straightforward applications like search, chatbots, and content summarization, smaller models are likely sufficient and most cost-effective.
Choosing the Right Model
To summarize, while model scale will continue trending upward rapidly, smaller models are also getting smarter. Massive models and small models will both own a slice of the future AI market.
Large models with trillions of parameters will push new frontiers in general intelligence and take on open-domain challenges. High-value use cases will justify the substantial cost of accessing these models.
On the other end, specialized models with millions to billions of parameters will handle focused tasks for most companies. Continued progress in model quality, not just quantity, will ensure impressive capabilities don’t require astronomical scale.
Rather than framing this as a “battle of the models,” it’s best to see large and small models as complementary and as serving different needs. The companies that learn to navigate this dichotomy - deploying the right models at the right scale for their business - will have an AI advantage in the years ahead.
To explore how publishers are already combining models for more advanced use cases and to learn how you can prepare for an AI future, download a copy of The Bright Future of AI.