Watercooler
November 11, 2023
5
min read

The Best Large Language Models on The Market

Krissy Davis

Large language models are sophisticated programs that enable machines to comprehend and generate human-like text. They have been the foundation of natural language processing for almost a decade. Although generative AI has only recently gained popularity, modern large language models started to emerge in 2014 after the publication of a research paper titled "Neural Machine Translation by Jointly Learning to Align and Translate."

Since then, there has been a significant increase in research and development of large language models, with many models being released. Some are open-source, while others belong to large companies such as Google and Microsoft.

This article will explore some of the best large language models currently available and discuss why they've made the list. 

The best large language models

In no particular order, here's our list of the best large language models most relevant in today's market. 

1. GPT-4

Key points: Increased accuracy and precision compared to other OpenAI models. However, it's a closed-source model, which makes it difficult to audit or modify. 

GPT-4 is the latest and most advanced language model in the GPT series by OpenAI. The model comprises eight models with 220 billion parameters each, trained on an enormous dataset of a staggering 1.76 trillion parameters. As the largest in the series, GPT-4 is capable of complex reasoning and understanding and performing multiple academic tasks, with some people saying it comes close to artificial general intelligence (AGI). It also can generate language and images, a significant upgrade from previous models.  

Microsoft Bing search is powered by GPT-4 and is currently available in ChatGPT Plus. It's also expected to be integrated into Microsoft Office products in the future. A noteworthy feature of GPT-4 is the introduction of a system message that enables users to specify the tone of voice and task they desire.

2. GPT-3.5 

Key points: a good large language model option for individuals and businesses as it's a cloud-based service, which means that it is scalable to meet the needs of users of all sizes. It's not as accurate as GPT-4. 

GPT-3.5 has a faster response time than its successor, GPT-4. However, due to its smaller parameter size, it has lower accuracy and expertise in specific domains. For example, GPT-3.5 scored 48.1% for accuracy, while GPT-4 scored a much higher 67%. 

GPT-3.5 was fine-tuned through reinforcement learning from human feedback, and it is the version of GPT that powers ChatGPT. OpenAI claims that GPT-3.5 turbo is the most capable among several models. The training data of GPT-3.5 extends up to September 2021, so relevancy is an issue with this large language model. 

3. PaLM 2

Key points: Best large language model for quick responses and relevant, up-to-date data. One of the largest language models with 540 billion parameters. It's a closed-source model, so its code isn't publicly available.

PaLM 2 is a powerful transformer-based model developed by Google, with 540 billion parameters, that powers its AI chatbot Bard. It has been trained across multiple TPU 4 Pods, custom hardware designed for machine learning.

PaLM 2's strengths are its ability to reason and understand formal logic, mathematics, and advanced coding in multiple languages. It performs exceptionally well in reasoning evaluations and has a quick response time. This model outperforms GPT-4 in reasoning evaluations and is excellent at understanding idioms, riddles, and nuanced texts in multiple languages. It provides quick responses and can offer three response options at a time. It's also particularly good at breaking down complex tasks into simpler subtasks.

PaLM 2 has several fine-tuned versions, including Med-Palm 2, designed for life sciences and medical information, and Sec-Palm, used for cybersecurity deployments to speed up threat analysis.

4. Claude 

Key points: Best large language model for safe, secure, and reliable outputs. Less accurate than GPT-4. 

Claude is a powerful large language model created by Anthropics and supported by Google. Its primary focus is to create AI systems that are safe, fair, and reliable. It uses particular principles to guide its output, ensuring that the AI-powered assistants it creates are helpful, accurate, and won't cause any harm. Claude powers Anthropic's two main product offerings: Claude Instant and Claude 2. Claude 2 excels at complex reasoning according to Anthropic. It performs similarly to GPT-4 and offers a high context length of up to 100,000 tokens. Claude performs better than PaLM 2 in the Model Accuracy Test and ranks just below GPT-4. 

5. Cohere

Key points: Best large language model for businesses due to its high accuracy and customisation. It's more expensive than the OpeanAI models. 

Cohere is an enterprise large language model that can be customised and fine-tuned to specific use cases of a company. The company that developed Cohere was founded by one of the authors of the research paper "Attention Is All You Need", which introduced the transformer model of LLM in 2017. Cohere has a unique advantage over other models as it is not restricted to a single cloud platform like OpenAI, which is limited to Microsoft Azure. Cohere is known for its high accuracy and robustness, but is relatively more expensive than OpenAI models.

6. Falcon 

Key point: Best open-source large language model on the market. 

Falcon is a language model that is open-source and has three variants: Falcon 40B (with 40 billion parameters), Falcon 7B (with 7 billion parameters), and Falcon 1B (with 1 billion parameters). It is a causal decoder-only model based on the transformer architecture created by the Technology Innovation Institute. Falcon has been trained in multiple languages and comes under the Apache 2.0 license. It outperforms other open-source models such as LLaMA, StableLM and MPT. Amazon has made Falcon 40B available on Amazon SageMaker. However, it's also available for free on GitHub.

7. Large Language Model Meta AI | Llama

Key point: predecessor to many open-source large language models 

Meta has developed a language model called Llama, which comes in two versions - a larger one with 65 billion parameters and a smaller one with 13 billion parameters. According to Meta, the 13B model is more accurate than GPT-3. Many developers have used Llama's smaller version to create open-source models. Still, it's limited to research purposes only and can't be used like Falcon to develop projects or products. 

8. Guanaco-65B 

Key points: Best open-source large language model after Falcon. Derived from Meta's Llama. 

Guanaco is an open-source language model derived from Llama and performs well in the mmLu test. Guanaco is more efficient than previous decoder-only models, such as GPT-3 and GPT-4. It can generate text more quickly and with fewer computational resources. Guanaco-65B is the largest version trained on 65 billion parameters, but has 7B, 13B and 33B versions. All the models have been fine-tuned on the OASST1 data set.

9. Vicuna 33B

Key points: It's smaller than other large language models but performs exceptionally well. 

Vicuna is another open-source large language model derived from Meta's Llama. Vicuna was fine-tuned using supervised instructions and trained on data collected from sharegpt.com, a platform where users share their ChatGPT conversations. Despite being smaller and less capable than GPT-4, it performs well for a model of its size. 

10. MPT-30B

Key point: Outperforms ChatGPT-3. A smaller model that can run locally on your system. 

MPT-30 billion is another open-source model based on Meta's Llama. It uses data sets from ShareGPT, Camel- AI, GPTeacher and Baize, offering a context length of 8,000 tokens. Additionally, MPT-30B outperforms ChatGPT-3, so if you're after a smaller large language model that can run locally on your system, then MPT-30B is a great choice. 

12. Orca

Key points: Small enough to run on a laptop. A large language model developed by Microsoft. 

With 13 billion parameters, Orca is small enough to run on laptops. However, despite its smaller size, it matches GPT-4's performance and is on par with GPT -3.5 for many tasks. Orca was developed by Microsoft and built on top of the LLaMA 13 billion parameter model. It aims to improve upon advancements made by other open-source models by emulating the reasoning processes achieved by large language models.

The Best Large Language Models on The Market

November 11, 2023
5
min read

Subscribe to DevDigest

Get a weekly, curated and easy to digest email with everything that matters in the developer world.

From developers. For developers.