Open-Source LLMs vs Closed: Unbiased 2024 Guide for Innovative Companies

When it comes to deciding between using open-source LLMs or closed ones, it’s not a matter of which is better but which is better for you.

One company or team may benefit from the freedom of open-source where they can tweak and customize the model while another will benefit from the structure and support of one that’s closed.

Infographic comparing open-source and closed LLMs by Hatchworks, 2024 guide for innovative companies.
To know what you should go with, you need to understand the basics of LLMs and learn what differentiates the two models most. We’ve broken it down for you in this article. Here’s what you can expect:

Understanding LLMs: A Primer

Below, we give a brief definition of LLMs and explain how they’ve evolved.

Definition and Significance of LLMs

Large Language Models (LLMs) are advanced AI systems that can understand and generate different forms of content which include text, code, images, video, and audio.

How is this possible? Because these models are trained on at least one billion parameters, or data points. That’s what makes them so ‘large’. They use these parameters to understand language patterns and respond appropriately.

LLMs can be applied to a range of natural language processing tasks, including:

  • text generation (sentences as well as code and mathematical equations)
  • language translation
  • sentiment analysis
  • data analysis
  • question answering
  • text summarization

 

📚Want to learn the ins and outs of large language models? Read our article: Large Language Models: Capabilities, Advancements, and Limitations [2024]

Evolution of LLMs

But LLMs, like anything in tech, took years of iteration and innovation to reach the capabilities they have today.

In the early days, rule-based systems were used, which relied on manually created rules to process language.

Rules, such as: “If a sentence starts with ‘What time’ or ‘When’ and ends with a question mark, classify it as a time-related question.”

This was incredibly manual, inflexible, and limiting. When machine learning was introduced in the late 2000s/early 2010s, however, there was a shift.

Suddenly, LLMs could be trained to recognize patterns in language data, allowing them to understand and generate language with much greater complexity and nuance, far beyond the rigid confines of rule-based systems.

1966
Screenshot of an ELIZA chatbot conversation, a text-based therapy simulation.
ELIZA

The first chatbot created by Joseph Weizenbaum, simulating a psychotherapist in conversation.

2013
Graph with points labeled "Man," "Woman," "King," and "Queen" demonstrating relationships with vector lines.
word2vec

A groundbreaking tool developed by a team led by Tomas Mikolov at Google, introducing efficient methods for learning word embeddings from raw text.

2018
GPT and BERT
  • GPT (Generative Pretrained Transformer): OpenAI introduced GPT, showcasing a powerful model for understanding and generating human-like text.
  • BERT (Bidirectional Encoder Representations from Transformers): Developed by Google, BERT significantly advanced the state of the art in natural language understanding tasks.
2020
OpenAI logo.
GPT 3

OpenAI released GPT-3, a model with 175 billion parameters, achieving unprecedented levels of language understanding and generation capabilities.

Late 2021
ChatGPT logo.
Introduction of ChatGPT

OpenAI introduced ChatGPT, a conversational agent based on the GPT-3.5 model, designed to provide more engaging and natural dialogue experiences. ChatGPT showcased the potential of GPT models in interactive applications.

2022
GPT-4

OpenAI released GPT-4, an even more powerful and versatile model than its predecessors, with improvements in understanding, reasoning, and generating text across a broader range of contexts and languages.

2022
Fantasy scene with people in period costumes inside a grand, circular room with a large window.
Midjourney and Other Innovations

The launch of Midjourney, along with other models and platforms, reflected the growing diversity and application of AI in creative processes, design, and beyond, indicating a broader trend towards multimodal and specialized AI systems.

Statistical models, such as n-grams, were developed to learn patterns from text data. Later, neural networks, including recurrent neural networks (RNNs) and long short-term memory (LSTM) models, were used for sequence processing.

Then came transformer architectures like BERT and GPT in 2017. Transformer architecture can process entire sequences of text simultaneously, unlike its predecessors that processed text sequentially.

Diagram comparing RNN and Transformer models in NLP tasks.

This makes LLMs more adaptable and applicable to new use cases such as sentiment analysis, question answering, and text classification with relatively small amounts of task-specific data.

It’s these latest developments that have captured the most attention and interest, paving the way for democratized access to scalable LLMs we didn’t have before.

Open Source LLMs vs Closed: The Core Debate

Open Source LLMs Closed Source LLMs
Characterized by their public accessibility, open-source LLMs encourage a collaborative approach to development.

Anyone can inspect the code, identify flaws, suggest improvements, or adapt the model for specific purposes, fostering a community-driven advancement.
These models are proprietary, developed, and maintained within a closed environment.

Access to the underlying code and training data is restricted, often making these models black boxes to external users.

As LLMs have evolved, two types have emerged. Those are open-source models and closed models.

Open-source models are publicly available, allowing anyone to use, modify, and distribute the software.

Closed-source models, on the other hand, are proprietary, with their codebases accessible only to the organization that developed them or those willing to pay for access.

The emphasis is on protecting intellectual property and monetizing the technology.

Examples of Open Source Models Examples of Closed Models
  • GPT-2 and GPT-Neo
  • BERT (Bidirectional Encoder Representations from Transformers)
  • ELECTRA
  • T5 (Text-to-Text Transfer Transformer)
  • Llama 3
  • Grok
  • Mixtral
  • DBRX
  • GPT-3 & 4 by OpenAI
  • Amazon Alexa’s Language Model
  • IBM Watson
  • Microsoft Visual Studio
  • Tableau Software
  • JetBrains IntelliJ IDEA
  • Microsoft Azure Cognitive Services
  • Google Cloud AI and Natural Language
Different models shine for different reasons. Below you can see how several models perform in terms of quality, speed, and price.
Comparison of AI models on quality, speed, and price. GPT-4 Turbo leads in quality, Gemini 1.5 in speed, and Llama 3 (8B) in cost.

Source: https://artificialanalysis.ai/models

When it comes to quality (as of May 2024 that is), GPT-4o is leading the pack but it’s only performing at mid level for speed. The fastest of the sampled models is Gemini 1.5

Personally, I find quality depends on the use case. For example, as a writer I find ChatGPT to have stronger logical performance but I prefer the style of writing that comes out of Claude.

So as a developer, you may find one performs better than another for coding. Our team seems to prefer GPT-4o based on experience and available data, while the best open-source model for coding is Llama-3-70B.

Bar graph comparing coding performance of AI models using HumanEval. GPT-4 leads with 90.2.

Why the Choice Matters

If both are LLMs, does it matter whether it’s open or closed?

Yes, because the open-door and closed-door approach to LLMs influence these 3 key factors:

  • Speed of Innovation & Customization
  • Accessibility & Cost
  • Data Security

 

How they influence those factors influences which model your business should use.

Let’s look at each one in turn and how they play out under each model.

Speed of Innovation & Customization

HatchWorks’ verdict: Open-source large language models allow for more customization and have a higher potential for faster innovation.
Open Source LLMs
Offer companies the flexibility to customize and fine-tune the model to fit their specific needs.

Given their accessibility, these models enable businesses to innovate rapidly, adapting the technology to new challenges or integrating it with other systems without waiting for vendor updates.

This can be particularly beneficial for companies with the technical expertise to modify and maintain the models.

Open source models also benefit from developers sharing their successes in advancing the model. This crowdsourcing of information further advances innovation of the LLM.
Closed Source LLMs
While potentially limiting in terms of customization, closed-source models offer sophisticated solutions that have been developed with significant resources.

Companies might prefer these for cutting-edge performance or specific functionalities that are not yet available in open-source alternatives.

However, reliance on the vendor for updates can slow down innovation.

Accessibility & Cost

HatchWorks’ verdict: Within both open-sourced and closed source models, there’s a range of costs and accessibility.

For example, ChatGPT-4 as a closed source model is about $10 per million token input and $30 per million token output while Llama-3-70-B, also an open-sourced model, is 60 cents per million token input and 70 cents per million token output. That’s about 10x cheaper with very little performance difference between them.

Thus, it makes sense to go with an open-source model like Llama-3-70-B.

Open Source LLMs
Open-source models like Llama-3-70-B offer significantly lower costs for developers, with pricing as low as 60 cents per million token input and 70 cents per million token output.

This affordability enables a wider range of developers to leverage advanced AI capabilities without substantial financial investment.

However, developers should be prepared to invest in customization and maintenance to maximize the model's utility and security.
Closed Source LLMs
Typically, a closed LLM will come with licensing fees, which might include ongoing costs for updates and support.

For developers without the expertise to maintain and update an LLM, these ongoing support services can justify the cost, ensuring the model remains effective and secure.

Security

HatchWorks’ verdict: If security is paramount in your organization and your LLM use cases put your data at risk, it’s better to use an open-source LLM. Or you could build your own closed source model under your own infrastructure where you alone control the security and compliance.

Open Source LLMs
When deployed on a company's private cloud, open-source models offer enhanced control over security measures and data privacy.

This setup allows organizations to implement tailored security protocols, ensuring that sensitive data is protected according to their specific standards.

The transparency of open-source software enables thorough audits and continuous improvement of security features. Companies can also benefit from the collective expertise of the open-source community in identifying and addressing vulnerabilities.
Closed Source LLMs
The security is managed by the vendor, which can offer peace of mind to companies without a large IT department.

Plus, vendors might provide compliance certifications necessary for certain industries, simplifying regulatory adherence.

But, the company has less visibility into potential vulnerabilities and has to rely on the vendor's diligence for security updates.
📚Open source vs closed source isn’t the only large language model debate. Learn when to choose single or multiple models for different LLM use cases by reading our article: LLM Use Cases: One Large Language Model vs Multiple Models

Which Model is Winning the Hearts of Enterprise Companies?

Reports have emerged showing that open-source LLMs are becoming more popular.

a16z.com, for example, shows 41% of interviewed enterprises will increase their use of open-source models in their business in place of closed models.

A further 41% will switch from closed to open if the open-source model matches the performance of closed models, while a measly 18% don’t plan to increase use of open-source LLMs at all.

If the projections of the report come true, then this will mark a significant shift in company behavior. Where in 2023 the market share was 80%–90% closed source, we’ll start to see an even divide in use of open and closed models.

Donut chart on open-source usage expectations in 2024: 41% will increase usage, 41% switch when performance matches, 18% no plans to increase.
What does this prove? Companies have begun to value the customization, control, and cost-effectiveness of open-source models.
Bar graph showing why enterprises care about open-source: Control (37%), Customizability (37%), Cost (26%). Primary reason: Control (60%).

But just because there’s a movement toward open-source models, doesn’t mean they’re right for you or for every use case you want. The same report expects businesses will use both, aiming for a 50-50 split.

Next, we’ll look at the advantages of both so you can decide which one is best suited to your company’s AI needs or if you’ll implement both in different ways.

Advantages of Open Source LLMs

Open-source LLMs are great for offering a community of users and providing transparency in the development of the model.

Community and Collaboration

An open-source LLM thrives on community support. Researchers and developers from around the world can contribute to its development, propose improvements, and fix bugs.

This collaborative approach accelerates innovation, as the community works together to refine and enhance the model.

Open source projects also often have dedicated support channels, such as forums and mailing lists, where users can seek assistance, share knowledge, and engage in discussions.

Transparency and Trust

The transparency afforded by open-source LLMs is critical for building trust among users and developers alike.

Accessible code allows for thorough security audits and ensures that any potential ethical issues can be identified and addressed promptly. This level of openness is vital for ethical AI development, as it enables the community to oversee and guide the model’s evolution, ensuring it adheres to high standards of fairness and unbiased behavior.

Such transparency strengthens the integrity of AI applications, making open-source LLMs a trusted choice for developers and businesses focused on ethical AI solutions.

The Case for Closed Source LLMs

While open-source LLMs have community, collaboration, and transparency on its side, closed-source LLMs offer unique proprietary advancements and security.

Proprietary Advancements

With closed-source LLMs, your business gains access to proprietary advancements shielded by intellectual property rights.

This is crucial in sectors like finance or healthcare where having bespoke AI tools can significantly differentiate your offerings from the competition. By optimizing these models with specific training data, they can solve intricate, industry-specific challenges, giving your company a competitive edge.

Support and Security

Sure, you won’t have the community and collaboration of an open-sourced model but you will have dedicated support from the closed LLM provider. This is useful for companies who don’t have AI experts in-house to guide application of the LLM.

And because the model is closed, you know the training data it uses is secure as well as any data you feed into it, preventing unauthorized access and giving you peace of mind while handling sensitive information.

Evaluating the Impact of Both on Businesses

Innovation, accessibility, and security aren’t the only factors you should consider. You also need to look at what best fits your needs in terms of scalability, cost, integration, and customization.

Below, we’ve created skimmable tables to help you understand how each model can impact your business across these factors.

Scalability and Cost

💡 HatchWorks tip: Think about short term and long term use of the LLM. Will you outgrow the use cases a closed model brings? Can you afford the costs of scaling with an open-source model?

Open Source LLMs Closed Source LLMs
Scalability
Highly flexible; can be scaled according to specific needs.
Often comes with scalable solutions tailored for enterprise needs.
Initial Cost
Generally low or no cost for acquisition.
Typically involves initial purchasing or licensing fees.
Total Cost of Ownership
May increase due to the need for specialized talent for maintenance and scaling.
Predictable costs including support and maintenance packages, easing budgeting.
Maintenance
Requires in-house expertise or external consultants to manage updates and scaling.
Vendor provides maintenance and updates, reducing the need for specialized in-house skills.

Integration and Customization

💡 HatchWorks tip: Evaluate your team’s technical capabilities and the criticality of tailored solutions to your operations. Does your workflow demand bespoke AI features that open-source models can uniquely provide, or do you value a streamlined, ready-to-use solution that minimizes technical overhead?

Open Source LLMs Closed Source LLMs
Integration Ease
Can be complex; requires technical expertise to integrate with existing systems.
Generally easier to integrate, especially if part of a larger suite of business applications.
Technical Requirements
Requires significant technical skills for effective customization and integration.
Lower technical demands as vendors often support integration processes.
Flexibility
Highly flexible, allowing for adjustments and enhancements as needed.
Less flexible; dependent on vendors for changes and updates.

Navigating Legal Considerations

Nobody wants to embroil themselves in legal troubles. It’s expensive, it’s stressful, it’s a lot of work to get out of.

Here’s how to address the most common one we see with LLM use:

Intellectual Property Issues

Closed LLM usage does put you at greater risk of intellectual property (IP) issues because they’ll have firmer regulations on your use of the model.

To avoid stepping on any legal landmines, you must comply with their licensing agreements at all times. Find they’re too rigid? You can try to negotiate with the LLM provider before you sign a contract and work to create terms that align more closely with your business needs. But there’s no guarantee they’ll change and you

Or you can use an open-source model—they typically have fewer IP barriers.

Scenario: Licensing Violation

A software company licenses a closed-source LLM from a major AI technology provider for use in their customer service chatbot.

The license specifies that the LLM can only be used within certain geographical boundaries and for specific applications.

However, the company decides to expand the chatbot’s functionality into other areas not covered by the license or starts offering the LLM-powered service in regions that are outside the licensed territories.

The company’s actions violate the licensing agreement, leading to potential legal action from the LLM provider. This could include cease and desist orders, demands for financial compensation, or a lawsuit claiming infringement of intellectual property rights.

Finding the Path Forward

When to Choose Open Source LLMs When to Choose Closed Source LLMs
  • If you seek flexibility and rapid innovation
  • If you have a limited budget to spend on AI (especially in smaller businesses and startups) but still want advanced AI capabilities.
  • If you have highly skilled AI users in-house who can adapt the open-source models to unique use cases
  • If you value ethical AI and ability to audit and adapt the AI tools you use.
  • If you want ready-to-use AI capabilities with comprehensive vendor support.
  • If you don’t have AI expertise in-house and want to rely on vendor expertise for AI integration and maintenance.
  • If security, privacy, and compliance are important in your industry.
  • If you want guaranteed performance and scalability from your AI tool.

Additional Resources

We have articles, podcast episodes, live and recorded workshops where you can learn all things about AI from what it is to how to embed its use strategically in your business.

Built Right, Delivered Fast

Start your project in as little as two weeks and cut your software development costs in half.