ML Model License Guide — Understanding Open Source AI Licenses
Comprehensive comparison of ML model licenses with a permissions matrix. Covers Apache 2.0, MIT, Llama Community, Gemma, CC-BY-NC, and 10+ other license types used in open-source AI.
By Michael Lip · Updated April 2026
Methodology
License terms are sourced from the official license text of each model as published on Hugging Face, GitHub, or the provider's website. Permissions are interpreted based on the plain text of each license. This is not legal advice — consult a lawyer for production use. Model examples are current as of April 2026. Stack Overflow data was fetched via the public API on April 10, 2026.
License Permissions Matrix
| License | Commercial Use | Fine-tuning | Redistribution | Patent Grant | Attribution Req. | Example Models |
|---|---|---|---|---|---|---|
| Apache 2.0 | Yes | Yes | Yes | Yes | Yes | Mistral 7B, Falcon 40B, Grok-1, Arctic, Qwen 2.5 |
| MIT | Yes | Yes | Yes | No | Yes | Phi-3, Phi-4, OLMo |
| Llama 3.1 Community | Conditional | Yes | Conditional | Yes | Yes | Llama 3, Llama 3.1, Llama 3.3 |
| Gemma License | Conditional | Yes | Conditional | Yes | Yes | Gemma 2, Gemma 2 27B |
| CC-BY-4.0 | Yes | Yes | Yes | No | Yes | OpenAssistant, Dolly 2.0 (data) |
| CC-BY-NC-4.0 | No | Yes | Non-commercial | No | Yes | Command R+ (original), some Stability models |
| OpenRAIL-M | Conditional | Yes | Conditional | No | Yes | BLOOM, Stable Diffusion 2.x |
| Stability Community | Conditional | Yes | Conditional | No | Yes | Stable Diffusion 3, SDXL Turbo |
| DeepSeek License | Conditional | Yes | Conditional | No | Yes | DeepSeek V3, DeepSeek R1 |
| Mistral Research | No | Yes | No | No | Yes | Mistral Large (weights-only) |
| GPL-3.0 | Copyleft | Yes | Must share alike | Yes | Yes | GPT-J training code, some tools |
| Yi License | Conditional | Yes | Conditional | No | Yes | Yi-Large, Yi-34B |
| Cohere C4AI | Conditional | Yes | Conditional | No | Yes | Command R, Command R+ (v2) |
License Conditions Detail
| License | Key Restriction | User Limit | Derivative Naming |
|---|---|---|---|
| Llama 3.1 Community | 700M MAU threshold for separate license | 700M MAU | Must include "Llama" in name |
| Gemma License | Prohibited harmful uses; redistribution requires same license | None specified | No requirement |
| OpenRAIL-M | Behavioral use restrictions (no harm, surveillance, discrimination) | None | No requirement |
| Stability Community | Revenue threshold; no competing products | $1M revenue | No requirement |
| DeepSeek License | No competing model training; usage limits | Varies | No requirement |
| CC-BY-NC-4.0 | No commercial use of any kind | N/A | No requirement |
Frequently Asked Questions
Can I use Llama 3 commercially?
Yes, the Llama 3/3.1 Community License allows commercial use for organizations with fewer than 700 million monthly active users. Above that threshold, you need a separate license from Meta. You must include the attribution notice "Built with Llama" in your product and comply with the Acceptable Use Policy, which prohibits certain harmful use cases.
What is the most permissive ML model license?
Apache 2.0 and MIT are the most permissive licenses. Apache 2.0 grants patent rights and allows commercial use, modification, and redistribution with minimal requirements (attribution and license notice). MIT is even simpler with just attribution required. Models using Apache 2.0 include Mistral 7B, Falcon 40B, Grok-1, and Arctic.
What is the difference between open-weights and open-source for AI?
Open-weights means the model weights are downloadable and usable, but training data, training code, and evaluation code may not be available. True open-source means all components are available: weights, training data, training code, and documentation. Most "open-source" LLMs like Llama and Mistral are technically open-weights. OLMo from AI2 is a truly open-source model.
Can I fine-tune and redistribute a model with a custom license?
It depends on the specific license. Apache 2.0 and MIT allow unrestricted fine-tuning and redistribution. The Llama Community License allows fine-tuning and redistribution but requires naming derivative models with "Llama" and maintaining the license. The Gemma License allows fine-tuning but has redistribution conditions. CC-BY-NC licenses prohibit any commercial use of fine-tuned derivatives.
Which license should I choose for my own ML model?
If you want maximum adoption, use Apache 2.0 — it provides patent protection and is well-understood by legal teams. If you want simplicity, use MIT. If you want to prevent large companies from competing with your model, consider a license with a user threshold (like Llama's 700M MAU limit). If you want attribution for research, use CC-BY-4.0. Avoid creating custom licenses unless you have legal counsel.