Compare 30+ open-source models side-by-side. Real parameters, benchmarks, and licensing info. Filter, sort, and find the right model for your use case.
| Model | Org | Params | License | Released | MMLU | Tags |
|---|
LockML is an interactive comparison table for open-source machine learning models. It provides a single, filterable view of 30+ models with their parameter counts, benchmark scores (MMLU), release dates, licensing information, and use-case tags. Instead of visiting dozens of Hugging Face model cards and reading individual papers, you can compare all the major open-source models in one place and narrow the list to models that match your specific requirements.
The comparison table supports real-time filtering across multiple dimensions. The search box filters by model name or organization. The parameter slider lets you set a maximum model size based on your hardware constraints — if you have a single A100-80GB, slide it down to 70B to see only models you can run. The license dropdown filters by permissiveness: Apache 2.0, MIT, or restrictive/custom licenses. Use-case tag chips let you filter by task type: general chat, code generation, RAG, multilingual, or edge deployment. All filters work together, so you can quickly find something like "Apache 2.0 licensed coding models under 15B parameters."
Every column in the table is sortable. Click any header to sort ascending or descending. Sort by MMLU to find the highest-scoring models. Sort by parameters to find the most efficient models. Sort by release date to find the newest options. The data is compiled from official model announcements, Hugging Face model cards, and published benchmark papers, and is updated regularly as new models are released.
LockML includes real-time search that filters the model table as you type, matching against model names and organization names. The parameter range slider provides instant filtering with a visual display of the current threshold. The license filter categorizes all models into three tiers: fully permissive (Apache 2.0, MIT), permissive with conditions (Llama Community, Gemma), and restrictive or non-commercial licenses. Use-case tag chips provide one-click filtering by task type. A results counter shows how many models match your current filter combination. The license comparison section provides detailed breakdowns of what each license allows and restricts for commercial use.
LockML is built for ML engineers, data scientists, CTOs, and anyone evaluating open-source models for production use. Engineering teams use it to create shortlists of models that meet their licensing and infrastructure requirements before running evaluations. Startup CTOs use it to compare licensing terms across models to ensure commercial viability. Researchers use it to track the state of the art and identify new models worth evaluating. Infrastructure teams use it to scope hardware requirements based on model parameter counts. If you are working with closed-source AI APIs like Claude or GPT-4, you might find ClaudKit useful for building and testing API requests. For teams that need prompt templates optimized for specific tasks, ClaudHQ provides a library of 30+ tested templates that work across both open and closed models.
Everything runs locally in your browser. LockML does not send any data to any server, does not use cookies for tracking, does not require authentication, and does not store any information about you or your model searches. The application is a static HTML, CSS, and JavaScript site hosted on GitHub Pages. The complete source code is open source on GitHub.
It depends on your use case. Llama 3.1 405B leads on benchmarks (87.3 MMLU) and is competitive with GPT-4. For cost efficiency, Llama 3.1 70B and Qwen 2 72B offer excellent performance-per-parameter. For edge deployment, Phi-3 Mini 3.8B punches above its weight class.
Yes, with conditions. The Llama 3/3.1 Community License allows commercial use for organizations with under 700 million monthly active users. Above that threshold, you need a separate license from Meta. You must also include attribution and comply with the Acceptable Use Policy.
Apache 2.0 and MIT are the most permissive, allowing unrestricted commercial use. Models like Mistral 7B, Falcon 40B, Grok-1, and Arctic use Apache 2.0. Microsoft's Phi-3 uses MIT. Llama and Gemma have custom licenses with some restrictions. CC-BY-NC models like Command R+ prohibit commercial use entirely.
In the LLM space, "open source" often means open weights — the model weights are downloadable but training data and code may not be. True open source (like OLMo from AI2) includes data, weights, code, and training logs. Always check the specific license for your use case.
DeepSeek Coder V2 leads on coding benchmarks, beating GPT-4 Turbo. Codestral 22B supports 80+ languages with fill-in-the-middle. StarCoder2 15B covers 600+ languages. For general coding, Llama 3.1 70B and Mixtral 8x22B are strong all-rounders.
Yes, LockML is completely free. Compare 30+ open-source models, check license compatibility, and find the right model for your project. No sign-up required. Everything runs in your browser.
LockML is updated regularly as new open-source models are released. Benchmark data is sourced from official papers and the Open LLM Leaderboard. Check the last-updated date on the page for the most recent data refresh.
LockML currently displays MMLU scores as the primary benchmark, along with parameter counts, release dates, licensing information, and use-case tags. MMLU is the most widely reported benchmark across open-source models, making it the most useful for cross-model comparison.