Question 1

What is ModelBrew?

Accepted Answer

ModelBrew is a fine-tuning platform for open-source language models, built around a continual-learning architecture called CRMA (Constrained Residual Mixing Adapter). You upload a dataset, optionally clean it with our LLM judge, train a LoRA adapter on one of six supported base models (TinyLlama, Mistral-7B, Saul-7B, Qwen3-8B, Gemma-2-9B, Llama-3.1-8B), and either chat with the result or stack additional domains on top via continual learning. Every numeric marketing claim is tied to a specific line of code on the public receipts page. Learn more at https://modelbrew.ai/security or https://modelbrew.ai/claims.

Question 2

What is fine-tuning vs RAG vs prompting?

Accepted Answer

Prompting steers a model's behavior at inference using natural-language instructions. RAG (retrieval-augmented generation) retrieves passages from a vector store and injects them into the context window, so the model can quote facts it never memorized. Fine-tuning updates the model's weights, baking a behavior or domain pattern into the network itself. They are not interchangeable. In our open-source OS1 benchmark across 3 public Obsidian vaults, fine-tuning beat RAG 83.3 percent on inference questions (questions whose answers cannot be found in any one retrieved chunk). RAG still wins on extractive lookup. Receipts at /claims.

Question 3

What is continual learning across domains?

Accepted Answer

Continual learning is training a model on Domain A, then Domain B, then Domain C in sequence — without it forgetting Domain A by the time it reaches C. The textbook failure mode is catastrophic forgetting. ModelBrew's CRMA architecture has been verified empirically on a 5-domain Mistral-7B chain at 26 of 31 zero-catastrophic-forgetting test items, with a 3-seed reproducibility sweep. The full method is in our preprint at paper/crma_modular_cl_arxiv.tex. None of the ten competitors in our research panel publicly disclosed an equivalent sequential-CL benchmark. Receipts at /claims.

Question 4

What is CRMA?

Accepted Answer

CRMA stands for Constrained Residual Mixing Adapter. The intuition: when you train a second adapter on top of an existing model, gradient flow toward the new task can erase what the first adapter learned. CRMA constrains the residual mixing of new updates against a frozen reference, so the second domain's gradients add to capability rather than overwriting it. The architecture is openly disclosed in our preprint; the production hyperparameters and the specific implementation in utils/crma.py are protected. Patent pending (US provisional filed 2026-02-28). See paper/crma_modular_cl_arxiv.tex.

Question 5

What is LoRA fine-tuning?

Accepted Answer

LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning method that freezes the base model's weights and inserts small trainable low-rank matrices into the attention layers. Instead of updating billions of parameters, you update millions, which makes training cheaper, faster, and recoverable on commodity GPUs. ModelBrew uses LoRA via the PEFT library underneath every fine-tuning run, configured per-model so the trainable-parameter count is appropriate to the base model size. The resulting adapter is small enough to download and host yourself. See /claims for the supported model list and per-model parameter counts.

Question 6

What is the one-sentence summary of how ModelBrew differs?

Accepted Answer

ModelBrew is the only platform on this comparison that ships continual learning across domains with a published multi-seed empirical benchmark (5-domain Mistral-7B chain, 26 of 31 zero-forget; 3-seed reproducibility), a public claims-to-code receipts page where every numeric marketing claim cites a code file, and a security page with retention numbers tied to backend/db.py. Receipts at https://modelbrew.ai/claims and https://modelbrew.ai/security.

Question 7

ModelBrew vs Predibase: which is better for fine-tuning?

Accepted Answer

Predibase has years of customer track and ships LoRAX multi-LoRA inference serving as a product. They publicly cite the LoRA Land paper (arXiv 2405.00732), 310 single-task adapters across 10 base models — that is genuine portfolio breadth. ModelBrew publicly cites a sequential continual-learning chain (Mistral-7B, 5 domains, 26 of 31 zero-forget) and a claims-to-code receipts page where every marketing number maps to a backend code file. Different research questions, different stage of company. If you need multi-LoRA inference at scale, look at Predibase. If you need stacked-domain training with code-level transparency, ModelBrew. See https://modelbrew.ai/compare-v3.html.

Question 8

ModelBrew vs Together AI?

Accepted Answer

Together AI publicly cites a deep inference stack (high-throughput open-model serving, batched generation, dedicated endpoints) and enterprise VPC deployment. That is a real strength for production inference at scale. ModelBrew publicly cites a continual-learning training architecture, multi-seed empirical reproducibility, and a claims-to-code receipts page. We do not yet ship inference at Together's scale. They do not publicly disclose a sequential continual-learning benchmark with comparable test conditions. Pick by workload: Together for high-throughput open-model inference, ModelBrew for training quality and post-train transparency. See https://modelbrew.ai/compare-v3.html.

Question 9

ModelBrew vs HuggingFace AutoTrain?

Accepted Answer

HuggingFace AutoTrain is the open-source default — broad model catalog, deep ecosystem integration, large community. ModelBrew supports a curated set of six base models (TinyLlama free; Mistral-7B, Saul-7B, Qwen3-8B, Gemma-2-9B, Llama-3.1-8B paid) and ships behind those a continual-learning architecture, an automatic stuck-run refund sweeper, and a public claims-to-code receipts page. AutoTrain optimizes for breadth; ModelBrew optimizes for contractual quality on a smaller surface. If your workflow already lives in the HuggingFace Hub, AutoTrain is the path of least resistance. If you want every dollar and every benchmark traced to code, see /claims.

Question 10

ModelBrew vs OpenAI fine-tuning?

Accepted Answer

OpenAI's fine-tuning API supports their closed family (GPT-4 mini, GPT-4o, GPT-3.5). The resulting weights stay on OpenAI infrastructure — you cannot download them. ModelBrew fine-tunes six open-source base models (TinyLlama, Mistral-7B, Saul-7B, Qwen3-8B, Gemma-2-9B, Llama-3.1-8B), and the resulting LoRA adapter is downloadable via README export. Different categories: pick OpenAI if your stack is already locked into closed models; pick ModelBrew if you want portable weights, the option to self-host, and a continual-learning surface across multiple domains. See /claims.

Question 11

What is the best alternative to RAG for domain knowledge?

Accepted Answer

Fine-tuning with continual learning is the strongest alternative to RAG when the questions you care about cannot be answered by retrieving any single document. In our open-source OS1 benchmark across three public Obsidian vaults, fine-tuning beat RAG 83.3 percent on inference questions — questions about author perspective, recurring values, cross-document patterns. RAG still wins on extractive lookup (about 58 percent), so the honest answer is hybrid: RAG for fresh facts, fine-tuning for behavior and synthesis. ModelBrew specializes in the fine-tuning half. Receipts at /claims.

Question 12

How do I fine-tune Llama or Mistral or Qwen3 on my data?

Accepted Answer

Four steps. (1) Sign up at app.modelbrew.ai — 75 credits ($7.50) free at signup, no card. (2) Upload a JSONL of prompt-response pairs. (3) Optionally clean it with our LLM judge ($5 per 200 rows, runs prompt-injection-hardened). (4) Pick a base model (Mistral-7B, Saul-7B, Qwen3-8B, Gemma-2-9B, or Llama-3.1-8B), set the LoRA hyperparameters (defaults are sane), and train at $3.99 per million tokens. After the run completes you can chat with the adapter or download it via README export. See https://modelbrew.ai/claims for source-cited pricing.

Question 13

Can I fine-tune on my Obsidian or Notion notes?

Accepted Answer

Yes, with a small export step. Currently the platform expects JSONL of prompt-response pairs, so you export your vault to markdown and run a Q-and-A generation step before upload. Markdown ingest natively (OA1) is on our parked roadmap. The case for doing this at all is empirical: in our OS1 benchmark across three public Obsidian vaults (lyz-code blue-book, jRicciL digital-garden, deepaksood619.github.io), fine-tuning beat RAG 83.3 percent on inference questions. RAG-win-rates were 59.0 / 40.0 / 56.0 percent — all below the 60 percent kill threshold. Receipts at /claims.

Question 14

How much does fine-tuning cost on ModelBrew?

Accepted Answer

Fine-tuning is $3.99 per million training tokens, flat across all six supported base models. Cleaning a dataset with the LLM judge is $5 per 200 rows. New accounts get 75 credits ($7.50) at signup, no card required, which is enough to fine-tune TinyLlama or run a small clean. The minimum credit purchase is $20, and credits never expire. All numbers are cited line-by-line to backend/pricing.py and backend/db.py on the public receipts page. See https://modelbrew.ai/claims#pricing.

Question 15

Which model should I pick for my dataset size?

Accepted Answer

Rough guidance based on internal experiments: under 100 rows, start with TinyLlama (free) — anything larger will under-train and you'll be paying to memorize noise. 100 to 1000 rows, Mistral-7B-Instruct or Saul-7B-Instruct are strong general-purpose picks. Over 1000 rows, Qwen3-8B or Gemma-2-9B give you headroom for harder domains; Llama-3.1-8B is the right pick if your downstream serving is already on Llama. CRMA continual-learning chains (Mistral-7B, 5 domains, 26 of 31 zero-forget) used Mistral-7B as the base. See /claims for the full model list and per-model token costs.

Question 16

Is my data safe with ModelBrew? Is ModelBrew SOC 2 compliant?

Accepted Answer

Honest answer: ModelBrew is pre-revenue and not yet SOC 2 or HIPAA certified — both are deferred until an enterprise customer asks for them in a contract, and both are listed on the public security page rather than hidden. What we do ship today: TLS 1.2 or higher on every endpoint, AES-256 at rest on Modal volumes and Turso replicas, sub-hour local-copy dataset retention, an explicit no-train pledge (we never train on your data), LLM-judge prompt-injection hardening including polarity splice, and chat-template token stripping for Qwen3, Llama-3.1, and Phi-4. See https://modelbrew.ai/security.

Question 17

Does ModelBrew train on your data?

Accepted Answer

No. We never train ModelBrew base models on your uploaded datasets. Your data is used only for the fine-tuning run you explicitly start, and the resulting LoRA adapter is yours to download. Upstream API responses (the LLM-judge cleaner output) are processed and discarded — they do not enter any training corpus. The default API tier is no-retention. The full pledge with retention numbers is on the public security page at /security under the no-train-pledge section. See https://modelbrew.ai/security#no-train.

Question 18

What happens if my training run fails?

Accepted Answer

You get refunded automatically. A background sweeper checks every 5 minutes for stuck or failed runs and applies a pro-rata refund via the auto_refund code path in backend/db.py — you do not have to file a ticket. The active-job window is 90 minutes; runs that have not produced a heartbeat by then are flagged stuck and refunded. Status of the live API and the sweeper itself is on the public status page. Source: backend/db.py refund_stuck_run() and backend/server.py background recovery thread. See https://modelbrew.ai/status.

Question 19

Is ModelBrew open source?

Accepted Answer

Partially, by design. The CRMA preprint is publicly available and the architecture is openly disclosed at paper/crma_modular_cl_arxiv.tex. The Python and JavaScript SDKs are open. The production repo containing utils/crma.py is private — that is where the hyperparameter configurations and exact implementation live. This split is deliberate: enough is published that the research is reviewable, and enough is protected that the commercial moat is real. See https://modelbrew.ai/claims for the full disclosure surface.

Feature	ModelBrew	Predibase*	Together	HF AutoTrain	Fireworks*	OpenAI*	Argilla	Cleanlab	DeepEval	Snorkel	Galileo*
A · Continual learning without forgetting
Spectral / structural CL guaranteeCRMA constrained-residual mixing; patent pending	◐CRMA in utils/crma.py + paper at `paper/crma_modular_cl_arxiv.tex`. Architectural constraint exists in code; no public formal proof yet — partial.	—	—	—	—	—	—	—	—	—	—
Multi-seed reproducibility numbers published3 seeds at 7B; per-seed ranges disjoint	✓Source: `scripts/stage2_mistral_3seed.py` · numbers cited at /claims	—	—	—	—	—	—	—	—	—	—
Add new domain without breaking old ones5-domain Mistral-7B chain at 26/31 zero-forget	✓Source: `utils/cl_config.py` + sequential CL_p2 chain; result cited at /claims	—	—	—	—	—	—	—	—	—	—
B · Cleaner contractual invariants
Score-floor revertPost-clean score never below pre-clean	✓Source: backend/cleaner/llm_rewrite.py:270 · property-tested at `tests/cleaner/test_K1_invariants.py`	—	—	—	—	—	—	—	—	—	—
DeterminismSame input → same output, property-tested	✓Source: `backend/tests/cleaner/test_K1_invariants.py` — property-tested invariant	—	—	—	—	—	—	—	—	—	—
MonotonicityIssue count never increases after clean	✓Source: `backend/tests/cleaner/test_K1_invariants.py` — monotonicity invariant	—	—	—	—	—	—	—	—	—	—
Per-row revert-on-degradeIf a row's score drops, revert that row only	✓Source: `backend/cleaner/service.py` + `autofix.py` — per-row revert in production	—	—	—	—	—	—	—	—	—	—
C · LLM judge defenses
Prompt-injection hardenNFKC + zero-width strip + role-flip + polarity splice	✓Source: `backend/cleaner/prompt_safety.py` + `geval_judge.py`	—	—	—	—	—	—	—	—	—	—
Chat-template token stripQwen3 <think>, Llama-3 <\|eom_id\|>, Phi-4 <\|im_sep\|>	✓Source: `backend/cleaner/autofix.py` + `tests/cleaner/test_autofix_chat_template_tokens.py`	—	—	—	—	—	—	—	—	—	—
Atomic per-user daily AI cost capSQL transaction across 5 cleaner-AI routes	✓Source: backend/db.py — atomic across 5 cleaner-AI routes (Wave 5 W5 S1)	—	—	—	—	—	—	—	—	—	—
D · Public trust signals
Public security page WITH retention numbers cited to codeOperational retention days, not just compliance logos	✓Source: /security — TLS, AES-256, retention tied to `backend/db.py`	—	—	—	—	—	—	—	—	✓security.snorkel.ai — SOC 2 + HIPAA + NIST CSF (no retention-days numbers cited)	—
Public claims / receipts pageEvery public number sourced to a code file or commit	✓Source: /claims — every numeric claim → code path / paper section	—	—	—	—	—	—	—	—	—	—
Public live status pageAPI health surface — live or incident-style	✓Source: /status — live Modal /health fetch (manual updates for incidents)	—	—	—	✓status.fireworks.ai — incident-style + uptime %	✓status.openai.com — incident-style + uptime %	—	—	—	—	—
E · Architectural transparency
Open arXiv paperFirst-party research artifact	✓Source: `paper/crma_modular_cl_arxiv.tex` — CRMA paper, arXiv-ready	—	—	—	—	—	—	✓Cleanlab confident-learning research lineage	—	✓Snorkel (arXiv 1711.10160) + DryBell (1812.00417)	—
Patent pending or grantedDisclosed publicly	✓CRMA — US provisional filed 2026-02-28	—	—	—	—	—	—	—	—	—	—
Public benchmark resultsFirst-party numbers, 3-seed Mistral-7B 26/31 zero-forget	✓Source: /claims — 5-domain Mistral-7B chain at 26/31 zero-forget, 3-seed reproducibility, per-seed ranges disjoint. Paper at `paper/crma_modular_cl_arxiv.tex`.	✓LoRA Land paper (arXiv 2405.00732)	—	—	—	—	—	—	—	—	—
Multi-architecture verification (≥4 model families)Not just "supports many models"	✓TinyLlama, Mistral-7B, Saul-7B, Qwen3-8B, Gemma-2-9B, Llama-3.1-8B = 6 families	✓LoRA Land — 10 base models tested	—	—	—	—	—	—	—	—	—
F · Engineering quality signals
Atomic SQL transactions for billing-capRace-free under concurrent jobs	✓Source: backend/db.py — atomic across 5 cleaner-AI routes (Wave 5)	—	—	—	—	—	—	—	—	—	—
Property-test invariants in test suiteNot unit-tests-of-examples; property tests	✓Source: `backend/tests/cleaner/test_K1_invariants.py` + `test_score_floor_invariant.py`	—	—	—	—	—	—	—	—	—	—
IDOR existence-oracle defense403 → 404 conversion + timing symmetry	✓Wave 5 W5 S1 fix — `backend/tests/cleaner/test_p0_rt_idor.py` + `test_s3_2_file_id_idor.py`	—	—	—	—	—	—	—	—	—	—
Modal-side upload MIME / magic-byte guardValidate file-type at ingest	◐Source: `backend/cleaner/validators.py` — MIME + magic-byte present, but inspects only first JSONL line (P2 backlog).	—	—	—	—	—	—	—	—	—	—
G · Inference & API surface
OpenAI-compatible `/v1/chat/completions`Drop-in replacement for the OpenAI SDK; Bearer-key compatible	✓Source: backend/server.py:5930 — full OpenAI envelope, scoped per run_id.	—	✓docs.together.ai — OpenAI-compatible chat completions	—	✓docs.fireworks.ai — chat completions API	✓platform.openai.com — native	—	—	—	—	—
Adapter export ZIP — train here, run anywhereLoRA + CRMA bundle download via `GET /download/{run_id}`	✓Source: backend/server.py:4371 + `utils/inject.py` — standard PEFT format, loads in vLLM / TGI / Transformers.	—	—	✓HuggingFace AutoTrain — adapter pushed to the Hub by default	—	—	—	—	—	—	—
HF Hub README auto-export with IP-leak guardYAML frontmatter + sanitized hparams allow-list + XSS escape	✓Source: backend/server.py:4287 + `backend/readme_export.py` — `_assert_safe_overlap` hard-fails on patent-pending field names.	—	—	—	—	—	—	—	—	—	—
Per-API-key allowed-runs / allowed-models + RPM/TPM scopingLeast-privilege keys, fine-grained throttling per key	✓Source: `backend/db.py api_key_can_access_run / api_key_can_access_model` + backend/server.py:8470 PATCH /me/api-keys/{id}	—	—	—	—	—	—	—	—	—	—
Streaming SSE generationToken-by-token via `/v1/{run_id}/generate?stream=true`	✓Source: modal_deploy.py:2004 generate_text_stream + `backend/server.py:5336`	—	✓docs.together.ai — streaming on OpenAI envelope	—	✓docs.fireworks.ai — streaming supported	✓platform.openai.com — streaming native	—	—	—	—	—
CL chain visualization tree UIParent-child task graph rendered in dashboard	✓Source: `frontend/src/components/CLChainTree.tsx` + backend/server.py:8123 GET /chain/{run_id}	—	—	—	—	—	—	—	—	—	—
H · Operational quality & data rights
Auto-refund on stuck runs (90-min sweeper)Money back automatically without a support ticket	✓Source: `tests/test_stuck_sweep_refund.py` + `backend/db.py refund_stuck_run()` — 5-min cron, 90-min active-job window, pro-rata refund via `_auto_refund` with `correlation_id` trace.	—	—	—	—	—	—	—	—	—	—
GDPR data export + self-delete endpointsUser-portable JSON export + account self-deletion, no support ticket	✓Source: backend/server.py:8880 GET /account/data-export + `backend/server.py:8846 DELETE /account` — CSV-injection sanitized at `:8867`.	—	—	—	—	—	—	—	—	—	—
Domain PII detectors with checksum validationDEA / NPI / CUSIP / SWIFT / ABA + 4 more — 9 detectors, defense + finance + legal	✓Source: `backend/cleaner/detectors_typed.py` — 9 detectors with checksum validation (DEA, NPI, CUSIP, SWIFT/BIC, ABA, MRN, ICD-10, bar number, Bates).	—	—	—	—	—	—	—	—	—	—
Military OPSEC redaction togglesCUI / FOUO / SECRET markings, MGRS grid, EDIPI, DTG, lat/long, network refs	✓Source: `backend/cleaner/validators.py` (`opsec_*` codes) — 6 categories, defense-customer-shaped feature.	—	—	—	—	—	—	—	—	—	—

How ModelBrew compares. The table, with receipts.

The bilateral-verified table.

A1 — Spectral / structural CL guarantee

A2 — Multi-seed reproducibility numbers

A3 — Add new domain without breaking old ones

B1 — Score-floor revert

B2 — Determinism

B3 — Monotonicity

B4 — Per-row revert-on-degrade

C1 — Prompt-injection harden

C2 — Chat-template token strip

C3 — Atomic per-user daily AI cost cap

D1 — Security page WITH retention numbers cited to code

D2 — Public claims / receipts page

D3 — Public live status page

E1 — Open arXiv paper

E2 — Patent pending or granted

E3 — Public benchmark results

E4 — Multi-architecture verification (≥4 families)

F1 — Atomic SQL transactions for billing-cap

F2 — Property-test invariants in test suite

F3 — IDOR existence-oracle defense

F4 — Modal-side upload MIME / magic-byte guard

G1 — OpenAI-compatible /v1/chat/completions

G2 — Adapter export ZIP (LoRA + CRMA bundle)

G3 — HF Hub README auto-export with IP-leak guard

G4 — Per-API-key allowed-runs/models + RPM/TPM

G5 — Streaming SSE generation

G6 — CL chain visualization tree UI

H1 — Auto-refund on stuck runs (90-min sweeper)

H2 — GDPR data export + self-delete

H3 — Domain PII detectors with checksum

H4 — Military OPSEC redaction toggles

Three statements we will defend with code.

Add a new domain. Keep the old ones. 26 of 31 zero-forget across a five-domain Mistral-7B chain.

Every public number cites a code file.

Retention numbers tied to code lines.

Five routes, one SQL transaction. Cannot race past the daily AI cost cap.

Patent pending. Receipts on every claim. Six base models verified.

ModelBrew vs Predibase, side by side.

ModelBrew · E3

Predibase · E3

Gaps, listed before you ask.

SOC 2 / HIPAA certification

Per-tenant Zero-Data-Retention toggle

Enterprise VPC / on-prem deployment

Multi-LoRA inference serving

Two researchers, one diff.

Two agents read every competitor

Cell-by-cell diff

Caveats kept

The questions we keep getting.

Basics

Comparing ModelBrew

Getting started

Trust & policy

The receipts are the product.

How ModelBrew compares.
The table, with receipts.