| A · Continual learning without forgetting |
| Spectral / structural CL guaranteeCRMA constrained-residual mixing; patent pending |
◐CRMA in utils/crma.py + paper at paper/crma_modular_cl_arxiv.tex. Architectural constraint exists in code; no public formal proof yet — partial. |
— | — | — | — | — |
— | — | — | — | — |
| Multi-seed reproducibility numbers published3 seeds at 7B; per-seed ranges disjoint |
✓Source: scripts/stage2_mistral_3seed.py · numbers cited at /claims |
— | — | — | — | — |
— | — | — | — | — |
| Add new domain without breaking old ones5-domain Mistral-7B chain at 26/31 zero-forget |
✓Source: utils/cl_config.py + sequential CL_p2 chain; result cited at /claims |
— | — | — | — | — |
— | — | — | — | — |
| B · Cleaner contractual invariants |
| Score-floor revertPost-clean score never below pre-clean |
✓Source: backend/cleaner/llm_rewrite.py:270 · property-tested at tests/cleaner/test_K1_invariants.py |
— | — | — | — | — |
— | — | — | — | — |
| DeterminismSame input → same output, property-tested |
✓Source: backend/tests/cleaner/test_K1_invariants.py — property-tested invariant |
— | — | — | — | — |
— | — | — | — | — |
| MonotonicityIssue count never increases after clean |
✓Source: backend/tests/cleaner/test_K1_invariants.py — monotonicity invariant |
— | — | — | — | — |
— | — | — | — | — |
| Per-row revert-on-degradeIf a row's score drops, revert that row only |
✓Source: backend/cleaner/service.py + autofix.py — per-row revert in production |
— | — | — | — | — |
— | — | — | — | — |
| C · LLM judge defenses |
| Prompt-injection hardenNFKC + zero-width strip + role-flip + polarity splice |
✓Source: backend/cleaner/prompt_safety.py + geval_judge.py |
— | — | — | — | — |
— | — | — | — | — |
| Chat-template token stripQwen3 <think>, Llama-3 <|eom_id|>, Phi-4 <|im_sep|> |
✓Source: backend/cleaner/autofix.py + tests/cleaner/test_autofix_chat_template_tokens.py |
— | — | — | — | — |
— | — | — | — | — |
| Atomic per-user daily AI cost capSQL transaction across 5 cleaner-AI routes |
✓Source: backend/db.py — atomic across 5 cleaner-AI routes (Wave 5 W5 S1) |
— | — | — | — | — |
— | — | — | — | — |
| D · Public trust signals |
| Public security page WITH retention numbers cited to codeOperational retention days, not just compliance logos |
✓Source: /security — TLS, AES-256, retention tied to backend/db.py |
— | — | — | — | — |
— | — | — |
✓security.snorkel.ai — SOC 2 + HIPAA + NIST CSF (no retention-days numbers cited) |
— |
| Public claims / receipts pageEvery public number sourced to a code file or commit |
✓Source: /claims — every numeric claim → code path / paper section |
— | — | — | — | — |
— | — | — | — | — |
| Public live status pageAPI health surface — live or incident-style |
✓Source: /status — live Modal /health fetch (manual updates for incidents) |
— | — | — |
✓status.fireworks.ai — incident-style + uptime % |
✓status.openai.com — incident-style + uptime % |
— | — | — | — | — |
| E · Architectural transparency |
| Open arXiv paperFirst-party research artifact |
✓Source: paper/crma_modular_cl_arxiv.tex — CRMA paper, arXiv-ready |
— | — | — | — | — |
— |
✓Cleanlab confident-learning research lineage |
— |
✓Snorkel (arXiv 1711.10160) + DryBell (1812.00417) |
— |
| Patent pending or grantedDisclosed publicly |
✓CRMA — US provisional filed 2026-02-28 |
— | — | — | — | — |
— | — | — | — | — |
| Public benchmark resultsFirst-party numbers, 3-seed Mistral-7B 26/31 zero-forget |
✓Source: /claims — 5-domain Mistral-7B chain at 26/31 zero-forget, 3-seed reproducibility, per-seed ranges disjoint. Paper at paper/crma_modular_cl_arxiv.tex. |
✓LoRA Land paper (arXiv 2405.00732) |
— | — | — | — |
— | — | — | — | — |
| Multi-architecture verification (≥4 model families)Not just "supports many models" |
✓TinyLlama, Mistral-7B, Saul-7B, Qwen3-8B, Gemma-2-9B, Llama-3.1-8B = 6 families |
✓LoRA Land — 10 base models tested |
— | — | — | — |
— | — | — | — | — |
| F · Engineering quality signals |
| Atomic SQL transactions for billing-capRace-free under concurrent jobs |
✓Source: backend/db.py — atomic across 5 cleaner-AI routes (Wave 5) |
— | — | — | — | — |
— | — | — | — | — |
| Property-test invariants in test suiteNot unit-tests-of-examples; property tests |
✓Source: backend/tests/cleaner/test_K1_invariants.py + test_score_floor_invariant.py |
— | — | — | — | — |
— | — | — | — | — |
| IDOR existence-oracle defense403 → 404 conversion + timing symmetry |
✓Wave 5 W5 S1 fix — backend/tests/cleaner/test_p0_rt_idor.py + test_s3_2_file_id_idor.py |
— | — | — | — | — |
— | — | — | — | — |
| Modal-side upload MIME / magic-byte guardValidate file-type at ingest |
◐Source: backend/cleaner/validators.py — MIME + magic-byte present, but inspects only first JSONL line (P2 backlog). |
— | — | — | — | — |
— | — | — | — | — |
| G · Inference & API surface |
OpenAI-compatible /v1/chat/completionsDrop-in replacement for the OpenAI SDK; Bearer-key compatible |
✓Source: backend/server.py:5930 — full OpenAI envelope, scoped per run_id. |
— |
✓docs.together.ai — OpenAI-compatible chat completions |
— |
✓docs.fireworks.ai — chat completions API |
✓platform.openai.com — native |
— | — | — | — | — |
Adapter export ZIP — train here, run anywhereLoRA + CRMA bundle download via GET /download/{run_id} |
✓Source: backend/server.py:4371 + utils/inject.py — standard PEFT format, loads in vLLM / TGI / Transformers. |
— | — |
✓HuggingFace AutoTrain — adapter pushed to the Hub by default |
— |
— |
— | — | — | — | — |
| HF Hub README auto-export with IP-leak guardYAML frontmatter + sanitized hparams allow-list + XSS escape |
✓Source: backend/server.py:4287 + backend/readme_export.py — _assert_safe_overlap hard-fails on patent-pending field names. |
— | — | — | — | — |
— | — | — | — | — |
| Per-API-key allowed-runs / allowed-models + RPM/TPM scopingLeast-privilege keys, fine-grained throttling per key |
✓Source: backend/db.py api_key_can_access_run / api_key_can_access_model + backend/server.py:8470 PATCH /me/api-keys/{id} |
— | — | — | — | — |
— | — | — | — | — |
Streaming SSE generationToken-by-token via /v1/{run_id}/generate?stream=true |
✓Source: modal_deploy.py:2004 generate_text_stream + backend/server.py:5336 |
— |
✓docs.together.ai — streaming on OpenAI envelope |
— |
✓docs.fireworks.ai — streaming supported |
✓platform.openai.com — streaming native |
— | — | — | — | — |
| CL chain visualization tree UIParent-child task graph rendered in dashboard |
✓Source: frontend/src/components/CLChainTree.tsx + backend/server.py:8123 GET /chain/{run_id} |
— | — | — | — | — |
— | — | — | — | — |
| H · Operational quality & data rights |
| Auto-refund on stuck runs (90-min sweeper)Money back automatically without a support ticket |
✓Source: tests/test_stuck_sweep_refund.py + backend/db.py refund_stuck_run() — 5-min cron, 90-min active-job window, pro-rata refund via _auto_refund with correlation_id trace. |
— | — | — | — | — |
— | — | — | — | — |
| GDPR data export + self-delete endpointsUser-portable JSON export + account self-deletion, no support ticket |
✓Source: backend/server.py:8880 GET /account/data-export + backend/server.py:8846 DELETE /account — CSV-injection sanitized at :8867. |
— | — | — | — | — |
— | — | — | — | — |
| Domain PII detectors with checksum validationDEA / NPI / CUSIP / SWIFT / ABA + 4 more — 9 detectors, defense + finance + legal |
✓Source: backend/cleaner/detectors_typed.py — 9 detectors with checksum validation (DEA, NPI, CUSIP, SWIFT/BIC, ABA, MRN, ICD-10, bar number, Bates). |
— | — | — | — | — |
— | — | — | — | — |
| Military OPSEC redaction togglesCUI / FOUO / SECRET markings, MGRS grid, EDIPI, DTG, lat/long, network refs |
✓Source: backend/cleaner/validators.py (opsec_* codes) — 6 categories, defense-customer-shaped feature. |
— | — | — | — | — |
— | — | — | — | — |