Describe a task.
Get training data.

Upload your reference files, choose SFT or RL, and let an agent swarm generate, validate, and package a training-ready dataset — typically in hours, not weeks.

See examples
▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒

01 / The gap

Foundation models have never seen your internal code.

Your proprietary frameworks, DSLs, internal toolchains, and domain-specific languages don't exist in any training set. AI coding tools hallucinate on your stack because they were never trained on it.

02 / The bottleneck

Fine-tuning fixes this — but you need data first.

Building high-quality datasets by hand takes weeks. Prompting an LLM in a loop gets you volume, not quality — no validation, no deduplication, no contamination checks, no reward logic, no understanding of how software actually works.

03 / Wormulon

Agents generate it. Validated. Deduped. Training-ready.

The flow

How it works

STEP 01

Describe

Write a dataset brief: subject matter, use case, category distribution. Pick SFT or RL. Name the toolchains — Terraform, Bazel, Kdb+/Q, your internal compiler.

STEP 02

Upload

Attach reference files: CSV, JSON, PDF, code samples, toolchain binaries. If your toolchain is niche or internal, upload a Linux-compatible version. More context = better output.

STEP 03

Configure

Set dataset size, effort level (fastest → maximum QA depth), and train/eval split. Generate a quote and see the price before you commit. No call required.

STEP 04

Download

Validated, merged, packaged. Dataset in your chosen format (JSONL, Parquet, HF), with reward and grading logic, documentation, dependencies, and a contamination report.

Most jobs complete in hours. Large or maximum-effort datasets may take longer. You'll see an estimate before you confirm.

Recent jobs

What completed datasets look like

Generate RL dataset for clinical coding

RL Complete
Type
Reinforcement Learning
Size
500 examples
Effort
Level 2/5
Format
Harbour

Given a simulated clinical note (discharge summary, operative report), the agent must assign the correct combination of ICD-10 codes from the ~70,000 code taxonomy. Reward is based on coding accuracy, specificity, and compliance with CMS bundling/unbundling rules.

Sample row

{
  "clinical_note": "72 y/o male, CABG x3, discharged POD 5...",
  "expected_codes": ["Z95.1", "I25.10", "Z87.74"],
  "reward_criteria": {
    "primary_accuracy": 0.85,
    "specificity_bonus": 0.10,
    "bundling_compliance": true
  }
}

Generate SFT dataset for internal Kdb+/Q codebase

SFT Complete
Type
Supervised FT
Size
1,200 examples
Effort
Level 3/5
Format
JSONL

Instruction/response pairs for a proprietary Kdb+/Q research library used by a quantitative trading desk. Covers query optimisation, tick data manipulation, real-time aggregation patterns, and internal utility functions.

Sample row

{
  "instruction": "Write a q function that computes 30-second VWAP
                  for a given sym from the trades table,
                  handling overnight boundaries.",
  "response": "vwap30:{[s] select vwap:wavg[size;price]
               by 30 xbar time.second from trades
               where sym=s, time within (prev_close;next_open)}",
  "metadata": {"category": "tick_data", "complexity": "intermediate"}
}

Generate SFT dataset for Terraform + internal cloud abstraction

SFT Complete
Type
Supervised FT
Size
800 examples
Effort
Level 4/5
Format
Parquet

Instruction/response pairs covering Terraform modules, custom provider configurations, and an internal cloud abstraction DSL used by a platform engineering team.

Sample row

{
  "instruction": "Create a Terraform module that provisions an internal
                  load balancer using our cloud_abstract provider
                  with the standard tagging policy applied.",
  "response": "module \"internal_lb\" {\n  source = \"../modules/cloud_abstract_lb\"\n
                 providers = { cloud_abstract = cloud_abstract.prod }\n
                 tags = merge(local.standard_tags, { service = var.service_name })\n}",
  "metadata": {"category": "infrastructure", "complexity": "advanced"}
}
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

Who it's for

Built for teams whose code lives outside the training set.

▸ Audience 01

Quantitative finance

Your quants use AI coding tools daily but your internal code — Kdb+/Q, proprietary C++, internal Python research libraries — has never been in any training set. Generate a test dataset on open-source code over a weekend. See the quality difference. Then bring it to your internal stack.

▸ Audience 02

Proprietary stacks

Your codebase uses languages and frameworks no foundation model has seen. Internal DSLs, custom compilers, niche toolchains. AI tools underperform exactly where it matters most. Wormulon generates training data that teaches models your stack.

▸ Audience 03

ML teams already fine-tuning

You're running training jobs. The bottleneck isn't compute — it's sourcing quality data for your specific task. You've tried prompting models in a loop and the output isn't good enough. Wormulon generates validated, structured datasets from a task description.

▸ Audience 04

Regulated industries

Healthcare, defence, government. Specific compliance requirements, internal taxonomies, proprietary workflows. Wormulon generates task-specific datasets without requiring you to expose production data — upload sanitised reference files and describe what you need.

Your data

Your data stays yours.

Your task descriptions, reference files, and generated datasets are not used to train any model. Uploaded files are available only during your generation run. Generated datasets are stored for download for 30 days, then permanently deleted. Wormulon does not retain, aggregate, or learn from customer inputs or outputs.

For regulated environments requiring on-premise deployment: Wormulon is built by Cosine, a team with sovereign AI clearance that has delivered classified workloads for the UK government.

Built by the team behind UK's first sovereign AI model

Stop sourcing data. Start generating it.

No call required. See a quote in minutes.