Describe a task.
Get training data.

Upload your reference files, choose SFT or RL, and let an agent swarm generate, validate, and package a training-ready dataset — typically in hours, not weeks.

See examples

dataset_brief.txt

Generate an SFT dataset for our proprietary Kdb+/Q research library used by a quantitative trading desk. Cover query optimisation, tick data manipulation, real-time aggregation patterns, and internal utility functions.

Reference filestrading_lib.q · schema_docs.pdf · internal_utils.q Target1,200 examples · JSONL · 90/10 split

job_4a2f91c Complete

Type: Supervised FT
Size: 1,200 examples
Effort: Level 3/5
Format: JSONL
Duration: 4h 12m
Contamination: 0 matches

01 / The gap

Foundation models have never seen your internal code.

Your proprietary frameworks, DSLs, internal toolchains, and domain-specific languages don't exist in any training set. AI coding tools hallucinate on your stack because they were never trained on it.

02 / The bottleneck

Fine-tuning fixes this — but you need data first.

Building high-quality datasets by hand takes weeks. Prompting an LLM in a loop gets you volume, not quality — no validation, no deduplication, no contamination checks, no reward logic, no understanding of how software actually works.

03 / Wormulon

Agents generate it. Validated. Deduped. Training-ready.

The flow

How it works

STEP 01

Describe

Write a dataset brief: subject matter, use case, category distribution. Pick SFT or RL. Name the toolchains — Terraform, Bazel, Kdb+/Q, your internal compiler.

STEP 02

Upload

Attach reference files: CSV, JSON, PDF, code samples, toolchain binaries. If your toolchain is niche or internal, upload a Linux-compatible version. More context = better output.

STEP 03

Configure

Set dataset size, effort level (fastest → maximum QA depth), and train/eval split. Generate a quote and see the price before you commit. No call required.

STEP 04

Download

Validated, merged, packaged. Dataset in your chosen format (JSONL, Parquet, HF), with reward and grading logic, documentation, dependencies, and a contamination report.

Most jobs complete in hours. Large or maximum-effort datasets may take longer. You'll see an estimate before you confirm.

Recent jobs

What completed datasets look like

Generate RL dataset for clinical coding

RL Complete

Type: Reinforcement Learning
Size: 500 examples
Effort: Level 2/5
Format: Harbour

Given a simulated clinical note (discharge summary, operative report), the agent must assign the correct combination of ICD-10 codes from the ~70,000 code taxonomy. Reward is based on coding accuracy, specificity, and compliance with CMS bundling/unbundling rules.

Sample row

{
  "clinical_note": "72 y/o male, CABG x3, discharged POD 5...",
  "expected_codes": ["Z95.1", "I25.10", "Z87.74"],
  "reward_criteria": {
    "primary_accuracy": 0.85,
    "specificity_bonus": 0.10,
    "bundling_compliance": true
  }
}

Generate SFT dataset for internal Kdb+/Q codebase

SFT Complete

Type: Supervised FT
Size: 1,200 examples
Effort: Level 3/5
Format: JSONL

Instruction/response pairs for a proprietary Kdb+/Q research library used by a quantitative trading desk. Covers query optimisation, tick data manipulation, real-time aggregation patterns, and internal utility functions.

Sample row

{
  "instruction": "Write a q function that computes 30-second VWAP
                  for a given sym from the trades table,
                  handling overnight boundaries.",
  "response": "vwap30:{[s] select vwap:wavg[size;price]
               by 30 xbar time.second from trades
               where sym=s, time within (prev_close;next_open)}",
  "metadata": {"category": "tick_data", "complexity": "intermediate"}
}

Generate SFT dataset for Terraform + internal cloud abstraction

SFT Complete

Type: Supervised FT
Size: 800 examples
Effort: Level 4/5
Format: Parquet

Instruction/response pairs covering Terraform modules, custom provider configurations, and an internal cloud abstraction DSL used by a platform engineering team.

Sample row

{
  "instruction": "Create a Terraform module that provisions an internal
                  load balancer using our cloud_abstract provider
                  with the standard tagging policy applied.",
  "response": "module \"internal_lb\" {\n  source = \"../modules/cloud_abstract_lb\"\n
                 providers = { cloud_abstract = cloud_abstract.prod }\n
                 tags = merge(local.standard_tags, { service = var.service_name })\n}",
  "metadata": {"category": "infrastructure", "complexity": "advanced"}
}

Who it's for

Built for teams whose code lives outside the training set.

▸ Audience 01

Quantitative finance

Your quants use AI coding tools daily but your internal code — Kdb+/Q, proprietary C++, internal Python research libraries — has never been in any training set. Generate a test dataset on open-source code over a weekend. See the quality difference. Then bring it to your internal stack.

▸ Audience 02

Proprietary stacks

Your codebase uses languages and frameworks no foundation model has seen. Internal DSLs, custom compilers, niche toolchains. AI tools underperform exactly where it matters most. Wormulon generates training data that teaches models your stack.

▸ Audience 03

ML teams already fine-tuning

You're running training jobs. The bottleneck isn't compute — it's sourcing quality data for your specific task. You've tried prompting models in a loop and the output isn't good enough. Wormulon generates validated, structured datasets from a task description.

▸ Audience 04

Regulated industries

Healthcare, defence, government. Specific compliance requirements, internal taxonomies, proprietary workflows. Wormulon generates task-specific datasets without requiring you to expose production data — upload sanitised reference files and describe what you need.

Your data

Your data stays yours.

Your task descriptions, reference files, and generated datasets are not used to train any model. Uploaded files are available only during your generation run. Generated datasets are stored for download for 30 days, then permanently deleted. Wormulon does not retain, aggregate, or learn from customer inputs or outputs.

For regulated environments requiring on-premise deployment: Wormulon is built by Cosine, a team with sovereign AI clearance that has delivered classified workloads for the UK government.

Describe a task.Get training data.

Foundation models have never seen your internal code.

Fine-tuning fixes this — but you need data first.

Agents generate it. Validated. Deduped. Training-ready.

How it works

Describe

Upload

Configure

Download

What completed datasets look like

Built for teams whose code lives outside the training set.

Quantitative finance

Proprietary stacks

ML teams already fine-tuning

Regulated industries

Your data stays yours.

Stop sourcing data. Start generating it.

Describe a task.
Get training data.