DEV Community

Cover image for I built a REST API that parses job descriptions into structured JSON using Claude Haiku, here's how...
Alexroblesr
Alexroblesr

Posted on

I built a REST API that parses job descriptions into structured JSON using Claude Haiku, here's how...

Problem Statement

Every ATS integration I've worked on hits the same wall.
Job descriptions are written for humans. Salary buried in paragraph 4. Skills scattered across three sections. "Competitive compensation DOE" instead of a number. No two companies format them the same way.
I kept writing regex and spaCy pipelines to extract structured data from JDs, and they kept breaking on edge cases. Eventually I stopped fighting it.

The Solution: Let the LLM Handle It

I built JD Parser Pro — a REST API that takes raw job description text and returns clean, structured JSON in under 3 seconds.

You can try it right now with no signup:

bashcurl -s -X POST https://jd-parser-api.onrender.com/v1/parse \
-H "Content-Type: text/plain" \
-d 'Senior Data Engineer at Stripe, San Francisco, CA. Salary: $160,000 - $220,000/year. Full-time. 5+ years Python, Spark, dbt required. Kafka and Airflow preferred. Remote-friendly. Bachelor degree in CS required. Benefits: Equity, 401k, medical, dental, vision, unlimited PTO.'

What comes back:

json{
"title": "Senior Data Engineer",
"company": "Stripe",
"location": {
"city": "San Francisco",
"state": "CA",
"country": "United States",
"remote_policy": "remote"
},
"salary": {
"min": 160000,
"max": 220000,
"currency": "USD",
"period": "annual"
},
"employment_type": "full_time",
"seniority_level": "senior",
"experience_years": { "min": 5, "max": null },
"required_skills": ["Python", "Spark", "dbt", "Distributed systems", "Data modeling"],
"nice_to_have_skills": ["Kafka", "Airflow", "Real-time streaming architectures"],
"education": {
"degree": "Bachelor",
"field": "Computer Science, Engineering, or related field",
"required": true
},
"responsibilities": [
"Design and maintain high-throughput ETL pipelines processing billions of events daily",
"Collaborate with analytics and machine learning teams to deliver reliable data products",
"Own and improve data quality monitoring and alerting systems",
"Mentor junior engineers and contribute to architecture decisions"
],
"benefits": ["Equity", "401k", "Medical", "Dental", "Vision", "Unlimited PTO", "Home office stipend"],
"ats_keywords": ["Data Engineer", "ETL", "Python", "Spark", "dbt", "Kafka", "Airflow"],
"industry": "Fintech",
"department": "Data Infrastructure"
}

The Stack

  • FastAPI + Python 3.12 — fast, async, great developer experience
  • Claude Haiku 4.5 — cheap, fast, handles ambiguous text well
  • Prompt caching — cuts cost to ~$0.001/call on repeated system prompts
  • Upstash Redis — response caching for identical JD inputs
  • Slowapi — rate limiting (30 req/min per IP)
  • Docker on Render — simple deployment, free tier for now
  • GitHub Actions — CI/CD on every push

Why Claude Haiku instead of spaCy or Regex?
Salary ranges alone come in dozens of formats:

$160K–$220K
160,000 to 220,000 annually
competitive compensation DOE
up to $220K depending on experience

A rules-based approach breaks on every new variation. Claude handles the ambiguity naturally and is fast enough (under 3 seconds) and cheap enough (~$0.001/call with caching) to run at API scale.
The tradeoff: LLMs can hallucinate. So I added a post-processing layer — for example, if the JD contains a $ sign and the model returns currency: null, I force it to USD. Belt-and-suspenders on the fields that matter most.

Pricing & Access
Listed on RapidAPI with four tiers:

Plan | Price | Request/day

Basic | Free | 10
Pro | 19/mo | 50
Ultra | 49/mo | 250
Mega | 99/mo | 1,000

Links

Top comments (0)