Ritin Wadekar, AI Engineer

01 SELECTED WORK

Three companies.Five roles. One thread:

“I always left the system smarter than I found it.”

01JAN 2026 TO JUN 2026·SAN FRANCISCO, CA

AI Engineer

at Onpoint Insights

Researched, architected, and shipped a government compliant automated medical coding pipeline as an AI Engineer for a healthcare client.

7.66%

Coding error rate, well below the 10–20% industry standard

3×

More accurate than a standard baseline

$8.4B

Market we are targeting in medical coding

KEY OUTCOMES

01Delivered an automated medical coding pipeline in 2 months, owning the project end to end as an AI Engineer.
02Identified a flawed LLM baseline sitting below 70% accuracy before it became the foundation of the entire system.
03Built a retrieval system grounded in government sources, achieving 7.66% error rate vs a 10–20% industry baseline.
04Designed a two stage AI pipeline that tripled prediction accuracy over a standard LLM approach.
05Embedded Medicaid and Medicare compliance rules directly into the pipeline to handle government payer requirements.

PythonRAGVector DatabasesLLMsICD 10CPTRetrieval Architecture+3

02JAN 2025 TO DEC 2025·ANDOVER, MA

Data Scientist

at Onpoint Insights

Built and shipped AI pipelines that automated data workflows and improved decision making across pharma and retail clients.

40%

Reduction in analyst effort by automating data workflows

17%

More accurate answers over a standard search baseline

4.2%

Better return on campaign spend through causal modeling

KEY OUTCOMES

01Built a multi agent LLM pipeline to query 20K+ records, cutting analyst effort by 40%.
02Designed a RAG assistant over 20+ documents, outperforming standard search question answering by 17%.
03Ran causal regression models for a top 3 pharma client, delivering a 4.2% ROI lift.
04Deployed an OCR workflow to extract data from PDF invoices, achieving 92% accuracy.

Multi Agent SystemsOCRCausal ModelingNLPAzureStatistical AnalysisPower Automate+2

03MAY 2024 TO AUG 2024·ANDOVER, MA

Data Science Intern

at Onpoint Insights

Built data and ML systems to improve product recommendations, data consistency, and sales forecasting across retail and supply chain operations.

10M+

Transaction records extracted to build a hybrid recommendation system

6.7%

Lift in average order size by building a hybrid recommendation system

12%

Error rate on sales forecasting, well within industry acceptable range

KEY OUTCOMES

01Built a hybrid recommendation system over 10M+ records, driving a 6.7% lift in Average Order Size.
02Built an NLP based entity resolution system to reconcile partial company names, reducing reporting inconsistencies.
03Built sales forecasting models achieving a 12% error rate to support inventory and production planning.

PythonNLPRecommendation SystemsTime Series ForecastingEntity ResolutionPower BIVector Similarity+1

04JAN 2024 TO MAY 2024·DALLAS, TX

Student Consultant, Data Scientist

at Conagra Brands

Built analytical frameworks across pricing, promotion, and consumer demand to drive growth strategy and optimize spend for Conagra's Meat Substitutes category.

7%

Projected sales uplift in Conagra's Meat Substitutes from optimized strategy

$80K

Cost savings unlocked through promotional spend reallocation

100+

Product attributes analyzed across 4 years of regional sales data

KEY OUTCOMES

01Analyzed Meat Substitutes sales across U.S. regions, projecting a 7% sales uplift from region specific pricing and product strategy.
02Built Clout and Vulnerability Maps to benchmark plant based brands, driving promotion reallocation and cost savings.
03Analyzed supermarket scanner data, addressing statistical challenges to enable reliable causal inference.

PythonSASPricing StrategyPromotion OptimizationDemand ModelingCausal ModelingMultivariate Analysis+2

05APR 2023 TO JUL 2023·PUNE, INDIA

Data Science Intern

at Creative Galileo

Built ML and telemetry systems to drive user retention and platform performance for a Series A EdTech startup with 10M+ app downloads.

10M+

App downloads at the Series A EdTech platform

10%

Reduction in app load times from targeted performance optimizations

12%

Reduction in payment page churn from funnel bottleneck fixes

KEY OUTCOMES

01Built churn prediction models on user behavioral data for a Series A EdTech with 10M+ app downloads, generating risk scores to drive retention initiatives.
02Diagnosed user telemetry on AWS S3 to surface latency bottlenecks, cutting app load times 10% and payment page churn 12%.

Pythonscikit learnAWS SageMakerAWS S3Churn ModelingTelemetry AnalysisFunnel Optimization+2

02 FLAGSHIP · AUTOMATED MEDICAL CODING PIPELINE

Medical coding, automated and compliant.

Every hospital visit ends in a bill, and every bill depends on someone translating the doctor's notes into official medical codes. Done by hand, 10–20% of those codes come out wrong, and the industry loses over $1B a year to the fallout. For a healthcare client, I researched, architected, and shipped an AI pipeline that does the translation itself: grounded in official government sources, compliant with Medicaid and Medicare rules, and wrong just 7.66% of the time. Built end to end in two months. Here is how it works, stage by stage.

1

RETRIEVE

Ground in government truth

2

CLASSIFY

Narrow, then decide

3

COMPLY

Compliance in the architecture

The pipeline reads the doctor's notes and retrieves candidate codes from three official CMS coding sources. Every prediction starts from clinically and legally accepted standards, not from whatever the model happens to remember.

→ 3 authoritative CMS sources in the retrieval layer

A two-stage design narrows roughly 70,000 possible codes to a shortlist, then picks the one right code, mirroring how expert human coders actually work. It beats a standard AI approach, which capped below 70% accuracy, by over 3×.

→ Two-stage narrowing → 3× accuracy vs stock LLM

Medicaid and Medicare payer rules are built directly into the pipeline rather than bolted on afterwards. That targets the $1B+ the healthcare industry loses every year to coding errors and non-compliant claims.

→ Government payer rules enforced by design

01 · RETRIEVE

Clinical note

unstructured patient documentation

GROUNDED IN 3 CMS SOURCES

ICD-10-CM index

CMS coding guidelines

Payer references

Retrieval layer

every candidate comes from official sources

02 · CLASSIFY

NARROWING FUNNEL

~70,000 codes

dozens candidates

1 final code

Stage 1: retrieve candidates

shortlist grounded in the CMS sources

Stage 2: classify

picks the one right code · 3× a stock LLM (<70% acc.)

03 · COMPLY

COMPLIANCE GATE

Medicaid rulesMedicare rules

non-compliant candidates are rejected here, not in an audit later

ICD-10 code

clinically and legally defensible

7.66% error vs 10–20% industry3× vs stock LLM

7.66%

error rate, vs the 10–20% industry standard

3×

more accurate than a standard LLM baseline

$8.4B

market projected by 2033

2 mo

from research to production, owned end to end

Read the full case study

03 PROOF

Numbers from systems that actually shipped.

Every figure on this wall comes from production, not slide decks. Click any card to see the system behind it.

Medical Coding Error Rate

7.66%

error rate from the CMS-grounded medical coding pipeline I shipped

Three companies.Five roles. One thread:

AI Engineer

Data Scientist

Data Science Intern

Student Consultant, Data Scientist

Data Science Intern

Medical coding, automated and compliant.

Ground in government truth

Narrow, then decide

Compliance in the architecture

Numbers from systems that actually shipped.

Ten systems. One pattern: find the bottleneck, rebuild it.

A stack built for production, not slide decks.

From Pune to San Francisco, and back with the playbook.

Let's build somethingthat creates an impact.