Investigation

Your Doctor vs. The Algorithm: What AI Found That Humans Missed

Published February 2026 · 15 min read

Key Finding

We trained a machine learning model on 2,198 confirmed fraud cases and scored 1.7M Medicare providers. The model achieved an AUC of 0.8298 — meaning it correctly distinguishes fraudulent from legitimate providers about 83% of the time.

What Is an AUC Score? (And Why 0.83 Matters)

Let's start with the basics. AUC stands for "Area Under the Curve" — it's how data scientists measure whether a model can tell the difference between two groups. In our case: fraudulent providers vs. legitimate ones.

An AUC of 0.5 means the model is guessing randomly — a coin flip. An AUC of 1.0 means perfect detection. Our score of 0.8298 means the model is significantly better than chance. It's not perfect, but it's powerful enough to surface patterns that human auditors would take years to find manually.

0.50

Random guess

1.00

Perfect

Our model: 0.8298

What the Algorithm Looks For

This is the part most people are curious about. What actually makes a provider look "suspicious" to a machine? The answer isn't any single thing — it's a combination of factors, weighted by how predictive each one is.

Here are the top 10 features our model uses, ranked by importance:

Top 10 Features — What the Model Weighs Most

Feature importance scores from our gradient-boosted classifier

1Years Active

16.3%

How long has this provider been billing Medicare? Fraudsters tend to be newer — they set up shop, bill aggressively, and disappear.

2Services Per Patient

11.9%

How many services does this provider bill per patient? Most doctors see a patient a few times a year. Fraud mills bill dozens of services per person.

3Markup Ratio

8.0%

How much does this provider charge vs. what Medicare actually pays? Extreme markups can signal inflated billing.

4Payment Per Service

6.7%

The average Medicare payment per service. Unusually high values suggest upcoding — billing for more expensive procedures than were performed.

5Markup vs. Median

6.5%

How does this provider's markup compare to the median for their specialty? Outliers stand out.

6Payment Per Patient

5.9%

Total payments divided by number of patients. A high ratio means heavy billing per person.

7Total Payments

5.5%

Raw dollar volume. Not suspicious alone, but combined with other factors, large totals amplify risk.

8Total Services

5.3%

Total number of services billed. At extreme levels, it becomes physically impossible for one provider to deliver this many.

9Patients vs. Median

5.2%

How does this provider's patient count compare to peers? Both extremes — too many and too few — can signal problems.

10Payments vs. Median

5.2%

Total payments relative to specialty peers. Consistent outliers attract algorithmic attention.

The #1 Signal: How Long You've Been Around

The single most important feature — at 16.3% importance — is years active. This might seem counterintuitive. Why would how long you've been practicing matter?

The answer tells you everything about how Medicare fraud works. Fraudsters tend to be new. They register a fresh NPI, bill as aggressively as possible for 1-3 years, then close up shop — often before auditors catch on. Legitimate providers, by contrast, build practices over decades.

So when the algorithm sees a provider who's only been active for 2 years but is already billing millions, that's a red flag. Not proof of fraud — but a pattern worth investigating.

The #2 Signal: Services Per Patient

At 11.9% importance, services per beneficiary is the second-strongest signal. This one is intuitive: a normal doctor might bill 5-10 services per patient per year. A fraud mill might bill 50, 100, or even 200+.

In the confirmed fraud cases our model trained on, the average services-per-beneficiary ratio was dramatically higher than the national median. Many were physically impossible — no doctor can perform that many procedures on one person in a year.

Normal vs. Suspicious: A Side-by-Side

To make this concrete, here's what a typical provider looks like next to a flagged one:

✅

Typical Provider

Years Active15-20 years

Services/Patient3-8

Markup Ratio1.2-1.8x

Total Payments$100K-$500K

Fraud Score0.01-0.10

🚩

Flagged Provider

Years Active1-3 years

Services/Patient15-50+

Markup Ratio2.0-3.0x+

Total Payments$500K-$2M+

Fraud Score0.90-0.96

The difference is stark. Flagged providers aren't just a little unusual — they're statistical outliers across multiple dimensions simultaneously.

Where the Algorithm Finds Risk

When we look at the providers the model flagged as highest-risk, they cluster in specific specialties and states:

Top Flagged Specialties

Internal Medicine263

Family Practice135

Physical Medicine and Rehabilitation18

Infectious Disease9

Hematology-Oncology6

Podiatry6

Emergency Medicine6

Endocrinology5

Geriatric Medicine5

Neurology5

Top Flagged States

CA56

FL56

NY39

TX36

NJ33

OH31

IL30

MI30

PA17

TN14

Internal Medicine dominates with 263 flagged providers — nearly twice the next specialty (Family Practice at 135). This makes sense: Internal Medicine is one of the broadest billing categories in Medicare, making it easier to blend fraudulent billing with legitimate-looking services.

Geographically, California and Florida tie at 56 flagged providers each, followed by New York (39), Texas (36), and New Jersey (33). These are the same states that appear repeatedly in our fraud database.

The 500 Providers "Still Out There"

⚠️ Important Context

All data on this page comes from publicly available CMS Medicare payment records. Unusual billing patterns may reflect legitimate medical practices (such as high-volume drug administration where each unit is counted as a separate service), data reporting differences, or group practice billing. Inclusion on this page does not constitute an accusation of fraud or wrongdoing. Only law enforcement and regulatory agencies can determine whether billing patterns represent fraud. Providers flagged by our statistical model have billing patterns similar to previously convicted providers, but many may have perfectly legitimate explanations.

Our model identified 500 providers with fraud probability scores above 0.90 — representing $399.4M in total Medicare payments. These are providers who look statistically similar to confirmed fraudsters but haven't been charged.

To be clear: a high fraud score doesn't mean a provider is committing fraud. It means their billing patterns match those of providers who were later convicted. Some may have legitimate explanations. Others may warrant investigation.

Explore the full list and methodology at /fraud/still-out-there, or read about how we built the model.

What the Algorithm Can't Do

It's important to be honest about limitations:

⚠️ Limitations

• Not a diagnosis. The model flags patterns — it can't prove intent or distinguish billing errors from deliberate fraud.
• Specialty bias. Some specialties naturally have higher volumes or markups. The model accounts for this, but imperfectly.
• Training data limits. We trained on 2,198 confirmed cases — a real but small sample. Some fraud types may be underrepresented.
• Public data only. We use publicly available Medicare Part B data, which doesn't include Part A (hospital), Part D (drugs), or private claims.
• No patient context. A provider in an underserved area may have unusual patterns for legitimate reasons.

Why This Matters

Medicare loses an estimated $60 billion per year to fraud. The federal government has about 1,500 investigators at HHS-OIG responsible for overseeing a program that pays 1.7 million providers. The math doesn't work.

Machine learning doesn't replace investigators — it helps them focus. Instead of auditing providers at random, algorithms can surface the most suspicious patterns for human review. Think of it as a spotlight, not a judge.

Read our full methodology for technical details on the model architecture, training process, and validation.

Methodology

Our model (v2.0) is a gradient-boosted decision tree (XGBoost) trained on 2,198 confirmed fraud cases from the HHS-OIG LEIE database, matched against Medicare Part B billing features. The model was validated using 5-fold cross-validation with stratified sampling. Features are computed from the most recent available Medicare Part B Public Use File. AUC is reported on held-out test data.

Data Sources

• Centers for Medicare & Medicaid Services (CMS)
• Medicare Provider Utilization and Payment Data (2014-2023)
• CMS National Health Expenditure Data

Note: All data is from publicly available Medicare records. OpenMedicare is an independent journalism project not affiliated with CMS.

Investigation

Your Doctor vs. The Algorithm: What AI Found That Humans Missed

Published February 2026 · 15 min read

Key Finding

What Is an AUC Score? (And Why 0.83 Matters)

0.50

Random guess

1.00

Perfect

Our model: 0.8298

What the Algorithm Looks For

Here are the top 10 features our model uses, ranked by importance:

Top 10 Features — What the Model Weighs Most

Feature importance scores from our gradient-boosted classifier

1Years Active

16.3%

How long has this provider been billing Medicare? Fraudsters tend to be newer — they set up shop, bill aggressively, and disappear.

2Services Per Patient

11.9%

How many services does this provider bill per patient? Most doctors see a patient a few times a year. Fraud mills bill dozens of services per person.

3Markup Ratio

8.0%

How much does this provider charge vs. what Medicare actually pays? Extreme markups can signal inflated billing.

4Payment Per Service

6.7%

The average Medicare payment per service. Unusually high values suggest upcoding — billing for more expensive procedures than were performed.

5Markup vs. Median

6.5%

How does this provider's markup compare to the median for their specialty? Outliers stand out.

6Payment Per Patient

5.9%

Total payments divided by number of patients. A high ratio means heavy billing per person.

7Total Payments

5.5%

Raw dollar volume. Not suspicious alone, but combined with other factors, large totals amplify risk.

8Total Services

5.3%

Total number of services billed. At extreme levels, it becomes physically impossible for one provider to deliver this many.

9Patients vs. Median

5.2%

How does this provider's patient count compare to peers? Both extremes — too many and too few — can signal problems.

10Payments vs. Median

5.2%

Total payments relative to specialty peers. Consistent outliers attract algorithmic attention.

The #1 Signal: How Long You've Been Around

The single most important feature — at 16.3% importance — is years active. This might seem counterintuitive. Why would how long you've been practicing matter?

So when the algorithm sees a provider who's only been active for 2 years but is already billing millions, that's a red flag. Not proof of fraud — but a pattern worth investigating.

The #2 Signal: Services Per Patient

Normal vs. Suspicious: A Side-by-Side

To make this concrete, here's what a typical provider looks like next to a flagged one:

✅

Typical Provider

Years Active15-20 years

Services/Patient3-8

Markup Ratio1.2-1.8x

Total Payments$100K-$500K

Fraud Score0.01-0.10

🚩

Flagged Provider

Years Active1-3 years

Services/Patient15-50+

Markup Ratio2.0-3.0x+

Total Payments$500K-$2M+

Fraud Score0.90-0.96

The difference is stark. Flagged providers aren't just a little unusual — they're statistical outliers across multiple dimensions simultaneously.

Where the Algorithm Finds Risk

When we look at the providers the model flagged as highest-risk, they cluster in specific specialties and states:

Top Flagged Specialties

Internal Medicine263

Family Practice135

Physical Medicine and Rehabilitation18

Infectious Disease9

Hematology-Oncology6

Podiatry6

Emergency Medicine6

Endocrinology5

Geriatric Medicine5

Neurology5

Top Flagged States

CA56

FL56

NY39

TX36

NJ33

OH31

IL30

MI30

PA17

TN14

The 500 Providers "Still Out There"

⚠️ Important Context

Explore the full list and methodology at /fraud/still-out-there, or read about how we built the model.

What the Algorithm Can't Do

It's important to be honest about limitations:

⚠️ Limitations

• Not a diagnosis. The model flags patterns — it can't prove intent or distinguish billing errors from deliberate fraud.
• Specialty bias. Some specialties naturally have higher volumes or markups. The model accounts for this, but imperfectly.
• Training data limits. We trained on 2,198 confirmed cases — a real but small sample. Some fraud types may be underrepresented.
• Public data only. We use publicly available Medicare Part B data, which doesn't include Part A (hospital), Part D (drugs), or private claims.
• No patient context. A provider in an underserved area may have unusual patterns for legitimate reasons.

Why This Matters

Read our full methodology for technical details on the model architecture, training process, and validation.

Methodology

Data Sources

• Centers for Medicare & Medicaid Services (CMS)
• Medicare Provider Utilization and Payment Data (2014-2023)
• CMS National Health Expenditure Data

Note: All data is from publicly available Medicare records. OpenMedicare is an independent journalism project not affiliated with CMS.