Step 2 CS, Part Four: The Future of USMLE Step 2 CS

On May 26, 2020, the USMLE announced the suspension of the Step 2 CS exam.

Not long after that, I started thinking about a scene from the movie Troy. Even if you haven’t seen the film, you’ll know the story I’m thinking of.

The Greek army had laid seige to Troy for 10 years – but try as they might, they couldn’t pierce the gates. Then one day, the Trojans awakened to find that the Greeks had just… disappeared.

Their ships and soldiers were gone. In their place was a giant wooden horse.

“It’s an offering to Poseidon!” the Trojans say, as they pull the horse into the city amongst the jubilant citizenry.

And of course, the rest is history.

The Greek army hadn’t given up. They sailed back later that night and sacked the city, met at the gates by the soldiers who had been stashed away inside the wooden horse.

On social media, at least, I’ve seen a lot of celebration about the end of Step 2 CS.

Folks, Step 2 CS is just suspended. This is not an offering to Poseidon.

Step 2 CS is coming back.

Last week, the USMLE announced Revitalize Clinical Skills Assessment, a new initiative to “identify the optimal approach to assess clinical skills for licensure.”

Soon after, my Twitter mentions and DMs started to fill with questions and comments. While many students are likely still unaware of Revitalize CS, there’s already a vocal minority who are – and they’re not happy about it.

But I want anyone who cares about Step 2 CS to think hard about what I’m about to say.

Fair warning – many of you won’t like it.

A cold, hard reality

USMLE Step 2 CS will come back.

There, I said it.

The NBME fought too hard, for too long, to get the CS exam in place. To bring it back after COVID-19, they’ll face an uphill battle, sure. They can – and should – be forced to articulate why future students should be required to take Step 2 CS now that a group of graduates will be licensed without it.

However, the sharply-worded petitions and angry open letters and toothless AMA resolutions demanding that the NBME permanently get rid of Step 2 CS will not work. They’ve overcome this kind of organized resistance to the exam multiple times in the past – and they’ll do it again.

Why? Because the NBME has a powerful ally: the state medical boards.

Remember, the USMLE is a licensing examination. And whatever its failings, the Step 2 CS exam provides at least two valuable things to state boards.

First, a clinical skills exam has face validity.

From the standpoint of the general public, it’s logical to expect that physicians would be licensed using a test of clinical skills, not just computer-based tests. (You wouldn’t expect your pilot to be licensed after passing only a multiple-choice question test about how planes work, would you?)

Even if the exam itself has no predictive validity, requiring it makes the boards appear to be doing their due diligence, and serves a useful role in maintaining physician self-regulation.

Second, there’s the issue of international medical graduates. While 95% of first-time test takers from U.S. schools pass Step 2 CS, only 73% of IMGs do. On the extremes, those who are screened out by a clinical skills exam probably do have significant deficiencies that could cause problems in practice. This is the kind of thing that keeps board members up at night.

(And forget about the idea of state boards just requiring IMGs to take the Step 2 CS exam. The whole reason we have the USMLE and not the old NBME/FLEX system is that having a more difficult pathway to licensure for IMGs was viewed as discriminatory and unfair.)

From the 1987 NBME Annual Report. The NBME’s exams could only be taken by U.S. medical graduates – and U.S. citizen IMGs successfully advocated for laws to write it out of existence unless that policy changed.

My point is this:

USMLE Step 2 CS will come back – regardless of what students, faculty, and practicing physicians think.

Medical students may pay for CS, but the NBME’s real customer is the state boards. And they want a clinical skills exam.

If you really want to END Step 2 CS…

There are really only three ways that I can conceive that the Step 2 CS exam could be eliminated.

From @EndStep2CS.
1. Competition

Right now, the USMLE has a monopoly on licensure testing for allopathic physicians. But what if a competitor exam emerged? If state boards recognized a different licensure examination – one that did not require a $1300 doctor cosplay – then the market could determine the necessity of the USMLE’s Step 2 CS.

Of course, this is not likely.

The NBME/FSMB are well-acquainted with the individual state medical boards, and would lobby aggressively to contest the validity of any proposed competitor product.

Another barrier is medical schools. Before COVID-19, almost all U.S. medical schools required their students to pass Step 2 CS for graduation. (The schools meant well: making the USMLE a graduation requirement means that the $1300 registration fee can be included in the school’s cost of attendance, giving students access to student loans to cover it.) Unfortunately, a competitor exam would have to convince students that it was worth taking a new exam on top of the expensive exams that they’re required to take. That’s a tough sell.

2. Legal challenge

Over the past year, I’ve heard separately from several students who were considering bringing a lawsuit against the USMLE over Step 2 CS. To protect their anonymity, I won’t get into the specifics of their situations or the potential grounds for litigation.

Obviously, I’m not an attorney, or in a position to opine about the merits of anyone’s suit. But generally speaking, a lawsuit that would overturn Step 2 CS would face a very difficult battle. The courts give broad leeway to states and medical boards in establishing standards for professional licensure. And even a successful suit would almost certainly take years to resolve.

3. Legislative action

State medical boards are powerful bodies – but they have to follow state laws.

If a state legislature could be convinced to pass a law precluding the board from requiring an unvalidated clinical skills exam in making licensure decisions, then Step 2 CS would effectively disappear.

This seems like the most likely path to success – see the NBME/FLEX example above – but the odds still don’t look good.

There are certainly some well-connected medical students out there… but it’s gonna be hard to convince politicians to spend their hard-earned political capital to get votes on an issue that impacts, at most, a handful of their constituents.

So where do we go from here?

If you want to end Step 2 CS, the three possibilities above are your only realistic options. And none seems very likely to occur.

So to those of you DM-ing me to champion the #EndStep2CS cause: after careful consideration, I respectfully decline.

Go ahead and circulate that petition if you want.

Sign an open letter if it makes you feel better.

Write an editorial; punch a pillow; or just go outside and scream at the top of your lungs.

But if you’re asking the NBME to voluntarily stop offering Step 2 CS, just be aware that your sincere and valid concerns about the exam – no matter how loudly or clearly they are articulated – will likely result in nothing more than some serious-faced head-nodding and beard stroking in a future town hall meeting with USMLE executives. Meanwhile, the plans to re-start Step 2 CS will march on.

Another path

The more I thought about this, the more I started looking at the Step 2 CS problem a different way. I started to think about it through the lens of harm reduction.

For anyone in medicine, harm reduction will be a familiar concept. Some problems aren’t going to go away – and rather than wasting energy in an impossible task, we can actually make people’s lives better by limiting the associated harms that the problem causes.

For instance: heroin exists.

Maybe the world would be better if it didn’t. But it does. And providing access to naloxone, methadone, needle exchanges, and rehab will help a lot more people than begging drug dealers to stop selling it.

Not everyone agrees with harm reduction. Many people with good intentions still advocate for zero-tolerance drug laws and abstinence-only sexual education. And if you want to continue advocacy aimed at ending the USMLE Step 2 CS exam, have at it.

Otherwise, in the absence of a likely path to end Step 2 CS, I hope you’ll join me in advocating to reduce its harms.

Building a better (or at least less harmful) clinical skills exam

So how do we make Step 2 CS better? Let’s start by thinking about all of the reasons that students hate CS – and decide how we could make each one better.

1. Cost

Paying $1300 for the privilege of playing dress-up doctor was never going to be a popular idea. The first thing we’ve gotta do is fix that.

Remember, as we learned in Part 1, the original plan for a clinical skills exam called for costs to be limited to $200.

In defending the CS exam, the NBME is fond of noting how much value the exam provides to stakeholders like medical boards, residency program directors, and the general public. (Usually, these claims are met with a rejoinder about how the exam has no proven value.)

But this time, instead of arguing, let’s take them at their word.

If the CS exam provides such value to these groups, then why should the costs of the exam be borne exclusively on the shoulders of trainees living off of student loans? Let the groups who derive value subsidize some of the costs, and lower the price for students. Seems only fair to me.

Hallway at one of of the USMLE’s five Clinical Skills Evaluation Centers.

2. Inconvenience

Taking USMLE Step 2 CS is easy if you live in Philadelphia, Chicago, Houston, Los Angeles, or Atlanta. If you live elsewhere, you’re out of luck – and will need to tack on some travel expenses to your already-four-figure CS budget.

Rather than demanding an end to Step 2 CS, a harm mitigation strategy would be to demand new testing centers to lessen the burdens of scheduling and travel.

It’s time to re-evaluate the possibility of offering the CS exam at individual medical schools (which, of course was the original plan for where the CS exam would occur).

(And yes, I’ve heard the NBME’s argument that more test centers would be “financially prohibitive.” I just find it hard to believe. In the year 2020, almost all U.S. medical schools already have their own standardized patient programs. The Step 2 CS exam would just be carving time out of that existing infrastructure.)

3. Lack of validity

Does Step 2 CS really prevent bad doctors from practicing? Or might it actually encourage bad doctoring by incentivizing the wrong approach to clinical medicine?

Because the history taking was originally graded by considering how many items from a checklist the examinee elicited, students came up with inscrutable mnemonics like PAM HITS FOSS, LIQORAAAP, TIASHOE, DOC PA FAA, to ensure they checked all the boxes. Obviously, this is not how real hypothesis-based clinical reasoning is done.

Moreover, examinees know that these are fake patients. To pass cardiac auscultation, for example, you just have to hold your stethoscope in the right spot for a few seconds, not actually interpret anything you heard.

The original Step 2 CS exam was hoisted upon students with little evidence that the exam does what it is intended to do. And that is simply unacceptable. It’s not enough to have a test that is precise and reliable if it doesn’t measure things we really care about.

Thing is, I think everyone except for the most extreme voices in this debate are willing to support a validated standardized clinical skills exam. So why has the NBME resisted giving us one for the past 15 years?

4. An arbitrary fail rate

Lemme show you some data.

These are the pass rates for first-time test-takers on various exams.

  • USMLE Step 1: 96%
  • USMLE Step 2 CS: 95%
  • COMLEX Level 1: 94%
  • COMLEX Level 2-PE: 92%
  • American Board of Anesthesiology (Basic): 91%
  • American Board of Anesthesiology (Advanced): 95%
  • American Board of Internal Medicine: 91%
  • American Board of Internal Medicine (Cardiology): 92%
  • American Board of Obstetrics and Gynecology (Qualifying): 91%
  • American Board of Obstetrics and Gynecology (Certifying): 91%
  • American Board of Orthopedic Surgery – Part I: 91%
  • American Board of Orthopedic Surgery – Part II: 96%
  • American Board of Pediatrics: 87%
  • American Board of Surgery (Qualifying): 96%
  • American Board of Surgery (Certifying): 85%

I could go on, but I think I’ve made my point: name the high-stakes exam, and the pass rate is gonna be between around 85% and 96%.

Weird, huh?

Is it really the case that 4-15% of physicians at every level of the training pyramid are incompetent? Of course not. But a 4-15% fail rate is what examinees expect – and what the market will bear.

Make a high-stakes exam that fails more than around 15% of its test-takers, and you’d better prepare to take heavy fire. On the other hand, if you make a test that too many examinees pass, it won’t take too long for them to start questioning why they have to take it in the first place.

In part, these exams all adjust their passing standard by using survey data.

For instance, every few years, the USMLE asks a broad swath of stakeholders what they believe the ideal fail rate for its exams should be, and whether the current passing rate is “too high,” “too low”, or “about right.”

An example of survey data used to set passing standards for high stakes exams – here, the old NBME Part I, II, and III exams, the immediate predecessors of the USMLE. (Source)

I’ve written before about how off-putting the premise is: an examination should not need to fail any particular number of examinees to be perceived as valid.

So how are we gonna improve Step 2 CS? Let’s take some lessons from a different test of clinical skills with which most physicians will be familiar: the American Heart Association’s Advanced Cardiovascular Life Support (ACLS) course exam.

The American Heart Association doesn’t fret about what the “ideal fail rate” for the ACLS exam might be. Instead, they set a passing standard is simple and straightforward. You either performed appropriate CPR, or you didn’t. You either recognized V-tach, or you didn’t. If everyone in the class successfully runs the megacode, everyone in the class passes – and it’s obvious what specific competencies they each possess.

This is the kind of compact we need to have with the new USMLE Step 2 CS exam. Make the passing standard valid and unambiguous. Let everyone know what they have to do to pass the test – and when they do it, you can celebrate instead of rushing to reconfigure the grading so more will fail next time.

5) Low educational value

After students had the gall to point out that the Step 2 CS pass rate was so high that it would require a seven-figure investment to find even a single student who fails twice, the USMLE authorities seized the opportunity to move the goalposts.

Oh, you sweet summer child, we don’t have to fail anyone for our exam to have value! they said. The real value of our exam is that it has improved medical school curricula.

There is, of course, a nugget of truth amidst this gaslighting. Students will focus their learning on what is tested.

However, the educational value of a test is negated when students don’t know what the test will cover, and when test results come out of a black box.

How is USMLE Step 2 CS graded? Beats me. The official description is light on specifics.

So if you want to make Step 2 CS better, tell students exactly what you want them to learn and how they’ll be evaluated. Let them rise to the bar you set.

In-flight medical emergencies occur on 1 in 604 flights. Be prepared.

Finally, cases should be chosen carefully to maximize the educational value of preparing for the exam. Instead of just pulling cases from the USMLE’s grab bag of generic CS cases, imagine what would happen if we changed one case to – and I’m just spitballing here – an in-flight emergency.

Is there a doctor on board?

Notice, the captain doesn’t ask for an emergency physician. They ask for a doctor, operating from the premise that even an ophthalmologist or pathologist should be able to offer assistance to a patient in need. Shouldn’t it be a condition of licensure that all physicians have demonstrated the ability to respond to a certain number of common, potentially-life-threatening situations that may occur in public? I think so.

And if an in-flight emergency is too specific for you, then let’s go back to our ACLS example above, and use their mega-code cases.

Who cares if examinees know ahead of time what they’re going to be tested on? (Once you stop worrying about maintaining an arbitrary fail rate, you can stop worrying so much about test security, too.)

High-stakes tests have incredible power to motivate learning. We should think about how to use that power for maximum benefit. Do it right, and somewhere, on some airplane or soccer field or grocery store, someone’s life will be saved by us incentivizing students to learn how to respond to an emergency.


For students, there’s no way to win playing USMLE Step 2 CS.

6) A stick… with no carrot

From a student’s standpoint, the Step 2 CS exam is a no-win proposition.

If you pass the exam, great. You’re supposed to pass it.

But if you fail the exam? Not only is your wallet $1300 lighter, but your entire career trajectory may be permanently altered.

Compare that to the USMLE’s Step 1 and Step 2 CK exams. Why don’t students complain about them the same way they do about Step 2 CS? (After all, Step 1 and Step 2 CS have the same fail rate.)

I think students tolerate these tests because they at least have the chance to benefit by scoring highly. If Step 1 is like playing Monopoly, Step 2 CS is like playing Russian roulette.

So how can we give students a chance to benefit from the exam – a carrot to go along with the stick?

One way would be to evaluate skills that program directors care about.

You’re interested in anesthesia? Okay, then maybe instead of wasting your time doing a contrived 15 minute office encounter for a standardized patient with low back pain, let’s have you work through an airway case.

You want to do radiology? Cool. Let’s have you start working through the films in the queue and see what your discrepancy rate is. Maybe we’ll even have you answer a phone call from a standardized clinician while you’re at it.

You want to do internal medicine? We’re gonna give you a real whodunit of a case and have you work through it Clinical Problem Solvers style so program directors can see how you think.

You want to be a surgeon? Then you get to suture up some bowel. We’ll even grade it objectively by pumping water into your anastamosis at a specific water pressure to see if it’s watertight.

(Does that last example sound far-fetched? Well, this very thing was actually a component of the first NBME exam… in 1916. Remember, also, that the original, comprehensive, single-step NBME exam cost only $5 – the equivalent of around $64 today. It’s okay to expect a little more from our benevolent overlords.)

And if you think these examples I gave are too specialized, fine. They’re here for argument’s sake, and don’t necessarily reflect skills that program directors would realistically expect interns to have on July 1.

My point isn’t the specifics of the cases. It’s the general idea that if we’re going to keep Step 2 CS, we need to give students a chance to win, not just avoid losing.

I’ve argued before that program directors need more “game film” on the candidates that they’re evaluating. Instead of relying on some third-hand description of an applicant’s skills from a random letter of recommendation writer, let interested PDs view a short clip of the applicant in action in a standardized encounter and judge for themselves. If you want a level playing field for all applicants, this is as level as it gets.

Will all PDs want to watch this? Of course not. But as long as some do, students will benefit.

Right now, most students who pass CS are left with a $1300 hole in their checking account and no more additional skills than they would have obtained anyhow. Turn one of the CS encounters into a ‘showcase,’ though, and students will focus on improving their skills in whatever that showcase measures. As long as we choose to measure real-world skills, then students will aspire to perform those skills well – and both the student and their future patients will benefit.

The way forward

Look, I don’t like Step 2 CS – at least, not the high cost, low value, unvalidated version of it that we’ve been force-fed since 2004. But right now, there is not a realistic path to end the exam altogether (and I’ve thought about it a lot). However, there may be an opportunity to reduce the harms and increase the value of test moving forward. This is where we should advocate – forcefully – while we have the chance.

And honestly? Give me a Step 2 CS exam that costs less, is taken locally, transparently tests useful physician skills, and has been validated for outcomes that matter – and you won’t hear me agitating about ending it.


Step 2 CS, Part One: How Did We Get Here?

Step 2 CS, Part Two: The Resistance

The Residency Selection Arms Race, Part 1: On Genghis Khan, Racing Trophies, and USMLE Score Creep

The Lecture That Never Got to Be