Posted by: Chris Cole | September 13, 2013

I tawt I taw a pulmonary embolism!! The murky waters of hunting for PEs in the Emergency Dept.

This post is long enough without any further preamble, but basically this is a cut & paste from a response I made to a discussion on Dr Casey Parker’s exceptionally awesome medical education website & blog at Broome Docs.  This follows some days and weeks (and months really) of acute on chronic back and forth between several luminaries of the resuscitative care world (like Casey, Scott Weingart, Minh Le Cong, Anand Senthi, Ryan Radecki, Seth Treuger, etc…)  and interested spectators (like me) on the topic of how extensively we need to investigate patients presenting to ED with symptoms and signs that might represent a pulmonary embolism (a blood clot jammed in the arteries in your lungs).


The short short version is that PE is relatively common, the test for finding one is harmful in itself (but we don’t know how harmful, exactly), PE can kill you or leave you crippled (but little ones don’t), we’ve never proven that treating all of them (especially the little ones) actually saves lives, and we do indeed treat them all (with blood thinning medications), even the little ones. We use various scoring systems (based on things like heart rate, oxygen levels, whether you have cancer, etc.) to guesstimate the chances that you might have a PE, and the idea is that if the chance you will be harmed by the PE you might have is greater than the chance you will be harmed by the test we use to look for it, then we crack on and do the test. Some of the numbers used to estimate those chances are pretty flexible and poorly defined, though, so there is a lot of angst surrounding just how we should go about rationing out the tests and the treatment in order to avoid doing more harm than good. Oh, and one more thing… the term “clinical gestalt” basically means a gut feeling on the part of the doctor. An informed, educated gut feeling, but nonetheless it still amounts to essentially deciding something based on feeling a disturbance in The Force.


So, we continue on into the slightly disjointed responses I made to a few points that have come up:

1.  Delineate / separate the two distinct questions of (a) current PE vs. (b) risk of next PE

I’ve just listened to Scott’s response, and he made this point beautifully. I could not agree more. When we see a patient in ED with possible PE, we should break our inquiry into two questions. Does the patient have a PE right now? If so, is it causing enough trouble for me to care about any imminent threat to their well-being? (<– I’m claiming artistic license here and calling that one question, by the way; given my sample size of 3 things, 1 = 2 is well within the boundaries of a 95% CI). And secondly, even if I’m happy the PE I think is in their lungs is not a major problem today, if there is a PE there, what is the risk of the _next_ PE being haemodynamically or mortally significant, and what should I do about it?

Frustratingly for the “Avoid CTPA” camp (and in the interests of full disclosure, I’m pitching my own tent firmly on their patch of turf), the answer to the first question does somewhat inform the answer to the second one. What I would love to know, is just how much does one’s risk of a fatal/crippling/”bad” PE in a given time period (say, the next 12 months) increase if one has a haemodynamically irrelevant or subsegmental PE today? Even answering the posterior, converse question would be handy: for those patients who have a “bad” PE, how many of them have a sentinel/warning smaller PE that was symptomatic enough to bring them to hospital?

If the answer to both of those sub-questions is “not very much”, then the more relaxed our quest to answer the first big question can afford to be. If I know that missing a subsegmental PE in a patient who is low-risk according to clinical decision rules will only result in say a 0.1% absolute risk increase for “bad” PE in the next 12 months, I will be much happier not ordering that CTPA. If, on the other hand, I know that 10% of “bad” PEs are preceded by an ED-presentation-inducing sentinel subsegmental PE, or that there’s an annual increased absolute risk of “bad PE” of say 5%, then there is a more pressing need to find or exclude the current possible PE.

As we all know, it’s bloody easy to confidently exclude a clinically significant current PE using nothing more exotic than vital signs, ultrasound, ECG and Dr Weingart’s patented Looks Like Shit ™ or LLS score. (I’d love to do a prospective trial of the LLS score for massive/submassive/lysable PE, by the way). Unfortunately, until we get a better handle on how a non-haemodynamically significant PE affects the attributable risk of future “bad” PE, we just don’t have enough information to make a fully informed decision regarding how hard to look for that small current PE.

Being nearly 3am and running, as I am, on dark mint chocolate and tea, I hereby invoke the conceptual model of “Serial Schrodinger’s Cats”:

– The existence of the current small PE is represented by the mortal state of the cat in box #1. The probability of the atom decaying, the poison being released and killing the cat is ~20% (i.e. roughly the incidence of PE in those presenting to ED with what we think is ?PE)

– Doing the CTPA opens the box, collapses the probability waveform, and tells us if there’s a PE (i.e. if the cat is now, to paraphrase Monty Python, an ex-cat).

– The occurrence of a “bad” PE in the next 12 months is represented by the state of the cat in box #2.

– The discovery of a dead cat in box #1 cascades to alter the probability waveform of box #2. (We presume that the probability of a dead cat #2 was close to zero to begin with, but the reality of a dead cat # 1 puts his counterpart at greater risk).

– We NEVER OPEN the second box, unless the patient presents with a lethal PE some time after we missed the initial one, or they have a massive PE whilst already anticoagulated. This is a consequence of _everybody_ being anticoagulated immediately if we find a dead cat in box # 1.

– All of our efforts and discussions to this point therefore revolve around finding a way to AVOID OPENING BOX # 1.

A bit like Gene Hackman’s advice to Tom Cruise in “The Firm” when handing him a sealed envelope containing his new job offer and salary details, “A good attorney wouldn’t have to ask what’s in the envelope”, we strive to divine as accurately as possible the state of health of cat #1 without cheating and peeking in the box. But the _only reason_ we care about cat #1 is the effect his or her demise has on cat #2. However, anticoagulation destroys the quantum entanglement that links the two cats, and prevents us from determining the conditional probabilities upon which to base our estimates of how cat # 1 is doing, without looking in the box.

The point is: We need to start opening box #2, to confidently determine when it’s safe not to open box #1.

*(There is a more complex model, involving a 3rd cat representing the harms of CTPA + anticoagulation, but… we’re just not going there tonight… )

2. All gestalts are not created equal.

Yes, gestalt (or a combination of gestalt + clinical decision rule (CDR)) outperforms CDRs alone. But I can’t help but think that we should simply abandon the idea of clinical gestalt being a separate entity. For what is clinical gestalt? It is a set of factors or criteria, present or notably absent in our patient’s history and examination findings, which alters our notional pre-test probability of the patient having the pathology of interest, in this case a PE. What is a CDR or risk-stratification-tool (such as Wells or Geneva)? I submit that they are precisely the same thing, with one notable qualitative difference: Wells, Geneva, etc. are an explicitly defined and consistent subset of all of the myriad variables we might consider when estimating that pre-test probability. Clinical gestalt is much more a moving feast; it is a highly user-dependent, non-uniform and undefined collection of some of those same, and some different, variables found in the CDRs.

This is important, because even though overall when averaged across many physicians and many patients, gestalt might come out looking pretty good in the wash, there is likely to be much higher physician to physician and patient to patient variability in the accuracy of the assigned pre-test probability than there is with consistent application of a rigidly defined set of criteria. Thus, by employing gestalt, there will be a larger number of “outlying” patients who are either under or over-investigated and/or treated which by definition causes greater overall harm.

This is one of the reasons I favour Geneva over Wells (when used in this context), as it seems a bit recursive and self-referential to include, essentially, “Do I reckon this is probably a PE?” as one of the highest scoring components of a CDR which I am employing specifically to help me estimate the probability that this might be a PE. Geneva, at least, is more objective and consequently more reproducible across different observers/scorers.

The point is that all of these estimations use criteria that significantly overlap. So, I fret that the logic in saying things like:

“Gestalt has superior predictive power than CDR 1 or CDR 2, and gestalt + CDR 1 is even better!”

…when that essentially means, for discrete criteria {a,b,c…n} :

(a+b+c+d+e+f+x+y+z) is better than  (a+b+c+d)  or  (a+c+d+f+g+h)
(a+b+c+d+e+f+x+y+z) + (a+b+c+d)   is better than  (a+c+d+f+g+h)

…is a wee bit flawed, and we can probably do better. I admit that real world effects of synergism and the fact that the addition of risk factors is not a zero-sum game mandate that we must somewhat laboriously conduct large prospective trials of each subset of criteria that takes our fancy, rather than just applying Boolean logic to simplify the problem (though the idea does have some merit), but I suggest that there must be some way of more rigidly quantifying what it is that makes up our individual clinical gestalt guesstimations, other than waving our hands vaguely and ascribing it to “using the force”.

If and when we can get a handle on a well-defined set of variables that in reality are what we actually use (even if not explicitly or “out loud”) to tip us over that critical “Hey, you know what, I reckon this just might be a PE… aww, crap!” point, we will be one step closer to a more rational evidence-based approach to estimating pre-test probability in a less volatile and noisy manner, so that practice will be more consistent and fewer patients will be unnecessarily harmed by “gestalt variation”.

3.  Let’s do an RCT.  No… really.

Sick of a lack of high quality evidence undermining your lovingly crafted prognostic algorithms? Tired of the unquestioned acceptance of dogma from a bygone era bursting your bubble of hope for a more objective predictive model? Take heart, for you are not alone!

Is anyone interested in seriously looking into the possibility of setting up a prospective RCT for this? Ethics approval is clearly a major issue given the ingrained nature of universal anticoagulation, but I suspect it is not unthinkable that a strong enough case can be made, on solid evidential grounds (or predominantly the lack of evidence, really), to crack on and actually learn something that will probably change practice.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: