4.2 Simpson’s Paradox

As in the Two-Way Tables tutorial, let’s say a drug company is interested in evaluating the performance of two new drugs in development, New Drug 1 (D1) and New Drug 2 (D2), in alleviating Disease Y symptoms. They want to test it against the current standard drug (ST). They enroll 1000 people in a large clinical trial, and found that:

  1. out of the 400 people put on D1, 200 found their health status improve,
  2. out of the 200 people put on D2, 150 found their health status improve, and
  3. out of the 400 people put on ST, 240 found their health status improve.

You can create the table directly in R:

drug <- matrix(c(200, 200, 150, 50, 240, 160), ncol = 2, byrow = TRUE)
colnames(drug) <- c("Improved", "NotImproved")
rownames(drug) <- c("D1", "D2", "ST")
drug
##    Improved NotImproved
## D1      200         200
## D2      150          50
## ST      240         160

Recall that we decided to drop the D1 drug treatment from consideration because it appeared to be less effective than the standard treatment. Here we will take another look at that result.

Let’s create the 2x2 table for D1 and ST:

drug1 <- drug[c(1,3),]
drug1
##    Improved NotImproved
## D1      200         200
## ST      240         160

What happens if we break this down by sex:

drug1.f <- matrix(c(80, 20, 210, 90), ncol = 2, byrow=TRUE)
colnames(drug1.f) <- c("Improved", "NotImproved")
rownames(drug1.f) <- c("D1", "ST")
drug1.f
##    Improved NotImproved
## D1       80          20
## ST      210          90
drug1.m <- drug1 - drug1.f
drug1.m
##    Improved NotImproved
## D1      120         180
## ST       30          70

We calculate the row conditional probabilities for each table:

prop.table(drug1.f, margin = 1)
##    Improved NotImproved
## D1      0.8         0.2
## ST      0.7         0.3
prop.table(drug1.m, margin = 1)
##    Improved NotImproved
## D1      0.4         0.6
## ST      0.3         0.7
prop.table(drug1, margin = 1)
##    Improved NotImproved
## D1      0.5         0.5
## ST      0.6         0.4

Thus 80% (80/100) of females improved on the new drug in comparison with 70% (210/300) for the standard drug. For males, 40% (120/300) improved on the new drug and 30% (30/100) on the standard drug. So both sexes did better on the new drug.

But when we combine the data and ignore the sex variable, 50% (200/400) improved on the new drug in comparison with 60% (240/400) on the standard drug. In contrast to the sex-specific results, the overall result is that the patients did WORSE on the new drug.

The seemingly conflicting results might be summarized for a lay person by saying that females did better on the new drug, that males also did better on the new drug, but that humans as a whole did worse. Thus, this is an example of Simpson’s Paradox.

The two factors that lead to this paradox here are that:

  1. the treated condition appeared to be more resistant in males, for both drugs; and
  2. relatively more men than women were given the new drug, so the men’s marginal treatment failure rate dominates the combined effect for the new drug, and the women’s marginal treatment success rate dominates the combined effect for the standard.

Finally, it’s possible to see Simpson’s paradox at work in continuous data also. Check out this example on the median wage: Median wage Simpson’s paradox example