Skip to main content
SearchLoginLogin or Signup

Reflections on Sir David Cox

Published onApr 27, 2023
Reflections on Sir David Cox
·
key-enterThis Pub is a Commentary on

While I did not have the pleasure of meeting David Cox personally, I have heard much about his intellect and generosity. I’ll share one story that stuck with me.

A friend of mine gave a talk at Oxford University, two months into his PhD studies. A few days later, David Cox sent him an email praising some aspects of his talk and suggesting trying something. This helped my friend realize he had made a mistake. The email was formulated in an encouraging and gentle way, without directly telling my friend he was wrong. It is clear that he cared about the graduate students around him and valued their opinions.

On the technical side, his contributions were immense. I was inspired by his famous example in Cox (1958), which discusses whether one should condition for inference or not. In David Cox’s words: "Suppose that we are interested in the mean θ\large {\theta} of a normal population and that, by an objective randomization device, we draw either (i) with probability 12\large{\frac 1 2} , one observation, x\large{x}, from a normal population of mean θ\large {\theta} and variance σ12\large {\sigma_1^2} or (ii) with probability 12\large{\frac 1 2}, one observation x\large{x}, from a normal population of mean θ\large {\theta} and variance σ22\large {\sigma_2^2} where σ12\large {\sigma_1^2}, σ22\large {\sigma_2^2} are known, σ12σ22\large {\sigma_1^2} \gg {\sigma_2^2} and where we know in any particular instance which population has been sampled." He then presents the unconditional Neyman-Pearson test for this problem and discusses how it differs from a test that conditions on the population that has been sampled. In his words: “If the object of the analysis is to make statements by a rule with certain specified long-run properties, the unconditional test just given is in order (...). If, however, our object is to say ‘what we can learn from the data that we have’, the unconditional test is surely no good."

The question ‘What is the relevant source of variation?’ has stuck with me ever since. One can ask whether there are sources of uncertainty that we miss with standard statistical approaches. For example, in the big data era, the sample size might be on the order of 106, leading to confidence intervals that are tiny. Is this a realistic way to quantify uncertainty? Probably not! Due to batch effects, distribution shifts, or other types of distributional errors, there might be considerable discrepancies between the distribution of interest and the distribution we sample from. This error might be of larger order than statistical uncertainty. Should we construct confidence intervals that not only account for sampling uncertainty but also some form of distributional uncertainty? How can we estimate distributional uncertainty? These are problems I have worked on recently.

It is clear that David Cox’s contributions will shape statistical thinking for decades to come. Beyond that, I was particularly impressed to hear many stories about how humble and generous he was to the people around him. In an increasingly fast-paced and noisy world, these are qualities to be cherished and admired.


Disclosure Statement

Dominik Rothenhäusler has no financial or non-financial disclosures to share for this article.


References

Cox, D. (1958). Some problems connected with statistical inference. Annals of Mathematical Statistics, 29(2), 357–372.


©2023 Dominik Rothenhäusler. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.

Connections
1 of 19
Another Commentary on Remembering David Cox
Another Commentary on Remembering David Cox
Another Commentary on Remembering David Cox
Another Commentary on Remembering David Cox
Comments
0
comment
No comments here
Why not start the discussion?