I would like to thank the editors of this issue for inviting me to describe how my interactions with Sir David Cox have shaped my thinking in statistics and data science and my career. The following reflect my own perceptions and recollections.
I met David when I was a student at Oxford in 2009. He had been ‘retired’ for a number of years and worked at Nuffield College. I was incredibly lucky to have him as a supervisor for my doctoral studies and to continue working with him alongside my applied work in biostatistics and epidemiology. He was probably the best supervisor one could have and the nicest person I have ever met. He had a unique way of transferring his enthusiasm for scientific work and making everything feel more exciting and beautiful. I used to think of work as something that one has to do; he showed me how pleasurable work can be. He worked until January 2022.
Even though David had invented a lot of the fundamental statistical methods commonly used and had made several wide-ranging contributions to the field, he was always modest, kind, encouraging, and made one feel at ease, despite saying “the job of a doctoral student is to make their supervisor learn something new.” He genuinely valued people and appreciated their strengths. Besides his influence on the more practical side of things—Cox proportional hazards models and logistic regression are at the core of biostatistics, and clinical trial results benefiting many of us are often based on estimating a hazard ratio—directly or indirectly most of my work since has been influenced by everything I learned from him and several of his colleagues that I was fortunate to meet and work with. My present for finishing my DPhil thesis was the book on case-control studies (Keogh & Cox, 2014), which inspired me to design and analyze several case-subcohort studies when I was tasked with planning studies to assay biomarkers among a subset of participants of a large cohort study.
I learned in different ways: from discussing a seminar that David and I had just attended, from discussions with other people we collaborated with, from casual conversations, from exchanging notes with formulae (‘presents’). He probably never said ‘you should do x’; he had a unique way of teaching. He encouraged me to think critically, with skepticism but an open mind and a willingness to think about what other people say, regardless of one’s status, focusing on the available evidence and the arguments presented. He encouraged people to rely on their judgment, valued other people’s time, and often thought that people who were just starting their career had valuable contributions to make, reflecting his egalitarian views. He believed that mentoring is a two-way process—both the mentor and the mentee learn from each other. I learned to appreciate the varying quality of the published literature and to scrutinize what is considered standard. From our collaborations on applications, I learned to think carefully before running analysis in statistical software, and to stop and think what the output means before doing more analyses, despite the ease and speed at which many analyses can nowadays be done, points that he emphasized.
When faced with a research problem I did not know how best to approach, I often asked David for advice. The questions he asked for clarification and the suggestions he made were always full of insight and provided a glimpse into his process of tackling a scientific question. For many of my questions, there was usually a paper or a book that he had written, perhaps many years before, that was relevant for the current problem. He had a unique way of ‘thinking outside the box,’ not for the sake of inventing something ‘novel,’ but to answer a question that matters; he encouraged and valued original thinking, and taught me to question the question. This related to the emphasis he put on the importance and difficulty of problem formulation (Cox et al., 2020), and on the idea that it is better to arrive at an approximate answer to an important question rather than to obtain a precise answer to the ‘wrong’ question (‘ask a silly question and you’ll get a silly answer’). He also emphasized study design and interpretation of findings.
Instead of the compartmentalized version of statistics as it is often taught split into modules, from David I gained an appreciation of the subject as a whole, with a continuum of theory, methods, and applications, and across subfields of statistics. He regarded statistics as a set of scientific principles, a component of science tightly interrelated with the subject-matter considerations (Cox, 2017) and “not a ritualistic procedure to be done at the end.” I realized that a good balance is to do both statistical and applied work; the two should not be separated. He valued breadth of knowledge rather than focusing too narrowly on a very specialized issue, original thinking, and the transferability of ideas across fields.
He had a unique ability to distill what was important out of anything—a paper, talk, scientific problem, or any other interaction. Instead of strict focus on assumptions, I learned to think whether a particular assumption is important and what are the implications of it not being plausible. His papers were concise and focused on the essence, rather than on the details, although the details were included in an eloquent way. He often said “everything takes longer than you expect” and pointed out that scientific work never finishes in one sense but that one has to decide when enough has been done on a particular piece of work.
I learned the importance of approaching a problem from first principles, of distinguishing between different objectives of statistical procedures, the main distinction being between estimation and prediction with a preference for estimation and explanation, but with caution toward strong claims of causality; and that time itself cannot be a cause of something. I also learned to value parametric procedures more and their interpretability and appreciate that nonparametric procedures, albeit free of distributional assumptions, are not free of other, often implicit assumptions.
Among some points that David had raised in recent years was the issue of unrealistically small standard errors that may be estimated, in particular in the analysis of large data sets, in which often the assumption of independence of individuals is not plausible (Cox, 2015). Some of his recent contributions with direct implications on applied science were related to the testing of several hypotheses, partly inspired by his work at CERN (Cox, 2011) and recent controversies on the interpretation of p-values (Cox, 2016, 2020).
I think David influenced not only my statistical thinking but also my general way of thinking. I hope to have learned from him how to supervise students, how to focus on what matters most (“golden rule of administration: no discussion meeting should last more than an hour”), how to be more thoughtful and considerate of others. And perhaps most importantly, as he sometimes said when one cannot figure out something in research, to never give up. His persistence to work through any difficulty for more than 50 years of employment and more than 25 years of pseudo-retirement, has been to me an inspiration to try to get through any difficulties I encounter.
I genuinely enjoyed every minute of working with him and I am very grateful for everything. I can only hope to pay even a small proportion of it forward.
Christiana Kartsonaki has no financial or non-financial disclosures to share for this article.
Cox, D. R. (2015). Big data and precision. Biometrika, 102(3), 712–716. https://doi.org/10.1093/biomet/asv033
Cox, D. R. (2011). Discovery: A statistical perspective. In H. B. Prosper & L. Lyons (Eds.), PHYSTAT2011 Workshop (pp. 12–16). CERN. https://cds.cern.ch/record/2203237/files/1087459_12-16.pdf
Cox, D. R. (2017). Statistical science: A grammar for research. European Journal of Epidemiology, 32, 465–471. https://doi.org/10.1007/s10654-017-0288-1
Cox, D. R. (2020). Statistical significance. Annual Review of Statistics and Its Application, 7, 1–10. https://doi.org/10.1146/annurev-statistics-031219-041051
Cox, D. R. (2016). Statistical significance tests. Diagnostic Histopathology, 22(7), 243–245. https://www.nuffield.ox.ac.uk/users/cox/cox142%20(1).pdf
Cox, D. R., Kartsonaki, C., & Keogh, R. H. (2020). Statistical science: Some current challenges. Harvard Data Science Review, 2(3). https://doi.org/10.1162/99608f92.a6699bda
Keogh, R. H., & Cox, D. R. (2014). Case-control studies. Cambridge University Press. https://doi.org/10.1017/CBO9781139094757
©2023 Christiana Kartsonaki. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.