Skip to main content
SearchLoginLogin or Signup

Early Interactions With David Cox

Published onApr 27, 2023
Early Interactions With David Cox

When I first mentioned to my undergraduate advisor, Professor Henry Daniels, that I might like to do a PhD in statistics, his immediate response was “Good. You should work with David Cox. He has lots of good ideas.” Up to that point, my meetings with Daniels had been mostly the sort of interaction between a quiet undergraduate and a senior academic—awkward, polite but with little to converse about. In my case, Henry’s penchant for the provocative remark had led to at least one memorable and volatile moment. Nevertheless, he insisted on contacting David by phone to ask if there were any PhD openings at Imperial College. To my surprise, David responded promptly, and within a few days I was off to London for an interview at Imperial College.

As an undergraduate, my exposure to statistical research was limited, but not zero. Experimental design comprised a substantial part of the curriculum, so I was familiar with split-plot and fractional factorial designs, Yates’s algorithm, partial confounding, and so on. There was also a course on stochastic processes, so I was reasonably familiar with the book by Cox and Miller (1965), and Henry had encouraged me to study some of the classical techniques developed by N. T. J. Bailey in the 1950s for extracting information on limit distributions for epidemiological processes. Apart from that, Henry had assigned as an undergraduate term project a trio of papers on time-series analysis, a JRSS read paper by Coen et al. (1969) on forecasting, a follow-up paper by Box and Newbold (1971), and a much earlier paper by Bachelier (1900). As I saw it in my report, Kendall’s proposal contained a serious flaw, one that would have been recognized by Bachelier. Box’s remark on “the innate and insidious capacity [of Kendall’s method] to mislead” (p. 238) left no room for ambiguity. My impression from that assignment was that Henry must have had a low opinion of Maurice Kendall as a statistician, so the exposure of Kendall’s error must have given him a degree of intellectual satisfaction that he could not resist sharing with his advisees.

The interview with Cox must have gone reasonably smoothly for I do not recall much of it. As the interview came to a close, I asked if he could give me any assessment of the prospects. My recollection is that he replied something along the lines: “I’d put the probability somewhere around 80%.” But then he hesitated and corrected himself: “About 80% confidence level.” So, with that fine distinction to contemplate, I made my way back to Euston station and on to Birmingham.

My project as a graduate student was to develop statistical methods for epidemiological studies in various industries, mostly asbestos, granite, coal-mining, and the potteries. It was sponsored by the Health and Safety Executive. At that time, the health of workers in these industries was assessed radiologically, the X-ray images being categorized on the 10–12-point ILO radiological scale. Typical samples were cross-sectional, each worker being categorized by exposure class and radiological assessment. Other studies were longitudinal with short series observed for a large number of workers.

Looking back, David could have given more specific directions, but he did not. Instead, I was encouraged to explore all sorts of useful but not directly relevant topics such as response transformation in Box and Cox (1964), interesting dead ends such as nonmetric multidimensional scaling along the lines of Kruskal (1964), and scoring methods along the lines of section 49.2 of Fisher (1950). A prepublication typescript copy of Tukey’s Exploratory Data Analysis appeared on my desk sometime around 1976, presumably sent by Tukey to Cox for comments. The style was idiosyncratic and the material relevant in places, so I digested carefully the parts on froots and flogs. Despite the current upsurge in computation and data science, little of the content and none of the style of Tukey’s pioneering text survives in today’s curriculum.

In the mid-1970s, there was also a good deal of more conventional activity on contingency tables and log-linear models, with a rather challenging book by Haberman (1974) and a more leisurely text by Bishop et al. (1975). But I also read David’s 1958 paper and his 1970 book Binary Data rather carefully, and I could not fail to notice the drastic difference in emphasis between that and the log-linear framework. The Chicago-Harvard schools tended to look at a multiway table as a problem in multivariate analysis, with all variables on an equal footing. In the 1970s at least, Cox’s approach emphasized the dependence of a single response variable on multiple explanatory factors, which was more to my taste.

During the early 1980s, I had regular discussions with David on all sorts of topics, Biometrika submissions, survival models, cumulants, Edgeworth series, and so on. On the way to lunch one day, I asked him how one could have an uncorrelated sequence of zero-mean unit-variance random variables for which the central limit theorem or the associated Edgeworth expansion for the sample mean might break down. He loved that sort of specific challenge, so I was disappointed by his noncommittal reply “Mmm...,” which indicated little interest. The topic was not mentioned over lunch, so I decided that it was probably a silly question. On the way back after lunch, David took me aside. “In answer to your question, couldn’t you take a single standard Gaussian variable ϵ and transform it by the normalized Hermite polynomials Xr = hr(ϵ). That would give you an uncorrelated zero-mean, unit-variance sequence, wouldn’t it? Probably the sequence of sample means wouldn’t have an Edgeworth expansion, would it?”

So, he had taken the question seriously, and I think his reply in the form of a question was genuinely because he couldn’t figure out the answer over lunch. Would any version of the central limit theorem apply? A related version of this construction takes Xr = hrr), with independent ϵs so that the X-sequence has independent components. Forty years later, I’m still not sure that I understand the behavior of normalized sample averages in either setting, either the behavior of n1/2X¯n or n1/2X¯n/sn, which appear not to be equivalent. Certainly, the marginal density of hr(ϵ) is highly irregular with locally asymmetric singularities in the density at certain critical points. So, although the variables are uncorrelated, I’m inclined to think that there is no central-limit theorem or Edgeworth series for the distribution of the sample mean.

Disclosure Statement

Peter McCullagh has no financial or non-financial disclosures to share for this article.


Bachelier, L. (1900). Théorie de la spéculation. Annales Scientifiques De L Ecole Normale Superieure, 17, 21–86.

Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis. MIT Press.

Box, G. E. P., & Cox, D. R. (1964). An analysis of transformations (with discussion). Journal of the Royal Statistical Society: Series B, 26(2), 211–252.

Box, G. E. P., & Newbold, P. (1971). Some comments on a paper of Coen, Gomme and Kendall. Journal of the Royal Statistical Society: Series A, 134(2), 229–240.

Coen, P. G., Gomme, E. D., & Kendall, M. G. (1969). Lagged relationships in economic forecasting. Journal of the Royal Statistical Society: Series A, 132(2), 133–152.

Cox, D. R. (1958). The regression analysis of binary sequences (with discussion). Journal of the Royal Statistical Society: Series B, 20(2), 215–242.

Cox, D. R. & Miller, H. D. (1965) Theory of stochastic processes. Chapman & Hall.

Cox, D. R. (1970). Binary data. Chapman & Hall.

Fisher, R. A. (1950). Statistical methods for research workers. Oliver & Boyd.

Haberman, S. J. (1974) The analysis of frequency data. University of Chicago Press.

Kruskal, J. B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–27.

Tukey, J. W. (1977) Exploratory data analysis. Addison-Wesley.

©2023 Peter McCullagh. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.

No comments here
Why not start the discussion?