Skip to main content
SearchLoginLogin or Signup

Reproducibility and Replication of Experimental Particle Physics Results

Published onDec 21, 2020
Reproducibility and Replication of Experimental Particle Physics Results
·
history

You're viewing an older Release (#1) of this Pub.

  • This Release (#1) was created on Dec 16, 2020 ()
  • The latest Release (#4) was created on Apr 10, 2022 ().

Abstract

Recently, much attention has been focused on the replicability of scientific results, causing scientists, statisticians, and journal editors to examine closely their methodologies and publishing criteria. Experimental particle physicists have been aware of the precursors of non-replicable research for many decades and have many safeguards to ensure that the published results are as reliable as possible. The experiments require large investments of time and effort to design, construct, and operate. Large collaborations produce and check the results, and many papers are signed by more than three thousand authors. This paper gives an introduction to what experimental particle physics is and to some of the tools that are used to analyze the data. It describes the procedures used to ensure that results can be computationally reproduced, both by collaborators and by non-collaborators. It describes the status of publicly available data sets and analysis tools that aid in reproduction and recasting of experimental results. It also describes methods particle physicists use to maximize the reliability of the results, which increases the probability that they can be replicated by other collaborations or even the same collaborations with more data and new personnel. Examples of results that were later found to be false are given, both with failed replication attempts and one with alarmingly successful replications. While some of the characteristics of particle physics experiments are unique, many of the procedures and techniques can be and are used in other fields.

Just Accepted - Preview

12/16/20: To preview this content, click below for the Just Accepted version of the article. This peer-reviewed version has been accepted for its content and is currently being copyedited to conform with HDSR’s style and formatting requirements.

Connections
A Commentary on this Pub
Comments
2
Thomas Junk:

A response to Robert Matthews's comment:

Thank you for your interest in our article! We had a great time writing it.

We do not believe the NHST method itself is inherently flawed, and that properly executed and in the absence of confounding factors, the error rates ought to be the stated ones. Physicists like to be reassured of knowing what the error rate is, and would like to minimize the impact of subjective judgments. The main source of failures of the method come from ununderstood systematic biases, or, as statisticians say, model misspecification. As mentioned in the article, requiring five sigma effectively rules out "statistical fluctuation" as an explanation for an unusual result, focusing the attention on systematic effects and true discoveries. One can discover systematic biases, of course, and some are more "interesting" than others.

In example of the superluminal neutrinos, the experimenters were initially unaware of a loose cable in their apparatus. There are two solutions to that -- repair the cable and take more data, and correct for estimated biases due to the loose cable in the data analysis. The second solution also requires estimating a systematic uncertainty on the remaining mismodeling of the effect of the loose cable. Both solutions bring the model closer to Nature. Even with a repaired cable, experimenters may still have to assess a systematic uncertainty, however small, to account for other cables and any residual timing delays in the system incurred even with a tight cable.

In fact, even if no loose cables are known, one can still assign a systematic uncertainty to the possibility that they are present. Only by checking all the timing delays in all of the relevant cables and other equipment can one constrain the size of this uncertainty.

Unfortunately, statistical methods, neither Bayesian nor frequentist, will rectify a misspecified model. Attempts have been made to inflate uncertainties, either based on whether measurements are outliers [1], or simply to make the assessed uncertainties larger [2]. In the case of the superluminal neutrinos, one would not want the statistical method to cover discrepancies with prior belief that nothing travels faster than the speed of light in a vacuum, because one would then never be able to discover something. In the case of inflating the uncertainties due to outliers, there is a comparison with other measurements which is an important part of the scientific method. The OPERA neutrino speed measurement was the most precise terrestrial measurement at the time, although SN1987A placed a much more stringent limit that was inconsistent with the OPERA measurement [3]. OPERA's result touched off a series of measurements by other terrestrial experiments that subsequently made the original measurement an outlier.

Bayesian methods are used fairly commonly in experimental particle physics, but usually in the construction of confidence intervals and posterior probability distributions for parameters of interest.

References:

[1] Glen Cowan, Statistical Models with Uncertain Error Parameters, Eur.Phys.J.C 79 (2019) 2, 133

[2] Xiao-Li Meng, Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw your Kidstrogram, N Engl J STAT DATA SCI 1(2022), no. 1, 4-23, DOI 10.51387/22-NEJSDS6

[3] Daniele Fargion and Daniele D'Armiento 2012 J. Phys. G: Nucl. Part. Phys. 39 085002

?
Robert Matthews:

Great read; have long wondered about the methodology used in particle physics.

Would be interested in knowing more about why the widely-criticised NHST approach is still used so extensively, with the main protection against false positives largely being the 5-sigma rule (which didn’t help prevent the superluminal neutrinos debacle….!).

Why are Bayesian methods not used more extensively ? Is it the old problem of setting priors in the absence of objective prior evidence ?