Skip to main content# Evaluating Personalized (N-of-1) Trials in Rare Diseases: How Much Experimentation Is Enough?

# Abstract

# 1. Introduction

# 2. An Evaluation Framework for Personalized (N-of-1) Trials

## 2.1. The Anatomy of a Personalized (N-of-1) Trial

## 2.2. Standard of Care

## 2.3. Design Parameters

# 3. An Analytical Model for N-of-1 Trials

# 4. How Much Experimentation Is Enough?

## 4.1. Patient-Centric Criteria and Length of Experimentation Phase

**Proposition 1.** Suppose $\hat \beta_i \sim N( \beta_i, \tau_i^2)$ under a balanced experimentation phase (3.1). Then for $0 < m \leq T$ ,

#### where $W,U$ are independent standard normal variables. Furthermore, if $\mu_B = 0$ , then

#### where $G$ is the cumulative distribution function of $W/|U|$ , which is a pivotal distribution.

**Proposition 2.** Under the same condition as in Proposition 1, for $0 < m \leq T$ ,

#### where $\Phi$ and $\phi$ respectively denote the standard normal distribution function and density.

**Main Result 1**. *The optimal experimentation length *$m^*$ * is less than one-third of the total N-of-1 trial duration from a patient’s perspective, that is, *$m^* \lesssim T/3$ *.*

## 4.2. Sample Size

**Main Result 2**. *All else being equal, the power to demonstrate quality improvement due to N-of-1 trials (vs SOC) increases as heterogeneity of treatment effects *$\sigma_B^2$ * increases.*

# 5. Numerical Illustrations: Application to ALS Patients

## 5.1. Optimal Length of Experimentation

## 5.2. Sample Size and Effect Size

## 5.3. Power for Comparing to Fully Informed SOC

#### Table 1. Quality improvement $\Delta$ and power for comparing to a fully informed SOC (standard of care) with $p_1 = \Phi(\mu_B/\sigma_B$ ) with $n=34,$ $m=4,$ $T=18,$ $\sigma_A = 4.8,$ $\sigma_B = 4.8,$ $\sigma=1.6$ and $\rho=0$ .

# 6. Discussion

# Disclosure Statement

# References

# Appendices

## Appendix A. Theoretical Results Concerning $E(z_i)$

### A.1. Proof of Proposition 1.

## Appendix B. Theoretical Results Concerning $E(\bar y_i)$

### B.1. Lemma 1

**Lemma 1.** Let $V \sim N(\mu_V, \sigma_V^2)$ . Then,

#### where $\Phi$ and $\phi$ respectively denote the standard normal distribution function and density function.

### B.2. Proof of Proposition 2

### B.3. Derivation of optimal experimentation length $m^*$ and Main Result 1

## Appendix C. Theoretical Results Concerning Power

Published onSep 08, 2022

Evaluating Personalized (N-of-1) Trials in Rare Diseases: How Much Experimentation Is Enough?

For rare diseases, conducting large, randomized trials of new treatments can be infeasible due to limited sample size, and it may answer the wrong scientific questions due to heterogeneity of treatment effects. Personalized (N-of-1) trials are multi-period crossover studies that aim to estimate individual treatment effects, thereby identifying the optimal treatments for individuals. This article examines the statistical design issues of evaluating a personalized (N-of-1) treatment program in people with amyotrophic lateral sclerosis (ALS). We propose an evaluation framework based on an analytical model for longitudinal data observed in a personalized trial. Under this framework, we address two design parameters: length of experimentation in each trial and number of trials needed. For the former, we consider patient-centric design criteria that aim to maximize the benefits of enrolled patients. Using theoretical investigation and numerical studies, we demonstrate that, from a patient’s perspective, the duration of an experimentation period should be no longer than one-third of the entire follow-up period of the trial. For the latter, we provide analytical formulae to calculate the power for testing quality improvement due to personalized trials in a randomized evaluation program and hence determine the required number of trials needed for the program. We apply our theoretical results to design an evaluation program for ALS treatments informed by pilot data and show that the length of experimentation has a small impact on power relative to other factors such as the degree of heterogeneity of treatment effects.

**Keywords:** ALS, heterogeneity of treatment effects (HTE), minimally clinically important heterogeneity, patient-centered research, rare diseases, sample size formulae

When managing chronic diseases and conditions, patients commonly try different treatments over time before finding the right treatments. The practice of N-of-1 trials operationalizes this type of patient-centered experimentation by randomizing treatments to single patients in multiple crossover periods, often in a balanced fashion. N-of-1 trials can be used to identify the optimal personalized treatment for single patients in situations involving evidence for heterogeneity of treatment effects (HTE) or the lack of a cure (Davidson et al., 2021). As such, these trials are sometimes called single-patient trials or personalized trials. First introduced by Hogben and Sim (1953), N-of-1 trials have recently been applied to treat rare diseases (Roustit et al., 2018), as well as common chronic conditions such as hypertension (Kronish et al., 2019; Samuel et al., 2019). The use of personalized (N-of-1) trials in treating rare diseases is particularly appealing because demonstrating comparative effectiveness of treatments at the population level via parallel-group randomized trials is often infeasible.

In this article, we consider personalized (N-of-1) trials of treatments for people with amyotrophic lateral sclerosis (ALS). ALS is a rare neurodegenerative disease that affects motor neurons in the brain and spinal cord. Despite the fact that two modestly effective disease-modifying medications have been approved for the treatment of ALS (Edaravone [MCI-186] ALS 19 Study Group, 2017), the disease has no cure, and thus, symptomatic treatments remain an important strategy to improve the quality of life in people with ALS (Mitsumoto et al., 2014). In particular, muscle cramps are disabling symptoms affecting over 90% of ALS patients, with demonstrated between-patient variability and yet stable manifestation of symptoms in a patient (Caress et al., 2016). Several treatments targeting muscle cramps have been evaluated and have shown mixed results, suggesting the presence of HTE or inadequate statistical power for definitive conclusions (Baldinger et al., 2012). Furthermore, ALS itself has been considered markedly heterogeneous in its pathogeneses, disease manifestations, and disease progression (Al-Chalabi & Hardiman, 2013; van den Berg et al., 2019). These are the clinical situations in which personalized (N-of-1) trials can help patients identify the best treatments for themselves (Kravitz et al., 2014).

Despite renewed interest in N-of-1 trials and numerous recent applications, the literature has offered little discussion on the evaluation of the usefulness of N-of-1 trials. As N-of-1 trials typically require active physician involvement, intense monitoring, and frequent data collection compared with usual care, these additional costs and resources warrant careful evaluation of effectiveness before said trials are used in practice as regular clinical service. The primary evaluation question is “Does the practice of N-of-1 trials in clinical care improve outcomes on the standard of care?” However, presuming the quality of treatment decisions based on N-of-1 trials is higher than what standard of care would prescribe, reports of N-of-1 trials often describe only the applications and results of the trials without plans to address the evaluation question. An exception is Kravitz et al. (2018) who compare N-of-1 intervention against the usual care for patients with musculoskeletal pain in a randomized fashion using data collected after experimentation ends and find no evidence of superior outcomes among participants undergoing N-of-1 trials. However, when planning the study, the authors had not considered the underlying model that accounts for variability and correlation in the longitudinal observations and the assumptions on the effect size, which would in turn drive the appropriate sample size of an evaluation program for N-of-1 trials. A design issue related to sample size determination is the duration of experimentation in N-of-1 trials. In this article, we propose a framework to evaluate the quality and effectiveness of N-of-1 trials and develop specific guidance to address these design issues. We will introduce the evaluation framework in Section 2 and define the basic analytical model for analyzing N-of-1 trials in Section 3. The main findings on the experimentation duration and sample size are derived and described in Section 4 and applied to the ALS treatment program in Section 5. The article ends with a discussion in Section 6. All technical details are provided in the Appendices.

We consider an evaluation program comparing the effectiveness of personalized (N-of-1) trials in treating muscle cramps in people with ALS relative to the institutional standard of care. Under the program, people with ALS will be randomized to receive personalized (N-of-1) trials that compare two standard drugs prescribed for muscle cramps, mexiletine and baclofen. In each trial, a patient will be given the two drugs sequentially over

During the treatment periods, the Columbia Muscle Cramp Scale (MCS) will be collected weekly to result in two MCS measurements for each period: one at the end of week 1 and one at the end of week 2. The MCS is a validated, composite score summarizing the frequency, severity, and clinical relevance of cramps in people with ALS (Mitsumoto et al., 2019). While the study does not include washout periods between treatments, only the measurement at the end of each two-week period will be used in the primary analysis in order to avoid carryover effects of the drugs.

Sandwiched between the two treatment phases is a feedback period where the MCS data in the experimentation phase are reviewed with the treating physician and the patient. The feedback period enables data-driven treatment decisions by providing the stakeholders with data visualization as well as numerical comparison (Davidson et al., 2021).

In this article, we focus on a randomized controlled evaluation program where patients are randomized between an N-of-1 trial and standard of care (SOC). As depicted in Figure 1, a patient under SOC will be given either mexiletine or baclofen for 36 weeks, corresponding to the 18 two-week treatment periods in the N-of-1 trials, and will have the same follow-up schedule as the N-of-1 trial patients. Treatments in the ‘experimentation phase’ will be determined by the treating physicians. The ‘feedback period’ in the SOC arm may be viewed as a sham intervention and be conducted as a regular clinic visit before the patient continues into the ‘validation phase’ with the same drug in the remaining

Let

While the study duration (or the number of treatment periods

A second design parameter is the specification of an analytical plan used to guide treatment selection during the feedback period. Principled statistical or data science methods should be employed to ensure the analysis is rigorous, while a prespecified plan entails preprogrammed algorithms that in turn facilitate quick feedback to the stakeholders.

Finally, as in conventional randomized controlled trials, the number of patients randomized in an evaluation program will need to be determined to ensure adequate statistical power for the primary evaluation question on whether N-of-1 trials improve outcomes.

To summarize, the design parameters that need to be prespecified at the planning stage of an evaluation program are the primary analysis plan used in the feedback period, the experimentation length (

Let

$\sum_{t =1}^m x_{it} = 0.\ \ \ \ \ \ \ \ \text{(3.1)}$

Consider the outcome model

$y_{it} = \alpha_i + \beta_i x_{it} + \epsilon_{it} \ \ \ \ \ \ \ \ \text{(3.2)}$

where

Under model (3.2), the optimal treatment for patient

$x_i^* %= \mbox{sgn}(\hat \beta_i)
= 2 I( \hat \beta_i > 0) - 1.
\ \ \ \ \ \ \ \ \text{(3.3)}$

Subsequently, in the event of perfect adherence to analysis result, the patient will receive the estimated optimal treatment (3.3) in the validation phase, that is,

Some practical notes on the choice of

$\tau_i^2 = \text{var}(\hat \beta_i^{LS} | \alpha_i, \beta_i) = \lambda_i \sigma^2 / m \text{ where }
\lambda_i = 1 + \sum_{s \neq t} x_{is} x_{it} \rho_{st}/m.
\ \ \ \ \ \ \ \ \text{(3.4)}$

Note that the conditional variance (3.4) is free of the patient-specific parameters

In this subsection, we discuss the choice of the experimentation length

The first criterion is defined as the expected number of periods where a patient receives the optimal treatment. Mathematically, this criterion is denoted as

$E(z_i) = \frac{m}{2} + (T-m) \, \text{Pr} \left( W \leq \frac{ \left| \mu_B + \sigma_B U \right|}{\tau_i} \right)$

$E(z_i) = \frac{m}{2} + (T-m) \, G \left( \sigma_B / \tau_i \right) \ \ \ \ \ \ \ \ \text{(4.1)}$

The second patient-centric criterion is defined as the expected average outcome of a patient during an N-of-1 trial. This criterion is denoted as

$E(\bar y_i) =
\mu_A + \left(1 - \frac{m}{T} \right)
\left[
\mu_B \left\{ 2 \Phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) - 1 \right\} +
\frac{2 \sigma_B^2}{\sqrt{ \sigma_B^2 + \tau_i^2}} \phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) \right]
%\begin{equation}
%E(\bar y_i) =
% \mu_A + \left(1 - \frac{m}{T} \right)
%\left[
%\mu_B \left\{ 2 \Phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \sigma^2/m}} \right) - 1 \right\} +
%\frac{2 \sigma_B^2}{\sqrt{ \sigma_B^2 + \sigma^2/m}} \phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \sigma^2/m}} \right) %\right]
%\mu_A + \left(1 - \frac{m}{T} \right)
% \left[
% \mu_B \left\{ 2 \Phi \left( \frac{\tau \mu_B}{\sigma_B} \right) - 1 \right\}
% + {2 \tau \sigma_B}
% \phi \left( \frac{\tau \mu_B}{\sigma_B} \right) \right]
%\label{eq:thm2}$

We can derive a few practical principles from Proposition 1 and Proposition 2. First, conducting an N-of-1 trial with an experimentation length

Second, we can derive from the propositions that

Third, and importantly, considering the null case where

$m^* = \frac{2 T}{ \sqrt{ 9 + 8 \xi_i T} + 3}
\ \ \ \ \ \ \ \ \text{(4.2)}$

where

In this subsection, we discuss how much experimentation is adequate in terms of the sample size enrolled to the evaluation program. We first define the quality of an N-of-1 trial as the expected health outcome under the estimated optimal treatment

$H_0: \Delta := E(y_i^*) - E(y_i') \leq 0 \text{ versus} \ H_1: \Delta > 0
\ \ \ \ \ \ \ \ \text{(4.3)}$

where

$Z = \frac{\sqrt{n} ( \bar{y}^* - \bar{y}' )}{ \sqrt{v^* + v'}}
\ \ \ \ \ \ \ \ \text{(4.4)}$

where

$\text{Pr} (Z > c_{\alpha} | \Delta) \approx \Phi \left( \frac{\sqrt{n} \Delta}
{ \sqrt{ \text{var}(y_i^*) + \text{var}(y_i')}} - c_{\alpha} \right)
\ \ \ \ \ \ \ \ \text{(4.5)}$

where

$\Delta = 2 \mu_B \left\{ \Phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) - p_1 \right\}
+
\frac{2 \sigma_B^2}{\sqrt{ \sigma_B^2 + \tau_i^2}} \phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right),
\ \ \ \ \ \ \ \ \text{(4.6)}$

$\text{var}(y_i^*) = \sigma_A^2 + \sigma_B^2 + \mu_B^2 - \{ \Delta + \mu_B (2p_1-1) \}^2 + \frac{\sigma^2}{T-m},
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{(4.7)}$

and

$\text{var}(y_i') =
\sigma_A^2 + \sigma_B^2 + \mu_B^2 - \mu_B^2 (2 p_1 - 1)^2 + \frac{\sigma^2}{T-m}.
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{(4.8)}$

The above expressions account for population-level information about the treatments through the program parameter

We use the MCS natural history data in Mitsumoto et al. (2019) to inform the design of the evaluation program for people with ALS. Specifically, we fitted a random effects model to the data and obtained an estimate of

Main Result 2 implies that

To determine if a specific value of

The calculations in the previous subsection assume the null case

Table 1 shows that as the average treatment effect *provided that* the standard of care is fully informed. On the other hand, even with a large average treatment effect

| |||
---|---|---|---|

0 | 0.50 | 3.8 | 80% |

1.2 | 0.60 | 3.7 | 77% |

1.6 | 0.63 | 3.6 | 75% |

2.4 | 0.69 | 3.3 | 68% |

4.8 | 0.84 | 2.3 | 39% |

The numerical results in this article, and power and the patient-centric criteria in general, can be computed using tools are available at: https://roadmap2health.io/hdsr/n1power/.

N-of-1 trials have been increasingly used as a design tool to bridge practice and science in rare diseases (Müller et al., 2021; Stunnenberg et al., 2018). However, the literature is missing concrete guidelines on N-of-1 designs as to how much experimentation is appropriate. A fundamental issue is the articulation of a framework that will facilitate the evaluation of the usefulness of N-of-1 trials. In this article, we introduce an evaluation framework and outline the basic elements in an evaluation program for N-of-1 trials—namely, an experimentation phase, a feedback period, and a validation phase. In the literature, the reporting of N-of-1 trials mostly focuses only on the results of the experimentation phase, where patients explore the different treatments sequentially under a rigorous clinical protocol such as randomization, blinding, and scheduled follow-up. The feedback period and the validation phase are the critical elements in the planning and the conduct of N-of-1 trials but are, unfortunately, often omitted in the description of the design and the analytical plan.

Specifically, the length of the validation phase, relative to that of the experimentation phase, should be given careful consideration. We have demonstrated theoretically and numerically that the optimal length of experimentation from the patient’s perspective should be no greater than one-third of the entire study duration. This implies a relatively long validation phase, suggesting the importance of reproducing the quality of the decisions due to N-of-1 trials with additional follow-up. Our theoretical results also provide guidance on how many patients are needed in order to adequately power for testing quality improvement. Importantly, the relative length of experimentation and validation has minimal impact on the power. In other words, little conflict exists between the goal of maximizing patient benefits and maximizing power.

The feedback period facilitates evidence-based treatment decisions using data measured in the experimentation phase. Summarizing the relative benefits of the treatments via a single numerical statistic is a pragmatic way to present such evidence, because the information can be objectively presented and quickly digested by stakeholders. We have developed design calculus based on the model-based least squares estimation, which is quick to compute and produces unbiased estimates of patient-specific treatment effects under a broad range of scenarios. Other more sophisticated model-based methods may be used to deal with the more complex situations. For example, when we observe high volume of outcome measures via wearable devices, we could extend model (3.2) to an autoregressive model with multiple observations per treatment period (Kronish et al., 2019). In practice, treatment decisions are likely determined based on the totality of evidence. For example, in situations where a treatment that apparently benefits a patient may have side effects, a possibly less effective treatment may be preferred if it is more easily tolerated. Considerations of multiple outcomes in the analysis during the feedback period will likely increase adherence and will warrant further empirical, domain-specific research. Overall, as the feedback period potentially changes the treatment decisions—and hence, the outcomes—in the validation phase, it can be viewed as an integral part of the intervention component. We may thus experiment in a randomized fashion different elements in the feedback period for different individuals: we may consider presenting different endpoints (e.g., muscle cramp or safety), using a single endpoint, a composite outcome, or multivariate endpoints, using different types of analyses (e.g., intent-to-treat vs per-protocol), and asking patients for their satisfaction and preference (Cheung et al., 2020).

Some considerations, assumptions, and limitations for power calculation in conventional randomized controlled trials also apply for N-of-1 trials. First, power calculation involves the inputs of a number of nuisance model parameters (e.g.,

Second, our derivations assume that patients in both arms comply with their treatments in the following sense: patients in the N-of-1 trials adhere to the estimated optimal treatments based on the experimentation phase data, and patients in the SOC continue with the same treatment as in the experimentation phase. If there is prior information about noncompliance rate, power expressions can be derived accordingly under the proposed framework. However, from the viewpoint that the feedback period is part of the N-of-1 trial intervention, it should be designed to maximize adherence by choosing the outcomes and analyses that most reflect patient preference as discussed in the previous paragraph. Third, approaches to deal with missing data should be prespecified and implemented during the feedback period. An advantage of using model-based estimation is that the model can also serve as the basis for multiple imputations. That being said, no statistical approach can replace a well-conducted trial that is characterized by good compliance to treatment and minimal missing data.

This work was supported by grants R01LM012836 from the NIH/NLM, P30AG063786 from the NIH/NIA, UL1TR001873 from NIH/NCATS, and R01MH109496 from NIH/NIMH. Dr. Mitsumoto’s work was also supported by ALS Association, MDA Wings Over Wall Street, Spastic Paraplegia Foundation, Mitsubishi-Tanabe, and Tsumura. The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication. The views expressed in this paper are those of the authors and do not represent the views of the National Institutes of Health, the U.S. Department of Health and Human Services, or any other government entity.

Al-Chalabi, A., & Hardiman, O. (2013). The epidemiology of ALS: A conspiracy of genes, environment and time. *Nature Reviews Neurology*, *9*(11), 617–628. https://doi.org/10.1038/nrneurol.2013.203

Baldinger, R., Katzberg, H. D., & Weber, M. (2012). Treatment for cramps in amyotrophic lateral sclerosis/motor neuron disease. *Cochrane Database of Systematic Reviews*, Article CD004157. https://doi.org/10.1002/14651858.CD004157.pub2

Caress, J. B., Ciarlone, S. L., Sullivan, E. A., Griffin, L. P., & Cartwright, M. S. (2016). Natural history of muscle cramps in amyotrophic lateral sclerosis. *Muscle & Nerve*, *53*(4), 513–517. https://doi.org/10.1002/mus.24892

Cheung, K., Wood, D., Zhang, K., Ridenour, T. A., Derby, L., St Onge, T., Duan, N., Duer-Hefele, J., Davidson, K. W., Kronish, I. M., & Moise, N. (2020). Personal preferences for personalized trials among patients with chronic experience: An empirical Bayesian analysis of a conjoint survey. *BMJ Open*, *10*(6), Article e036056. https://doi.org/10.1136/bmjopen-2019-036056

Davidson, K. W., Silverstein, M., Cheung, K., Paluch, R. A., & Epstein, L. H. (2021). Experimental designs to optimize treatments for individuals: Personalized N-of-1 trials. *JAMA Pediatrics*, *175*(4), 404–409. https://doi.org/10.1001/jamapediatrics.2020.5801

Edaravone [MCI-186] ALS 19 Study Group. (2017). Safety and efficacy of edaravone in well defined patients with amyotrophic lateral sclerosis: A randomised, double-blind, placebo-controlled trial. *Lancet Neurology*, *16(7)*, 505–512. https://doi.org/10.1016/s1474-4422(17)30115-1

Hogben, L., & Sim, M. (1953). The self-controlled and self-recorded clinical trial for low-grade morbidity. *British Journal of Preventive and Social Medicine*, *7*(4), 163–179. https://doi.org/10.1136/jech.7.4.163

Kravitz, R. L., Duan, N. (Eds), and the DEcIDE Methods Center N-of-1 Guidance Panel (Duan, N., Eslick, I., Gabler, N. B., Kaplan, H. C., Kravitz, R. L., Larson, E. B., Pace, W. D., Schmid, C. H., Sim, I., & Vohra, S.) (2014). *Design and implementation of N-of-1 trials: A user’s guide.* Agency for Healthcare Research and Quality. https://effectivehealthcare.ahrq.gov/products/n-1-trials/research-2014-5

Kravitz, R. L., Schmid, C. H., Marois, M., Wilsey, B., Ward, D., Hays, R. D., Duan, N., Wang, Y., MacDonald, S., Jerant, A., Servadio, J. L., Haddad, D., & Sim, I. (2018). Effect of mobile device-supported single-patient multi-crossover trials on treatment of chronic musculoskeletal pain: A randomized clinical trial. *JAMA Internal Medicine*, *178*(10), 1368–1378. https://doi.org/10.1001/jamainternmed.2018.3981

Kronish, I. M., Cheung, Y. K., Shimbo, D., Julian, J., Gallagher, B., Parsons, F., & Davidson, K. W. (2019). Increasing the precision of hypertension treatment through personalized trials: A pilot study. *Journal of General Internal Medicine*, *34*(6), 839–845. https://doi.org/10.1007/s11606-019-04831-z

Mitsumoto, H., Brooks, B. R., & Silani, V. (2014). Clinical trials in amyotrophic lateral sclerosis: Why so many negative trials and how can trials be improved? *Lancet Neurology*, *13*(11), 1127–1138. https://doi.org/10.1016/s1474-4422(14)70129-2

Mitsumoto, H., Chiuzan, C., Gilmore, M., Zhang, Y., Ibagon, C., McHale, B., Hupf, J., & Oskarsson, B. (2019). A novel muscle cramp scale (MCS) in amyotrophic lateral sclerosis (ALS). *Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration*, *20*(5–6), 328–335. https://doi.org/10.1080/21678421.2019.1603310

Müller, A. R., Brands, M. M. M. G., van de Ven, P. M., Roes, K. C. B., Cornel, M. C., van Karnebeek, C. D. M., Wijburg, F. A., Daams, J. G., Boot, E., & van Eeghen, A. M. (2021). The power of 1: Systematic review of N-of-1 studies in rare genetic neurodevelopmental disorders. *Neurology*, *96*(11), 529–540. https://doi.org/10.1212/WNL.0000000000011597

Roustit, M., Giai, J., Gaget, O., Khouri, C., Mouhib, M., Lotito, A., Blaise, S., Seinturier, C., Subtil, F., Paris, A., Cracowski, C., Imbert, B., Carpentier, P., Vohra, S., & Cracowski, J.-L. (2018). On-demand sildenafil as a treatment for raynaud phenomenon: A series of N-of-1 trials. *Annals of Internal Medicine*, *169*(10), 694–703. https://doi.org/10.7326/m18-0517

Samuel, J. P., Tyson, J. E., Green, C., Bell, C. S., Pedroza, C., Molony, D., & Samuels, J. (2019). Treating hypertension in children with n-of-1 trials. *Pediatrics*, *143*(4), Article e20181818. https://doi.org/10.1542/peds.2018-1818

Stunnenberg, B., Raaphorst, J., Groenewoud, H., Statland, J., Griggs, R., Woertman, W., Stegeman, D., Timmermans, J., Trivedi, J., Matthews, E., Saris, C., Schouwenberg, B., Drost, G., van Engelen, B., & van der Wilt, G. (2018). A series of aggregated randomized-controlled N-of-1 trials with mexiletine in non-dystrophic myotonia: Clinical trial results and validation of rare disease design (p3.440) [70th Annual Meeting of the American-Academy-of-Neurology (AAN) ; Conference date: 21-04-2018 through 27-04-2018]. *Neurology*, *90*(15 Suppl). https://n.neurology.org/content/90/15_Supplement/P3.440

U.S. Food & Drug Administration. (2019). *Adaptive designs for clinical trials of drugs and biologics: Guidance for Industry*. https://www.fda.gov/media/78495/download

van den Berg, L. H., Sorenson, E., Gronseth, G., Macklin, E. A., Andrews, J., Baloh, R. H., Benatar, M., Berry, J. D., Chio, A., Corcia, P., Genge, A., Gubitz, A. K., Lomen-Hoerth, C., McDermott, C. J., Pioro, E. P., Rosenfeld, J., Silani, V., Turner, M. R., Weber, M., . . . Mitsumoto, H. (2019). Revised Airlie House consensus guidelines for design and implementation of ALS clinical trials. *Neurology*, *92*(14), e1610–e1623. https://doi.org/10.1212/wnl.0000000000007242

First, consider the case

$z_i = \frac{m}{2} + (T-m) I (x_i^* = 1) = \frac{m}{2} + (T-m) I (\hat \beta_i > 0 ).
\ \ \ \ \ \ \ \ \text{(A.1)}$

The first term in the right-hand-side of (A.1) is the number of optimal treatment periods received in the experimental phase, and the second term is the number in the validation phase. Since

$E \left\{ I( \hat \beta_i >0) | \alpha_i, \beta_i \right\} = \text{Pr} \left( \hat \beta_i >0 | \alpha_i, \beta_i \right) =
\Phi \left( \beta_i/\tau_i \right),
%E \left\{ I( \hat \beta_i >0) | \alpha_i, \beta_i \right\} = \text{Pr} \left( \hat \beta_i >0 | \alpha_i, \beta_i \right) =
%\Phi \left( {\sqrt{m} \beta_i}/{\sigma} \right),$

and therefore,

$E (z_i | \alpha_i, \beta_i) = \frac{m}{2} + (T-m) \Phi (\beta_i / \tau_i) \text{ when } \beta_i > 0.
\ \ \ \ \ \ \ \ \text{(A.2)}$

Next, under the case

$E(z_i | \alpha_i, \beta_i) = \frac{m}{2} + (T-m) \Phi (-\beta_i / \tau_i ) \text{ when } \beta_i < 0.
\ \ \ \ \ \ \ \ \text{(A.3)}$

Combining (A.2) and (A.3) gives

$E(z_i | \alpha_i, \beta_i) = E(z_i | \beta_i) = \frac{m}{2} + (T-m) \Phi ( | \beta_i | / \tau_i ),
%E(z_i | \alpha_i, \beta_i) = E(z_i | \beta_i) = \frac{m}{2} + (T-m) \Phi ( \frac{\sqrt{m} |\beta_i|}{\sigma} ),
\ \ \ \ \ \ \ \ \text{(A.4)}$

which is free of

$\begin{aligned}
E \ {\Phi ( |\beta_i |/ \tau_i )} &=
\int_{-\infty}^{\infty} \int_{-\infty}^{ \frac{|b|}{\tau_i}} \frac{1}{\sigma_B} \phi(w) \phi ( \frac{b-\mu_B}{\sigma_B} ) dw db \\
\ \ \ \ \ &=
\int_{-\infty}^{\infty} \int_{-\infty}^{ \frac{|\mu_B + \sigma_B u |}{\tau_i}} \phi(w) \phi (u) dw du \\
&=
\text{Pr} ( W \leq \frac{ | \mu_B + \sigma_B U | }{\tau_i} ). \end{aligned} \ \ \ \text{(A.5)}$

The proof is completed by substituting (A.5) into (A.4).

Derivations of

$E\left\{ V \Phi(V) \right\} = \mu_V \Phi \left( \frac{\mu_V}{ \sqrt{ \sigma_V^2 + 1}} \right) + \frac{\sigma_V^2}{\sqrt{\sigma_V^2 +1 }}
\phi \left( \frac{\mu_V}{ \sqrt{ \sigma_V^2 + 1}} \right)$

*Proof of Lemma 1:* Using definition of expectation, we derive

$\begin{aligned}
E \{ V \Phi(V) \} &= \int_{-\infty}^{\infty} \int_{-\infty}^v v \phi(u) \frac{1}{\sigma_V} \phi ( \frac{v-\mu_V}{\sigma_V} ) du dv \\
&= \int_{-\infty}^{\infty} \int_{-\infty}^{\mu_V + \sigma_V w} (\mu_V + \sigma_V w) \phi(u) \phi (w) du dw \\
&= \mu_V \text{Pr}(U < \mu_V + \sigma_V W) + \sigma_V \int_{-\infty}^{\infty} w \Phi( \mu_V + \sigma_V w ) \phi (w) dw \\
&= \mu_V \Phi ( \frac{\mu_V}{\sqrt{1 + \sigma_V^2}} ) + \sigma_V^2 \int_{-\infty}^{\infty} \phi(w) \phi ( \mu_V + \sigma_V w ) dw \end{aligned} \ \ \ \ \ \text{(B.1)}$

where

$\mu_V \text{Pr}(U < \mu_V + \sigma_V W) = \mu_V \Phi \left( \frac{\mu_V}{\sqrt{1 + \sigma_V^2}} \right). \ \ \ \ \ \ \ \ \ \ \ \text{(B.2)}$

Next, the single integral in the second term in (B.2) can be evaluated using integration by parts

$\begin{aligned}
\int_{-\infty}^{\infty} w \Phi\left( \mu_V + \sigma_V w \right) \phi (w) dw &= \sigma_V \int_{-\infty}^{\infty} \phi\left( \mu_V + \sigma_V w \right) \phi (w) dw \\
&= \sigma_V \frac{1}{\sqrt{\sigma_V^2+1} } \phi \left( \frac{\mu_V}{\sqrt{\sigma_V^2 +1}}\right). \end{aligned} \ \ \ \ \text{(B.3)}$

Equation (B.3) can be derived by straightforward derivation. The proof of Lemma 1 is thus completed by plugging (B.2) and (B.3) into (B.1).

Recall that

$\begin{aligned}
E(\bar y_i) &= \frac{1}{T} \sum_{t=1}^T E(\alpha_i + \beta_i x_{it} + \epsilon_{it} )
= \mu_A + \frac{1}{T} \sum_{t=1}^T E(\beta_i x_{it}) \\
&= \mu_A + \left(1 - \frac{m}{T} \right) E(\beta_i x_i^*). \end{aligned} \ \ \ \ \ \text{(B.4)}$

Equation (B.4) holds as because of balanced design

$\begin{aligned}
E( \beta_i x_i^* ) &= E \left\{ \beta_i E( x_i^* | \beta_i) \right\}
= E \left[ \beta_i E \left\{ \left. 2 I(\hat \beta_i >0) - 1 \right| \beta_i \right\} \right]
= E \left\{ 2 \beta_i \Phi \left( \frac{\beta_i}{\tau_i} \right) - \beta_i \right\} \\
&= 2 E \left\{ \beta_i \Phi \left( \frac{\sqrt{m} \beta_i}{\sigma} \right) \right\} - \mu_B \\
&= 2 \tau_i E \left\{ \frac{ \beta_i}{\tau_i} \Phi \left( \frac{\beta_i}{\tau_i} \right) \right\} - \mu_B \\
&=\mu_B \left\{ 2 \Phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) - 1 \right\} +
\frac{2 \sigma_B^2}{\sqrt{ \sigma_B^2 + \tau_i^2}} \phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right). \end{aligned} \ \ \ \ \ \text{(B.5)}$

Expression (B.5) is obtained by applying Lemma 1 with

$E(\bar y_i) = \mu_A + \left(1 - \frac{m}{T} \right) \left[
\mu_B \left\{ 2 \Phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) - 1 \right\} +
\frac{2 \sigma_B^2}{\sqrt{ \sigma_B^2 + \tau_i^2}} \phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right)
\right]
\ \ \ \ \ \text{(B.6)}$

thus completing the proof of Proposition 2.

For least squares

$E ( \bar y_i) = \mu_A + \left(1 - \frac{m}{T} \right) \frac{2 \sigma_B^2}{\sqrt{\sigma_B^2 + \lambda_i \sigma^2/m}} \phi (0). %=
%\mu_A + \left(1 - \frac{m}{T} \right) \frac{2 \sigma_B \rho_1}{\sqrt{\rho_1^2 + 1/m}} \phi (0).$

Hence, maximizing

$h(m) = \left(1 - \frac{m}{T} \right) \frac{1}{ \sqrt{\xi_i + 1/m}}$

where

$m^* = \frac{\sqrt{ 9 + 8 \xi_i T} - 3}{4 \xi_i }.
\ \ \ \ \ \ \ \ \ \ \text{(B.7)}$

The derivation of

$m^* = \frac{ 2T } { \sqrt{9 + 8 \xi_i T} + 3}.$

Now, since

In this section, we derive the expressions involved in the power of the

Recall that

$\begin{aligned}
E(y_i^*) &=& \frac{1}{T-m} \sum_{t=m+1}^T E(\alpha_i + \beta_i x_{it} + \epsilon_{it} )
&= \frac{1}{T-m} \sum_{t=m+1}^T E(\alpha_i + \beta_i x_i^* + \epsilon_{it} )
&= \mu_A + E(\beta_i x_i^*). \end{aligned}$

and analogously

$\begin{aligned}
\Delta &= E( \beta_i x_i^*) - \mu_B (2 p_1 - 1) \\
&= \mu_B \left\{ 2 \Phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) - 1 \right\} +
\frac{2 \sigma_B^2}{\sqrt{ \sigma_B^2 + \tau_i^2}} \phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right)
- \mu_B (2 p_1 - 1) \\
&= 2 \mu_B \left\{ \Phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) - p_1 \right\}
+
\frac{2 \sigma_B^2}{\sqrt{ \sigma_B^2 + \tau_i^2}} \phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) \end{aligned} \ \ \ \ \text{(C.1)}$

where

Next,

$\begin{aligned}
\text{var}(y_i^*) &= \text{var}\left\{ \alpha_i + \beta_i x_{i}^* + \sum_{t=m+1}^T \epsilon_{it} / (T-m) \right\} \\
&= \sigma_A^2 + \text{var}( \beta_i x_i^*) + \frac{\sigma^2}{T-m} \
= \sigma_A^2 + \sigma_B^2 + \mu_B^2 - \{ E(\beta_i x_{i}^*) \}^2 + \frac{\sigma^2}{T-m} \\
&= \sigma_A^2 + \sigma_B^2 + \mu_B^2 - \{ \Delta + \mu_B (2p_1-1) \}^2 + \frac{\sigma^2}{T-m}.
:= \sigma^{*2} \end{aligned}$

The last equality is a result of (C.1). Similarly, we can show

$\begin{aligned}
\text{var}(y_i') &= \sigma_A^2 + \sigma_B^2 + \mu_B^2 - \mu_B^2 (Ex_i')^2 + \frac{\sigma^2}{T-m} \\ &=
\sigma_A^2 + \sigma_B^2 + \mu_B^2 - \mu_B^2 (2 p_1 - 1)^2 + \frac{\sigma^2}{T-m}.\end{aligned}$

Finally, under the null

$\frac{\Delta}
{ \sqrt{ \text{var}(y_i^*) + \text{var}(y_i')}} =
\frac{2 \sigma_B^2 \phi(0)}{\sqrt{ \sigma_B^2 + \tau_i^2} \sqrt{2 \sigma_A^2 + 2 \sigma_B^2 - \frac{4 \sigma_B^4 \phi^2(0)}{ \sigma_B^2 + \tau_i^2}+ 2 \sigma^2 / (T-m)} }.$

Main Result 2 is proved by dividing

For the situations where the physicians have patient-specific knowledge to inform treatments under the SOC, we may postulate that

$x_i' = \left\{ \begin{array}{cc}
2 I(\beta_i >0) - 1 & \text{with probability $\theta_C$} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \quad \ \ \ \\
1 & \text{with probability $(1-\theta_C)p_1$} \ \ \ \ \ \ \ \ \ \ \ \\
-1 & \text{with probability $(1-\theta_C)(1-p_1)$}.
\end{array}
\right. \ \ \ \ \ \text{(C.2)}$

The parameter

$E(\beta_i x_i') = E \left\{ \beta_i E(x_i' | \beta_i) \right\} =
2 \theta_C E \left\{\beta_i I(\beta_i >0 \right\} - \theta_C \mu_B
+ (1-\theta_C)\mu_B (2p_1-1)
\ \ \ \ \ \text{(C.3)}$

where

$E \{\beta_i I(\beta_i >0 \} = \sigma_B \phi( \mu_B/\sigma_B
) + \mu_B \Phi( \mu_B/\sigma_B).
\ \ \ \ \ \text{(C.4)}$

Using (B.5), (C.3), and (C.4), after some algebra, we have

$\begin{aligned}
\Delta &= E( \beta_i x_i^*) - E( \beta_i x_i') \\
&= 2 (1-\theta_C) \mu_B \left\{ \Phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) - p_1 \right\}
+
\frac{2 (1 - \theta_C) \sigma_B^2}{\sqrt{ \sigma_B^2 + \tau_i^2}} \phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) + \\
& 2 \theta_C \mu_B \left\{ \Phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) - \Phi\left( \frac{\mu_B}{\sigma_B} \right)\right\} +
2 \theta_C \left[
\frac{\sigma_B^2}{\sqrt{ \sigma_B^2 + \tau_i^2}} \phi \left( \frac{\mu_B}{\sqrt{\sigma_B^2 + \tau_i^2}} \right) - \sigma_B \phi \left(\frac{\mu_B}{\sigma_B}\right)
\right]\end{aligned}$