A dose finding design for seizure reduction in neonates
Summary
Clinical trials in vulnerable populations are extremely difficult to conduct. A sequential phase I–II trial aimed at finding the appropriate dose of levetiracetam for treating neonatal seizures was planned with a maximum sample size of 50 newborns. Three primary outcomes are considered: efficacy and two types of toxicity that occur at the same time but are measured at different time points. In the case of failure, physicians could add a second agent as a rescue medication. The primary outcomes were modelled via a logistic model for efficacy and a weighted likelihood with pseudo-outcomes for the two toxicities taking into account the dependences under Bayesian inference. Simulations were conducted to assess the design properties.
1 Introduction
The aim of early phase dose finding trials is to obtain reliable information on a drug's safety, tolerability, pharmacokinetics, mechanism of action and trends regarding efficacy. Usually, these trials are performed on healthy adult volunteers, except when the drug is very toxic as in oncology. In paediatrics clinical trials, the practice of including healthy infants in phase I studies only for safety assessment is generally considered unethical. Drugs or procedures are often directly evaluated for efficacy in clinical trials (Gill, 2004) with certain safety stopping rules to protect infants from toxic drugs or procedures. Such trials are often known as phase I–II trials (Yuan et al., 2016), where efficacy and toxicity are studied simultaneously. Many dose finding designs have been proposed for adults in the oncology setting (Zohar and O’Quigley, 2006; Yuan et al., 2016), but only a few of them were specifically developed for paediatrics or for other indications than oncology. Thall et al. (2014) proposed a dose finding method for neonates with respiratory distress syndrome based on three clinical outcomes.
Conducting early phase clinical trials in neonates is challenging. Correct dosing is obstructed by the fast physiological changes that occur in neonates at this stage of development (Coppini et al., 2016). Neonates are not very small adults or ‘young’ children, but they have a completely different metabolism from adults and older children. Furthermore, there is no direct relationship as a function of body surface or allometry that links the pharmacokinetics and pharmacodynamics variables, such as the clearance or the constant of absorption related to the drug (Petit et al., 2016a, b). As a result, the definitions of efficacy and toxicity end points for neonates are often substantially different from those for adults or children (2 years old or more). In addition, selecting proper efficacy and toxicity end points and measuring them in neonates are more difficult and subjective (Denne, 2012; Thall et al., 2014; Coppini et al., 2016). For example, because neurological damage cannot be measured before 1 or 2 years after birth, surrogate end points, such as anaphylactic shock or long duration apnoea, must be used as a measure of neurological damage in neonates. In our motivating trial, one potential adverse event (AE) that is caused by the treatment is hearing loss. Such an AE is easy to capture in realtime for adults but difficult to measure in neonates. A specific hearing test must be scheduled and performed to diagnose it. Because of those difficulties coupled with the many ethical challenges, dose finding in neonates has been largely done in an ad hoc way without formal statistical modelling and considerations.
- the definition of multiple types of toxicity, that can be observed or measured at different times after the treatment and can be correlated,
- the addition of another rescue drug or treatment, and sometimes it could be unclear whether the resulting toxicity is due to the test treatment or to the additional one and
- the small target of probability of an AE which is accepted for the treatment.
These characteristics are not only limited to clinical trials in neonates or paediatrics, but also in rare disease in adults, for example. In what follows, we address these challenges by stating a motivating trial in newborns.
In this paper, we propose a Bayesian phase I–II design for the ‘Levetiracetam treatment of neonatal seizures: safety and efficacy phase II study’ (called the ‘LEVNEONAT’ trial; registration number NCT 02229123 at www.ClinicalTrials.gov) to find the optimal dose of levetiracetam for treating seizures in neonates. As detailed later, this trial has some challenges that are associated with treating neonates. For example, hearing loss cannot be measured in realtime and is only ascertained at day 30, and a new drug may be added during the course of the treatment if clinicians believe that levetiracetam alone is not adequately effective to reduce seizure. To handle these challenges, we model three end points (one efficacy and two toxicity end points) and utilize a pseudolikelihood approach for inference. On the basis of accumulating data, we continuously update the model estimates and adaptively assign doses to new patients.
The remainder of this paper is organized as follows. In Section 2, we describe our motivating clinical trial and some challenges. In Section 3, we propose the new design, including statistical models and the dose assignment rule. The simulation settings and results are presented in Section 4. Finally, a discussion is given in Section 5.
The programs that were used to analyse the data can be obtained from
2 Motivating trial
The aim of this paper is to propose a dose finding design for the LEVNEONAT clinical trial based on the experiences from the ‘Neonatal seizures with medication off-patent’ trial (called the ‘NEMO’ trial; NCT01434225 in www.ClinicalTrials.gov) (Pressler et al., 2015). The NEMO trial is an open label phase I–II dose finding trial conducted between 2011 and 2013. The objective of the trial was to find the optimal dose of bumetanide that achieved the maximum seizure reduction with an acceptable safety profile out of four study doses (i.e. 0.05, 0.1, 0.2 and 0.3 mg kg^{−1}). The primary efficacy end point was defined as the reduction of the electrographic seizure burden by 80% or more within hours 3 and 4 after the first bumetanide administration compared with the baseline. The safety end point was binary and defined as the occurrence of a list of AEs within 48 h after the first dose. The lowest acceptable efficacy response rate was 50%, and the maximum tolerable toxicity rate was 10%. A phase I–II dose finding design with dual binary efficacy and safety end points was used (Zohar and O'Quigley, 2006). 14 evaluable neonates were included in the trial. Four neonates were included at a dose of 0.05 mg kg^{−1}, three neonates at a dose of 0.1 mg kg^{−1}, six neonates at a dose of 0.2 mg kg^{−1} and one at a dose of 0.3 mg kg^{−1}. During the trial, no major AE was observed according to the definition that was specified in the protocol. However, after 14 neonates had been accrued, an unexpected AE was observed: three neonates experienced hearing loss at different doses. These AEs might have occurred during the treatment phase but could only be measured later by using a specific test as babies could not express this AE earlier. Fig. 1 shows the estimated dose–efficacy and the dose–toxicity relationships with or without including hearing loss as an AE after the accrual of 14 neonates. After including hearing loss as an AE, the model fitted indicated that all doses were unsafe, and thus the trial was terminated early following a recommendation by the Data and Safety Monitoring Board.
Based on these results, a second trial, LEVNEONAT (registration number NCT 02229123 at www.ClinicalTrials.gov) was planned for the same indication but with a different drug. The aim of this new trial is to find the optimal dose of levetiracetam out of the four doses 30, 40, 50 and 60 mg kg^{−1}. Fig. 2 shows the dosing schedule and end point measurement scheme. The loading dose is given at time 0, and after 4 h the efficacy end point is evaluated. Between hours 6 and 64, up to eight maintenance doses, defined as a quarter of the loading dose, are administrated. After 6 days, the first toxicity end point, which is referred to as the ‘short-term’ toxicity, is measured. The second toxicity end point (i.e. hearing loss), which is referred to as the ‘long-term’ toxicity, is assessed after 30 days or when the neonate is released from the hospital, whichever occurs first. During the treatment, the investigators have the option to add a second agent A2 as a rescue medication when they believe that levetiracetam is not effective. The type of agent to be added is at the discretion of the investigator, with the possibility of reducing the maintenance dose. In this trial, the investigators hoped that the dose finding method would reflect the clinical practice as much as possible, including
- to account for not only efficacy and short-term toxicity, as in the NEMO trial, but also long-term toxicity (i.e. hearing loss) that cannot be measured earlier,
- to consider not only the loading dose but also the number (or quantity) of maintenance doses of levetiracetam and
- to account for the fact that the second agent A2 might be added during the course of treatment, and thus toxicity might be caused by levetiracetam, A2, or both.
We have considered two end points for toxicity rather than one combined end point for clinical and logistical reasons rather than statistical. For instance, our two AE definitions differ for short-term and long-term toxicities; then from a medical viewpoint the end points cannot be merged, as we are interested in the estimation of each end point separately. Moreover, monitoring babies during the first 6 days of life for toxicity is already difficult. If from day 6 to day 30 the second outcome will not be observed there is no reason to require medical staff to undertake close monitoring when it is not necessary.
The model that was proposed by Thall et al. (2014), which uses elicited numerical utilities for the possible composite outcomes due to two efficacy outcomes and one safety outcome, cannot be adapted to this setting, since it does not take into account the timing of assessment of different outcomes. Another three-outcome model was presented in Zhong et al. (2012), who proposed a trivariate continual reassessment method (CRM) for a toxicity, efficacy and a surrogate efficacy end point. But, even if changing the surrogate efficacy with a surrogate toxicity, the assumption of a surrogate end point is not suitable in this trial. The short-term toxicity is not a surrogate of the long-term toxicity. Moreover, Thall et al. (2014) and Zhong et al. (2012) did not consider adding a second agent during the course of treatment. Here, we model three end points and propose the use of a pseudolikelihood approach for inference.
3 Methods
In this section, we describe three statistical models to describe the relationships between the dose and efficacy and short-term toxicity (denoted as T1) and long-term toxicity (denoted as T2) respectively. These models will be used to guide the dose allocation and selection. The correlation between efficacy and toxicity was not taken into account since in previous studies it was negligible. Let d_{k}, , be the loading dose and d_{[i]} be the dose that is administered to the ith subject. Let y_{E,i} be a binary efficacy indicator that takes a value of 1 if the ith subject experiences efficacy and 0 otherwise, y_{T1,i} be a binary short-term toxicity indicator that takes a value of 1 if the ith subject experiences short-term toxicity T1 and 0 otherwise, and y_{T2,i} be a binary indicator for long-term toxicity T2.
3.1 Dose–efficacy model
3.2 Short-term toxicity model
The short-term toxicity T1 is assessed within 6 days from the initiation of the treatment. As shown in Fig. 2, one challenge here is that, when clinicians believe that levetiracetam is not adequately effective to reduce seizure, they may reduce or stop the maintenance dose and add a new agent A2 to boost the treatment effect. This makes the modelling of T1 more complicated than standard dose finding trials. The evaluation of toxicity of levetiracetam is confounded by the possible addition of A2 and affected by the number of maintenance doses that a baby actually received. In other words, when toxicity is observed after adding A2, we do not know whether that toxicity comes from levetiracetam, A2 or both. The second challenge is that, although the assessment period for T1 is short (i.e. 6 days), new babies could arrive in hospitals at any time and require immediate treatment. Thus, the so-called ‘late onset outcome’ problem may occur, i.e. when a new baby arrives, some enrolled baby may not have completed the 6-day toxicity evaluation, which hinders the adaptive decision of dose assignment for the new baby. As noted by Liu et al. (2013) and Jin et al. (2014), whether there is a late onset outcome problem depends on not only the length of the assessment period, but also on both the length of the assessment period and the accrual rate. In the LEVNEONAT trial, the assessment period (i.e. 6 days) is shorter than in most trials but, as the accrual rate is fast, we may still face the late onset outcome problem. We handle these two challenges in a unified framework using a weighted pseudolikelihood approach.
As the sample size is small and the number of toxicities that were observed in the trial is even smaller, it is critical to choose an appropriate prior for ζ to avoid an extremely noisy estimate. We elicit the prior distribution of ζ from clinicians as follows. We provide several different distributions of time to toxicity to clinicians and ask them to pick the most likely one. Fig. 3 shows the distributions that we showed to our clinical collaborators. Distribution (b) was picked as the most likely. We then assign ζ a gamma prior distribution with mean matched to that of the distribution picked. For the LEVNEONAT trial, we set ζ∼Ga(5,1) since the prior mean of distribution (b) was 5. Fig. 3 shows also how the parameterized beta distribution can capture various shapes where toxicity is supposed to occur at the beginning of the period. However, if the posterior estimate of ζ is less than 1, this shape is reflected and toxicity occurs more likely at the end of the period. For the LEVNEONAT clinical trial ζ was considered to be the same for all doses, to avoid model complexity. Nevertheless, from the monotonicity assumption, the higher is the dose, the earlier that toxicity occurs, and ζ could then depend on the dose by setting , where λ<0 and z_{k} is a transformed value of d_{k} constrained to the interval [0,1] (Braun, 2006).
We used pseudolikelihood in a general sense that the likelihood that is yielded by equation 3 is not necessarily the true likelihood because we attached an empirical weight to toxicity probability p_{T1}. In the special case that the time to toxicity follows a uniform distribution, equation 3 leads to the true likelihood. Without considering the weight, y^{*} actually follows the quasi-Bernoulli likelihood (Gourieroux et al., 1984; McCullagh and Nelder, 1989). Because of the weights, it is more appropriate to be called pseudolikelihood as explained.
3.3 Long-term toxicity model
3.4 Avoiding stickiness
3.5 Dose allocation rule
To ensure that the trial is ethically acceptable, constraints on both safety and efficacy were imposed. At the inclusion of each new cohort, the aim is to assign to the patient(s) the most effective dose that is also sufficiently safe but, if all the doses are too toxic or not sufficiently efficient, the trial must be stopped.
- P(p_{T1}>τ_{T1}+ε_{1})<g(n),
- and
where ε_{1}, ε_{2} and ε_{E} are specified constants as discussed below.
The errors ε_{E}, ε_{1} and ε_{2} were set equal to 0.02, based on a sensitivity analysis, and in LEVNEONAT clinical trial τ_{T1}=τ_{T2}=0.1 and τ_{E}=0.6. In the case of no eligible dose, because the minimum effective dose is a dose that is higher than the maximum tolerated dose, the trial is stopped. Furthermore, the trial is stopped if , or , i.e. if the first dose is too toxic or the last dose is not sufficiently efficient, similarly to what is proposed in Thall and Cook (2004). The no-skipping rule is applied, i.e. a dose level can be assigned only if at least one patient is allocated to all lower doses.
4 Evaluation of the method proposed
4.1 Simulation setting
The performance of the trial design proposed was evaluated through six scenarios (additional scenarios are given in the Web appendix A). For each scenario, 1000 phase I–II trials were simulated. A cohort of two newborns per dose and a sample size of 30, 40 and 50 neonates were set for each trial, assuming an accrual rate of one newborn per 15 days. The skeletons were elicited by LEVNEONAT investigators, and were p_{E}=(0.5,0.6,0.7,0.8), p_{T1}=(0.005,0.05,0.1,0.2) and p_{T2}=(0.001,0.01,0.05,0.1) for efficacy, short-term toxicity and long-term toxicity respectively. The investigators were asked to give their estimates of those probabilities and then to reach a consensus. Therefore, these skeletons are the consensus results, and we used them in all simulations. We did not change them since they come from clinical relevance; however, we tested them in several scenarios, i.e. we changed the position of the true dose to be selected. The time to toxicity was simulated from an exponential distribution with rate 1/40 h^{−1}, and the number of maintenance doses follows a beta–binomial distribution with a=7 and b=6 to be close to the total number of maintenance doses. This action reflects the physicians' behaviour of trying to administer all maintenance doses. For simplicity, A2 was considered added after the efficacy evaluation, if it was added. The target probabilities that were chosen for simulations were those specified in the LEVNEONAT protocol, i.e. τ_{T1}=τ_{T2}=0.1 and τ_{E}=0.6.
- for the probability of T2 without T1,
- for the probability of T2 along with T1 and
- , for the probability of T2 when A2 is added.
For simplicity, only marginal probabilities p_{T2,\!true} were reported, but all values can be found in the Web Table 1 in the Web appendix A.
Results for the following doses: | PCSs for the following sample sizes: | ||||||
---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 30 | 40 | 50 | |
Scenario 1 (recommended dose 3) | |||||||
p _{T1, true} | 0.001 | 0.01 | 0.1 | 0.2 | M_{1} 0.673 | M_{1} 0.737 | M_{1} 0.798 |
p _{T2, true} | 0.001 | 0.01 | 0.1 | 0.2 | M_{2} 0.582 | M_{2} 0.685 | M_{2} 0.766 |
p _{E, true} | 0.6 | 0.7 | 0.8 | 0.9 | |||
p _{ a } | 0 | 0 | 0 | 0 | |||
Scenario 2 (recommended dose 3) | |||||||
p _{T1, true} | 0.001 | 0.01 | 0.1 | 0.2 | M_{1} 0.641 | M_{1} 0.742 | M_{1} 0.788 |
p _{T2, true} | 0.001 | 0.01 | 0.1 | 0.2 | M_{2} 0.53 | M_{2} 0.657 | M_{2} 0.717 |
p _{E, true} | 0.6 | 0.7 | 0.8 | 0.9 | |||
p _{ a } | 0.5 | 0.5 | 0.5 | 0.5 | |||
0.005 | 0.05 | 0.15 | 0.25 | ||||
0.005 | 0.05 | 0.15 | 0.25 | ||||
Scenario 3 (recommended dose 4) | |||||||
p _{T1, true} | 0.001 | 0.001 | 0.01 | 0.1 | M_{1} 0.8 | M_{1} 0.839 | M_{1} 0.871 |
p _{T2, true} | 0.001 | 0.006 | 0.026 | 0.09 | M_{2} 0.698 | M_{2} 0.742 | M_{2} 0.781 |
p _{E, true} | 0.5 | 0.6 | 0.7 | 0.8 | |||
p _{ a } | 0.5 | 0.5 | 0.5 | 0.5 | |||
0.005 | 0.005 | 0.05 | 0.15 | ||||
0.005 | 0.005 | 0.05 | 0.15 |
- ^{a} In the second to fifth columns, values for p_{T1}, p_{T2} and p_{E} along with p_{a}, and used in simulations are summarized for each dose. In the sixth to eighth columns, the percentages of correct selection, PCS, are given.
All the scenarios were simulated with (M_{1}) or without relevance weights (M_{2}) associated with the pseudolikelihood scheme. The percentage of correct dose selection, PCS, at the end of the trial, the number of neonates that experienced toxicities, n_{tox}, and dose allocation percentages were compared to evaluate our design proposition performance. The posterior quantities were computed by using Hamiltonian Monte Carlo sampling, using Rstan version 2.6.0 (Stan Development Team, 2016).
4.2 Results
Results are shown in Tables 1 and 2. More results in terms of the number of newborns that showed toxicity, n_{tox}, and dose allocation over the entire trial are given in the Web-based supporting materials appendix A. In scenario 1, where A2 was not added, M_{1} had high PCS compared with M_{2} on the basis of 30 patients and more, above 67%. This simple setting evaluates the influence of relevance weights, i.e. M_{1}versusM_{2}. Scenario 2 was similar to scenario 1 but with the administration of A2 associated with p_{a}=0.5. In this setting, the PCSs were higher than in scenario 1, as A2 allowed a better estimation of T1 and T2, keeping a similar amount of observed n_{tox} across trials (Table 1 in the Web appendix A). Again, PCS by using M_{1} exceeds that by using M_{2}. In scenario 3, the optimal dose under toxicity restrictions was the last of the panel, and A2 was added; the PCS obtained was above 80% by using M_{1}. A higher difference in PCS was observed, compared with scenarios 1 and 2, between M_{1} and M_{2}.
Results for the following doses: | PCSs for the following sample sizes: | ||||||
---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 30 | 40 | 50 | |
Scenario 4 (recommended dose 4) | |||||||
p _{T1, true} | 0.001 | 0.005 | 0.01 | 0.05 | M_{1} 0.841 | M_{1} 0.821 | M_{1} 0.804 |
p _{T2, true} | 0.001 | 0.007 | 0.015 | 0.05 | M_{2} 0.766 | M_{2} 0.746 | M_{2} 0.722 |
p _{E, true} | 0.3 | 0.4 | 0.5 | 0.6 | |||
p _{ a } | 0.5 | 0.5 | 0.5 | 0.5 | |||
0.005 | 0.009 | 0.012 | 0.06 | ||||
0.005 | 0.009 | 0.012 | 0.06 | ||||
Scenario 5 (recommended dose 2) | |||||||
p _{T1, true} | 0.01 | 0.1 | 0.25 | 0.35 | M_{1} 0.619 | M_{1} 0.706 | M_{1} 0.768 |
p _{T2, true} | 0.009 | 0.1 | 0.18 | 0.26 | M_{2} 0.623 | M_{2} 0.647 | M_{2} 0.68 |
p _{E, true} | 0.6 | 0.7 | 0.8 | 0.9 | |||
p _{ a } | 0.5 | 0.5 | 0.5 | 0.5 | |||
0.01 | 0.1 | 0.25 | 0.35 | ||||
0.01 | 0.1 | 0.25 | 0.35 | ||||
Scenario 6 (recommended dose 2) | |||||||
p _{T1, true} | 0.001 | 0.01 | 0.1 | 0.2 | M_{1} 0.623 | M_{1} 0.682 | M_{1} 0.713 |
p _{T2, true} | 0.01 | 0.1 | 0.2 | 0.3 | M_{2} 0.623 | M_{2} 0.663 | M_{2} 0.689 |
p _{E, true} | 0.6 | 0.7 | 0.8 | 0.9 | |||
p _{ a } | 0.5 | 0.5 | 0.5 | 0.5 | |||
0.005 | 0.05 | 0.15 | 0.25 | ||||
0.005 | 0.05 | 0.15 | 0.25 |
- ^{a} In the second to fifth columns, values for p_{T1}, p_{T2} and p_{E} along with p_{a}, and used in simulations are summarized for each dose. In the sixth to eighth columns, the percentages of correct selection, PCS, are given.
In scenario 4, all doses were safe but only the last was considered efficacious regarding the target of 60%. In this case, the PCSs were above 71% for all sample sizes and regarding M_{1} and M_{2}. Scenario 5 was selected to evaluate a situation where the probabilities of T1 remain the same whereas it increases for T2 when adding A2. The PCSs obtained were higher for M_{2} for sample sizes of 30. In scenario 6 the T1 and T2 were simulated independently from each other. The observed PCS, in this case, was around 60% for all sample sizes and regarding M_{1} and M_{2}.
In the Web appendix A, two additional scenarios are given (7—too toxic—and 8—not efficient) that evaluate the efficiency of our proposed stopping rules. In these cases, stopping was recommended in 90% on average of cases where all doses were too toxic and in 94% on average of cases where all the doses were not efficient.
In the Web appendix B, we compared the performance of a modification of the TITE CRM when combining the two toxicities in only one variable, Y_{T}. We ran simulations in six scenarios, which was considered important to see differences between our method and the modified TITE CRM method (referred as M_{titecrm}). This simpler method tends to overdosing patients and the PSCs are lower, above all for small sample sizes.
5 Discussion
The objective of our work was to propose a dose finding method for trials in paediatrics and, more specifically, in neonate populations when delayed toxicities are observed such as in the LEVNEONAT trial. To date, such approaches have been rare in the literature. Indeed, there are fewer clinical trials in neonates, and therefore only a few methods have been proposed or adapted for this vulnerable population. Recently, the European Medicines Agency and Food and Drug Administration have proposed a modification of the ‘Guidance for Industry: E11 clinical investigation of medicinal products in the paediatric population’ where the need for better designs and methods for paediatrics was pointed out. In this work, we have specifically taken into account in our models the real practical issues that prevent us from using other methods that have been proposed for adults. In general, this design could be also used for the evaluation of other drugs treating seizures in neonates, on one hand, or other diseases where toxicities are correlated and a rescue agent is used when one treatment does not work, on the other hand. The models that are presented are very flexible and can be easily adapted to other situations. For example, it is possible to include the scaled time-to-toxicity part, which here was parameterized as a beta distribution, also in the long-term toxicity model to take care of late onset toxicity. The weights that are used for creating the pseudo-observations and the relevance weights can be customized according to prior knowledge on the toxicity and efficacy of the drug. Then, since in our proposed method the two toxicities are estimated in a joint likelihood, it is very easy to add a new constraint on the probability than at least one of the two types of toxicity is lower than the unacceptable threshold. Indeed, the formula can be written as
In the LEVNEONAT trial, the dose allocation scheme and the efficacy and toxicity outcomes of this trial were more complex than in usual dose finding studies. The resulting proposed method was based on the modelling of efficacy, short-term and long-term toxicities taking into account the number of maintenance doses and a second agent that was highly correlated to a failure outcome. The model was built with the collaboration of investigators and other collaborators who were involved in this trial to develop the best model to answer the clinical question and practice constraints. We modelled T2 conditionally on T1 since we followed the physicians’ knowledge and experience. We tested this hypothesis by adding scenarios where T1 is not predictive for T2, and we found that the model could still achieve proper estimates. A beta distribution was used for the TITE part in the T1 model again after discussing with the investigators. We did not test the case where toxicities appear more at the end of the observational window, but our parameter ζ is free to take values for which the beta shape is inverted. A richer and more complicated model could have been proposed; however, the small sample size, the small toxicity targets and the constraints on data acquisition led to simplifying some of its aspects. For example, the model does not take into account the correlation between efficacy and toxicity. We decided not to complicate the model since in previous studies the correlation was negligible. However, working with marginal distributions, we do not expect that adding correlation in the model should change the results much (Cunanan and Koopmeiners, 2014). Nevertheless, our proposition was sufficiently richer to reflect the complexity of this dose finding clinical trial. When modelling, there should be a balance between simplicity and the right complicated way to represent clinical considerations and that, when information is available, it should be introduced in the design. Then, this method has the advantage of being easily customized, depending on the application, and this is the reason for the ad hoc choices.
In general, the simulation study showed that the model proposed could be a good trade-off between a high PCS and a reasonable number of observed short- and long-term toxicities under small sample constraints. The clinical relevance weights made it possible to avoid becoming stuck during the dose allocation process. Moreover, the model was shown to be robust, i.e. the PCS was less sensitive to sample size. Scenarios were selected to test several possible situations. The probability of adding A2 was set at 50% and not more since we believe that it is useless to perform a clinical trial where most of the neonates received other competing drugs. All fixed parameters were chosen using first investigators’ advice and then testing them in a sensitivity analysis. After the NEMO experience, we took care in modelling T2, and also physicians selected the new drug better. The first inclusion in the LEVNEONAT trial took place in September 2017.
In conclusion, this design was the result of a successful and close collaboration across statisticians, physicians and other trial collaborators. In the last 20 years, many dose finding designs have been proposed in the oncology setting and almost none for paediatrics. There is a crucial need for efficient designs in this population, and this paper is an example of how and what can be done. Outcomes that cannot be measured in realtime, such as hearing loss, and the adding of rescue medications are very common features in paediatrics trials and this design can be easily customized for them.
Acknowledgements
We should like to show our gratitude to Estelle Boivin, Bruno Giraudeau, Julie Leger, Elie Saliba and Elsa Tavernier, who are involved in the LEVNEONAT clinical trial, for sharing their opinions and being ready to help and give information to adapt the model for this trial. We also thank two reviewers for their suggestions.
This work was conducted as part of the ‘Innovative methodology for small populations research’ project funded by the European Union's seventh framework programme for research, technological development and demonstration under grant agreement FP HEALTH 2013-602144.