This function calculates the required sample size to achieve a target power in studies with multiple endpoints and treatment arms. The function leverages modified root-finding algorithms to estimate sample size while considering correlation structures, variance assumptions, and equivalence bounds across endpoints. It is especially useful for bioequivalence trials or multi-arm trials with complex endpoint structures.

sampleSize(
  mu_list,
  varcov_list = NA,
  sigma_list = NA,
  cor_mat = NA,
  sigmaB = NA,
  Eper,
  Eco,
  rho = 0,
  TAR = NULL,
  arm_names = NA,
  ynames_list = NA,
  type_y = NA,
  list_comparator = NA,
  list_y_comparator = NA,
  power = 0.8,
  alpha = 0.05,
  lequi.tol = NA,
  uequi.tol = NA,
  list_lequi.tol = NA,
  list_uequi.tol = NA,
  dtype = "parallel",
  ctype = "ROM",
  vareq = TRUE,
  lognorm = T,
  k = NA,
  adjust = "no",
  dropout = NA,
  nsim = 5000,
  seed = 1234,
  ncores = NA,
  optimization_method = "fast",
  lower = 2,
  upper = 500,
  step.power = 6,
  step.up = TRUE,
  pos.side = FALSE,
  maxiter = 1000,
  verbose = FALSE
)

Arguments

mu_list

Named list of arithmetic means per treatment arm. Each element contains a vector (i.e., one per treatment arm) with the expected outcomes for all endpoints of interest.

varcov_list

list of var-cov matrices, each element corresponds to a comparator with a varcov matrix of size number of endpoints X number of endpoints.

sigma_list

list of sigma vectors, each element corresponds to a comparator with a sigma vector of size number of endpoints.

cor_mat

matrix specifying the correlation matrix between endpoints, used along with sigma_list to calculate the varcov list in case it is not provided.

sigmaB

number between subject variance only for 2x2 design.

Eper

Optional. Vector of length 2 specifying the period effect in a dtype = "2x2" design, applied to c(Period 0, Period 1). Defaults to c(0, 0) if not provided. Ignored for dtype = "parallel".

Eco

Optional. Vector of length 2 specifying the carry-over effect for each arm in a dtype = "2x2" design, applied to c(Reference, Treatment). Defaults to c(0, 0) if not provided. Ignored for dtype = "parallel".

rho

Correlation parameter applied uniformly across all endpoint pairs, used with sigma_list to calculate varcov if cor_mat or varcov_list are not provided.

TAR

Numeric vector. Treatment allocation rates for each arm, where the order of values corresponds to the order of arm_names. The length of TAR must match the number of arms. If not provided, a default equal allocation rate is assigned across all arms.

arm_names

Optional vector with the treatment names. If not supplied, it will be derived from mu_list.

ynames_list

Optional list of vectors with Endpoint names on each arm. When not all endpoint names are provided for each arm, arbitrary names (assigned by vector order) are used.

type_y

vector with the type of endpoints: primary endpoint(1), otherwise (2).

list_comparator

list of comparators, i.e each comparator is a vector of size 1 X 2 where are specified the name of treatments

list_y_comparator

list of endpoints to be considered in each comparator. Each element of the list is a vector containing the names of the endpoints to compare. When it is not provided, all endpoints present in both compared arms are used.

power

target power (default = 0.8)

alpha

alpha level (default = 0.05)

lequi.tol

lower equivalence bounds (e.g., -0.5) expressed in raw scale units (e.g., scalepoints) of endpoint repeated on all endpoints and comparators

uequi.tol

upper equivalence bounds (e.g., -0.5) expressed in raw scale units (e.g., scalepoints) of endpoint repeated on all endpoints and comparators

list_lequi.tol

list of lower equivalence bounds (e.g., -0.5) expressed in raw scale units (e.g., scalepoints) of endpoint in comparator

list_uequi.tol

list of upper equivalence bounds (e.g., -0.5) expressed in raw scale units (e.g., scalepoints) of endpoint in comparator

dtype

Character. Design type for the trial: "parallel" (default) for parallel group design or "2x2" for crossover design (applicable only for trials with 2 arms).

ctype

Character. Specifies the type of hypothesis test for comparison: "DOM" for Difference of Means or "ROM" for Ratio of Means.

vareq

Logical indicating whether variances are assumed equal across arms (default = FALSE).

lognorm

Is data log-normally distributed? (TRUE, FALSE)

k

Vector with the number of endpoints that must be successful (integer) for global bioequivalence for each comparator. If no k vector is provided, it will be set to the total number of endpoints on each comparator.

adjust

Character. Method for alpha adjustment: "k" (K-fold), "bon" (Bonferroni), "sid" (Sidak), "no" (no adjustment, default), or "seq" (sequential adjustment).

dropout

vector with proportion of total population with dropout per arm

nsim

number of simulated studies (default=5000)

seed

main seed

ncores

Integer. Number of processing cores to use for parallel computation. Defaults to one less than the total number of detected cores.

optimization_method

Character. Method for determining the required sample size: "fast" (using modified root-finding algorithms) or "step-by-step". Defaults to "fast".

lower

Integer. Initial value of N for the search range. Defaults to 2.

upper

Integer. Maximum value of N for the search range. Defaults to 500.

step.power

Numeric. The initial step size for the sample size search, defined as 2^step.power. Relevant when optimization_method is "fast".

step.up

Logical. If TRUE (default), the sample size search increments upward from the lower limit; if FALSE, it decrements downward from the upper limit. Used only when optimization_method is "fast".

pos.side

Logical. If TRUE, finds the smallest integer, i, closest to the root such that f(i) > 0. Used only when optimization_method is "fast".

maxiter

Integer. Maximum number of iterations allowed for finding the sample size. Defaults to 1000. Used only when optimization_method is "fast".

verbose

Logical. If TRUE, the function displays progress and informational messages during execution. Defaults to FALSE.

Value

An object simss that contains the following elements :

"response"

array with the sample sizes for each arm and aproximated achieved power with confidence intervals

"table.iter"

data frame with the estimated sample size for each arm and power calculated at each searching iteration

"table.test"

data frame that collects the total information of the simulation at each iteration

"param.u"

parameters provided by the user

"param"

parameters used for the sample size calculation; as param.u are checked and modified in case of any inconsistent or missing information provided

"param.d"

parameters of design

References

Mielke, J., Jones, B., Jilma, B., & König, F. (2018). Sample size for multiple hypothesis testing in biosimilar development. Statistics in Biopharmaceutical Research, 10(1), 39-49.

Berger, R. L., & Hsu, J. C. (1996). Bioequivalence trials, intersection-union tests and equivalence confidence sets. Statistical Science, 283-302.

Author

Johanna Muñoz johanna.munoz@fromdatatowisdom.com

Examples


mu_list <- list(SB2 = c(AUCinf = 38703, AUClast = 36862, Cmax = 127.0),
                EUREF = c(AUCinf = 39360, AUClast = 37022, Cmax = 126.2),
                USREF = c(AUCinf = 39270, AUClast = 37368, Cmax = 129.2))

sigma_list <- list(SB2 = c(AUCinf = 11114, AUClast = 9133, Cmax = 16.9),
                   EUREF = c(AUCinf = 12332, AUClast = 9398, Cmax = 17.9),
                   USREF = c(AUCinf = 10064, AUClast = 8332, Cmax = 18.8))

# Equivalent boundaries
lequi.tol <- c(AUCinf = 0.8, AUClast = 0.8, Cmax = 0.8)
uequi.tol <- c(AUCinf = 1.25, AUClast = 1.25, Cmax = 1.25)

# Arms to be compared
list_comparator <- list(EMA = c("SB2", "EUREF"),
                        FDA = c("SB2", "USREF"))

# Endpoints to be compared
list_y_comparator <- list(EMA = c("AUCinf", "Cmax"),
                          FDA = c("AUClast", "Cmax"))

# Equivalence boundaries for each comparison
lequi_lower <- c(AUCinf = 0.80, AUClast = 0.80, Cmax = 0.80)
lequi_upper <- c(AUCinf = 1.25, AUClast = 1.25, Cmax = 1.25)

# Run the simulation
sampleSize(power = 0.9, alpha = 0.05, mu_list = mu_list,
           sigma_list = sigma_list, list_comparator = list_comparator,
           list_y_comparator = list_y_comparator,
           list_lequi.tol = list("EMA" = lequi_lower, "FDA" = lequi_lower),
           list_uequi.tol = list("EMA" = lequi_upper, "FDA" = lequi_upper),
           adjust = "no", dtype = "parallel", ctype = "ROM", vareq = FALSE,
           lognorm = TRUE, ncores = 1, nsim = 50, seed = 1234)
#> Given a 90%  target power with 100(1-2*0.05)% confidence level.
#> 
#> The total required sample size to achieve 90% power is 138 sample units.
#> 
#>  n_drop n_SB2 n_EUREF n_USREF n_total power power_LCI power_UCI
#>   <num> <num>   <num>   <num>   <num> <num>     <num>     <num>
#>       0    46      46      46     138   0.9 0.7740882 0.9625954