| | Confidence limits of the probability of success in animal experiments and clinical studies: A Bayesian approachReceived 19 September 2007; received in revised form 4 January 2008; accepted 7 January 2008. published online 18 February 2008. Abstract PurposeTo determine from the number of trials, n, and the number of observed successes, k the most probable value, the variance and the confidence limits of the probability of success, p, in animal experiments and clinical studies subject to binomial statistics. MethodIn such experiments the probability of success is an unknown parameter. The Bayesian approach to the problem is advocated, based on constructed distribution of the probability of success. ResultsA simple Matlab code for the calculation of the confidence limits according to the proposed method is provided. The most probable, the mean, the variance and the confidence limits are calculated applying the usual definitions of these characteristics. Introduction  Clinical studies and animal experiments where all outcomes of a treatment can be classified just as positive (success) or negative (failure) outcomes are subject to binomial statistics. In such studies, the unknown parameter is the probability of the positive/negative outcome, p. It is the determination of its most probable value together with its mean, variance and confidence limits which is of primary interest in these studies. The importance of reporting confidence intervals for this type of studies was emphasized by Shakespeare and Holecek [1] The idea of confidence limits and of fiducial intervals has been introduced in the early thirties of last century by several prominent statisticians [2], [3], [4], [5], [6]. The first work concerning the “confidence or fiducial limits” in the binomial case was published by Clopper and Pearson [3]. An approach based on the idea of the fiducial limits is discussed in Ref. [7]. It is worth mentioning that the fiducial approach was criticized by Neyman [5], [6]. Indeed, this approach to the problem as described in Collet [8] experiences several difficulties. For instance, if k is the number of positive/negative events and n is the number of trials, the standard formula [8], [9] for the variance (uncertainty) of p is , where is the estimated (most probable) value of p. It is obvious, that these formulae should not be used in marginal cases, namely when and are smaller than around 5, i.e. when the number of trials is rather small and/or the number of successes or the number of failures is around zero. Due to the fact that the random variable p is defined in a closed interval, it is more likely that its probability distribution function is asymmetric. This makes the report of confidence intervals, rather than the variance, a more adequate issue in this case. According to the traditional definition of the lower and upper limits of the 100 (1 − α)% confidence interval for p given in the orthodox statistical books (see Refs. [8], [9] for example) the following equations must be solved: for the lower limit pL and for the upper limit pU, respectively. These equations follow from the requirement that  , where the summation is over the set, X( j), of acceptable values of j for a given p. Neyman [5] has proved that the form  cannot be set to be equal to a constant value (say, 1  −  α) independent of the value of p, if P is the binomial distribution with parameters j and p. For instance, in case of k =  0 the left-hand side of the equation for pL is equal to 1, independent of the value of pL; therefore, it cannot be satisfied for any value of pL. The same reasoning applies to the equation for pU in case of k = n when its left-hand side is equal to 1, independent of the value of pU and it cannot be therefore satisfied for any value of pU. It can also be shown that the values of pL and pU calculated according to Eq. (1) do not converge to the values of pL and pU calculated according to the normal approximation of the binomial distribution. Therefore, we advocate here a different, Bayesian approach to the problem (Bayes [10], Laplace [11]). This approach has been reintroduced (Jaynes [12], Jeffreys [12], [13], [14]) and applied to the treatment of astrophysical and physical data [15], [16]. In the described experiments the known parameter is the observed number of successes/failures and the unknown stochastic variable is the probability of success, p. Method and results  If the probability distribution for p was known, on its basis the upper and lower limits of a given confidence interval could be defined and determined in the usual way in which confidence limits of a random variable with a known distribution are found. The problem thus could be more precisely formulated as a problem of constructing the probability distribution of p as a function of the observed number of successes k and the number of trials, n. What Laplace did more than 200-years-ago (Laplace [11], see also Jaynes' Probability theory: the logic of science [14]) was to propose the following probability density distribution of p: which differs from the binomial distribution by only a constant. The denominator is the normalization factor. It happens to be the Euler integral of the first kind, known also as the complete β function –  .  are the binomial coefficients. Eq. (2) takes into account that p is a continuous variable, which can take any value between 0 and 1. By differentiating the distribution function (Eq. (2)) with respect to p one finds the most probable value of p, , that will give the observed number of successes k in n trials: Based on Eq. (2) the mean value of p, and its variance can be calculated (Laplace [11]) giving: The lower limit pL of the 100(1 − α)% confidence interval can be determined through solving the following integral equation for pL: The upper limit pU of the 100(1 − α)% confidence interval is determined through solving a similar equation for pU: The constants A1 and A2 should satisfy the following relation, A2−A1 = 1 − α, so that combining Eqs. (5), (6) one gets The meaning of this equation is that the cumulative probability that the true value of p lies between pL and pU,  , is 1  − α. There exist different ways of constructing a second equation for A1 and A2. The following two are the most commonly used approaches. (A)For symmetry reasons one may choose: and . (B)Alternatively, one may impose the requirement that . Definition (B) determines the shortest possible 100(1 − α)% confidence interval. The equations for the upper and lower confidence limits can be more easily solved using the realization that the integrals are incomplete β functions. MATLAB codes for calculating the confidence limits according to the above definitions are presented in Fig. 1a,b. However, none of the above described ways of determining pL and pU can be applied in the case of k = 0 or k = n. Indeed, in the case of k = 0 when the most probable value is , it is obvious that pL should be set to 0 because otherwise the most probable values of p will be excluded from the 100(1 − α)% confidence interval. The upper limit is determined through solving the equation wherefrom . Thus, in case of k = 0 the true value of p lies between zero and with 100(1 − α)% probability. Analogously, in case of k = n, pU is chosen equal to 1 and , so that the true value of p is between and 1 with 100(1 − α)% probability. Discussion and conclusion  It is worth mentioning the fact that due to the complexity in solving the definition equation (Eq. (1)), for pL and pU, there exist a number of papers [17], [18], [19] in which different approximations to the confidence limits are discussed, others [20], [21], that include just tables of values of confidence limits and even internet addresses (see Table 1) for their calculations, while the proposed method provides a simple and exact procedure of calculating the confidence limits for different confidence intervals and different values of n and k as shown in the Matlab codes provided here. These codes could be easily incorporated in any Matlab code developed by the researcher for clinical or animal experiments data treatment. In conclusion, one may say that the main advantage of the proposed method is its inner consistency and logic. It can be shown that there exists a nearly perfect coincidence between the values of pL and pU calculated according to this method and the values of pL and pU calculated according to the normal approximation to the binomial distribution in the range of its validity, that is for large n and . Moreover, it also has the advantage that it provides exact solutions for the confidence limits for any values of n (large and small) and for all possible values of k, including the extreme values of k = 0 and k = n, thus avoiding the limitations of the normal approximation. Therefore, it allows the treatment of studies in which the available statistic is small (small n) and/or the events of success or failure are rare (k = 0 or k = n), cases in which the normal approximation is inapplicable. Since, small statistics and rare events of a specific outcome are typical for animal experiments and clinical trials, the advocated method should be regarded as a helpful tool for the medical researcher. Acknowledgments  We would like to thank K Thoren for helpful discussions. References  [1]. [1]Shakespeare TP, Holecek MJ. Should we be using confidence intervals when reporting results of oncology studies?. Int J Radiat Oncol Biol Phys. 1998;41(4):971–972.
Full-Text PDF (38 KB)
|
CrossRef
[2]. [2]Fisher RA. Inverse probability. Proc Camb Phil Soc. 1930;26:528. [3]. [3]Clopper CJ, Pearson ES. The use of confidence or fiducial limits illustrated in the case of the binomial distribution. Biometrika. 1934;26(4):404–413.
CrossRef
[4]. [4]Fisher RA. The fiducial argument in statistical inferences. Ann Eugen (London). 1935;6:391–398. [5]. [5]Neyman J. On the problem of confidence intervals. Ann Math Statist. 1935;6(23):111–116. [6]. [6]Neyman J. Fiducial argument and the theory of confidence intervals. Biometrika. 1941;32(2):128–150.
CrossRef
[7]. [7]Stevens WL. Fiducial limits of the parameter of a discontinuous distribution. Biometrika. 1950;37(1/2):117–129. MEDLINE |
CrossRef
[8]. [8]Collet D. Modelling binary data. London: Chapman and Hall; 1994;. [9]. [9]Armitage P, Berry G. Statistical methods in medical research. Oxford: Blackwell Scientific; 1994;. [10]. [10]Bayes T. An essay towards solving a problem in the doctrine of chances. MD Comput. 1991;8(3):157–171. MEDLINE [11]. [11]P.S. Laplace. Memoire sur la probabilite des causes par lesevenemens; 1774. [12]. [12]Jaynes ET. Confidence intervals vs Bayesian intervals. Dordrecht, Holland: Reidel Publishing Company; 1976;http://bayes.wustl.edu/etj/articles/confidence.pdf. [13]. [13]Jeffreys H. Scientific inference. Cambridge University Press; 1957;. [14]. [14]Jaynes ET. Probability theory: the logic of science. Cambridge, UK: Cambridge University Press; 2003;. [15]. [15]Loredo TJ. From Laplace to Supernova SN 1987A: Bayesian inference in astrophysics. Dordrecht, Holland: Kluwer Academic Publishers; 1990;. [16]. [16]Gull SF. Bayesian inductive inference and maximum entropy. Dordrecht, Holland: Kluwer Academic Publishers; 1988;. [17]. [17]Anderson TW, Burstein H. Approximating the upper binomial confidence limit. J Am Stat Assoc. 1967;62(319):857–861. [18]. [18]Anderson TW, Burstein H. Approximating the lower binomial confidence limit. J Am Stat Assoc. 1968;63(324):1413–1415. [19]. [19]Jovanovic BD, Zalenski RJ. Safety evaluation and confidence intervals when the number of observed events is small or zero. Ann Emerg Med. 1997;30(3):301–306. Abstract | Full Text |
Full-Text PDF (463 KB)
|
CrossRef
[20]. [20]Pearson ES, Hartley HO. Tables for statisticians. Biometrika. 1972;1:. [21]. [21]Blyth CR, Hutchinson DW. Table of Neyman – shortest unbiased confidence intervals for the binomial parameter. Biometrika. 1960;47(3/4):381–391.
CrossRef
Department of Medical Physics, Cross Cancer Institute, 11560 University Avenue, Edmonton, Alberta T6G1Z2, Canada Corresponding author.
PII: S1120-1797(08)00006-9 doi:10.1016/j.ejmp.2008.01.001 Crown Copyright © 2009. Published by Elsevier Inc. All rights reserved. | |
|