# ASTM E2935-17 14.05

Designation: E2935 − 17 An American National StandardStandard Practice forConducting Equivalence Testing in Laboratory Applications1This standard is issued under the fixed designation E2935; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon (´) indicates an editorial change since the last revision or reapproval.1. Scope1.1 This practice provides statistical methodology for con-ducting equivalence testing on numerical data from twosources to determine if their true means or variances differ byno more than predetermined limits.1.2 Applications include (1) equivalence testing for biasagainst an accepted reference value, (2) determining meansequivalence of two test methods, test apparatus, instruments,reagent sources, or operators within a laboratory or equiva-lence of two laboratories in a method transfer, and (3)determining non-inferiority of a modified test procedure versusa current test procedure with respect to a performance charac-teristic.1.3 The guidance in this standard applies to experimentsconducted on a single material at a given level of the test resultor on multiple materials covering a range of selected testresults.1.4 Guidance is given for determining the amount of datarequired for an equivalence trial. The control of risks associ-ated with the equivalence decision is discussed.1.5 The values stated in SI units are to be regarded asstandard. No other units of measurement are included in thisstandard.1.6 This standard does not purport to address all of thesafety concerns, if any, associated with its use. It is theresponsibility of the user of this standard to establish appro-priate safety, health, and environmental practices and deter-mine the applicability of regulatory limitations prior to use.1.7 This international standard was developed in accor-dance with internationally recognized principles on standard-ization established in the Decision on Principles for theDevelopment of International Standards, Guides and Recom-mendations issued by the World Trade Organization TechnicalBarriers to Trade (TBT) Committee.2. Referenced Documents2.1 ASTM Standards:2E177 Practice for Use of the Terms Precision and Bias inASTM Test MethodsE456 Terminology Relating to Quality and StatisticsE2282 Guide for Defining the Test Result of a Test MethodE2586 Practice for Calculating and Using Basic StatisticsE3080 Practice for Regression Analysis2.2 USP Standard:3USP Validation of Alternative MicrobiologicalMethods3. Terminology3.1 Definitions—See Terminology E456 for a more exten-sive listing of statistical terms.3.1.1 accepted reference value, n—a value that serves as anagreed-upon reference for comparison, and which is derivedas: (1) a theoretical or established value, based on scientificprinciples, (2) an assigned or certified value, based on experi-mental work of some national or international organization, or(3) a consensus or certified value, based on collaborativeexperimental work under the auspices of a scientific orengineering group. E1773.1.2 bias, n—the difference between the expectation of thetest results and an accepted reference value. E1773.1.3 confidence interval, n—an interval estimate [L, U]with the statistics L and U as limits for the parameter θ andwith confidence level 1 – α, where Pr(L ≤ θ ≤ U) ≥ 1–α.E25863.1.3.1 Discussion—The confidence level, 1 – α, reflects theproportion of cases that the confidence interval [L, U] wouldcontain or cover the true parameter value in a series of repeatedrandom samples under identical conditions. Once L and U aregiven values, the resulting confidence interval either does ordoes not contain it. In this sense “confidence” applies not to theparticular interval but only to the long run proportion of caseswhen repeating the procedure many times.1This test method is under the jurisdiction of ASTM Committee E11 on Qualityand Statistics and is the direct responsibility of Subcommittee E11.20 on TestMethod Evaluation and Quality Control.Current edition approved Oct. 1, 2017. Published November 2017. Originallyapproved in 2013. Last previous edition approved in 2016 as E2935 – 16. DOI:10.1520/E2935-17.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at service@astm.org. For Annual Book of ASTMStandards volume information, refer to the standard’s Document Summary page onthe ASTM website.3Available from U.S. Pharmacopeial Convention (USP), 12601 TwinbrookPkwy., Rockville, MD 20852-1790, http://www.usp.org.Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United StatesThis international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for theDevelopment of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.13.1.4 confidence level, n—the value, 1 – α, of the probabilityassociated with a confidence interval, often expressed as apercentage. E25863.1.4.1 Discussion—α is generally a small number. Confi-dence level is often 95 % or 99 %.3.1.5 confidence limit, n—each of the limits, L and U, of aconfidence interval, or the limit of a one-sided confidenceinterval. E25863.1.6 degrees of freedom, n—the number of independentdata points minus the number of parameters that have to beestimated before calculating the variance. E25863.1.7 equivalence, n—condition that two population param-eters differ by no more than predetermined limits.3.1.8 intermediate precision conditions, n—conditions un-der which test results are obtained with the same test methodusing test units or test specimens taken at random from a singlequantity of material that is as nearly homogeneous as possible,and with changing conditions such as operator, measuringequipment, location within the laboratory, and time. E1773.1.9 mean, n—of a population, µ, average or expectedvalue of a characteristic in a population; of a sample, X¯sum ofthe observed values in the sample divided by the sample size.E25863.1.10 percentile, n—quantile of a sample or a population,for which the fraction less than or equal to the value isexpressed as a percentage. E25863.1.11 population, n—the totality of items or units ofmaterial under consideration. E25863.1.12 population parameter, n—summary measure of thevalues of some characteristic of a population. E25863.1.13 precision, n—the closeness of agreement betweenindependent test results obtained under stipulated conditions.E1773.1.14 quantile, n—value such that a fraction f of the sampleor population is less than or equal to that value. E25863.1.15 repeatability, n—precision under repeatabilityconditions. E1773.1.16 repeatability conditions, n—conditions where inde-pendent test results are obtained with the same method onidentical test items in the same laboratory by the same operatorusing the same equipment within short intervals of time. E1773.1.17 repeatability standard deviation (sr), n—the standarddeviation of test results obtained under repeatabilityconditions. E1773.1.18 sample, n—a group of observations or test results,taken from a larger collection of observations or test results,which serves to provide information that may be used as a basisfor making a decision concerning the larger collection. E25863.1.19 sample size, n, n—number of observed values in thesample. E25863.1.20 sample statistic, n—summary measure of the ob-served values of a sample. E25863.1.21 standard deviation—of a population, σ, the squareroot of the average or expected value of the squared deviationof a variable from its mean; of a sample, s, the square root ofthe sum of the squared deviations of the observed values in thesample from their mean divided by the sample size minus 1.E25863.1.22 test result, n—the value of a characteristic obtainedby carrying out a specified test method. E22823.1.23 test unit, n—the total quantity of material (containingone or more test specimens) needed to obtain a test result asspecified in the test method. See test result. E22823.1.24 variance, σ2,s2,n—square of the standard deviationof the population or sample. E25863.2 Definitions of Terms Specific to This Standard:3.2.1 bias equivalence, n—equivalence of a populationmean with an accepted reference value.3.2.2 equivalence limit, E, n—in equivalence testing, a limiton the difference between two population parameters.3.2.2.1 Discussion—In certain applications, this may betermed practical limit or practical difference.3.2.3 equivalence test, n—a statistical test conducted withinpredetermined risks to confirm equivalence of two populationparameters.3.2.4 means equivalence, n—equivalence of two populationmeans.3.2.5 non-inferiority, n—condition that the difference inmeans or variances of test results between a modified testingprocess and a current testing process with respect to aperformance characteristic is no greater than a predeterminedlimit in the direction of inferiority of the modified process tothe current process.3.2.5.1 Discussion—Other terms used for non-inferior are“equivalent or better” or “at least equivalent as.”3.2.6 paired samples design, n—in means equivalencetesting, single samples are taken from the two populations at anumber of sampling points.3.2.6.1 Discussion—This design is termed a randomizedblock design for a general number of populations sampled, andeach group of data within a sampling point is termed a block.3.2.7 power, n—in equivalence testing, the probability ofaccepting equivalence, given the true difference between twopopulation means.3.2.7.1 Discussion—In the case of testing for bias equiva-lence the power is the probability of accepting equivalence,given the true difference between a population mean and anaccepted reference value.3.2.8 range equivalence, n—equivalence of two populationmeans over a range of test result values.3.2.9 slope equivalence, n—equivalence of the slope of alinear statistical relationship with the value one (1).3.2.10 two independent samples design, n—in meansequivalence testing, replicate test results are determined inde-pendently from two populations at a single sampling time foreach population.E2935 − 1723.2.10.1 Discussion—This design is termed a completelyrandomized design for a general number of sampled popula-tions.3.2.11 two one-sided tests (TOST) procedure, n—a statisti-cal procedure used for testing the equivalence of the param-eters from two distributions (see equivalence).3.3 Symbols:a = intercept estimate (8.1.3)B = bias (7.1.1)b = slope estimate (8.1.3)dj= difference between a pair of test results at samplingpoint j (7.1.1)d¯= average difference (7.1.1)D = difference in sample means (6.1.2)(X1.1.2)E = equivalence limit (5.2)E1= lower equivalence limit (5.2.1)E2= upper equivalence limit (5.2.1)ei= residual estimate (8.3.3)f = degrees of freedom for s (9.1.1)(X1.1.2)F1-α=(1–α)thpercentile of the F distribution (10.3.1)fi= degrees of freedom for si(6.1.1)fp= degrees of freedom for sp(6.1.2)^(•) = the cumulative F distribution function (X1.6.3)H0: = null hypothesis (X1.1.1)Ha: = alternate hypothesis (X1.1.1)n = sample size (number of test results) from a popu-lation (5.4)(6.1.3)(7.1.1)(9.1.1)ni= sample size from ithpopulation (6.1.1)n1= sample size from population 1 (6.1.2)n2= sample size from population 2 (6.1.2)R = ratio of two sample variances (5.5.2.1)r = sample correlation coefficient (8.3.2)5 = ratio of two population variances (X1.6.3)SXX= sum of squared deviations of X from their mean(8.1.3.2)SXY= sum of products of deviations of X and Y from theirmeans (8.1.3.2)SYY= sum of squared deviations of Y from their mean(8.1.3.2)s = sample standard deviation (9.1.1)sB= sample standard deviation for bias (9.1.2)sd= standard deviation of the difference between twotest results (7.1.1)sD= sample standard deviation for mean difference(6.1.3)(X1.1.2)si= sample standard deviation for ithpopulation (6.1.1)si2= sample variance for ithpopulation (6.1.1)s12= sample variance for population 1 (6.1.2)s12= variance of test results from the current process(10.3.1)s22= sample variance for population 2 (6.1.2)s22= variance of test results from the modified process(10.3.1)sp= pooled sample standard deviation (6.1.2)sr= repeatability sample standard deviation (6.2)t = Student’s t statistic (6.1.4)(7.1.3)(9.1.3)t12α,f=(1–α)thpercentile of the Student’s t distributionwith f degrees of freedom (X1.1.2)Xij= jthtest result from the ithpopulation (6.1)UCLR= upper confidence limit for 5 (10.3.1)X¯= test result average (9.1.1)Xi¯= test result average for the ithpopulation (6.1.1)X1¯= test result average for population 1 (6.1.3)X2¯= test result average for population 2 (6.1.3)Z12α=(1–α)thpercentile of the standard normal distribu-tion (X1.6.1)α = (alpha) intercept parameter (8.1.1)α = consumer’s risk (5.2.2)(6.2)(7.2)β = (beta) slope parameter (8.1.1)β = producer’s risk (5.4.1)∆ = true mean difference between populations (5.4.1)δ = (delta) measurement error of X (X3.1.1)ε = (epsilon) measurement error of Y (X3.1.1)η = (eta) true mean of Y (X3.1.1)θ = (theta) angle of the straight line to the horizontalaxis (8.1.4.1)θˆ= estimate of θ (8.1.4.1)κ2= (kappa squared) information size (X3.3)λ = (lambda) ratio of measurement error variances of Yover X (8.1.1.1)µ = population mean (X1.4.1)µi= ithpopulation mean (X1.1.1)ν = (nu) probability associated with informative confi-dence interval (X3.3.2)ν = approximate degrees of freedom for sD(X1.1.4)ξ = (xi) true mean of X (X3.1.1)σ = standard deviation of the test method (5.2)σd= standard deviation of the true difference betweentwo populations (7.2)σε2= measurement error variances of Y (8.1.1)σδ2= measurement error variances of X (8.1.1)τ = (tau) perpendicular distance from line to origin(X3.1.3)Φ(•) = standard normal cumulative distribution function(X1.6.1)φ = (phi) half width of confidence interval for θ(8.1.4.2)ω = (omega) width of the equivalence interval for θ(X3.2)3.4 Acronyms:3.4.1 ARV, n—accepted reference value (5.5.1.1)(9.1)(X1.4)3.4.2 CRM, n—certified reference material (5.5.1.1)(9.1)3.4.3 ILS, n—interlaboratory study (6.2)3.4.4 LCL, n—lower confidence limit (6.2.5)(7.2.3)3.4.5 TOST, n—two one-sided tests (5.5.1) (Section 6)(Section 7) (Section 9)(Appendix X1)3.4.6 UCL, n—upper confidence limit (6.2.5)(7.2.3)4. Significance and Use4.1 Laboratories conducting routine testing have a continu-ing need to make improvements in their testing processes. Inthese situations it must be demonstrated that any changes willneither cause an undesirable shift in the test results from thecurrent testing process nor substantially affect a performancecharacteristic of the test method. This standard providesguidance on experiments and statistical methods needed todemonstrate that the test results from a modified test