# ASTM E1402-13

Designation: E1402 − 13 An American National StandardStandard Guide forSampling Design1This standard is issued under the fixed designation E1402; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon (´) indicates an editorial change since the last revision or reapproval.1. Scope1.1 This guide defines terms and introduces basic methodsfor probability sampling of discrete populations, areas, andbulk materials. It provides an overview of common probabilitysampling methods employed by users of ASTM standards.1.2 Sampling may be done for the purpose of estimation, ofcomparison between parts of a sampled population, or foracceptance of lots. Sampling is also used for the purpose ofauditing information obtained from complete enumeration ofthe population.1.3 No system of units is specified in this standard.1.4 This standard does not purport to address all of thesafety concerns, if any, associated with its use.2. Referenced Documents2.1 ASTM Standards:2D7430 Practice for Mechanical Sampling of CoalE105 Practice for Probability Sampling of MaterialsE122 Practice for Calculating Sample Size to Estimate, WithSpecified Precision, the Average for a Characteristic of aLot or ProcessE141 Practice for Acceptance of Evidence Based on theResults of Probability SamplingE456 Terminology Relating to Quality and Statistics3. Terminology3.1 Definitions—For a more extensive list of statisticalterms, refer to Terminology E456.3.1.1 area sampling, n—probability sampling in which amap, rather than a tabulation of sampling units, serves as thesampling frame.3.1.1.1 Discussion—Area sampling units are segments ofland area and are listed by addresses on the frame prior to theiractual delineation on the ground so that only the randomlyselected ones need to be exactly identified.3.1.2 bulk sampling, n—sampling to prepare a portion of amass of material that is representative of the whole.3.1.3 cluster sampling, n—sampling in which the samplingunit consists of a group of subunits, all of which are measuredfor sampled clusters.3.1.4 frame, n—a list, compiled for sampling purposes,which designates all of the sampling units (items or groups) ofa population or universe to be considered in a specific study.3.1.5 multi-stage sampling, n—sampling in which thesample is selected by stages, the sampling units at each stagebeing selected from subunits of the larger sampling unitschosen at the previous stage.3.1.5.1 Discussion—The sampling unit for the first stage isthe primary sampling unit. In multi-stage sampling, this unit isfurther subdivided. The second stage unit is called the second-ary sampling unit. A third stage unit is called a tertiarysampling unit. The final sample is the set of all last stagesampling units that are obtained. As an example of sampling alot of packaged product, the cartons of a lot could be theprimary units, packages within the carton could be secondaryunits, and items within the packages could be the third-stageunits.3.1.6 nested sampling, n—same as multi-stage sampling.3.1.7 primary sampling unit, PSU, n—the item, element,increment, segment or cluster selected at the first stage of theselection procedure from a population or universe.3.1.8 probability proportional to size sampling, PPS,n—probability sampling in which the probabilities of selectionof sampling units are proportional, or nearly proportional, to aquantity (the “size”) that is known for all sampling units.3.1.9 probability sample, n—a sample in which the sam-pling units are selected by a chance process such that aspecified probability of selection can be attached to eachpossible sample that can be selected.3.1.10 proportional sampling, n—a method of selection instratified sampling such that the proportions of the samplingunits (usually, PSUs) selected for the sample from each stratumare equal.3.1.11 quota sampling, n—a method of selection similar tostratified sampling in which the numbers of units to be selected1This guide is under the jurisdiction of ASTM Committee E11 on Quality andStatistics and is the direct responsibility of Subcommittee E11.10 on Sampling /Statistics.Current edition approved Aug. 1, 2013. Published August 2013. Originallyapproved in 2008. Last previous edition approved in 2008 as E1402 – 08ε1. DOI:10.1520/E1402-13.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at service@astm.org. For Annual Book of ASTMStandards volume information, refer to the standard’s Document Summary page onthe ASTM website.Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States1from each stratum is specified and the selection is done bytrained enumerators but is not a probability sample.3.1.12 sampling fraction, f, n—the ratio of the number ofsampling units selected for the sample to the number ofsampling units available.3.1.13 sampling unit, n—an item, group of items, or seg-ment of material that can be selected as part of a probabilitysampling plan.3.1.13.1 Discussion—The full collection of sampling unitslisted on a frame serves to describe the sampled population ofa probability sampling plan.3.1.14 sampling with replacement, n—probability samplingin which a selected unit is replaced after any step in selectionso that this sampling unit is available for selection again at thenext step of selection, or at any other succeeding step of thesample selection procedure.3.1.15 sampling without replacement, n—probability sam-pling in which a selected sampling unit is set aside and cannotbe selected at a later step of selection.3.1.15.1 Discussion—Most samplings, including simplerandom sampling and stratified random sampling, are con-ducted by sampling without replacement.3.1.16 simple random sample, n—(without replacement)probability sample of n sampling units from a population of Nunits selected in such a way that each of theN!n!~N2n!!subsetsof n units is equally probable – (with replacement) a probabil-ity sample of n sampling units from a population of N unitsselected in such a way that, in order of selection, each of the Nnordered sequences of units from the population is equallyprobable.3.1.17 stratified sampling, n—sampling in which the popu-lation to be sampled is first divided into mutually exclusivesubsets or strata, and independent samples taken within eachstratum.3.1.18 systematic sampling, n—a sampling procedure inwhich evenly spaced sampling units are selected.3.2 Definitions of Terms Specific to This Standard:3.2.1 address, n—(sampling) a unique label or instructionsattached to a sampling unit by which it can be located andmeasured.3.2.2 area segment, n—(area sampling) final sampling unitfor area sampling, the delimited area from which a character-istic can be measured.3.2.3 composite sample, n—(bulk sampling) sample pre-pared by aggregating increments of sampled material.3.2.4 increment, n—(bulk sampling) individual portion ofmaterial collected by a single operation of a sampling device.3.3 Symbols:N = number of units in the population to be sampled.n = number of units in the sample.Yi= quantity value for the i-th unit in the population.yi= quantity observed for i-th sampling unit.Y¯= average quantity for the population.y¯ = average of the observations in the sample.Xi= value of an auxiliary variable for the i-th unit in thepopulation.xi= value of an auxiliary variable for the i-th samplingunit.P = population proportion of units having an attribute ofinterest.p = sample proportion.f = sampling fraction.s = sample standard deviation of the observations in thesample.s2= sample variance of the observations in the sample.SE~y¯! = standard error of an estimated mean y¯ .4. Significance and Use4.1 This guide describes the principal types of samplingdesigns and provides formulas for estimating population meansand standard errors of the estimates. Practice E105 providesprinciples for designing probability sampling plans in relationto the objectives of study, costs, and practical constraints.Practice E122 aids in specifying the required sample size.Practice E141 describes conditions to ensure validity of theresults of sampling. Further description of the designs andformulas in this guide, and beyond it, can be found in textbooks(1-10).34.2 Sampling, both discrete and bulk, is a clerical andphysical operation. It generally involves training enumeratorsand technicians to use maps, directories and stop watches so asto locate designated sampling units. Once a sampling unit islocated at its address, discrete sampling and area samplingenumeration proceeds to a measurement. For bulk sampling,material is extracted into a composite.4.3 A sampling plan consists of instructions telling how tolist addresses and how to select the addresses to be measuredor extracted. A frame is a listing of addresses each of which isindexed by a single integer or by an n-tuple (several integer)number. The sampled population consists of all addresses inthe frame that can actually be selected and measured. It issometimes different from a targeted population that the userwould have preferred to be covered.4.4 A selection scheme designates which indexes constitutethe sample. If certified random numbers completely control theselection scheme the sample is called a probability sample.Certified random numbers are those generated either from atable (for example, Ref (11)) that has been tested for equal digitfrequencies and for serial independence, from a computerprogram that was checked to have a long cycle length, or froma random physical method such as tossing of a coin or acasino-quality spinner.4.5 The objective of sampling is often to estimate the meanof the population for some variable of interest by the corre-sponding sample mean. By adopting probability sampling,selection bias can be essentially eliminated, so the primary goalof sample design in discrete sampling becomes reducingsampling variance.3The boldface numbers in parentheses refer to a list of references at the end ofthis standard.E1402 − 1325. Simple Random Sampling (SRS) of a FinitePopulation5.1 Sampling is without replacement. The selection schememust allocate equal chance to every combination of n indexesfrom the N on the frame.5.1.1 Make successive equal-probability draws from theintegers 1 to N and discard duplicates until n distinct indexeshave been selected.5.1.2 If the N indexed addresses or labels are in a computerfile, generate a random number for each index and sort the fileby those numbers. The first n items in the sorted file constitutea simple random sample (SRS) of size n from the N.5.1.3 A method that requires only one pass through thepopulation is used, for example, to sample a productionprocess. For each item, generate a random number in the range0 to 1 and select the ith item when the random number is lessthan (n-ai)/(N-i+1), where aiis the number of selections alreadymade up to the i-th item. For example, the first item (i=1 anda1=0) is selected with probability n/N.5.2 The quantities observed on the variable of interest at theselected sampling units will be denoted y1,y2,…,yn. Theestimate of the mean of the sampled population isy¯ 5(yi/n (1)The standard error of the mean of a finite population usingsimple random sampling without replacement is:SE~y¯! 5 s =~1 2 f!/n (2)where f =n/N is the sampling fraction and s2is the samplevariance (s, its square root, is sample standard deviation).s25(~yi2 y¯!2/~n 2 1! (3)The population mean that y¯ estimates is:Y¯5(i51NYi/N (4)The expected value of s2is the finite population variancedefined as:S25(i51N~Yi2 Y¯!2/~N 2 1! (5)5.3 Finite Population Correction—The factor (1- f) in Eq 2is the finite population correction. In conventional statisticaltheory, the standard error of the average of independent,identically distributed random variables does not include thisfactor. Conventional statistical theory applies for randomsampling with replacement. In sampling without replacementfrom a finite population, the observations are not independent.The finite population correction factor depends on (a) thepopulation of interest being finite, (b) sampling being withouterrors and measurements for any sampled item being assumedcompletely well defined for that item. When the purpose ofsampling is to understand differences between parts of apopulation (analytic as opposed to enumerative, as describedby Deming (4)), actual population values are viewed asthemselves sampled from a parent random process and thefinite population correction should not be used in making suchcomparisons.5.4 Sample Size—The sample size required for a samplingstudy depends on the variability of the population and therequired precision of the estimate. Refer to Practice E122 forfurther detail on determining sample size. Eq 2 can bedeveloped to find required sample size. First, the user musthave a reasonable prior estimate s0of the population standarddeviation, either from previous experience or a pilot study.Solving for n in Eq 2, where now SE~y¯! is the required standarderror, gives:n 5no11no/Nwhere:no5 so2/SE ~y¯!2(6)5.5 Estimating a Proportion—Formulas 1 through 5 servefor proportions as well as means. For an indicator variable Yiwhich equals 1 if the i-th unit has the attribute and 0 if not, thepopulation proportion P 5Y¯can be recognized as the averageof ones and zeros. The sample estimate is the sample propor-tion p 5y¯ and the sample variance is s2= np(1-p)/(n-1).5.6 Ratio Estimates—An auxiliary variable may be used toimprove the estimate from an SRS. Values of this variable foreach item on the frame will be denoted Xi. Specific knowledgeof each and every Xiis not necessary for ratio estimation butknowing the population average X¯is. The observed values xiare needed along with the yi, where the index i goes from i=1to i=n, the sample size. The estimated ratio is Rˆ5y¯/x¯ and theimproved ratio estimate of Y¯is X¯y¯/x¯ . The estimated standarderror of the ratio estimate of Y¯is:SE~X¯Rˆ! 5 Œ1 2 fn(~yi2 Rˆxi!2/~n 2 1! (7)5.6.1 The ratio estimator works best when the relation ofX-values to Y-values is approximately linear through the originwith the variance of Y for given X approximately proportionalto X. Other estimates using the auxiliary variable includeregression estimators and difference estimators (2). The bestform of estimate depends on the relation of X to Y values andthe relation between the variance of Y for given X.6. Systematic Selection (SYS)6.1 For systematic selection of a sample of n from a list ofN sampling units when N/n=k is integer, a random integerbetween 1 and k should be selected for the start and every kthunit thereafter. When N/n is not integer, then a random integerbetween 1 and N should be selected for the start and the nearestinteger to N/n added successively, subtracting N whenexceeded, to get selected units. Multip