Designation D7372 17 An American National StandardStandard Guide forAnalysis and Interpretation of Proficiency Test ProgramResults1This standard is issued under the fixed designation D7372; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revision, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon indicates an editorial change since the last revision or reapproval.1. Scope*1.1 This guide covers the uation and interpretation ofproficiency test program PTP results. For proficiency testprogram participants, this guide describes procedures forassessing participants results relative to the collective PTprogram results and potentially improving the laboratorystesting perance based on the assessment of findings andinsights. For the committees responsible for the test methods included in PT programs, this guide describes procedures for assessing industry's ability to perform test methods and for potentially identifying opportunities for improvements.
1.2 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use. Referenced Documents2.1 ASTM Standards2D6259 Practice for Determination of a Pooled Limit ofQuantitation for a Test D6299 Practice for Applying Statistical Quality Assuranceand Control Charting Techniques to uate AnalyticalMeasurement System PeranceD6617 Practice for Laboratory Bias Detection Using SingleTest Result from Standard MaterialD6792 Practice for Quality Management Systems in Petro-leum Products, Liquid Fuels, and Lubricants TestingLaboratoriesE177 Practice for Use of the Terms Precision and Bias inASTM Test sE456 Terminology Relating to Quality and StatisticsE2655 Guide for Reporting Uncertainty of Test Results andUse of the Term Measurement Uncertainty in ASTM Tests2.2 ASTM standards used only in Appendix X3 are alsolisted in X3.1.3. Terminology3.1 Definitions3.1.1 accuracy, ncloseness of agreement between an ob-served value and an accepted reference value. E177, E4563.1.2 analytical measurement system, na collection of oneor more components or subsystems, such as sample handlingand preparation, test equipment, instrumentation, displaydevices, data handlers, printouts or output transmitters, that areused to determine a quantitative value of a specific property foran unknown sample in accordance with a standard test .3.1.3 assignable cause, nfactor that contributes to varia-tion and that is feasible to detect and identify. E4563.1.4 bias, nsystematic error that contributes to the differ-ence between a population mean of the measurements or testresults and an accepted reference or true value. E177, E4563.1.5 control limits, nlimits on a control chart that areused as criteria for signaling the need for action or for judgingwhether a set of data does or does not indicate a state ofstatistical control. E4563.1.6 in-statistical-control, adjprocess, analytical mea-surement system, or function that exhibits variations that canonly be attributable to common cause. D62993.1.7 out-of-statistical-control, adja process, analyticalmeasurement system, or function that exhibits variations inaddition to those that can be attributable to common cause andthe magnitude of these additional variations exceeds specifiedlimits. D62991This guide is under the jurisdiction of ASTM Committee D02 on PetroleumProducts, Liquid Fuels, and Lubricants and is the direct responsibility of Subcom-mittee D02.94 on Coordinating Subcommittee on Quality Assurance and Statistics.Current edition approved Oct. 1, 2017. Published October 2017. Originallyapproved in 2007. Last previous edition approved in 2012 as D7372 12. DOI10.1520/D7372-17.2For referenced ASTM standards, visit the ASTM website, www.astm.org, orcontact ASTM Customer Service at serviceastm.org. United StatesThis international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for theDevelopment of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade TBT Committee.13.1.8 proficiency testing, ndetermination of a laboratorystesting capability by participation in an interlaboratory profi-ciency test program D62993.1.9 proficiency test program PTP, nstatistical qualityassurance activities that enable laboratories to assess theirperance in conducting test s within their ownlaboratory when their data are compared against other labora-tories that participate in the same program cycle using the sametest .3.1.9.1 DiscussionProficiency test programs are alsoknown as crosscheck programs and check schemes. The termInterlaboratory Crosscheck Program ILCP was previouslyused by ASTM for its PTP with Committee D02.3.1.10 test perance indexindustry TPIIND, nanapproximate measure of a PT programs testing capability fora specific test , defined as the ratio of the ASTMreproducibility RASTMtothese data reproducibility Rthesedata.3.1.11 uncertainty, nan indication of the magnitude oferror associated with a value that takes into account bothsystematic errors and random errors associated with the mea-surement or test process. E26553.1.12 Z-score, nstandardized and dimensionless measureof the difference between an individual result in a data set andthe arithmetic mean of the dataset, re-expressed in units ofstandard deviation of the dataset by dividing the actualdifference from the mean by the standard deviation for the dataset. D62993.1.12.1 DiscussionThe Z-score term described here isequivalent to Eq. A1.3 in Practice D6299.3.1.13 Z-score, nmeasure similar to the Z-score exceptthat the PT program standard deviation is replaced with onethat takes into account the site precision of the laboratory. Z isa valid approach when the laboratorys site precision standarddeviation is less than that for the PT program that is, thesedata standard deviation or stated otherwise when the TPI 1.Z 5Xi2 XSs21Ssthese data2nDDwhereZ site precision adjusted Z-Score,Xi laboratorys result,X PT average value,s site precision standard deviation estimate,sthese data PT Program standard deviation estimate, andn number of non-outlier data.3.1.13.1 DiscussionZ-score described here is equivalentto Eq. 2 in Practice D6299 for pre-treated results, when the“standard error of ARV” is expressed as “standard deviation ofARV/ n.”3.2 Definitions of Terms Specific to This Standard3.2.1 common chance, random cause, nfor quality as-surance programs, one of generally numerous factors, individu-ally of relatively small importance, that contributes tovariation, and that is not feasible to detect or control. D62993.2.2 site precision R, nvalue below which the absolutedifference between two individual test results obtained undersite precision conditions may be expected to occur with aprobability of approximately 0.95 95 . It is calculated as2.77 times the standard deviation of results obtained under siteprecision conditions. D62993.2.3 site precision conditions, nconditions under whichtest results are obtained by one or more operators in a singlesite location practicing the same test on a singlemeasurement system which may comprise multipleinstruments, using test specimens taken at random from thesame sample of material, over an extended period of timespanning at least a 15 day interval. D62993.2.4 these data, nterm used by the ASTM InternationalD02 PT program to identify statistical results calculated fromthe data ted by program participants.3.3 Symbols3.3.1 Iindividual observation as in I-chart.3.3.2 PTP or PT programproficiency test program.3.3.3 QCquality control.3.3.4 Rsite precision.3.3.5 Rthese datareproducibility determined in PT program.3.3.6 rthese datarepeatability determined in PT program.3.3.7 RASTMpublished ASTM reproducibility.4. Summary of Guide4.1 Petroleum product, liquid fuel, and lubricant samplesare regularly analyzed by specified standard test s aspart of a proficiency test program. This guide provides alaboratory with the tools and procedures for uating theirresults from a PT program. Techniques are presented to screen,plot, and interpret test results in accordance with industry-accepted practices.5. Significance and Use5.1 This guide can be used to uate the perance of alaboratory or group of laboratories participating in a profi-ciency test PT program involving petroleum and petroleumproducts.5.2 Data accrued, using the techniques included in thisguide, provide the ability to monitor analytical measurementsystem precision and bias. These data are useful for updatingstandard test s, as well as for indicating areas ofpotential measurement system improvement for action by thelaboratory. This guide serves both the individual participatinglaboratory and the responsible standards development group asfollows5.2.1 Tools and Approaches for Participating Laboratories.Administrative ReviewsFlagged Data and InvestigationsData Normality ChecksQQ PlotsHistogramsBias Deviation from MeanZ-Scores, Z-Scores TrendsPrecision PeranceTPIIND, F-testD7372 172Comparison of PTP and Individual Laboratory Site Preci-sion5.2.2 Tools and Approaches for Responsible Standards De-velopment Groups.TPI and precision trendsBias and precision comparisons via box and includesthe mean and the 1st and 99th percentile limits on thehistogram for data sets with n 100. These limits are based on“median 6 2.33 Standard Deviation,” where 62.33 arerespectively the first and 99th percentiles of the standardnormal distribution.6.5.2 PT program participants should review histogramswhen available and note unusual data distributions. Partici-pants should locate where their result falls within the histogrambins. Depending on the histogram, the location of data incertain bins could indicate a potential issue such as bias.Consider reviewing the histogram in parallel with correspond-ing statistics such as the Z-score, AD statistic, TPI Industry,and the normal probability or deviate plot. See X3.2 forexamples.6.6 Single Laboratory Bias Deviation from Mean6.6.1 As mentioned in Practice D6299, subsection 7.6, it isappropriate to uate proficiency test results by plotting thesigned deviations from the mean for each result for each testcycle. Practice D6299 suggests plotting the signed deviationson control charts. Laboratories would then apply the strategiesoutlined in that standard to identify outliers and other issuessuch as long-term biases. The recommended control chart is achart of individual observations called an I-Chart with anexponentially weighted moving average EWMA overlaid onthe data. See X3.3 for examples.6.6.2 Another graphical approach for monitoring bias in-volves use of box and whisker graphs. As is the case forreviewing histograms, laboratories should use the box andwhisker graphs to observe where their particular result lies inthe graph relative to the general distribution of results for thetest they used. Consider investigating any data outsidethe whisker end, if those data were not flagged already forother causes.Areview of the apparent distribution of results foreach test measuring the same parameter may providuable insight regarding overall biases between s. See7.2 for more ination on box and whisker plots.6.6.3 Another statistical approach for uating bias isdescribed in Practice D6617. This guide estimates whether ornot a single test result is biased compared to the consensusvalue from the PT program.6.7 Z-score, Z-score TrendsThe Z-score or Z-score, orboth, calculated for each datum ted by the laboratoryshould be reviewed with respect to the following6.7.1 Sign and Magnitude of Z-scoreThe sign that is, “”or “” of the statistic reflects the relative bias of the individualresult versus the mean of the sample group and standardizedto the standard deviation of that data set. Z-score valuesfalling in the ranges of plus or minus 0 to 1, 1 to 2, 2 to 3, and3 can be compared to control chart values falling in the rangesbetween the mean and 1-sigma, 1 to 2-sigma, 2 to 3-sigma, and3-sigma. For normally distributed data, there is an expecta-tion that about 68 of the data will lie in the 1 sigma to 1sigma range, about 95 in the 2 sigma to 2 sigma range,and 99 in the 3 to 3 sigma range. The further alaboratorys Z-score is from zero, the greater the relative biasand lower the probability that the data is considered withinstatistical control. Conduct investigations to determine thecause of any perceived bias as needed.6.7.2 Z-scores and/or Z-score Trends Using Data fromMultiple PTP CyclesCollect the Z-scores or Z-scores valuesfor each test parameter for successive PT programcycles on a control chart to show the trend over time. PlottingZ-scores or Z-scores is more practical than plotting the signeddeviations from the mean as in 6.2.1 especially when themagnitude of means can vary considerably from PT cycle tocycle. It is recommended to use the run rules promulgated inPractice D6299 to uate any observed trends. Conductinvestigations to determine causes as needed. According toPractice D6299, Z-score and Z-score data for a PT programcycle and test parameter are acceptable for trendanalysis via control charts when two conditions are met first,there are at least 16 non-outlier data for the parameter andsecond, the PT cycle standard deviation is not statisticallygreater than the reproducibility standard deviation for the test see F-test.6.7.3 Average Z-score and Average Z-scoreCalculate theaverage Z-score or Z-score for a series over a selected timeperiod. The sign and magnitude of this result is an indication ofthe long-term relative bias. Conduct investigations to deter-mine the cause of any perceived bias as needed.6.8 Precision Perance6.8.1 TPI IndustryAssess the general capability of a test using TPIINDalone or along with other tool