Understanding PT performance assessment
Technical notes | 2024 | EurachemInstrumentation
Proficiency testing (PT) is a cornerstone of laboratory quality assurance and accreditation. Quantitative PT schemes enable laboratories to benchmark measurement accuracy, evaluate uncertainty estimates, demonstrate metrological traceability and identify systematic bias or methodological problems. Clear understanding of PT performance metrics and how they are derived is essential for laboratories to interpret results appropriately, improve measurement processes, and satisfy regulatory or accreditation requirements.
This leaflet aims to explain how PT providers assess participant performance in quantitative PT schemes, following ISO 13528 guidance. It outlines the concepts of assigned value and its uncertainty, the choice and role of the standard deviation for proficiency assessment (spt), the principal performance scores used in PT (percent difference, z, z', zeta, En) and the conventional interpretation of those scores. The document also clarifies the responsibilities of PT providers when defining values and uncertainties and summarizes practical implications for participating laboratories.
The PT assessment process compares each participant result (xi) to an assigned value (xpt) chosen by the PT provider. ISO 13528 describes five alternative strategies to obtain xpt; these range from using a certified or reference value independent of participant results to deriving a consensus value from participant data. Each strategy carries different implications for metrological traceability and for estimating the uncertainty of the assigned value u(xpt).
Uncertainty of the assigned value u(xpt) may be estimated by at least five corresponding approaches, reported as a standard uncertainty u(xpt) or an expanded uncertainty U(xpt)=k·u(xpt) using an appropriate coverage factor k (commonly k≈2 gives ~95% coverage). The magnitude of u(xpt) relative to spt determines whether z or the modified z' score should be used.
Selection of the standard deviation for proficiency assessment spt must align with scheme objectives. ISO 13528 offers five possible ways to set spt (for example, a fixed percentage of xpt, a value based on historical interlaboratory data, or robust statistics derived from participant results). The PT provider must justify the chosen spt with respect to the scheme purpose (e.g., suitability for performance evaluation vs. strict metrological comparison).
Performance scores are all based on the difference xi - xpt normalized by a scaling factor:
Key interpretive rules (ISO 13528:2022 clause 9.4.2):
Table-based scoring: The leaflet’s table summarises which normalisation and decision limits apply to each score type; it emphasizes choosing the score appropriate for the scheme objective and for how uncertainties are to be treated (standard vs expanded). Figures referenced in the leaflet illustrate the five assignment strategies, the corresponding methods to estimate u(xpt) and the options for setting spt; these underline that different choices change the meaning and sensitivity of performance scores.
Practical implications include:
Understanding PT scoring enhances a laboratory’s capability to:
For PT providers, transparent selection and documentation of xpt, u(xpt) and spt strengthens credibility of schemes and supports participants in correct interpretation.
Anticipated developments and opportunities include:
Quantitative PT performance assessment depends critically on the assigned value, its uncertainty, and the chosen standard deviation for proficiency assessment. Selecting appropriate assignment and uncertainty-estimation strategies, transparently reporting these choices, and applying the correct performance scores (and their interpretation) are essential for meaningful interlaboratory comparisons. Laboratories gain most from PT when they interpret scores in the context of method performance, uncertainty budgets and scheme objectives, and when PT findings feed into continuous improvement and accreditation evidence.
Other
IndustriesOther
ManufacturerSummary
Significance of the topic
Proficiency testing (PT) is a cornerstone of laboratory quality assurance and accreditation. Quantitative PT schemes enable laboratories to benchmark measurement accuracy, evaluate uncertainty estimates, demonstrate metrological traceability and identify systematic bias or methodological problems. Clear understanding of PT performance metrics and how they are derived is essential for laboratories to interpret results appropriately, improve measurement processes, and satisfy regulatory or accreditation requirements.
Objectives and overview of the leaflet
This leaflet aims to explain how PT providers assess participant performance in quantitative PT schemes, following ISO 13528 guidance. It outlines the concepts of assigned value and its uncertainty, the choice and role of the standard deviation for proficiency assessment (spt), the principal performance scores used in PT (percent difference, z, z', zeta, En) and the conventional interpretation of those scores. The document also clarifies the responsibilities of PT providers when defining values and uncertainties and summarizes practical implications for participating laboratories.
Methodology and assessment framework
The PT assessment process compares each participant result (xi) to an assigned value (xpt) chosen by the PT provider. ISO 13528 describes five alternative strategies to obtain xpt; these range from using a certified or reference value independent of participant results to deriving a consensus value from participant data. Each strategy carries different implications for metrological traceability and for estimating the uncertainty of the assigned value u(xpt).
Uncertainty of the assigned value u(xpt) may be estimated by at least five corresponding approaches, reported as a standard uncertainty u(xpt) or an expanded uncertainty U(xpt)=k·u(xpt) using an appropriate coverage factor k (commonly k≈2 gives ~95% coverage). The magnitude of u(xpt) relative to spt determines whether z or the modified z' score should be used.
Selection of the standard deviation for proficiency assessment spt must align with scheme objectives. ISO 13528 offers five possible ways to set spt (for example, a fixed percentage of xpt, a value based on historical interlaboratory data, or robust statistics derived from participant results). The PT provider must justify the chosen spt with respect to the scheme purpose (e.g., suitability for performance evaluation vs. strict metrological comparison).
Performance scores are all based on the difference xi - xpt normalized by a scaling factor:
- Percent difference (D%): the relative difference expressed as a percentage of xpt. The PT provider may specify a maximum permissible relative error dE% = dE / xpt.
- z score: unitless, z = (xi - xpt) / spt, comparing deviation to the proficiency standard deviation. If u(xpt) is non-negligible (recommendation: u(xpt) > 0.3·spt), use z' to include u(xpt) in the denominator.
- z' score: a modified z incorporating u(xpt) in the denominator to account for uncertainty in the assigned value.
- zeta (ζ) score: unitless, evaluates agreement of the laboratory result with the assigned value within combined standard uncertainties. The denominator is sqrt[u(xpt)^2 + u(xi)^2], so ζ highlights either bias in xi or under/overestimation of reported u(xi).
- En score: unitless, uses combined expanded uncertainties (approximately 95% coverage) in the denominator and is commonly applied in metrological inter-laboratory comparisons among calibration laboratories.
Main results and discussion
Key interpretive rules (ISO 13528:2022 clause 9.4.2):
- |z|, |z'| or |ζ| ≤ 2.0 — acceptable (satisfactory) performance.
- 2.0 < |score| < 3.0 — warning (questionable performance), merits investigation.
- |score| ≥ 3.0 — unacceptable (action signal), indicates significant disagreement.
- |En| < 1.0 and |D%| < dE% — considered successful/acceptable performance for those metrics.
Table-based scoring: The leaflet’s table summarises which normalisation and decision limits apply to each score type; it emphasizes choosing the score appropriate for the scheme objective and for how uncertainties are to be treated (standard vs expanded). Figures referenced in the leaflet illustrate the five assignment strategies, the corresponding methods to estimate u(xpt) and the options for setting spt; these underline that different choices change the meaning and sensitivity of performance scores.
Practical implications include:
- When the assigned value and its uncertainty are derived from participant results (consensus approaches), robustness of the estimator and outlier treatment critically affect xpt and u(xpt).
- If u(xpt) is large relative to spt, ignoring it will understate the true dispersion and may misclassify performance—hence the z' recommendation.
- ζ and En scores explicitly incorporate participant-reported uncertainties, revealing whether a laboratory’s uncertainty claims are consistent with interlaboratory variability.
Benefits and practical applications
Understanding PT scoring enhances a laboratory’s capability to:
- Validate and improve measurement methods through identification of bias and systematic errors.
- Assess the realism and adequacy of reported measurement uncertainties.
- Satisfy accreditation evidence requirements by demonstrating ongoing participation and monitoring of interlaboratory performance.
- Inform corrective actions, quality improvements and staff training based on objective performance indicators.
For PT providers, transparent selection and documentation of xpt, u(xpt) and spt strengthens credibility of schemes and supports participants in correct interpretation.
Future trends and possibilities for use
Anticipated developments and opportunities include:
- Broader adoption of robust statistical estimators and explicit outlier-handling rules to stabilise consensus-based assigned values.
- Greater emphasis on rigorous evaluation of u(xpt) and formal propagation of its effect into performance metrics (routine use of z' and ζ where appropriate).
- Enhanced digital reporting and automated analytics to detect patterns across PT rounds, enabling earlier identification of systemic issues.
- Integration of PT data with laboratory information management systems and accreditation workflows to streamline corrective action and documentation.
- Application of machine learning methods to classify atypical performance and to support dynamic selection of spt or decision limits based on accumulated data.
Conclusion
Quantitative PT performance assessment depends critically on the assigned value, its uncertainty, and the chosen standard deviation for proficiency assessment. Selecting appropriate assignment and uncertainty-estimation strategies, transparently reporting these choices, and applying the correct performance scores (and their interpretation) are essential for meaningful interlaboratory comparisons. Laboratories gain most from PT when they interpret scores in the context of method performance, uncertainty budgets and scheme objectives, and when PT findings feed into continuous improvement and accreditation evidence.
References
- ISO 13528:2022. Statistical methods for use in proficiency testing by interlaboratory comparison.
- Brookman, B. & Mann, I. (eds.). Eurachem Guide: Selection, Use and Interpretation of Proficiency Testing (PT) Schemes, 3rd ed., 2021.
- Eurachem leaflet. How can proficiency testing help my laboratory. Eurachem Proficiency Testing Working Group, 2024.
- Eurachem leaflet. Understanding PT statistics. Eurachem Proficiency Testing Working Group, 2024.
Content was automatically generated from an orignal PDF document using AI and may contain inaccuracies.
Similar PDF
How can proficiency testing help my laboratory?
2022||Technical notes
How can proficiency testing help my laboratory? Introduction Proficiency testing (PT) is applicable to quantitative, qualitative and interpretative assessments, but this leaflet will concentrate on PTs for quantitative tests. Participation in PT is an essential part of the quality assurance…
Key words
spt, sptbias, biasproficiency, proficiencyscores, scoresscore, scorelaboratory, laboratoryperformance, performanceplausibility, plausibilityassessment, assessmentunsatisfactory, unsatisfactoryquestionable, questionablerounds, roundsestablished, establishedproviders, providerszeta
Understanding PT statistics
2024||Technical notes
Understanding PT statistics Introduction The Eurachem Guide on “Selection, Use and Interpretation of Proficiency Testing (PT) Schemes” [1] recommends participants to consider the statistical approach used by the PT provider when selecting a PT scheme. This leaflet is intended to…
Key words
moderate, moderateyes, yeslocation, locationmean, meanparticipants, participantsrobust, robustnormally, normallydispersion, dispersionestimator, estimatordata, datauncertainty, uncertaintyarithmetic, arithmeticreported, reportedunreliable, unreliabledeviation
Proficiency testing schemes for sampling
2020||Technical notes
Proficiency testing schemes for sampling Introduction This leaflet gives some hints on the application of ISO/IEC 17043 [1] for PT providers organising PT schemes for sampling. If there is a comparison between participants and a mechanism for performance evaluation which…
Key words
sampling, samplingschemes, schemeseee, eeebehalf, behalfminimising, minimisingsite, siteparticipant, participantorganizing, organizingjudge, judgeeurachem, euracheminterpreted, interpretedprovider, providertransportation, transportationprocedure, procedureproficiency
Selecting the right proficiency testing scheme for my laboratory
2022||Technical notes
Selecting the right proficiency testing scheme for my laboratory Introduction Participation in Proficiency Testing (PT) is an important part of assuring the quality of test results in a laboratory. The time and effort required can be costly, especially for laboratories…
Key words
provider, providerproficiency, proficiencylaboratory, laboratoryparticipants, participantsprocedures, proceduresscheme, schemedna, dnastrategies, strategiestesting, testingnumber, numbertest, testmeasurement, measurementmeetings, meetingscriteria, criterialeaflet