Understanding PT performance assessment

Technical notes | 2024 | EurachemInstrumentation

Other

Industries

Other

Manufacturer

Summary

Significance of the topic

Proficiency testing (PT) is a cornerstone of laboratory quality assurance and accreditation. Quantitative PT schemes enable laboratories to benchmark measurement accuracy, evaluate uncertainty estimates, demonstrate metrological traceability and identify systematic bias or methodological problems. Clear understanding of PT performance metrics and how they are derived is essential for laboratories to interpret results appropriately, improve measurement processes, and satisfy regulatory or accreditation requirements.

Objectives and overview of the leaflet

This leaflet aims to explain how PT providers assess participant performance in quantitative PT schemes, following ISO 13528 guidance. It outlines the concepts of assigned value and its uncertainty, the choice and role of the standard deviation for proficiency assessment (spt), the principal performance scores used in PT (percent difference, z, z', zeta, En) and the conventional interpretation of those scores. The document also clarifies the responsibilities of PT providers when defining values and uncertainties and summarizes practical implications for participating laboratories.

Methodology and assessment framework

The PT assessment process compares each participant result (xi) to an assigned value (xpt) chosen by the PT provider. ISO 13528 describes five alternative strategies to obtain xpt; these range from using a certified or reference value independent of participant results to deriving a consensus value from participant data. Each strategy carries different implications for metrological traceability and for estimating the uncertainty of the assigned value u(xpt).

Uncertainty of the assigned value u(xpt) may be estimated by at least five corresponding approaches, reported as a standard uncertainty u(xpt) or an expanded uncertainty U(xpt)=k·u(xpt) using an appropriate coverage factor k (commonly k≈2 gives ~95% coverage). The magnitude of u(xpt) relative to spt determines whether z or the modified z' score should be used.

Selection of the standard deviation for proficiency assessment spt must align with scheme objectives. ISO 13528 offers five possible ways to set spt (for example, a fixed percentage of xpt, a value based on historical interlaboratory data, or robust statistics derived from participant results). The PT provider must justify the chosen spt with respect to the scheme purpose (e.g., suitability for performance evaluation vs. strict metrological comparison).

Performance scores are all based on the difference xi - xpt normalized by a scaling factor:

Percent difference (D%): the relative difference expressed as a percentage of xpt. The PT provider may specify a maximum permissible relative error dE% = dE / xpt.
z score: unitless, z = (xi - xpt) / spt, comparing deviation to the proficiency standard deviation. If u(xpt) is non-negligible (recommendation: u(xpt) > 0.3·spt), use z' to include u(xpt) in the denominator.
z' score: a modified z incorporating u(xpt) in the denominator to account for uncertainty in the assigned value.
zeta (ζ) score: unitless, evaluates agreement of the laboratory result with the assigned value within combined standard uncertainties. The denominator is sqrt[u(xpt)^2 + u(xi)^2], so ζ highlights either bias in xi or under/overestimation of reported u(xi).
En score: unitless, uses combined expanded uncertainties (approximately 95% coverage) in the denominator and is commonly applied in metrological inter-laboratory comparisons among calibration laboratories.

Main results and discussion

Key interpretive rules (ISO 13528:2022 clause 9.4.2):

|z|, |z'| or |ζ| ≤ 2.0 — acceptable (satisfactory) performance.
2.0 < |score| < 3.0 — warning (questionable performance), merits investigation.
|score| ≥ 3.0 — unacceptable (action signal), indicates significant disagreement.
|En| < 1.0 and |D%| < dE% — considered successful/acceptable performance for those metrics.

Table-based scoring: The leaflet’s table summarises which normalisation and decision limits apply to each score type; it emphasizes choosing the score appropriate for the scheme objective and for how uncertainties are to be treated (standard vs expanded). Figures referenced in the leaflet illustrate the five assignment strategies, the corresponding methods to estimate u(xpt) and the options for setting spt; these underline that different choices change the meaning and sensitivity of performance scores.

Practical implications include:

When the assigned value and its uncertainty are derived from participant results (consensus approaches), robustness of the estimator and outlier treatment critically affect xpt and u(xpt).
If u(xpt) is large relative to spt, ignoring it will understate the true dispersion and may misclassify performance—hence the z' recommendation.
ζ and En scores explicitly incorporate participant-reported uncertainties, revealing whether a laboratory’s uncertainty claims are consistent with interlaboratory variability.

Understanding PT performance assessment

Summary

Significance of the topic

Objectives and overview of the leaflet

Methodology and assessment framework

Main results and discussion

Benefits and practical applications

Future trends and possibilities for use

Conclusion

References

Similar PDF

Key words

Key words

Key words

Key words