Evaluation of inter-observer variability in the grading of oral dysplasia using two different grading systems

Document Type



Pathology and Laboratory Medicine


Objectives: To compare the inter-observer variability in grading of oral dysplasia between two consultant pathologists specialising in head and neck cancer, and one non-specialist, using both the WHO classification 2005 and the Binary Grading system (proposed by Kujan et al. 2005).
Methods: Eighty archived oral biopsy slides consisting of cases with different grades of dysplasia treated and followed up in a tertiary regional centre, were reviewed by three pathologists, blinded to the initial diagnosis and the clinical outcome. The H&E slides were graded according to the criteria of both systems. Inter-observer reliability and variation were computed with kappa coefficient analysis.
Results: The overall inter-observer kappa agreement for the WHO grading system was κ = 0.21(95% CI: 0.10–0.34) and for the binary system was κ = 0.55(95% CI: 0.42–0.71). There was closer correlation in grading of the lesions between the two experts (WHO κ = 0.40, 95% CI: 0.28–0.53 and Binary κ = 0.59, 95% CI: 0.34–0.80), when compared to that between an expert and the non-expert (WHO κ = 0.18 95% CI: 0.05–0.30 and Binary κ = 0.32 95% CI: 0.15–0.50) Kappa agreements between the three observers on individual architectural and cytological features of the binary system showed great variability, the highest agreement being in increased mitotic figures (κ = 0.49,95% CI: 0.25–0.72) and the lowest on atypical mitotic figures (κ = 0.15,95% CI: 0.01–0.32).
Conclusions: There was closer agreement between all three pathologists when using the binary system in comparison to the WHO system, but both systems showed at best only moderate agreement. Improved agreement of scoring of individual features of the binary system should improve overall agreement of this system.

Publication (Name of Journal)

Clinical Otolaryngology