Classification and treatment of proximal humerus fractures: inter-observer reliability and agreement across imaging modalities and experience
1 Department of Orthopaedic Surgery and Sports Medicine, Temple University School of Medicine, 3401 N. Broad Street, Philadelphia, PA 1914, USA
2 Department Of Physiology, Temple University School of Medicine, 3500 N. Broad Street, Philadelphia, PA 19141, USA
3 Rothman Institute, Thomas Jefferson University Hospital, 925 Chestnut Street, Philadelphia, PA 19107, USA
Journal of Orthopaedic Surgery and Research 2011, 6:38 doi:10.1186/1749-799X-6-38Published: 29 July 2011
Proximal humerus fractures (PHF) are common injuries, but previous studies have documented poor inter-observer reliability in fracture classification. This disparity has been attributed to multiple variables including poor imaging studies and inadequate surgeon experience. The purpose of this study is to evaluate whether inter-observer agreement can be improved with the application of multiple imaging modalities including X-ray, CT, and 3D CT reconstructions, stratified by physician experience, for both classification and treatment of PHFs.
Inter-observer agreement was measured for classification and treatment of PHFs. A total of sixteen fractures were imaged by plain X-ray (scapular AP and lateral), CT scan, and 3D CT reconstruction, yielding 48 randomized image sets. The observers consisted of 16 orthopaedic surgeons (4 upper extremity specialists, 4 general orthopedists, 4 senior residents, 4 junior residents), who were asked to classify each image set using the Neer system, and recommend treatment from four pre-selected choices. The results were evaluated by kappa reliability coefficients for inter-observer agreement between all imaging modalities and sub-divided by: fracture type and observer experience.
All kappa values ranged from "slight" to "moderate" (k = .03 to .57) agreement. For overall classification and treatment, no advanced imaging modality had significantly higher scores than X-ray. However, when sub-divided by experience, 3D reconstruction and CT scan both had significantly higher agreement on classification than X-ray, among upper extremity specialists. Agreement on treatment among upper extremity specialists was best with CT scan. No other experience sub-division had significantly different kappa scores. When sub-divided by fracture type, CT scan and 3D reconstruction had higher scores than X-ray for classification only in 4-part fractures. Agreement on treatment of 4 part fractures was best with CT scan. No other fracture type sub-division had significantly different kappa scores.
Although 3D reconstruction showed a slight improvement in the inter-observer agreement for fracture classification among specialized upper extremity surgeons compared to all imaging modalities, fracture types, and surgeon experience; overall all imaging modalities continue to yield low inter-observer agreement for both classification and treatment regardless of physician experience.