![##common.pageHeaderLogo.altText##](https://smw.ch/public/journals/1/pageHeaderLogoImage_en_US.jpg)
BACKGROUND: When Cox regression models are used to analyse time-to-event data, the proportional hazard assumption (PHA) must be reassured to obtain valid results. Transparent reporting of the statistics used is therefore essential to interpret research. This study aimed to assess the quality of statistical reporting and testing of the PHA in subgroup analysis of surgical randomised controlled trials (RCTs).
METHODS: All published articles (see appendix 1) in the top quartile (25%) of surgical journals from 2019 to 2021 were screened in a literature review according to the ClarivateTM journal citation report impact factor. Subgroup analyses of surgical RCT data that used Cox models were identified. Statistical reporting was rated using a previously established 12-item PHA Reporting Score as our primary endpoint. For original surgical publications, the PHA was formally tested on reconstructed time-to-event data from Kaplan-Meier estimators. Methodological reporting quality was rated according to the CONSORT statement. Digitalisation was only possible in studies where a Kaplan-Meier estimator including numbers at risk per time interval was published. All results from the subgroup analyses were compared to primary surgical RCT reports and benchmark RCTs using Cox models published in the New England Journal of Medicine and The Lancet.
RESULTS: Thirty-two studies reporting secondary subgroup analyses on surgical RCT data using Cox models were identified. Statistical reporting of surgical subgroup publications was significantly inferior compared to original benchmark publications: median PHA Reporting Score 50% (interquartile range [IQR]: 39 to 58) vs 58% (IQR: 42 to 67), p <0.001. The subgroups did not differ in comparison to primary surgical RCTs: median PHA Reporting Score 50% (IQR: 39 to 58) vs 42% (IQR: 33 to 58), p = 0.286. Adherence to the CONSORT reporting standards did significantly differ between subgroup studies and benchmark publications (p <0.001) as well as between subgroup studies and primary surgical RCT reports: 13 (12.5 to 14) vs 13 (IQR: 11 to 13), p = 0.042.
CONCLUSION: Statistical methodological reporting of secondary subgroup analyses from surgical RCTs was inferior to benchmark publications but not worse than primary surgical RCT reports. A comprehensive statistical review process and statistical reporting guidelines might help improve the reporting quality.