Skip to main navigation menu Skip to main content Skip to site footer

Systematic review

Vol. 155 No. 2 (2025)

The key role of base rates: systematic review and meta-analysis of the predictive value of four risk assessment instruments

DOI
https://doi.org/10.57187/s.3517
Cite this as:
Swiss Med Wkly. 2025;155:3517
Published
14.02.2025

Summary

AIMS OF THE STUDY: Many countries have seen a decline in recidivism rates over the past decades. These base rates are pertinent information for assessing the recidivism risk of offenders. They provide a foundation for clinical assessment and an empirical basis for risk assessment instrument norms, which inform expected recidivism rates. The present study explored the extent to which base rates influence the validity of risk assessment instruments.

METHODS: We systematically reviewed the available evidence on the discrimination ability of four well-established risk assessment instruments used to estimate the probability of recidivism for general (Level of Service Inventory-Revised [LSI-R]), violent (Violence Risk Appraisal Guide [VRAG]), sexual (Static-99R), and intimate partner violent offences (Ontario Domestic Assault Risk Assessment [ODARA]). We conducted a bivariate logit-normal random effects meta-analysis of sensitivity and false positive rates and modelled the positive and negative predictive values. We used base rates as reported in (a) the construction samples of each risk assessment instrument and (b) recent official statistics and peer-reviewed articles for different offence categories and countries. To assess the risk of bias, we used the Joanna Briggs Institute Critical Appraisal Checklist for Diagnostic Test Accuracy Studies.

RESULTS: We screened 644 studies and subsequently analysed 102, of which 96 were included in the systematic review and 24 in the meta-analyses. Discrimination was comparable for all four instruments (median area under the curve = 0.68–0.71). The information needed to calculate summary statistics of sensitivity and false positive rate was often not reported, and a risk of bias may be present in up to half of the studies. The largest summary sensitivity and false positive rate were estimated for the ODARA, followed by the LSI-R, the VRAG, and the Static-99R. If base rates are low, positive predictive values tend to be relatively low, while negative predictive values are higher: positive predictive value = 0.032–0.133 and negative predictive value = 0.985–0.989 for sexual offences; positive predictive value = 188–0.281 and negative predictive value = 0.884–0.964 for intimate partner violence; positive predictive value = 0.218–0.241 and negative predictive value = 0.907–0.942 for violent offences; positive predictive value = 0.335–0.377 and negative predictive value = 0.809–0.810 for general offences.

CONCLUSIONS: When interpreting the results of individual risk assessments, it is not sufficient to provide the discrimination of the instrument; the risk statement must also address the positive predictive value and discuss its implications for the specific case. As recidivism rates are neither stable over time nor uniform across countries or samples, the primary interpretation of risk assessment instruments should rely on the percentile rank. Expected recidivism rates should be interpreted with caution. However, our results are drawn from a limited database, as studies not reporting sufficient information were excluded from analyses and it was only possible to identify current base rates for modelling positive and negative predictive values for certain countries. International standards for consistently collecting and reporting base rates are important to better identify crime trends. Future research on the validity of risk assessment instruments should follow rigorous reporting standards.

References

  1. Singh JP, Bjørkly S, Fazel S. International perspectives on violence risk assessment. Oxford University Press; 2016. DOI: https://doi.org/10.1093/acprof:oso/9780199386291.001.0001
  2. Ægisdóttir S, White MJ, Spengler PM, Maugherman AS, Anderson LA, Cook RS, et al. The meta-analysis of clinical judgment project: fifty-six years of accumulated research on clinical versus statistical prediction. Couns Psychol. 2006;34(3):341–82. DOI: https://doi.org/10.1177/0011000005285875
  3. Grove WM, Zald DH, Lebow BS, Snitz BE, Nelson C. Clinical versus mechanical prediction: a meta-analysis. Psychol Assess. 2000 Mar;12(1):19–30. DOI: https://doi.org/10.1037//1040-3590.12.1.19
  4. Viljoen JL, Vargen LM, Cochrane DM, Jonnson MR, Goossens I, Monjazeb S. Do structured risk assessments predict violent, any, and sexual offending better than unstructured judgment? An umbrella review. Psychol Public Policy Law. 2021;27(1):79–97. DOI: https://doi.org/10.1037/law0000299
  5. Singh JP. Predictive validity performance indicators in violence risk assessment: a methodological primer. Behav Sci Law. 2013;31(1):8–22. DOI: https://doi.org/10.1002/bsl.2052
  6. Babchishin KM, Helmus LM. The influence of base rates on correlations: an evaluation of proposed alternative effect sizes with real-world data. Behav Res Methods. 2016 Sep;48(3):1021–31. DOI: https://doi.org/10.3758/s13428-015-0627-7
  7. Milner JS, Campbell JC, Messing JT. Prediction issues for practitioners. In: Campbell JC, Messing JT, editors. Assessing dangerousness: Domestic violence offenders and child abusers. 3. Edition. New York (NY): Springer Publishing Company; 2017. pp. 33–54. 10.1891/9780826133274.0006 DOI: https://doi.org/10.1891/9780826133274.0002
  8. Leguízamo A, Lee SC, Jeglic EL, Calkins C. Utility of the Static-99 and Static-99R With Latino Sex Offenders. Sex Abuse. 2017 Dec;29(8):765–85. DOI: https://doi.org/10.1177/1079063215618377
  9. Rossegger A, Gerth J, Seewald K, Urbaniok F, Singh JP, Endrass J. Current obstacles in replicating risk assessment findings: a systematic review of commonly used actuarial instruments. Behav Sci Law. 2013;31(1):154–64. DOI: https://doi.org/10.1002/bsl.2044
  10. Gelb A, Velázquez T. The changing state of recidivism: Fewer people going back to prison. Philadelphia (PA): The Pew Charitable Trusts; 2018., Retrieved from https://www.pewtrusts.org/en/research-and-analysis/articles/2018/08/01/the-changing-state-of-recidivism-fewer-people-going-back-to-prison
  11. Gonçalves LC, Baggio S, Weber M, Urwyler T, Noll T, Singh JP, et al. Recidivism in Switzerland: the influence of custodial sanctions. Swiss Med Wkly. 2021 Apr;151(1112):w20462. DOI: https://doi.org/10.4414/smw.2021.20462
  12. Van Dijk J, Tseloni A, Farrell G. The international crime drop: New directions in research. New York (NY): Palgrave Macmillan; 2012. DOI: https://doi.org/10.1057/9781137291462
  13. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021 Mar;372(372):n71. DOI: https://doi.org/10.1136/bmj.n71
  14. Olver ME, Stockdale KC, Wormith JS. Thirty years of research on the level of service scales: a meta-analytic examination of predictive accuracy and sources of variability. Psychol Assess. 2014 Mar;26(1):156–76. DOI: https://doi.org/10.1037/a0035080
  15. van Der Put CE, Gubbels J, Assink M. Predicting domestic violence: A meta-analysis on the predictive validity of risk assessment tools. Aggress Violent Behav. 2019;47:100–16. 10.1016/j.avb.2019.03.008 DOI: https://doi.org/10.1016/j.avb.2019.03.008
  16. Singh JP, Grann M, Fazel S. A comparative study of violence risk assessment tools: a systematic review and metaregression analysis of 68 studies involving 25,980 participants. Clin Psychol Rev. 2011 Apr;31(3):499–513. DOI: https://doi.org/10.1016/j.cpr.2010.11.009
  17. Helmus LM, Kelley SM, Frazier A, Fernandez YM, Lee SC, Rettenberger M, et al. Static-99R: Strengths, limitations, predictive accuracy meta-analysis, and legal admissibility review. Psychol Public Policy Law. 2022;28(3):307–31. 10.1037/law0000351 DOI: https://doi.org/10.1037/law0000351
  18. Andrews D, Bonta J. LSI-R: The Level of Service Inventory-Revised. Toronto (ON): Multi-Health Systems; 1995.
  19. Quinsey VL, Harris GT, Rice ME, Cormier CA. Violent offenders: Appraising and managing risk. Washington (DC): American Psychological Association; 2006. 10.1037/11367-000 DOI: https://doi.org/10.1037/11367-000
  20. Helmus L, Thornton D, Hanson RK, Babchishin KM. Improving the predictive accuracy of Static-99 and Static-2002 with older sex offenders: revised age weights. Sex Abuse. 2012 Feb;24(1):64–101. DOI: https://doi.org/10.1177/1079063211409951
  21. Hilton NZ, Harris GT, Rice ME, Lang C, Cormier CA, Lines KJ. A brief actuarial assessment for the prediction of wife assault recidivism: the Ontario domestic assault risk assessment. Psychol Assess. 2004 Sep;16(3):267–75. DOI: https://doi.org/10.1037/1040-3590.16.3.267
  22. Ahmed S, Lee SC, Helmus LM. Predictive accuracy of Static-99R across different racial/ethnic groups: A meta-analysis. Law Hum Behav. 2023 Feb;47(1):275–91. DOI: https://doi.org/10.1037/lhb0000517
  23. Babchishin KM, Blais J, Helmus L. Do static risk factors predict differently for Aboriginal sex offenders? A multi-site comparison using the original and revised static-99 and static-2002 scales. Can J Criminol Crim Justice. 2012;54(1):1–43. 10.3138/cjccj.2010.E.40 DOI: https://doi.org/10.3138/cjccj.2010.E.40
  24. Babchishin KM, Hanson RK, Helmus L. Even highly correlated measures can add incrementally to predicting recidivism among sex offenders. Assessment. 2012 Dec;19(4):442–61. DOI: https://doi.org/10.1177/1073191112458312
  25. Helmus L, Thornton D. The MATS-1 risk assessment scale: summary of methodological concerns and an empirical validation. Sex Abuse. 2016 Apr;28(3):160–86. DOI: https://doi.org/10.1177/1079063214529801
  26. Helmus LM, Hanson RK, Murrie DC, Zabarauckas CL. Field validity of Static-99R and STABLE-2007 with 4,433 men serving sentences for sexual offences in British Columbia: new findings and meta-analysis. Psychol Assess. 2021 Jul;33(7):581–95. DOI: https://doi.org/10.1037/pas0001010
  27. Lee SC, Hanson RK. Similar predictive accuracy of the Static-99R risk tool for White, Black, and Hispanic sex offenders in California. Crim Justice Behav. 2017;44(9):1125–40. 10.1177/0093854817711477 DOI: https://doi.org/10.1177/0093854817711477
  28. Lee SC, Hanson RK. Updated 5-year and new 10-year sexual recidivism rate norms for Static-99R with routine/complete samples. Law Hum Behav. 2021 Feb;45(1):24–38. DOI: https://doi.org/10.1037/lhb0000436
  29. McGrath RJ, Lasher MP, Cumming GF, Langton CM, Hoke SE. Development of Vermont assessment of sex offender risk-2 (VASOR-2) reoffense risk scale. Sex Abuse. 2014 Jun;26(3):271–90. DOI: https://doi.org/10.1177/1079063213486936
  30. Messing JT, Thaller J. The average predictive validity of intimate partner violence risk assessment instruments. J Interpers Violence. 2013 May;28(7):1537–58. DOI: https://doi.org/10.1177/0886260512468250
  31. Nicholls TL, Pritchard MM, Reeves KA, Hilterman E. Risk assessment in intimate partner violence: A systematic review of contemporary approaches. Partn Abus. 2013;4(1):76–168. 10.1891/1946-6560.4.1.76 DOI: https://doi.org/10.1891/1946-6560.4.1.76
  32. The EndNote Team. EndNote 20. Philadelphia (PA): Clarivate; 2020.
  33. Edens JF, Boccaccini MT. Taking forensic mental health assessment “out of the lab” and into “the real world”: introduction to the special issue on the field utility of forensic assessment instruments and procedures. Psychol Assess. 2017 Jun;29(6):599–610. DOI: https://doi.org/10.1037/pas0000475
  34. Campbell JM, Klugar M, Ding S, Carmody DP, Hakonsen SJ, Jadotte YT, et al. Diagnostic test accuracy: methods for systematic review and meta-analysis. Int J Evid-Based Healthc. 2015 Sep;13(3):154–62. DOI: https://doi.org/10.1097/XEB.0000000000000061
  35. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al.; QUADAS-2 Group. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011 Oct;155(8):529–36. DOI: https://doi.org/10.7326/0003-4819-155-8-201110180-00009
  36. Doebler P. mada: Meta-Analysis of Diagnostic Accuracy. R package version 0.5.10: https://CRAN.R-project.org/package=mada; 2020.
  37. Gatsonis C, Paliwal P. Meta-analysis of diagnostic and screening test accuracy evaluations: methodologic primer. AJR Am J Roentgenol. 2006 Aug;187(2):271–81. DOI: https://doi.org/10.2214/AJR.06.0226
  38. Harbord RM, Whiting P, Sterne JA, Egger M, Deeks JJ, Shang A, et al. An empirical comparison of methods for meta-analysis of diagnostic accuracy showed hierarchical models are necessary. J Clin Epidemiol. 2008 Nov;61(11):1095–103. DOI: https://doi.org/10.1016/j.jclinepi.2007.09.013
  39. Glas AS, Lijmer JG, Prins MH, Bonsel GJ, Bossuyt PM. The diagnostic odds ratio: a single indicator of test performance. J Clin Epidemiol. 2003 Nov;56(11):1129–35. DOI: https://doi.org/10.1016/S0895-4356(03)00177-X
  40. Reitsma JB, Glas AS, Rutjes AW, Scholten RJ, Bossuyt PM, Zwinderman AH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J Clin Epidemiol. 2005 Oct;58(10):982–90. DOI: https://doi.org/10.1016/j.jclinepi.2005.02.022
  41. Gordon M, Lumley T. forestplot: Advanced Forest Plot Using 'grid' Graphics. 2021. R package version 2.0.1: https://CRAN.R-project.org/package=forestplot
  42. R Core Team. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. https://www.R-project.org/; 2022.
  43. Dahle KP, Schmidt S. Prognostische validität des level of service inventory-revised: Vergleichende analyse bei jungen gewaltstraftätern mit migrationshintergrund aus einem muslimisch geprägten kulturkreis und einheimischen tätern = Predictive validity of the level of service inventory-revised: Comparative study of violent juvenile offenders with a migrant background from a predominantly Muslim cultural background and native German offenders. Forens Psychiatr Psychol Kriminol. 2014;8(2):104–15. 10.1007/s11757-014-0256-5 DOI: https://doi.org/10.1007/s11757-014-0256-5
  44. Manchak SM, Skeem JL, Douglas KS, Siranosian M. Does gender moderate the predictive utility of the Level of Service Inventory—Revised (LSI-R) for serious violent offenders? Crim Justice Behav. 2009;36(5):425–42. 10.1177/0093854809333058 DOI: https://doi.org/10.1177/0093854809333058
  45. Sowden JN, Olver ME. Use of the Violence Risk Scale-Sexual Offender Version and the Stable 2007 to assess dynamic sexual violence risk in a sample of treated sexual offenders. Psychol Assess. 2017 Mar;29(3):293–303. DOI: https://doi.org/10.1037/pas0000345
  46. Hanson RK, Lunetta A, Phenix A, Neeley J, Epperson D. The field validity of Static-99/R sex offender risk assessment tool in California. J Threat Assess Manag. 2014;1(2):102–17. 10.1037/tam0000014 DOI: https://doi.org/10.1037/tam0000014
  47. Jung S, Himmen MK, Velupillai N, Buro K. Does stalking behavior improve risk prediction of intimate partner violence? Vict Offender; 2021. pp. 1–18. 10.1080/15564886.2021.1963899 DOI: https://doi.org/10.1080/15564886.2021.1963899
  48. Seewald K, Rossegger A, Urbaniok F, Endrass J. Assessing the risk of intimate partner violence: Expert evaluations versus the Ontario Domestic Assault Risk Assessment. J Forensic Psychol Res Pract. 2017;17(4):217–31. 10.1080/24732850.2017.1326268 DOI: https://doi.org/10.1080/24732850.2017.1326268
  49. van Heesch B, Jeandarme I, Pouls C, Vervaeke G. Validity and reliability of the VRAG in a forensic psychiatric medium security population in Flanders. Psychol Crime Law. 2016;22(6):530–7. 10.1080/1068316X.2016.1168423 DOI: https://doi.org/10.1080/1068316X.2016.1168423
  50. Fitzgerald S, Gray NS, Alexander RT, Bagshaw R, Chesterman P, Huckle P, et al. Predicting institutional violence in offenders with intellectual disabilities: the predictive efficacy of the VRAG and the HCR-20. J Appl Res Intellect Disabil. 2013 Sep;26(5):384–93. DOI: https://doi.org/10.1111/jar.12032
  51. Harris GT, Rice ME, Quinsey VL. Violent recidivism of mentally disordered offenders: the development of a statistical prediction instrument. Crim Justice Behav. 1993;20(4):315–35. DOI: https://doi.org/10.1177/0093854893020004001
  52. Fazel S, Burghart M, Fanshawe T, Gil SD, Monahan J, Yu R. The predictive performance of criminal risk assessment tools used at sentencing: systematic review of validation studies. J Crim Justice. 2022;81:101902. DOI: https://doi.org/10.1016/j.jcrimjus.2022.101902
  53. Fazel S, Singh JP, Doll H, Grann M. Use of risk assessment instruments to predict violence and antisocial behaviour in 73 samples involving 24 827 people: systematic review and meta-analysis. BMJ. 2012 Jul;345:e4692. DOI: https://doi.org/10.1136/bmj.e4692
  54. Rice ME, Harris GT. Comparing effect sizes in follow-up studies: ROC Area, Cohen’s d, and r. Law Hum Behav. 2005 Oct;29(5):615–20. DOI: https://doi.org/10.1007/s10979-005-6832-7
  55. Harris GT, Rice ME. Characterizing the value of actuarial violence risk assessments. Crim Justice Behav. 2007;34(12):1638–58. 10.1177/0093854807307029 DOI: https://doi.org/10.1177/0093854807307029
  56. Fazel S, Wolf A, Yukhnenko D. Recidivism reporting checklist: Open Science Framework. Retrieved from https://osf.io/qvtfb/; 2019.
  57. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig L, et al.; STARD Group. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. Radiology. 2015 Dec;277(3):826–32. DOI: https://doi.org/10.1148/radiol.2015151516
  58. Trevethan R. Sensitivity, Specificity, and Predictive Values: Foundations, Pliabilities, and Pitfalls in Research and Practice. Front Public Health. 2017 Nov;5:307. DOI: https://doi.org/10.3389/fpubh.2017.00307
  59. Mills JF, Kroner DG. The effect of base-rate information on the perception of risk for reoffense. Am J Forensic Psychol. 2006;24(3):45–56.
  60. Fazel S, Bjørkly S. Methodological considerations in risk assessment research. In: Singh JP, Bjørkly S, Fazel S, editors. International perspectives on violence risk assessment. New York (NY): Oxford University Press; 2016. pp. 16–25. DOI: https://doi.org/10.1093/acprof:oso/9780199386291.003.0002
  61. Batastini AB, Hoeffner CE, Vitacco MJ, Morgan RD, Coaker LC, Lester ME. Does the format of the message affect what is heard? A two-part study on the communication of violence risk assessment data. J Forensic Psychol Res Pract. 2019;19(1):44–71. DOI: https://doi.org/10.1080/24732850.2018.1538474
  62. Ellman IM, Ellman T. Frightening and high: the Supreme Court’s crucial mistake about sex crime statistics. Const Comment. 2015;30:495–508. Available from: https://scholarship.law.umn.edu/concomm/419/
  63. Eher R, Olver ME, Heurix I, Schilling F, Rettenberger M. Predicting reoffense in pedophilic child molesters by clinical diagnoses and risk assessment. Law Hum Behav. 2015 Dec;39(6):571–80. DOI: https://doi.org/10.1037/lhb0000144
  64. Cartwright JK. Assessment and measurement of violence in adults with mental illness. Raleigh (NC): North Carolina State University; 2018.
  65. Hanson RK, Babchishin KM, Helmus LM, Thornton D, Phenix A. Communicating the results of criterion referenced prediction measures: risk categories for the Static-99R and Static-2002R sexual offender risk assessment tools. Psychol Assess. 2017 May;29(5):582–97. DOI: https://doi.org/10.1037/pas0000371
  66. Helmus LM. Sex offender risk assessment: where are we and where are we going? Curr Psychiatry Rep. 2018 May;20(6):46. DOI: https://doi.org/10.1007/s11920-018-0909-8
  67. Neller DJ, Frederick RI. Classification accuracy of actuarial risk assessment instruments. Behav Sci Law. 2013;31(1):141–53. DOI: https://doi.org/10.1002/bsl.2047
  68. Kroner DG, Hanson RK. Measuring what matters: Standardized risk levels for criminal recidivism risk. In: Liell G, Fisher M, Jones L, editors. Challenging bias in forensic psychological assessment and testing - Theoretical and practical approaches to working with diverse populations. London, UK: Routledge; 2023. pp. 95–110. 10.4324/9781003230977 DOI: https://doi.org/10.4324/9781003230977-7
  69. Gannon TA, Olver ME, Mallion JS, James M. Does specialized psychological treatment for offending reduce recidivism? A meta-analysis examining staff and program variables as predictors of treatment effectiveness. Clin Psychol Rev. 2019 Nov;73:101752. DOI: https://doi.org/10.1016/j.cpr.2019.101752
  70. UNODC. Prison Research - a pilot study on the causes of recidivism in Albania, Czechia and Thailand. Vienna: United Nations Office on Drugs and Crime; 2022.
  71. Yukhnenko D, Sridhar S, Fazel S. A systematic review of criminal recidivism rates worldwide: 3-year update. Wellcome Open Res. 2020 Nov;4:28. DOI: https://doi.org/10.12688/wellcomeopenres.14970.3
  72. Scurich N, John RS. The dark figure of sexual recidivism. Behav Sci Law. 2019 Mar;37(2):158–75. DOI: https://doi.org/10.1002/bsl.2400
  73. Psick Z, Simon J, Brown R, Ahalt C. Older and incarcerated: policy implications of aging prison populations. Int J Prison Health. 2017 Mar;13(1):57–63. DOI: https://doi.org/10.1108/IJPH-09-2016-0053

Most read articles by the same author(s)