Background
A Measurement Tool to Assess Systematic Reviews (AMSTAR2) is a 16-item scale to critically appraise systematic reviews (SR) of healthcare interventions (Shea et al., 2017). AMSTAR2 classifies the overall confidence in SRs as high, moderate, low or critically low according to a combination of scores on seven critical and nine non-critical items. Recent studies show that AMSTAR2 assigns mostly low to critically low ratings to SRs of various healthcare interventions (De Santis & Kaplan, 2020; Lorenz et al., 2020; Matthias et al., 2020) although the methods to derive the overall ratings are often not adequately reported (Pieper et al., 2021). One reason is that AMSTAR2 users may attempt to improve the discriminatory power of the scale by applying other algorithms to derive the overall confidence ratings.
Objective
Our objective was to assess the application of AMSTAR2 in studies with homogenous designs (overviews of SRs) and aims (appraisal of SRs) in one clinical field (interventions for mental and behavioural disorders, MBD).
Research question
The research question was to descriptively assess how AMSTAR2 is applied in overviews of SRs of interventions for MBD.
Method/Approach
We registered the study protocol (https://osf.io/g5437/) and adhere to the STROBE guidelines (von Elm et al., 2007).
Study design/setting. We utilise a cross-sectional design to descriptively assess how AMSTAR2 was applied in one group of studies published in peer-reviewed journals.
Data source. The studies were independently selected by two authors from a title/abstract search for ‘AMSTAR2’ in Medline, Epistemonikos and CINHAL up to September 2020. The inclusion criteria were:
- Study design: overview of SRs,
- PICO: a) Population with MBD, b) Intervention: any pharmacological and/or non-pharmacological (complementary or alternative), c) Control: any or none, d) Outcome: any clinical,
- AMSTAR2: at least one appraisal conducted.
Variables/Measurement. Unlike planned, two (rather than one) authors independently coded all data using a self-developed coding sheet in Excel and reached consensus during discussion. The variables included study characteristics (citation, intervention), data (SR type and age) and outcomes (AMSTAR2 appraisal methods and results).
Data analysis. All data were analysed using descriptive statistics.
Results
Sample size. Of the 267 studies from the electronic literature search, 14 overviews met our eligibility criteria.
Study characteristics. All 14 overviews used AMSTAR2 to appraise the confidence in SRs of predominantly non-pharmacological interventions for MBD. The overviews were written on average by seven authors, published in 2019 – 2020 and originated from Asia (9/14), Europe (3/14), Australia and North America (1/14 each).
Study data. The 14 overviews appraised 286 SRs (4 – 64 SRs/overview) of randomised-controlled trials (RCT) or non-RCTs. Most SRs were non-Cochrane (94%, 270/286) and 49% (145/286) were published within five years since the overview.
Study outcomes. While most overviews (6/14) had a protocol, only a minority (2/6) applied the planned appraisal methods. On average 20 AMSTAR2 appraisals were conducted either independently by at least two authors (11/14) or using other methods (3/14). The overall confidence ratings were either based on seven critical items (10/14), other unspecified methods (2/14) or not derived (2/14). The overall confidence ratings based on seven critical items in 187 SRs appraised in 10 overviews were critically low (65%, 121/187), low (25%, 46/187), moderate (1%, 2/187) or high (10%, 18/187). Most overviews (12/14) discussed the strengths or weaknesses of appraised SRs based on the individual AMSTAR2 items and half applied the GRADE approach for SR appraisal.
Conclusions and implications
Most overviews in our study applied AMSTAR2 appropriately in terms of study selection (SRs of interventions for MBD), appraisal conduct (independently by two authors) and report of individual item scores per SR. However our results are based on a small sample of overviews in one clinical field and may not reflect how researchers conduct AMSTAR2 appraisals in other fields. Although most overviews reported the overall confidence ratings, the methods to derive these ratings were incompletely reported, similar to findings in other meta-research studies (Pieper et al., 2021). We show that AMSTAR2 continues to assign mostly critically low ratings to SRs of various healthcare interventions in accordance with other studies (De Santis & Kaplan, 2020; Lorenz et al., 2020; Matthias et al., 2020). The low confidence ratings could be due to inadequate reporting of information required for scoring of AMSTAR2 items, especially considering that about half of all appraised SRs were older than AMSTAR2 (published prior 2017) in our study. It can be speculated that the standards for SRs have changed since 2017 because some journals now require preregistration, explicit adherence to reporting guidelines and offer online space for additional materials. Rather than focusing entirely on the poor overall confidence ratings, most overviews addressed the reasons for the weaknesses in their SRs based on the individual AMSTAR2 items. Thus, AMSTAR2 could guide the preparation of future SRs of healthcare interventions (De Santis & Kaplan, 2020). The overall confidence ratings on AMSTAR2 may become more useful at discriminating among SRs if authors adhere to the available reporting guidelines and peer-reviewers and journal editors provide sufficient advice before SRs are published.
References
De Santis, K., & Kaplan, I. (2020). Assessing the quality of systematic reviews in healthcare using AMSTAR and AMSTAR2: A comparison of scores on both scales. Zeitschrift für Psychologie, 228(1), 36-42.
Lorenz, R. C., et al. (2020). AMSTAR 2 overall confidence rating: lacking discriminating capacity or requirement of high methodological quality? Journal of Clinical Epidemiology, 119, 142-144.
Matthias, K., et al. (2020). The methodological quality of systematic reviews on the treatment of adult major depression needs improvement according to AMSTAR 2: A cross-sectional study. Heliyon, 6(9), e04776.
Pieper, D., et al. (2021). Authors should clearly report how they derived the overall rating when applying AMSTAR 2-a cross-sectional study. Journal of Clinical Epidemiology, 129, 97-103.
Shea, B. J., et al. (2017). AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. British Medical Journal, 358, j4008.
von Elm, E., et al. (2007). The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: Guidelines for Reporting Observational Studies. PLOS Medicine, 4(10), e296.