Grading the Quality of Information and Synthesis of mHealth Evidence MPH Capstone Project Dr. Jaime Lee
Abstract
Background: Despite the growing mHealth evidence base, it comprises of literature with varying levels of methodological rigor due to the rapid pace of technology and the multi-‐disciplinary nature of mHealth. As such, a grading tool to assess the quality of information will help researchers improve the completeness, rigor and transparency of their research reports for mHealth interventions for the purpose of guidance development. Objective: To propose a grading tool to rate the quality of information in mHealth, and for synthesis of the available high quality information about a particular mHealth intervention. Methods: We performed a comprehensive search for published checklists used to assess quantitative and qualitative studies, evaluation of complex interventions in Medline, and author or reviewer guidelines of major medical journals including those specific for mHealth or eHealth. 85 items from 7 checklists were compiled into a comprehensive list and we recorded the frequency of each item across all checklists. Duplicate items and ambiguous items were removed. The grading tool was subjected to an extensive iterative process of feedback and revision. A preliminary validation study to assess inter-‐rater reliability and clarity of item descriptions was conducted. We tested the use of the tool on 2 papers, a peer-‐reviewed article and a grey literature article with 8 graduate students. Results: Items most frequently included in the checklists were identified. All items were grouped into two domains: 1) Reporting and methodology and 2) Essential mHealth criteria. Preliminary testing of the mHealth grading tool showed moderate agreement between the rater for scoring of items with overall kappa statistic of 0.48 for the grey literature piece and 0.43 for the peer-‐reviewed article. Conclusions: The mHealth grading tool was developed to improve the quality of information of mHealth studies.
Dr. Jaime Lee MPH Candidate, 2013
Acknowledgments
I would firstly like to thank the support of my advisor Dr. Alain Labrique for his constant support and guidance, and kindness for giving me the opportunity to work on this World Health Organization mHealth Technical Advisory Group project. It has been a fantastic experience and without him this capstone project would not be possible. To Dr. Smisha Agarwal and Dr. Amnesty Lefevre, it has been truly been wonderful to work with such a brilliant team – thank you, my friends. I must also thank the other members of the Johns Hopkins WHO mTAG team who gave invaluable advice and reviewed the multiple drafts: Dr. Larissa Jennings and Michelle Colder Carras. To my friends, Estefania, Hamish, Madelyn, Mariam, Melissa, Sam, Shaymaa and Sneha who tested the mHealth grading tool, thank you for taking the time out of your busy schedules to help me complete this capstone project. Finally, it is important to acknowledge that the WHO mTAG Quality of Information Task Force have given multiple rounds of feedback that have been critical to the development of the grading tool.
Disclosure Statement
This Capstone is based on work that I am currently doing as a part of the Johns Hopkins WHO mTAG team.
2
Dr. Jaime Lee MPH Candidate, 2013
Table of Contents Abstract ............................................................................................................................................. 1 Acknowledgments ......................................................................................................................... 2 Disclosure Statement .................................................................................................................... 2 1. Background and current mHealth evidence base .......................................................... 4 1.1 Current tools for grading quality of information ................................................................... 7 2. Objectives ..................................................................................................................................... 8 3. A new grading tool for mHealth research ......................................................................... 8 3.1 Grading Quality of Information .................................................................................................... 9 3.1.1 Methodology: Development of grading criteria .............................................................................. 9 3.2 Using the mHealth grading tool to assess quality of Information .................................. 11 3.3 Calculation of Quality Score and Quality of Information rating ...................................... 18 3.4 Synthesis of evidence ..................................................................................................................... 19 3.5 Convene an expert review panel ................................................................................................ 20 4. Preliminary validation of the mHealth grading tool and inter-‐rater reliability 21 4.1 Objectives .......................................................................................................................................... 21 4.2 Methodology ..................................................................................................................................... 21 4.3 Results ................................................................................................................................................. 22 5. Discussion .................................................................................................................................. 23 6. References ................................................................................................................................. 25 7. Appendices ................................................................................................................................ 27 8. Reflection on the Capstone Project ................................................................................... 39
3
Dr. Jaime Lee MPH Candidate, 2013
1. Background and current mHealth evidence base Mobile health, or mHealth, is defined by the World Health Organization (WHO) as “medical and public health practice supported by mobile devices, such as mobile phones, patient monitoring devices, personal digital assistants (PDAs), and other wireless devices. mHealth involves the use and capitalization on a mobile phone’s core utility of voice and short messaging service (SMS) as well as more complex functionalities and applications including general packet radio service (GPRS), third and fourth generation mobile telecommunications (3G and 4G systems), global positioning system (GPS), and Bluetooth technology” (1).
Mobile technologies offer an effective means of delivering healthcare services to underserved populations. With the overall improvements in telecommunications, there has been increasing enthusiasm in the use of mobile technologies for health from multiple sectors such as health, computer science, engineering, and telecommunications, to capitalize on the rapid uptake of mobile communication technologies. Whilst mHealth is still a nascent field, there are indications that it has shown promise to revolutionize health systems through (2): 1) Increased access to healthcare and health-related information, particularly for hard-toreach populations 2) Improved ability to diagnose and track diseases 3) Timely more actionable public health information 4) Increased access to ongoing medical education and training for health workers 5) Increased efficiency and lower cost of service delivery
There has been a growing body of literature documenting mHealth studies and initiatives. However, a number of literature reviews have noted the lack of rigorous, high quality evidence in the mHealth domain (3-6). The varying levels of rigor found in the current
4
Dr. Jaime Lee MPH Candidate, 2013 mHealth evidence base are attributable to two major factors: first, the multi-disciplinary nature of mHealth, which combines the health and technology worlds, and second, the rapid pace of development of technology.
The first factor refers to how the health industry and the technology industry use different methodology to assess an intervention, with different speed and ways of dissemination of findings. In the technology space, prototypes are usually assessed by proof-of-concept or demonstration studies with fast turn-around time for modification. Then these results are generally disseminated quickly in the grey literature, including white papers, conference papers, presentations and blogs. In contrast, the health field moves at a slower pace. In general, more emphasis is placed on methodological rigor and the timeframe for a study may be longer than in the technology industry. The majority of results in the field of health and public health are disseminated through peer-reviewed journals and conference papers, and a smaller proportion in the grey literature.
So this leads into the second issue. The time it takes for a study of high methodological rigor to be completed and then published in a peer-reviewed journal could take over a year, even up to a few years, for the findings to be disseminated. However with the rapid pace at which technology changes, a newer model or new technology may be available in a significantly shorter timeframe and hence the study results may potentially be less relevant to the mHealth field.
Consequently, the current mHealth evidence base varies in the quality of information that is disseminated in multiple forms from peer-reviewed literature to white papers, theses, reports, presentations and blogs. The World Bank reported that there are more than 500 mHealth
5
Dr. Jaime Lee MPH Candidate, 2013 studies in 2011 (7). For the purpose of guidance development, the varying quality of information will offer different levels of value to stakeholders. As such, the present mHealth evidence base is not sufficient to inform governments and industry partners to invest resources in nationally scaled mHealth interventions (3, 6).
The Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach is one method used to develop guidelines, and is being increasingly used by international organizations such as the World Health Organization (WHO) and the Cochrane Collaboration (8). The GRADE system has brought greater transparency and a systematic approach to rating the quality of evidence and grading the strength of recommendations (8). It is tool that was developed for systematic reviews of evidence on effectiveness. There are also other tools that have been developed to assess the reporting of systematic reviews and metaanalyses such as the PRISMA checklist (9), and to assess their methodological quality or reliability such as the SUPPORT tools (10) and AMSTAR (11).
However, synthesis of the mHealth evidence base, as to what works and what does not work, has yet to be rigorously assessed and established (3). Such information would provide a valuable contribution to guidance development. There have been a number of efforts to review and synthesize the mHealth evidence base using the grading tools previously mentioned (12-15). Free et al. conducted two recent systematic reviews (13, 14). In one systematic review of 42 controlled trials, the authors concluded that mobile technology interventions showed modest benefits for supporting diagnosis and patient management outcomes (14). They also reported in this study that none of the trials were of high quality and the majority of studies were conducted in high-income countries. In the other systematic review of 59 trials that investigated the use of mHealth interventions to improve disease
6
Dr. Jaime Lee MPH Candidate, 2013 management, and 26 trials examined their use to change behavior, Free et al. found that there was mixed evidence regarding the benefits of interventions (13). Text messaging interventions were shown to increase adherence to anti-retroviral medication in a low-income setting, and increased smoking cessation was demonstrated in high-income settings (13). While in other areas, the evidence suggested potential benefits of mHealth interventions.
Using the results of these systematic reviews for guidance development are limited due to the lack of high-quality trials with adequate power. Hence there has been a call of high quality evidence and a set of standards that identify the optimal strategies for delivering and informing scale-up of mHealth interventions (16).
1.1 Current tools for grading quality of information Systematic and transparent approaches to grading the quality of mHealth information are particularly important given the complexity of mHealth interventions and the need for adequate integration with the existing health system of a particular country. One challenge to grading public health interventions is that typically randomized controlled trials (RCTs) have been held as providing the highest quality of evidence that a particular strategy can yield a specific outcome or result, and non-‐randomized designs are often perceived as less useful to the synthesis of evidence. This may particularly affect grading of emerging and complex public health interventions, where RCTs may be infeasible or otherwise inappropriate (17). Evidence reporting and synthesis approaches such as MOOSE (18), TREND (19) and STROBE (20) provide suggestions for improving quality of reporting in observational studies, but do not provide a framework to grading strength of evidence, when data sources are varied and depend on mixed methods.
7
Dr. Jaime Lee MPH Candidate, 2013 The current tools used to evaluate the quality of quantitative or qualitative studies are very specific to the study design. For example CONSORT is specific to RCTs (21), STROBE is specific to observational studies (20), TREND is for non-randomized studies particularly for behavioral and public health interventions (19), and COREQ (22) is for qualitative studies. There is no grading tool that assesses the quality of information of mHealth research with specific mHealth criteria that can help develop recommendations.
2. Objectives The objective of this capstone is to propose a grading tool to rate the quality of information in mHealth, as part of a two-stage process: first, to identify the highest quality information generated by a range of methodologies (from qualitative to quantitative), reported according to the best standards for that methodology, and second, to provide the raw materials for a synthesis of the available, high quality information about a particular mHealth strategy.
3. A new grading tool for mHealth research The grading tool that we propose has been developed because there is a need to examine the Quality of Information (QoI) in mHealth studies. The majority of publications lack clarity, transparency and rigor in the conduct of the research, and there is a tendency, given the rapid pace of this emerging field, to report formative and even research findings in the non-peerreviewed literature (4, 5, 23). Hence, it is important to develop a grading tool to help researchers improve the completeness, rigor and transparency of their research reports and to facilitate the more efficient use of research findings for those seeking to select and implement
8
Dr. Jaime Lee MPH Candidate, 2013 mHealth interventions, potential funders of evaluation studies, and policymakers (23). Figure 1 shows an overview of mHealth guidance development. The evaluation of mHealth research requires a unique approach using a combination of quantitative and qualitative evaluation methods that take into account the context and setting of the mHealth intervention. This proposed approach is specific to mHealth research but has been designed to be easily understood and applied by anyone interested in assess evidence to strengthen health systems.
3.1 Grading Quality of Information 3.1.1 Methodology: Development of grading criteria The mHealth grading tool development process was designed to produce consensus among a broad constituency of experts and users on both the content and format of guideline items. We first performed a comprehensive search for published checklists used to assess the methodology of quantitative and qualitative studies, the evaluation of complex interventions, guidelines for reporting quantitative and qualitative studies in Medline, and author or reviewer guidelines of major medical journals including those specific for mHealth or eHealth.
We extracted all criteria for assessing quantitative or qualitative studies from each of the included publications. Duplicated items were excluded. 85 items from 7 checklists (19-22, 24-26) were compiled into a comprehensive list for reporting and methodology criteria. We recorded the frequency of each item across all the publications (see Appendix 1). For the mHealth criteria, we generated additional items through literature searches. We subjected consecutive drafts to an extensive iterative process of consultation.
9
Dr. Jaime Lee MPH Candidate, 2013 We grouped all items into two domains: 1) Reporting and methodology criteria, and 2) Essential mHealth criteria (Box 1). Within each domain we reviewed all relevant items and simplified the criteria by rephrasing items for clarity and removing duplicates and items with ambiguous definitions.
We drafted a provisional list of items deemed important to be included in the checklist. This draft checklist was used to facilitate discussion at a December 2012, there was a 3-day WHO mHealth Technical Advisory Group meeting of 18 global representatives. All participants discussed the draft checklist and it was subjected to intensive analysis, comment and recommendations
for
change.
Furthermore, five members of the Quality of Information Task Force reviewed the first draft in depth and applied the checklist to various pieces of literature. Then they provided additional feedback.
Box 1. Overview of mHealth Grading tool Domain 1: Reporting and Methodology Criteria A. Essential criteria for ALL studies B. Essential criteria based on type of study (choose at least 1 of the following:) i. Quantitative ii. Qualitative iii. Economic evaluation Domain 2: Essential mHealth Criteria
After the meeting we subsequently revised the checklist. During this process, the coordinating group (ie. the authors of the present paper) met on six occasions and held several telephone conferences to revise the checklist. We obtained further feedback of the checklist from more than 10 people.
10
Dr. Jaime Lee MPH Candidate, 2013 Figure 1: Overview of proposed mHealth guidance development
Articulation of mHealth intervention for review
Systematic access of relevant information in database
Use mHealth grading tool to rate quality of information for each study based on Methodology and Reporting criteriia (Domain 1) and mHealth criteria (Domain 2)
Synthesis of evidence -‐ summarize the quality of information for every study across both Domains
Convene expert panel to assess the overall quality of information, develop recommendations and identify evidence gaps
Consensus statement on mHealth intervention based on the quality of information, and direction, consistency and magnitude of evidence
3.2 Using the mHealth grading tool to assess quality of Information The mHealth grading tool for assessing the QoI is a flexible approach that allows the grading of reporting and methodology for varied study designs (Table 1) As indicated in Box 1, all evidence under consideration must be scored against the “Essential criteria”. After that, the evidence can be classified as qualitative or quantitative or economic evaluation based on the
11
Dr. Jaime Lee MPH Candidate, 2013 methodology employed for the study. A detailed description of the steps in grading quality of information is presented in Box 2 and an example of using the mHealth tool to grade an article is shown in Appendix 2.
Box 2: How to use the mHealth grading tool Step 1: In Domain 1 Part A, apply the criteria to all studies. Step 2: For Domain 1 Part B, you can choose to apply 1 or more of the following criteria, as appropriate to the mHealth study: i. Quantitative ii. Qualitative iii. Economic evaluation Step 3: In Domain 2, apply all essential mHealth criteria to all studies. Step 4: Record the scores in the Scoring Summary Grid (Table 2) Step 5: Calculate the Quality Score for Domains 1 and 2 (Quality score = # points / maximum score* X 100%) Step 6: Based on the calculated Quality Score, you can determine the Quality of information for each domain separately as Weak 75%. Step 7: Steps 1 to 6 can be repeated for every study identified for a particular mHealth intervention. * The maximum score for Domain 1 will depend on which set/s of criteria were applied in Part B i.e. If it is a quantitative study, the maximum score for Domain 1 is 38. But if it is a study with quantitative and qualitative methods, then the maximum score for Domain 1 is 41.
12
Dr. Jaime Lee MPH Candidate, 2013 Table 1. Grading criteria of assessing Quality of Information from mHealth studies
Domain 1: Reporting and Methodology Criteria A. Essential criteria for all studies Criteria
Introduction Rationale/ scientific background Objectives/ hypotheses Intervention model and theoretical considerations
Item no.
Description
1
Scientific background and explanation of rationale
2
Specific objectives or hypotheses
3
Description of development and piloting of intervention and any theoretical support used to design the intervention (how the intervention is intended to bring about change in the intended outcomes)
4
Clear description and justification of chosen study design, especially if the design has been chosen based on a compromise between internal validity and the complexity and constraints of the research setting or research question
5
Clearly defined primary and secondary outcome measures to meet study objectives
6
Description of data collection methods, including training and level of data collection staff
7
Eligibility criteria for participants
8
Method of participant recruitment (eg. Referral, selfselection), including the sampling method if a systemic sampling plan was implemented
9
Method of participant allocation is clearly described
10
Information presents a clear amount of sampling strategy
11
Justification for sample size is reported
Setting and locations Comparator
12
Settings and locations where the data were collected
13
Describes use of a comparison group from similar population with regard to socio-demographics or adjusts for confounding
Data sources/ measurement
14
Describes the source of data for each variable of interest and detailed measurement criteria for the data
15
Enrollment: the numbers of participants screened for
Methodology Study design
Participants
Sampling
Results Participants
Score as: 1 – Found/ Met 0 – Not found/ not met
13
Dr. Jaime Lee MPH Candidate, 2013 eligibility, found to be eligible or not eligible, declined to be enrolled, and enrolled in the study 16
Assignment: the number of participants assigned to a study condition and the number of participants who received each intervention
17
Analysis: the number of participants included in, or excluded from, the main analysis, by study condition
Recruitment
18
Dates defining the periods of recruitment and follow-up
Baseline data
19
Baseline demographic and clinical characteristics of participants in each study cohort
Fidelity
20
Degree to which the intervention is implemented as planned with a description of adherence, exposure, quality of delivery, participant responsiveness and program differentiation
Context
21
Description of the organizational, social, economic and political context in which the intervention is developed and operated
Attribution
22
The link between the intervention and outcome is reported
Bias
23
The risk of biases is reported
24
The risk of confounding is reported
25
Ethical and distributional issues are discussed
Discussion Summary of evidence
26
General interpretation of the results in the context of current evidence and current theory
Limitations
27
Discussion of study limitations, addressing sources of potential bias, imprecision, and, if relevant, multiplicity of analyses
Generalizability
28
Generalizability (external validity) of the study findings, taking into account the study population, the characteristics of the intervention, length of follow-up, incentives, compliance rates, and specific settings involved in the study and other contextual issues
Conclusions/ interpretation
29
Interpretation of the results, taking into account study hypotheses, sources of potential bias, imprecision of measures, and other limitations or weaknesses of the study
30
Discussion of the success of, and barriers to, scaling up
Ethical considerations
14
Dr. Jaime Lee MPH Candidate, 2013 the intervention
Other Funding Competing interests
31
Discussion of research, programmatic or policy implications
32 33
Sources of funding and role of funders Relation of the study team towards the intervention being evaluated i.e. developers/sponsors of the intervention
Subtotal of Quality Points for Essential criteria for all studies (out of 33):
B. Essential criteria based on type of study - Must choose at least 1 of the following criteria to apply as appropriate: i. Quantitative ii. Qualitative iii. Economic evaluation Criteria
i. Quantitative Statistical methods
Outcomes and estimation
Item no.
Description
34
Statistical methods used to compare groups for primary and secondary outcomes
35
Methods for additional analyses, such as subgroup analyses and adjusted analyses
36
Methods of imputing or dealing with missing data
37
For each primary and secondary outcome, study findings are presented with for each study cohort, and the estimated effect size and confidence interval to indicate the precision
38
Estimate for random data variability and outliers are clearly stated
Score as: 1 – Found/ Met 0 – Not found/ not met
Subtotal for Quantitative study design (out of 5) ii. Qualitative N/A Analytical methods Use of verification methods to demonstrate credibility Reflexivity of account provided
39 40
41
Analytical methods clearly described (In-depth description of analysis process, how categories/themes were derived) Discusses use of triangulation, member checking (respondent validation), search for negative cases, or other procedures Relationship of researcher/study participant has been discussed, examining the researcher’s role, bias, or potential influence Subtotal for Qualitative study design (out of 3):
15
Dr. Jaime Lee MPH Candidate, 2013 iii. Economic evaluation 42
Competing alternatives clearly described (e.g. costeffectiveness of zinc and ORS for treatment of diarrhea versus standard treatment with ORS alone)
43
The chosen analytic time horizon is reported
44
The perspective / viewpoints (e.g. societal, program, provider, user, etc.) of the analysis is clearly described
45
The alternatives being compared are clearly described
46
The sources of effectiveness estimates are clearly stated
47
Details of the design and results of the effectiveness study and/or methods for effect estimation are clearly stated
48
Methods for estimation of quantities and unit costs are described
49
Details of currency of price adjustments for inflation or currency conversion are given
50
Currency and price data are recorded
51
The choice of model used and the key parameters on which it is based are reported
52
The discount rate(s) are reported
53
Sensitivity analyses are reported
54
Incremental analyses are reported
55
Major outcomes are presented in a disaggregated as well as aggregated form Subtotal for Economic Evaluation (out of 14):
Domain 2: Essential mHealth Criteria for all studies Criteria
Item no.
Infrastructure
56
Technology architecture Intervention
57
58
Description
Score as: 1 – Found/ Met 0 – Not found/ not met
Clearly presents the availability or kind of infrastructure to support technology operations (eg. electricity, access to power, connectivity) Describes the technology architecture including the software and hardware mHealth intervention is clearly described with frequency and mode of delivery of intervention (i.e. SMS, face-toface, interactive voice response) for replication
16
Dr. Jaime Lee MPH Candidate, 2013 59
Details of the content of the intervention are clearly described or link is presented and content is publically available
Usability
60
Clearly describes the ability of different user groups to successfully use the technology in a given context eg. literacy, computer/Internet literacy, ability to use device
User feedback
61
Describes user feedback about the intervention
Identifies constraints
62
mHealth solution states one or more constraints in the delivery of current service, intervention, process or product
Access and affordability
63
Presents data on the access and affordability of the mHealth solution from varying user perspectives
Cost assessment
64
Presents basic costs assessment of the mHealth intervention from varying perspectives
Training inputs
65
Clearly describes the training inputs for the adoption of the mHealth solution
Strengths and limitations
66
Clearly presents mHealth solution considerations, both strength and limitations, for delivery at scale
Language adaptability
67
Describes the adaptation, or not, of the solution to the local language
Replicability
68
Clearly presents the source code/screenshots/flowcharts of the algorithms/ examples of messages to ensure replicability
Data security
69
Describes the data security procedures/ confidentiality protocols Subtotal for mHealth criteria (out of 14):
17
Dr. Jaime Lee MPH Candidate, 2013
3.3 Calculation of Quality Score and Quality of Information rating After using the grading tool to assess an mHealth study, record the scores into the Scoring Summary Grid (Table 2) to calculate the Quality Score for Domains 1 and 2.
The quality of information is defined under 2 areas: 1) Domain 1: Reporting and Methodology – This is indicative of the quality of methodological rigor employed by the studies under consideration, as well as the reporting standards that have been adhered to. 2) Domain 2: Essential mHealth criteria – Classifies the studies under consideration based on the quality of information presented about the mHealth intervention.
The Quality Score for each domain is calculated using the formula: Quality score =
!"#$%& !" !"#$%& !"#$%& × 100% !"#$%&% !"#$% !"# !"#$%&
For Domain 1, the maximum will depend on which set/s of criteria were applied in Part B. That is, if it is a quantitative study, the maximum score for Domain 1 is 38. But if it is a study with quantitative and qualitative methods, then the maximum score for Domain 1 is 41. So then the quality score will be calculated accordingly. Domain 2 is more straightforward as the maximum score is set at 14 quality points.
Then based on the Quality Score, you can determine the Quality of Information rating for each domain as Strong (>75%), Moderate (50% to 75%) or Weak (