Abstrackr: Open-source, Web-based Software for (Semi-automated?!) Abstract-screening
AHRQ's 2012 Annual Conference Slide Presentation
Select to access the PowerPoint® presentation (5 MB).
Slide 1
abstrackr
open-source, web-based software for (semi-automated?!) abstract-screening
This project is currently supported by funding from AHRQ, grant number R01HS018494.
Slide 2
Abstract screening
an unpleasant necessity
Image: A photograph shows a reviewer sitting down to screen a large pile of abstracts stacked on the table in front of him. He is visibly unhappy with having to undertake this task.
Slide 3
abstrackr makes life easier!
- Upload citations:
- PMIDs, RefMan XML, RIS, tab-delimited files.
- Invite collaborators.
- Screen:
- abstrackr handles allocation and prioritizes screening citations likely to be relevant.
- Export screening decisions.
Slide 4
But wait, there's more!
- Single- or double-screening:
- All labels are exported; you can see who screened which abstracts.
- Resolve conflicts (if screeners disagree).
- See how many unscreened abstracts are likely to be relevant.
- Add notes about studies and/or tags to them.
Slide 5
Image: A screenshot shows the abstrackr Web application. An abstract (which happens to be about vampires) is centered in the browser window. The 'abstract panel' comprises the text of the abstract and some meta-data such as keywords and authors associated with the current citation. There are controls surrounding the abstract that allow the user to annotate it. Specifically, to the left of the panel displaying the abstract there is a 'tags' box, which includes one tag that has already been entered (it reads 'vampires'). There are two buttons beneath this tag: a 'tag study' button and an 'edit tags' button. Beneath the abstract panel, there are three buttons that allow the user to mark an abstract as an accept, a reject, or a 'borderline' (maybe) case. Beneath these buttons there is a text box for the user to enter relevant 'terms' that suggest a document should be included (or excluded). These terms will be highlighted in any abstracts in which they occur; terms indicative of inclusion will be colored green, while those designated as indicative of exclusion will be red. Finally, there are two buttons above the abstract that allow the user to review the labels she has provided so far or to review the terms that she has provided.
Slide 6
Image: A screenshot shows the abstrackr Web application as it appears in Slide 5; the screen is now annotated with arrows to point out the functionality features described above: tags, words of interest, and how to make screening decisions.
Slide 7
Image: A screenshot shows the user interface for 'note-taking'. A "pop-up" note is placed in front of the abstract panel; this comprises several text boxes, including one labeled 'general notes' and one for each PICO (Population, Intervention, Comparator, Outcome) elements. Users may take structured notes (general) or unstructured notes (for the PICO elements).
Slide 8
A few words on semi-automating screening via machine learning
Slide 9
Image: An illustration of the work-flow in the abstrackr program is shown. The figure includes a cartoonish image of a contemplative person (reviewer) deciding if an abstract is relevant or not. This reviewer began screening by issuing an initial query to a database, as indicated by an arrow pointing from the user to a database icon representative of PubMed®. An arrow then points from the database to a pile of documents which in turn points to a block representing the abstrackr program: this indicates that the documents have been exported from PubMed into abstrackr. An iterative process between the reviewer and abstrackr is then depicted; the reviewer makes screening decisions which are fed into the system, and the system then selects the next citation he should label.
Slide 10
Semi-Automating Abstract Screening via Supervised Machine Learning
Image: The "supervised machine learning paradigm" via machine learning is depicted. The (human) expert labels a sample of the data to be classified, and this labeled data is used to induce a classification model. This process begins with an unlabeled set of data (in our case, biomedical abstracts) and the expert then begins screening. Whatever the expert has screened is used to build a (predictive) classification model.
Slide 11
Results (Updating Reviews)
Image: A graph plots of results achieved using machine learning to semi-automate screening for updating an existing systematic review. The plot is in ROC space, so the x-axis is specificity, the y-axis is sensitivity. There are four points on the plot, representing the performance of classification system on four datasets (AlzGene, PDGene, SZGene and CEARegistry). The points representing the first three of these are at 100% sensitivity (very top) and ~90% specificity (very near the left), the other (the point representing CEARegistry) is at 99% sensitivity and ~75% specificity.
Slide 12
Image: A screenshot in abstrackr demonstrates the integration of the machine learning component with the system as a whole. Specifically, the screenshot shows a page accessible for reviews that plots a histogram of the predicted likelihoods of the remaining abstracts being relevant, according to the machine learning model. The y-axis is the (predicted) probability of inclusion and the x-axis comprises the 'bins'. The density is centered around .2-.3, indicating that most of the remaining studies in this review are probably going to be excluded.
Slide 13
And now a brief demo
- http://abstrackr.tuftscaes.org1
- byron_wallace@brown.edu.
1 The url will change to Brown soon, but we'll re-direct.
