CORONA Project

CORONA (COvid19 Registry of Off-label & New Agents) Project

Director/Lead Investigator: David Fajgenbaum, MD, MBA, MSc

About the Project

The CDCN launched the CORONA Project in March 2020 to identify and track all treatments reported to be used for COVID-19 in an open-source data repository. The CORONA team has reviewed 29,353 papers and extracted 2,399 papers on 590 treatments administered to 437,936 patients with COVID-19, and in continuing to help researchers to prioritize treatments for clinical trials and inform patient care.

All data is made available via our Tableau-based CORONA Viewer that has recorded over 23,000 views to date and counts staff at Google Health, HHS, FDA, and NIH among regular users.  You can visit the CORONA database viewer built and managed by Matt Chadsey, owner of Nonlinear Ventures below.

We have also written some high level articles about the project and its findings in our CORONA INSIGHTS blog. 

 

Goals

The overarching goal of the CORONA project is to advance effective treatments for COVID-19 by highlighting the most promising treatments to pursue, informing optimal clinical trial study design (sample size, target subpopulations), and determining if a drug should move forward to widespread clinical use. While there have been several notable failures in drug repurposing for COVID-19, a handful of drugs, including dexamethasone, heparin and baricitinib, have likely helped save thousands of lives and more treatments are urgently needed for newly diagnosed and soon to- be-diagnosed COVID-19 patients while vaccinations are underway and potential SARS-CoV-2 variants emerge. We’ve identified ~10 additional promising medications that, in collaboration with researchers at the FDA, NIH and elsewhere, we’re hoping to move into randomized control trials.

To further pursue our goal by the end of 2021, we will be expanding our work to also integrate pre-clinical and randomized controlled trial data. We believe that this expanded focus will allow us to more quickly advance promising treatments to clinical care that are supported by broader research findings.  

Key Milestones

FAQ

What data sources are used?

  • Data regarding clinical trial registrations come from international trial registration websites, such as clinicaltrials.gov. Information from these websites are aggregated by COVID-NMA and the aggregated data are exported for use in the CORONA Project.
    • COVID-NMA: Thu Van Nguyen, Gabriel Ferrand, Sarah Cohen-Boulakia, Ruben Martinez, Philipp Kapp, Emmanuel Coquery, … for the COVID-NMA consortium. (2020). RCT studies on preventive measures and treatments for COVID-19 [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4266528 
  • Information about treatment guidelines, such as those issues by the National Institutes of Health, are found on the website of the institutions providing guidelines. 
  • Data from published papers comes directly from the publications, as extracted by the CORONA Project’s data coordinators. Data are extracted as provided by the studies’ authors. However, errors may occur. Please report any errors or concerns to tsikora@pennmedicine.upenn.edu

What is the Research Prioritization grade?

The Research Prioritization (RP) grade is based on evidence from published randomized controlled trials. An evidence synthesis is performed for each drug or combination of drugs. If drugs are used in both the inpatient and outpatient settings, separate evidence syntheses are performed for each clinical setting. Each drug-setting receives an overall RP grade consisting of 1) a treatment effect assessment (likely beneficial, benefit unknown, not likely beneficial) and 2) certainty of evidence assessment (high certainty, moderate/low certainty). These grades are presented as letter grades for simplified comparison. The purpose of the RP grade is to identify promising drugs that require further evaluation.

What does the Research Prioritization grade indicate?

The purpose of the Research Prioritization grade is to identify promising drugs that require further evaluation. “A” treatments are likely beneficial with high certainty. They are generally well-studied and have a known likelihood of benefit. These treatments rarely need further research. “B” treatments are likely beneficial with moderate to low  certainty. They have indicators of benefit, but this determination may be based on insufficient data. These treatments are well-positioned for further research to better characterize efficacy and should generally be prioritized for additional clinical trials. “B/C” treatments may have a trend towards benefit, but the data synthesis indicates that the benefit is not yet clear. “C” treatments are not likely beneficial with moderate to low certainty. They do not appear beneficial based on the available data, but there may be insufficient data to make a robust determination. These treatments are generally not prioritized for further research. “D” treatments are not likely beneficial with high certainty, based on a substantial amount of previous research. In general, “D” treatments do not require further research. 

Letter Grade Definition
A Likely beneficial with high certainty
B Likely beneficial with moderate to low certainty
B/C Grade uncertain
C Not likely beneficial with moderate to low certainty
D Not likely beneficial with high certainty

What is the Treatment Efficacy grade?

The Treatment Efficacy (TE) grade is based on evidence from high-quality randomized controlled trials. By limiting data to well-run, peer-reviewed studies, this grade is intended to provide an assessment of the available evidence for use in a clinical setting. In contrast, the Research Prioritization grade includes pre-prints, small studies, etc. that may provide valuable signals about a treatment’s efficacy but are unlikely to be robust enough to guide treatment recommendations with high certainty. In addition to this algorithmic grade, information on international treatment guidelines, such as from the NIH and IDSA, are provided alongside for further context. Note that some treatments are recommended for particular subgroups, such as those on mechanical ventilation. If treatment guidelines recommend that a drug is best used in a particular setting, this will be noted in the CORONA viewer along with the treatment efficacy letter grade. The purpose of the TE grade is to serve as an indicator of whether sufficient data exist in support of using a treatment in clinical practice at this time for COVID-19.

How are the Research Prioritization and Treatment Efficacy grades formulated?

The grades are both based on two components, a likelihood of benefit assessment and a certainty of evidence assessment, as described below. 

Likelihood of benefit assessment 

The treatment effect portion of the overall grade is currently calculated using two methods:

  • Fisher’s method p-value: Combining P-values of the primary endpoints of each study for a given treatment. (as described in https://training.cochrane.org/handbook/current/chapter-10) We analyze p-values across all studies using Fisher’s method, where Pi is the one-sided p-value for study i. The test statistic follows a chi-squared distribution, which can be used to calculate a p-value for the null hypothesis that there is no evidence of an effect in at least one study. A p-value of <0.05 indicates that there is evidence of an effect in at least one study. Note that if a study provides a one-sided p-value, it is assumed that the hypothesis is in favor of the treatment under investigation.
  • Proportion of studies with beneficial effect direction: Summarizing the effect direction of the primary endpoints of each study. For example, if mortality is lower in the treatment group, the treatment is noted as having a “beneficial” effect for that given endpoint. (“Beneficial” is used only to indicate the direction of effect, not it’s magnitude or significance)

We are considering including other methods of synthesizing treatment effect, including a true meta-analysis or an analysis focusing on summarizing a select number of effect estimates. However, both methods would be best used if comparing identical endpoints across multiple studies – e.g. mortality. We are extracting multiple data points from all publications, so it is possible to explore the methods in the medium-term. 

Certainty of evidence assessment 

The certainty of evidence for a given treatment is currently determined by looking at three metrics:

  • Precision: Sample size across all peer-reviewed trials listed in PubMed included in the evidence synthesis (≥500 total patients treated is required for a high certainty of evidence). When calculating total sample size, we consider only articles that were peer-reviewed (to minimize the impact of erroneous results from pre-prints) and listed in PubMed (to minimize the impact of results from unreputable journals). However, pre-prints and those not listed in Pubmed are still used for drug effect estimate analysis
  • Directness of evidence: Number of trials included in the evidence synthesis compared to placebo or standard of care. ≥50% of trials included in the evidence synthesis must be compared to placebo or standard of care for a high certainty of evidence. 
  • Publication quality: Number of trials included in the evidence synthesis published in a journal that is indexed in PubMed. ≥75% of trials included in the evidence synthesis must be published in a journal that is indexed in PubMed for a high certainty of evidence. 

The certainty of evidence  for a given treatment is determined by incorporating each of the factors above. A “high” certainty of evidence assessment is for treatments that meet the threshold for all the factors above. A “moderate” certainty of evidence is reserved for treatments that meet some but not all the thresholds, and a “low” certainty of evidence is assigned to treatments that meet none. 

What does a trial ID starting with XXX indicate?

Trials that have a trial ID beginning with XXX were not known to have a trial registration number. This may be because the published paper did not indicate a trial registration number and data coordinators could not link it to a known trial. It may also occur if a trial was run without being registered. Errors may occur during the data extraction process. Please report any errors or concerns to tsikora@pennmedicine.upenn.edu

How can the data be used?

Data may be used freely and no permission is required. However, we do request attribution when using the manually extracted data (including study endpoint values, p-values, effect estimates, etc.).

Who should I contact with questions or concerns?

The data in this project are extracted manually from published papers by data coordinators. Errors may occur. Please report any errors or concerns to tsikora@pennmedicine.upenn.edu

We’re hiring:

  • Managing Director – more information here.
  • Bioinformatician/Computational Biologist – more information here.
  • Research Coordinator – more information here.

Acknowledgments

We are grateful to our current and past team members who have taken on this urgent need:

Dr. David Fajgenbaum, CSTL Director

Johnson Khor, Former CSTL Coordinator and COVID19 Project Lead

Sheila Pierson, CSTL Clinical Research Director

Dr. Ruth-Anne Langan, Former Penn Immunology PhD Student and COVID19 Data Analyst

Alek Gorzewski, ACCELERATE Data Analyst and COVID19 Data Analyst

Avery Tamakloe, ACCELERATE Data Analyst and COVID19 Data Analyst

Victoria Powers, ACCELERATE Data Analyst and COVID19 Data Analyst

Rozena Rasheed, Former BioBank Coordinator

Mileva Repasky, CDCN Chief Patient and Development Officer and COVID19 Data Analyst

Joseph Kakkis, UPenn BA Candidate and COVID19 Data Analyst

Alexis Phillips, Former CDCN Biomedical Leadership Fellow, UPenn Med student

Ania Korsunska, CDCN Biomedical Leadership Fellow, Syracuse University PhD Candidate

Our phase I team of data analysts:

Alex Beschloss, Madison McCarthy, Anne Taylor, Erin Napier, Lia Keyser, Dr. Duncan Mackay, Alex Beschloss, Anna Wing,  Ashwin Amurthur,  Beatrice Go,  Casey Kim,  James Germi,  Joanna Jiang,  Laura Miyarnes,  Michael Mayer,  Philip Angelides,  Sarah Frankl,  Vivek Nimgaonkar, Steve Bambury, Bruna Martins, Megan Fisher, Karen Gunderson, Cornelia Keyser,  and Nick Goodyear.

And our Phase II team members who have continued this important work:

Abhishek Goel, Aura Enache, Ariana Weiss, Akash Hemanth, Alana Rush, Alex BEschloss, Allie Liebmann, Ambika Menon, Anna Sloan, Biliana Veleva Rotse, Anna Butkiewicz, Anne Taylor, Annika Balakrishnan, Abhi Rao, Ariel Gordon, Ayelet Rubenstein, Bruna Martins, Brent Flanders, David Bright, Casey Alsumairi, Cheng Cheng, Christopher Gaeta, Christopher Reynolds, Claire Exaus, Josh Deffenbaugh, Dave Shanmugasundaram, Debra Kuykendall, Denise Wynett, Derek Ansel, Dorothy Kenny, Ekaanth Sravan, Elena Lazarus, Evaggelia Nassis, Florence Porterfield, Francisco Delgado, Gary Gravina, Hannah Miller, Dr. Noel Rodriguez, Hasan Saifee, Kate Chang, Helen Eisenach, HEather Farley, Henry Reinstein, Hadis Williams, Jayson Wisk, Jenna Pacheco, Jiwoo Kim, Josh Lefkowitz, Josh Qian, Karen Gunderson, Karin Kent, Kevin Freiert, Karen Garzon, KiBoem Kwon, Kim Firn, Kristen Stegeland, Kyra Taub, Lindsey Muratore, Lisa Stewart, Lou Scarmoutzos, Maggie Reilly, Shan Mukujina, Monica Mayorga, Meggie Goodridge, Meryt Hanna, Mi Zhang, Miriam Gracia, Mithuun Kanapathipillai, Matt Pepper, Margaret Sorenson, Morgan Wiese, My Ngyuen, Nancy Bates, Paul Niklewski, Nicole Hinson, Omar El Sayed, Philip Rybcznski, Pearl Joslyn, Prithvi Parthasarathy, Rebecca Morse, Raquel Gindi, Sarah Bates, Debra D, Edward Shadiack, Shenqin Yao, Katelyn Stebbins, Suhina Kanapathipillai, Sarah VanFleet, Sarah Weinshel, Tarra Shingler, Timothy Adesanya, Ty Demint, Trisha Parayil, Theresa Ruscitti

Our Partners:

Doctor Evidence’s (DRE) platform enables researchers to identify and discover evidence for analysis to generate actionable data insights far beyond human capabilities. The recent Coronavirus pandemic has shown the world the ever-increasing need to process voluminous amounts of data in real-time.  DRE is happy to support Dr. Fajgenbaum and his team’s vision for finding a COVID-19 cure through a systematic literature review. To learn more about this collaboration, check out this press release in Business Wire.  

 

Nonlinear Ventures helps communities and organizations leverage best practices in systems thinking, portfolio management, economic valuation, and data visualization to make better investments in communities and the environment. Matt Chadsey, the founder of Nonlinear Ventures,  is excited to support the work of CDCN and help make its innovative research accessible worldwide!

 

Manifold is a digital health discovery and development partner for leading life science companies, academic institutions, and non-profit organizations. Their partnerships are grounded in multi-disciplinary collaboration between biomedical scientists and computer scientists & engineers to realize patient impact. Manifold appreciates the opportunity to collaborate with Dr. Fajgenbaum and his team to identify medication candidates which could be repurposed to treat COVID-19 and to prioritize them for clinical trials.

 

Please email tracey.sikora@pennmedicine.upenn.edu if you are interested in contributing to, or collaborating on this project. 

Arrow Shape Facebook Instagram Twitter Youtube Play Arrow Left