15/09/2020

COVID-19 Clinical Data Explosion: The Need for Intelligence in Clinical Trials

dataexplode.jpg

The volume and speed of data generation during the COVID-19 pandemic is unprecedented. This mountain of data is growing daily. The massive COVID-19 Open Research Dataset (CORD-19) has been assembled from tech giants like Microsoft, the Allen Institute for Artificial Intelligence, and Georgetown University's Center for Security and Emerging Technology .  This dataset houses over 12,000 full text articles in “machine-readable form” that can be ingested programmatically into computer software programs and analyzed using machine learning applications like natural language processing (NLP). Furthermore, CovidSurg is a global registry for tracking outcomes in COVID-19 infected surgical patients with over 100 countries registered . This registry represents a unique opportunity to evaluate variation in patient characteristics, peri‑operative management and surgical outcomes. Not to mention the > 2000 clinical trials ongoing presently that are generating valuable data, but not currently shared across the industry.

Each of these datasets represents a rich repository for application of data mining, machine learning (ML) and artificial intelligence (AI) algorithms.  Data scientists and experts in AI and ML are the gate keepers to making this happen. Tools like CTi will pay a key role in the collaboration and analytics efforts to support this paradigm shift. Over the last decade there has been tremendous progress in these areas thanks to advances in computing capabilities and key advances in ML, such as deep learning (DL), a framework for working with complex, unstructured data (to include language/text). DL has pushed the state of the art in NLP, allowing computers to provide insight into complex language patterns, extract topics, understand context and identify relationships of interest

While data scientists can help apply technology toward organizing data more efficiently, it is paramount that this information is managed and organised by emerging tools such as Clintex CTi. Even preeminent organizations that are presumably playing by the books including pharmaceutical companies, government organizations, and hospitals are strapped by self-interest. While self-interest has some advantages pushing Big Pharma in a race toward developing new vaccines and therapeutic interventions quickly, in parallel, one hopes that data gathered from many sources is being dispersed appropriately and that organizations are acting together to share ideas, technology, and study designs. Clintex CTi will support this collaboration whilst protecting sensitive intellectual property.

 The reality is that there is insufficient data sharing on COVID-19 and as “facts” continue to emerge each organization is analyzing this information in a decentralized fashion leaving room for varying interpretation and real-life application. In summary, large-scale data collection, organization, scrutiny, and dissemination efforts during the COVID-19 era are key for strategic decision-making and sharing best practices.  ClinTex believe this is the future of medicines development and are uniquely placed to support using data to deliver new medicines, faster-safer-cheaper.