Pharma knows there is a need for clean,
reusable, FAIR data to fuel effective machine learning-driven R&D – but
driving effective information and knowledge management is a long process,
needing a step change where individuals learn their roles in data stewardship
and reduce them to practice in a way that doesn’t constrain or impede their
In our recent webinar, Pharma Data as an Asset: Moving from an application-centric to an information-centric organization, Dr. Martin Romacker, Senior Principal Scientist in Scientific Solution Delivery and Architecture at Roche, talked about how pharma considers data as an asset, but historically does not treat it as such – using the oft-cited ‘data is the new oil’ to metaphorically illustrate how much value is lost from poorly managed data (equivalent to the economic wastage lost in natural gas flares during oil production).
In order to embed good data management
practices, pharma needs to start taking account of the unplanned or hidden
costs – e.g. ETL (extract, transform, load) processes, data cleansing, semantic
data integration, etc. They also need to be clearer that any created or
acquired data that do not comply with the FAIR (Findable, Accessible,
Interoperable, Reusable) Principles during the production process immediately
lose value, adversely affecting innovation and output right across the drug
discovery and development process.
During the session, Dr. Romacker
demonstrated where Roche is implementing FAIR Data in core areas, including:
* Implementing a comprehensive data value chain: fully integrating semantic data management to ensure all incoming data and metadata are FAIR to enable retrospective or secondary data use far into the future
—Underpinned by change management around data citizenship & sharing, including a FAIR Data Playbook.
* The Roche Data Commons
(RDC) and the layers required to ensure not only FAIR data, but quality
data, emphasizing that for effective output you need both, in addition to the
fact that the infrastructure and services need to be FAIR too – not just the
All of this is particularly important in light of one of Dr. Romacker’s key points, which is that while technologies and applications will come and go continuously, the data are always going to be there, so it is much better to make them reusable as soon as possible in their lifecycle.
Central to this whole process is the
researcher. A FAIR data implementation must not impede or burden the
researcher, but rather they should feel empowered to own FAIRification of their
data, while being served by an internal infrastructure and technologies that
enable them to do this easily.
Driving this change requires not only internal change management, and a willingness on all sides to engage on this topic, but goes beyond internal change throughout the entire scientific community. Pharma, partners and academia need to all work together on an open public/private infrastructure supporting reusable FAIR Data.
Overall, the investment and effort now will pay off downstream, with a more cost-effective R&D process where FAIR Data is central, and the scientific community can react faster, with greater insight, to deliver what patients need.
To get involved or get started, download the Pistoia Alliance FAIR Toolkit.