Introduction to OMOP-CDM

What is the OMOP CDM?

The OMOP Common Data Model (CDM) is an open, community-driven standard designed to harmonize the structure and content of observational data. The OMOP CDM offers a unified structure for diverse observational data, including patient demographics, conditions, procedures, drug exposures, and clinical measurements.
In Sweden, interest in OMOP is steadily growing. Several universities and healthcare organizations (including the Karolinska Institutet) are exploring its potential to support national and international collaborations in medical research.

What is OMOP?
What does OMOP mean? A bit of history

What does OMOP mean? A bit of history

The Observational Medical Outcomes Partnership (OMOP) was launched in 2008 in the United States as a public–private initiative focused on improving drug safety surveillance and the analysis of observational healthcare data. To enable this work, OMOP introduced a common format for structuring disparate datasets. Over time, the OMOP CDM grew beyond its initial focus on drug development, becoming a widely adopted standard with an independent life of its own.
Today, the OMOP CDM is maintained by the Observational Health Data Sciences and Informatics (OHDSI) community, a global, open-science collaborative. Beyond the data model itself, OHDSI also develops and maintains standardized vocabularies, open-source software tools, and methodological frameworks that together support the full lifecycle of observational research using OMOP.
As of 2024, 544 data sources from 54 different countries have been standardized to the OMOP CDM. These data include electronic health records, administrative claims, registries, hospital systems, genomics and biobanks. Together, these data sources conservatively cover more than 974 million unique patient records (approximately 12% of the world’s population, see the 2024 OHDSI annual report).

Why should I use the OMOP CDM?

Health data is characterized by a wide variety of formats, database systems, information models, vocabularies, and tools for data management and analysis. This diversity stems from the complexity of the healthcare ecosystem, where numerous organizations collect data for different purposes and under varying regulatory constraints.
While an individual data model may serve the needs of a specific organization, the sheer diversity of data sources and models creates major challenges for researchers seeking to combine and analyze information across multiple data sources.
In this context, the OMOP CDM provides a unified structure that enables researchers to share and analyze data from diverse sources, regardless of their underlying data models or vocabulary.
Another important aspect of data standardization, especially considering the most recent advances in artificial intelligence (AI), is ensuring that data from different sources can be reliably combined, enabling the development of robust machine learning models and fair evaluations across populations.
Among any other possible solution for data standardization, OMOP CDM stands out for its proven efficacy in supporting federated data (data remain local, analyses run across sites) and for its open and collaborative nature.

Why should I use the OMOP CDM?
How does the OMOP CDM work?

How does the OMOP CDM work?

The OMOP CDM is, at its core, simply a standardized way to represent the information in a dataset, making it easier to compare and analyze across different sources. Moving data into this format requires a process known as mapping, where local codes and structures are translated into the standardized OMOP vocabulary. To support this transition, the OHDSI community provides a range of open-source tools for mapping, data transformation, and downstream analysis.
It is important to take in mind that the OMOP CDM will not replace the need for local data models and vocabularies, but rather provide a common framework for data integration and analysis across multiple sources. In other words, an OMOP CDM instance represents an interface that allows different source data to be combined and analyzed, regardless of their underlying data models or vocabularies.
In Sweden, SciLifeLab is emerging as a service provider for OMOP, offering expertise and infrastructure to help researchers adopt the model and connect with the international OHDSI network.

OMOP at SciLifeLab

Sweden is undertaking a national initiative (OMOP 4 Sweden) to create the conditions needed to accelerate the adoption of the OMOP standard. The project brings together a multi-helix consortium for long-term collaboration, aiming to build a clear OMOP case for Sweden while aligning international expertise with local needs and regulations.
SciLifeLab is contributing actively by driving pilot projects, such as the harmonization of the PREDDLUNG dataset. Through this work, SciLifeLab is helping to define national use cases, demonstrate OMOP’s value for the Swedish health data ecosystem, and capture best practices that will support wider implementation in the years ahead.

OMOP at SciLifeLab

Links and Resources

Last updated on 01-10-2025