Introduction to OMOP-CDM
What is the OMOP CDM?
The OMOP Common Data Model (CDM) is an open, community-driven standard designed to harmonize the structure and content of observational data. In general, a common data model defines a shared way of organizing data so that information from different sources can be represented in the same format. This makes it easier to compare, analyze, and reuse data across institutions and studies. The OMOP CDM applies these principles through a standardized structure and shared vocabularies specifically designed for observational data, including patient demographics, conditions, procedures, drug exposures, and clinical measurements.


Why use the OMOP CDM?
Health data comes in many formats and systems. While an individual data model may serve the needs of a specific organization, the sheer diversity of data sources and models creates major challenges for researchers seeking to combine and analyze information across multiple data sources. Furthermore, AI applications require large, diverse, and high-quality datasets to perform well, and the heterogeneity of health data can hinder the development of robust and generalizable models.
In this context, the OMOP CDM provides a unified structure that enables researchers to share and analyze data from diverse sources, regardless of their underlying data models or vocabulary. Advantages of using the OMOP CDM include:
- Standardization Across Sources
OMOP CDM provides a unified structure and vocabulary, making it easier to integrate data from multiple healthcare systems, EHRs, and claims databases.
- Interoperability for Research
Enables consistent analysis across institutions and countries, supporting collaborative studies and large-scale observational research.
- Scalability for Big Data Analytics
Designed to handle large, complex healthcare datasets efficiently, making it suitable for population-level studies and predictive modeling.
- Rich Vocabulary Mapping
Includes standardized vocabularies (SNOMED, RxNorm, LOINC, etc.), ensuring semantic consistency and reducing ambiguity in clinical concepts.
- Access to OHDSI Tools & Community
Using OMOP CDM unlocks a suite of open-source tools (e.g., ATHENA, Usagi, Achilles) and a global research network, accelerating analytics and methodological development.
How does the OMOP CDM work?
The OMOP CDM is, at its core, simply a standardized way to represent the information in a dataset. It is important to bear in mind that the OMOP CDM will not replace the need for local data models and vocabularies. Indeed, preparing a dataset for use with the OMOP CDM primarily involves translating the source data into the standardized OMOP structure, creating a clear and reproducible mapping between local representations and the OMOP conventions.
In practice, this typically involves:
- reviewing the structure and content of the source dataset
- mapping local tables and fields to the corresponding OMOP CDM tables
- translating local fields/codes into standardized OMOP vocabularies
- checking the transformed data using established validation tools
In this way, the OMOP CDM instance will allow local data to be combined and analyzed with other databases that have also been standardized to OMOP.
Researchers are not expected to approach this process alone. The OHDSI community provides a rich ecosystem of open-source tools, documentation, and shared best practices that support each step of the transformation. Moreover, the widespread adoption of the OMOP CDM across institutions and countries means that many real-world examples, mappings, and workflows already exist, making it easier to learn from and build upon prior experience.


What’s the history of OMOP?
The Observational Medical Outcomes Partnership (OMOP) was launched in 2008 in the United States as a public–private initiative focused on improving drug safety surveillance and the analysis of observational healthcare data. To enable this work, OMOP introduced a common format for structuring disparate datasets. Over time, the OMOP CDM grew beyond its initial focus on drug development, becoming a widely adopted standard with an independent life of its own.
Today, the OMOP CDM is maintained by the Observational Health Data Sciences and Informatics (OHDSI) community, a global, open-science collaborative. Beyond the data model itself, OHDSI also develops and maintains standardized vocabularies, open-source software tools, and methodological frameworks that together support the full lifecycle of observational research using OMOP.
As of 2024, 544 data sources from 54 different countries have been standardized to the OMOP CDM. These data include electronic health records, administrative claims, registries, hospital systems, genomics and biobanks. Together, these data sources conservatively cover more than 974 million unique patient records, approximately 12% of the world’s population. See the 2024 OHDSI annual report for more details.
How is OMOP being used in SciLifeLab and Sweden?
In Sweden, interest in OMOP is steadily growing. Several universities and healthcare organizations, including Karolinska Institutet and Karolinska University Hospital, are exploring their potential to support national and international collaborations in medical research.
A national initiative, OMOP 4 Sweden, is being carried out to create the conditions needed to accelerate the adoption of the OMOP standard. This project brings together a multi-helix consortium for long-term collaboration, aiming to build a clear OMOP case for Sweden while aligning international expertise with local needs and regulations.
SciLifeLab is emerging as a service provider for OMOP, offering expertise and infrastructure to help researchers adopt the model and connect with the international OHDSI network that provides a range of open-source tools for mapping, data transformation, and downstream analysis. Moreover, SciLifeLab is contributing actively by driving pilot projects, such as the harmonization of the PreDDlung dataset. Through this work, SciLifeLab is helping to define national use cases, demonstrate OMOP’s value for the Swedish health data ecosystem, and capture best practices that will support wider implementation in the years ahead.

Links and Resources
- OMOP 4 Sweden, the preparation project page for the OMOP initiative in Sweden
- OMOP common data model webpage
- OHDSI webpage
- EHDEN Academy, training and development programmes developed by the OHDSI and EHDEN community
