Resources
3/4
15 min Boris Delange · 2026/03/12

Variables, concept sets, and anchors

Create concept sets, define temporal anchors, and configure variables with their collection time windows.

Summary

After defining the study population, you need to specify what data to collect and when. This article covers three sections of the Study Designer: concept sets (grouping medical codes), temporal anchors (defining reference dates), and variables (configuring each data point with its time window and aggregation function).

Concept sets

The Concept sets section lets you create reusable sets of medical codes. A concept set groups codes from standardized terminologies (ICD-10, SNOMED CT, LOINC, ATC…) that describe the same clinical concept — for example, all codes for a diagnosis of “sepsis,” or all antibiotic prescription codes. These terminologies were introduced in the article on terminologies.

These sets are then used in the selection criteria (previous article) and in the definition of variables (below).

Interoperability and multicenter extension

Defining your variables and criteria using standardized terminologies from the study design stage ensures your protocol’s interoperability. If you later consider a multicenter extension, this groundwork will save considerable time: each center can use the same concept sets without having to recreate the definitions.

Creating a concept set

Three methods are available to create a concept set:

The INDICATE Data Dictionary

The catalog currently includes the INDICATE Data Dictionary, a dictionary of standardized concept sets developed as part of the European INDICATE project. This dictionary provides a level of abstraction above raw terminologies: rather than handling hundreds of individual LOINC or SNOMED CT codes on ATHENA or ATLAS, you work directly with clinical variables such as “Heart rate”, “Creatinine”, or “Type 2 diabetes”, each associated with expert-curated concept sets. This level of abstraction — the clinical variable, not the individual code — is what research protocols use to define which data to collect. Additional dictionaries could be integrated over time, created by learned societies or specialized working groups (oncology, genetics, cardiology…).

Managing concepts

Once a concept set is created, click on it to open it. You can then:

Each concept set also lets you define:

What are concept sets for?

The same concept set can be reused in multiple places: in a selection criterion (e.g., “patients with a sepsis diagnosis”) and in a variable (e.g., “highest creatinine value”). Centralizing codes in a concept set avoids duplication and makes maintenance easier.

Temporal anchors

Temporal anchors are the reference dates around which variables are collected. For example, if you want to measure creatinine “within 24 hours of admission,” the admission date is the temporal anchor.

Anchors are defined in the Anchors tab of the Variables section.

Anchor types

Several anchor types are available:

Each anchor has a name (e.g., “ICU admission”, “Sepsis diagnosis”) and optional details.

Why define anchors?

In a health data study, when a measurement is collected is just as important as the measurement itself. Temporal anchors formalize this information and report it consistently throughout the protocol, and are also used to automatically generate the study scripts.

Variables

The Variables tab in the same section lets you define each data point to extract. A variable corresponds to a measurement, result, or characteristic that you want to obtain for each individual in your cohort. The concepts of concept, temporal anchor, collection window, and aggregation function were introduced in detail in the article on defining variables.

Creating a variable

Click Add variable to open the creation form. You fill in:

Variable source

Two types of variables are available:

Temporal anchor and collection window

Each variable is linked to a temporal anchor. This is the reference date from which the collection window is calculated.

The collection window specifies the interval around the anchor during which data is searched:

For example, for “the highest creatinine within 24 hours of admission”:

Aggregation function

When multiple values exist within the collection window, you choose how to summarize them:

Concrete example

For a study on sepsis, you might define the following variables:

  • Maximum lactate at H24 — concept set “Lactate”, anchor “ICU admission”, window 0–24 hours, aggregation “Maximum”
  • Creatinine at admission — concept set “Creatinine”, anchor “Admission”, window −6 to +6 hours, aggregation “First”
  • History of type 2 diabetes — concept set “Type 2 diabetes”, anchor “Admission”, window null–0 (from all time up to admission), aggregation “Presence”
  • Age — computed variable “Age”

Table and timeline

Defined variables are displayed in a summary table that shows for each variable: the name, unit, temporal anchor, collection window, and aggregation function. You can edit or delete each variable from this table.

A timeline view is also available. It visually represents the temporal anchors and collection windows of each variable as horizontal bars. This is a good way to check at a glance that all windows are consistent.

Key takeaways

  • Concept sets group medical codes (ICD-10, LOINC, ATC…) into reusable sets for criteria and variables, with a unit (UCUM) and retained min/max values to exclude outliers.
  • Temporal anchors define reference dates (admission, discharge, clinical event…) around which variables are collected.
  • Each variable is linked to an anchor, a collection window, and an aggregation function that specify when and how to extract the data.
  • The timeline view provides a visual overview of all variable collection windows at a glance.