Overview

An overview of how LinkR works

Introduction

Here is a global view of LinkR:


  • LinkR is built around projects, in which datasets are loaded, containing data in OMOP format and using standard terminologies.
  • This data can be quality-controlled through the use and sharing of data cleaning scripts.
  • In these projects, users can visualize and analyze data using widgets, which use configured plugins.
  • It is also possible to access an R and Python development environment via the development console.

Let's now look at these different elements one by one.

Datasets

LinkR works with OMOP, an international common data model for health data.

You can import data from different sources: a database, Parquet files, or CSV files.

The same dataset can be used in multiple projects.

For now, it is necessary to code the data import script in R. A graphical interface will be coded in a future version.

See how to import data.

Integration of data in formats other than OMOP

It is currently not possible to import data in formats other than OMOP.

It is planned in future developments to integrate:

  • Data in FHIR format
  • Data in a custom format (typically a data collection in CSV or Excel format)

Terminologies

The OMOP data model is based on standard terminologies, such as:

  • LOINC for laboratory data
  • SNOMED for diagnoses
  • RxNorm for medications
  • etc

Terminologies are often imported along with the data.

All these terminologies are available on Athena.

See terminologies.

Data Cleaning Scripts

Data imported in OMOP format often needs to be quality-controlled using data cleaning scripts.

A classic example is represented by weight and height data, which, due to how care software is created, often contains aberrant data, for example by inversion between weight and height fields.

Scripts to exclude this type of data are often created. LinkR facilitates the sharing of such scripts, which, due to the use of the common OMOP data model, will be likely to work on the different datasets imported into LinkR.

A few other examples of scripts:

  • calculation of scores, such as IGS-2 or SOFA score
  • calculation of diuresis, by summing different parameters (urinary catheter, nephrostomy, etc.)
  • etc

See how to create a data cleaning script.

Projects

A project is an R and Python environment where data will be analyzed.

A project can correspond to a study (for example a study on mortality prediction), but also to data analysis outside of studies, such as creating dashboards (a dashboard allowing visualization of a hospital department’s activity, for example).

When creating a project, the user chooses the data to use from the datasets loaded in the application.

The project will be built around two main pages:

  • Patient-level data page: here, the user can recreate the equivalent of a clinical record, by creating tabs where they will configure widgets, for example:

    • a “Hemodynamics” tab where we will create widgets to visualize heart rate, blood pressure, and antihypertensive treatments received by the patient

    • a “Notes” tab where we will display all textual documents concerning the patient (hospital reports, daily clinical notes, etc.)

    • an “Infectious diseases” tab where we will display all data concerning infectious diseases (bacteriological samples, antibiotics received, etc.)

    • etc

  • Aggregated data page: here, the user will create tabs in the same way where they will configure widgets. These will be analyses on a group of patients, for example:

    • a “Demographic data” tab where the user will display demographic data of the patient group (age, sex, length of stay, mortality, etc.)

    • an “Aberrant data” tab where the distribution of different parameters will be displayed and aberrant data will be excluded

    • a “Survival analysis” tab where a widget will be configured to perform survival analysis of the selected population

    • etc

Using the low-code interface (which combines a code interface and a graphical interface), collaborative work between data scientists, statisticians, and clinicians becomes easier.

See how to create a project.

Plugins

Plugins are pieces of R and Python code that allow adding functionalities to LinkR.

As we saw in the previous paragraph, projects are organized into tabs.

These tabs contain widgets, which are plugins applied to data.

For example, if I choose the “Timeline continuous var.” plugin to be applied to the “Heart rate” parameter, the resulting widget will be a timeline graph that displays the heart rate of the selected patient.

There are patient-level data plugins, which are the elements that allow recreating a medical record, for example:

  • Document reader: this plugin allows displaying textual documents (hospital reports, clinical notes) and filtering them (with keyword search or title filter, for example)
  • Timeline continuous var.: as mentioned above, to display temporal data as a timeline
  • Data table: allows displaying data as a data table, such as displaying a patient’s laboratory results by collection time
  • etc…

We also have aggregated data plugins, which will serve to visualize and analyze aggregated data, for example:

  • Survival analysis: allows performing survival analyses
  • Machine learning: to train and evaluate machine learning models, with R or Python libraries
  • etc…

See how to create a plugin.

Widgets

Widgets correspond to plugins applied to data.

After creating a tab, I can add several widgets to it.

These widgets can be resized and moved on the page.

Subsets

At the project level, a dataset can be divided into several subsets.

A subset is a subset of the global dataset, after applying filters to select patients.

Here are examples of subsets that one could imagine based on the MIMIC database, which includes stays of more than 50,000 patients in intensive care, adult and pediatric:

  • Patients over 18 years old admitted to medical intensive care for COVID-19
  • Patients with an ICD-10 code for infectious pneumonia and treated with Amoxicillin
  • Excluded patients: it can be useful to create a subset with only patients excluded from analyses
  • etc…

For now, it is necessary to code the scripts in R to create subsets. A graphical interface will be coded in a future version.

See how to create a subset.

Conclusion

We have seen the different elements that make up LinkR.


We will now see: