1 - Installation

How to install LinkR, on R or from a Docker image

Introduction

Several installation methods are available:


  • The simplest (no programming knowledge required): use Docker Desktop
  • Command line: you can run the Docker image from a terminal or install the package via an R console.
  • For developers: Clone the Git repository to be able to edit the source code.

This method is the simplest and requires no programming knowledge.

Install Docker on Windows or macOS

  1. Download Docker Desktop
  2. Install and launch Docker Desktop

In the left menu, click on Docker Hub, this will allow you to search for and install the LinkR image.

Search for interhop/linkr in the search bar.

Click on this image.

On the right side of the screen, select the latest image (latest), and click on Pull to install it.

Return to the Images page on the left of the screen. You will see the list of locally installed images.

From the interhop/linkr image, click on the Play image in the Actions column.

You will then have this screen.

Expand the Optional settings menu.

You will need to modify this:

  • Host port: enter 3838
  • Host path: this is the local folder where the files for the application’s operation will be installed
  • Container path: you must enter /root

Then click on Run, open your browser and copy the following URL: localhost:3838.

Command line installation

Via Docker

Install Docker Desktop as described above.

Verify that Docker works by opening a terminal (PowerShell or CMD on Windows) and running:

docker --version

Copy the Docker image from Docker Hub.

docker pull interhop/linkr:latest

Launch a container from this image.

docker run -p 3838:3838 interhop/linkr:latest

You can now access LinkR via the address localhost:3838.

You can also launch LinkR by changing the arguments of the run_app function (see next paragraph).

docker run \
    -p 3838:3838 \
    interhop/linkr:latest \
    R -e "linkr::run_app(language = 'en', app_folder = '/root')"

Here are the arguments for run_app:

Argument Description
language Application language. Can be "fr" for French or "en" for English. (character)
app_folder Folder where application files will be stored in the container. (character)
authentication Enable or disable user authentication (TRUE or FALSE). (logical)
username Username to use for automatic login when authentication = FALSE. Ignored if authentication is enabled. (character)
local If TRUE, the application runs in local mode without loading external files (e.g., from GitHub). (logical)
log_level Log levels to display. Can include "info", "error", "event", or "none" to disable logs. (character vector)
log_target Log destination: "console" or "app". (character)
port Port used to run the Shiny application. (integer)
host Host address to run the application (default "0.0.0.0"). (character)
loading_options List of startup options (page, project, filter, etc.): can include named elements like page, project_id, load_data_page, subset_id, person_id. (list)


To allow the container to access a specific folder on your host system (for example, /my_personal_folder/linkr), you can mount this folder in the container. This is done with the -v option when launching the container.

docker run \
    -p 3838:3838 \
    -v /my_personal_folder/linkr:/root \
    interhop/linkr:latest \
    R -e "linkr::run_app(language = 'en', app_folder = '/root')"

Here we have properly configured the app_folder argument of the run_app function to save the application files in the /root folder, which will actually be the folder on your system that you specified with the -v option.

Via RStudio / R Console

The remotes library will be necessary for LinkR installation. You can install it with this command:

install.packages("remotes")

Stable version

Install the latest stable version with this command:

remotes::install_gitlab("interhop/linkr/linkr", host = "framagit.org")

Development version

To install the latest development version, add @dev at the end of the git repository link.

remotes::install_gitlab("interhop/linkr/linkr@dev", host = "framagit.org")

Important - shiny.fluent version

Version 0.3.0 of shiny.fluent is required


You must also install version 0.3.0 of shiny.fluent.

By default, version 0.4.0 is installed but it has unresolved bugs.

remotes::install_github('Appsilon/shiny.fluent', ref = 'dd1c956')

Start LinkR

To launch LinkR from RStudio or from an R console, execute the linkr function.

linkr::run_app()

See above for the arguments that the function can take.

Installation for developers

Clone the official LinkR Git repository from Framagit:

git clone git@framagit.org:interhop/linkr/linkr.git
cd linkr

Make sure you have configured an SSH key to access Framagit. Otherwise, you can also use the HTTPS URL:
https://framagit.org/interhop/linkr/linkr.git

Switch to the dev branch:

git checkout dev

Open the project folder in RStudio or Visual Studio Code (with the R extension enabled).

Load the package with devtools, by running this in the R console:

install.packages("devtools")  # If not already done
devtools::load_all(".")

Launch the application by running this:

linkr::run_app()

Conclusion

We have installed and launched LinkR.


Let's now see how to launch our first project.

2 - Getting Started

A tutorial for launching your first project and manipulating data

Introduction

In this tutorial, we will see how to:


  • Launch a project that's already installed
  • Visualize data and concepts from a dataset
  • Organize data by creating tabs and widgets

At the end of this tutorial, you will have a dashboard allowing you to visualize data from 100 patients.

You must have previously installed and launched LinkR.

Project Launch

When installing LinkR, the following were installed:

  • A project (LinkR Demo), allowing you to visualize patient records and obtain a dashboard of aggregated data
  • A dataset (MIMIC-IV Demo), with data from 100 patients admitted to intensive care

To launch the project, click on it directly from the home page (LinkR Demo Project).

Two things then happen:

  • The data from the 100 patients are loaded
  • The patient-level data pages (patient records) and aggregated pages (statistics or visualizations on groups of patients) are displayed as configured in the project

You arrive at the project’s home page:

You have information about:

  • the project (author, creation date…)
  • the data contained in the project: here we see that we have data from 100 patients, corresponding to 852 stays
  • a description of the project when available

To access patient records, click on the “Patient-level Data” icon.

You will see dropdown menus on the left side of the screen: Subset, Patient, and Stay.

Select the subset “All patients”, and any patient.

You will then see the patient’s stays. Here we see an emergency room visit, followed by an intensive care stay and a transfer to a medical unit.

Note the tabs at the top right of the screen, which allow you to organize the medical record.

Click on the “Haemodynamics” tab to access data concerning the patient’s hemodynamics.

Here we have three widgets:

  • A widget that displays hemodynamic data as a timeline
  • A widget that displays this same data as a table
  • A widget that displays the treatments received by the patient

You can click on a part of the first figure to zoom in on a period. This will update the other widgets to display data for the period selected in the first widget.

If you change patients, the data will also be updated.

Note at the top of the page, next to the loaded project name, three icons:

  • The first (person icon) allows you to display the patient-level data page
  • The second (multiple people icon) displays the aggregated data page
  • The third (list icon) allows you to explore the concepts used in the dataset

Click on the second icon to go to the aggregated data page, where we will see a dashboard with information about the selected patients.

Last step of this overview, click on the third icon, the concepts one.

In the “Terminology” dropdown menu, select “All terminologies”.

You will then see all the concepts available in the dataset chosen for this project.

If you click on a concept, you will see its details, including its distribution.

Let's summarize:


  • A project allows you to organize data in a certain way
  • The patient-level data page allows you to visualize patient data record by record
  • The aggregated data page allows you to visualize data from a group of patients
  • The concepts page allows you to explore the concepts used in a dataset

We will now see how to create a tab and a widget ourselves!

Creating a Tab and a Widget

Let’s go back to the patient-level data page (remember, with the person icon at the top of the screen).

We’re going to create a Respiratory tab, in which we’ll create a widget to display vital signs related to the patient’s ventilation.

Click on the “+ Tab” icon at the top left of the screen.

An “Add a tab” menu appears. Name it “Respiratory” and validate.

Once on this tab, click on the + Widget icon.

We will have three things to do:

  • Choose a name for the widget
  • Choose a plugin to select how to display the data (as we’ve seen: timeline, table, or other)
  • Choose the concepts to display

When you click on “Select a plugin”, you will get the list of available plugins.

To display a plugin’s description, click on the “Information” icon.

You will then get a description of the plugin’s features, which allows you to know if this is the plugin you need to display data as you wish.

If this plugin suits you, close the description and click on the plugin.

Finally, let’s choose the concepts to display by clicking on “Select concepts”.

For the example, we’re going to choose the LOINC terminology in the dropdown menu.

Choose the concepts:

  • Respiratory rate
  • Oxygen saturation in Arterial blood by Pulse oxymetry

Click on “Validate” to confirm the concept selection, then on “Add” to create the widget.

Choose the concepts to display in the dropdown menu, then click on “Show figure” to the left of the widget.

By clicking on “Edit page” on the left side of the screen, you can resize the widget.

Conclusion

We have therefore:


  • Created a tab "Respiratory" to visualize the patient's respiratory data
  • Created a widget with the "Timeline continuous var." plugin to display the patient's respiratory rate and saturation

For more information on creating widgets, go to this page.

To understand LinkR's structure in more detail, go to the next page of the documentation.

3 - Overview

An overview of how LinkR works

Introduction

Here is a global view of LinkR:


  • LinkR is built around projects, in which datasets are loaded, containing data in OMOP format and using standard terminologies.
  • This data can be quality-controlled through the use and sharing of data cleaning scripts.
  • In these projects, users can visualize and analyze data using widgets, which use configured plugins.
  • It is also possible to access an R and Python development environment via the development console.

Let's now look at these different elements one by one.

Datasets

LinkR works with OMOP, an international common data model for health data.

You can import data from different sources: a database, Parquet files, or CSV files.

The same dataset can be used in multiple projects.

For now, it is necessary to code the data import script in R. A graphical interface will be coded in a future version.

See how to import data.

Integration of data in formats other than OMOP

It is currently not possible to import data in formats other than OMOP.

It is planned in future developments to integrate:

  • Data in FHIR format
  • Data in a custom format (typically a data collection in CSV or Excel format)

Terminologies

The OMOP data model is based on standard terminologies, such as:

  • LOINC for laboratory data
  • SNOMED for diagnoses
  • RxNorm for medications
  • etc

Terminologies are often imported along with the data.

All these terminologies are available on Athena.

See terminologies.

Data Cleaning Scripts

Data imported in OMOP format often needs to be quality-controlled using data cleaning scripts.

A classic example is represented by weight and height data, which, due to how care software is created, often contains aberrant data, for example by inversion between weight and height fields.

Scripts to exclude this type of data are often created. LinkR facilitates the sharing of such scripts, which, due to the use of the common OMOP data model, will be likely to work on the different datasets imported into LinkR.

A few other examples of scripts:

  • calculation of scores, such as IGS-2 or SOFA score
  • calculation of diuresis, by summing different parameters (urinary catheter, nephrostomy, etc.)
  • etc

See how to create a data cleaning script.

Projects

A project is an R and Python environment where data will be analyzed.

A project can correspond to a study (for example a study on mortality prediction), but also to data analysis outside of studies, such as creating dashboards (a dashboard allowing visualization of a hospital department’s activity, for example).

When creating a project, the user chooses the data to use from the datasets loaded in the application.

The project will be built around two main pages:

  • Patient-level data page: here, the user can recreate the equivalent of a clinical record, by creating tabs where they will configure widgets, for example:

    • a “Hemodynamics” tab where we will create widgets to visualize heart rate, blood pressure, and antihypertensive treatments received by the patient

    • a “Notes” tab where we will display all textual documents concerning the patient (hospital reports, daily clinical notes, etc.)

    • an “Infectious diseases” tab where we will display all data concerning infectious diseases (bacteriological samples, antibiotics received, etc.)

    • etc

  • Aggregated data page: here, the user will create tabs in the same way where they will configure widgets. These will be analyses on a group of patients, for example:

    • a “Demographic data” tab where the user will display demographic data of the patient group (age, sex, length of stay, mortality, etc.)

    • an “Aberrant data” tab where the distribution of different parameters will be displayed and aberrant data will be excluded

    • a “Survival analysis” tab where a widget will be configured to perform survival analysis of the selected population

    • etc

Using the low-code interface (which combines a code interface and a graphical interface), collaborative work between data scientists, statisticians, and clinicians becomes easier.

See how to create a project.

Plugins

Plugins are pieces of R and Python code that allow adding functionalities to LinkR.

As we saw in the previous paragraph, projects are organized into tabs.

These tabs contain widgets, which are plugins applied to data.

For example, if I choose the “Timeline continuous var.” plugin to be applied to the “Heart rate” parameter, the resulting widget will be a timeline graph that displays the heart rate of the selected patient.

There are patient-level data plugins, which are the elements that allow recreating a medical record, for example:

  • Document reader: this plugin allows displaying textual documents (hospital reports, clinical notes) and filtering them (with keyword search or title filter, for example)
  • Timeline continuous var.: as mentioned above, to display temporal data as a timeline
  • Data table: allows displaying data as a data table, such as displaying a patient’s laboratory results by collection time
  • etc…

We also have aggregated data plugins, which will serve to visualize and analyze aggregated data, for example:

  • Survival analysis: allows performing survival analyses
  • Machine learning: to train and evaluate machine learning models, with R or Python libraries
  • etc…

See how to create a plugin.

Widgets

Widgets correspond to plugins applied to data.

After creating a tab, I can add several widgets to it.

These widgets can be resized and moved on the page.

Subsets

At the project level, a dataset can be divided into several subsets.

A subset is a subset of the global dataset, after applying filters to select patients.

Here are examples of subsets that one could imagine based on the MIMIC database, which includes stays of more than 50,000 patients in intensive care, adult and pediatric:

  • Patients over 18 years old admitted to medical intensive care for COVID-19
  • Patients with an ICD-10 code for infectious pneumonia and treated with Amoxicillin
  • Excluded patients: it can be useful to create a subset with only patients excluded from analyses
  • etc…

For now, it is necessary to code the scripts in R to create subsets. A graphical interface will be coded in a future version.

See how to create a subset.

Conclusion

We have seen the different elements that make up LinkR.


We will now see:

4 - Importing Data

How to import data from different sources: databases, Parquet, CSV…

Introduction

It is possible in LinkR to manage multiple data sources (in OMOP format).

These data sources can each be reused in multiple projects.


It is possible to import data from different sources:


  • A relational database (DuckDB, PostgreSQL...)
  • Parquet files
  • CSV files

It will soon be possible to import data in a custom format (data collection) and in FHIR format.

Create a Dataset

To import data, go to the Datasets page from the menu at the top of the screen.

Then click on the Plus (+) icon on the left side of the screen to create a new dataset.

Choose a name. For the example, we will import the MIMIC-III dataset.

For more information about the MIMIC database, go here.

Once the set is created, click on the widget corresponding to this set and go to the Code tab on the right side of the screen.

You will see that R code has been automatically generated.

This code gives you two examples of using the import_dataset function, which we will see in detail.

import_dataset Function

To import data into LinkR, we use the import_dataset function.

Here are the function arguments, which we will use depending on the type of data to import (file or database connection):

  • omop_version: This is the OMOP version of the data you are going to import (“5.3” or “5.4”)
  • data_folder: In the case of importing data from a folder, this is the folder containing the data
  • con: In the case of importing data from a database, this is the database connection object
  • tables_to_load: By default, all OMOP tables will be loaded from the indicated source. If you only want to load some of these tables, specify here the tables to import. For example load_tables = c('person', 'visit_occurrence', 'visit_detail')

Database Connection

You can import data from a connection to a database.

First configure the connection object con with the DBI library, then use the import_dataset function.

The con argument will take our con object as value.

# Connection object. We see this in detail below.
con <- DBI::dbConnect(...)

# Function to load data when loading the project
import_dataset(omop_version = "5.4", con = con)

This code will establish a connection to the database when loading a project using this dataset.

Here is an example with a connection to a local PostgreSQL database.

# Connection to local PostgreSQL database
con <- DBI::dbConnect(
    RPostgres::Postgres(),
    host = "localhost",
    port = 5432,
    dbname = "mimic-iv-demo",
    user = "postgres",
    password = "postgres"
)

# Loading data when launching the project
import_dataset(omop_version = "5.4", con = con)

Import Files

You can also import files without going through a database connection.

For this, specify the file location in the data_folder argument.

For example, let’s say my database files are in the /data/mimic-iv-demo/ folder:

/data/mimic-iv-demo/
--- person.parquet
--- visit_occurrence.parquet
--- visit_detail.parquet
--- measurement.parquet

I load them like this.

import_dataset(omop_version = "5.4", data_folder = "/data/mimic-iv-demo/")

Function Execution

Once your script is configured, you can execute the code with the “Execute” button on the left side of the screen (Play icon), or using the CTRL/CMD + SHIFT + ENTER shortcut.

If the data is correctly imported, you will have the row count per table like this.

In case of error, the error message will be displayed in this field (on the right side of the screen).

So it’s this script that will execute every time you load a project that uses this dataset.

You can create more complex scripts, which will for example download data from an external source, transform it and save it locally. This is for example what we do with the script to download the MIMIC-IV demo database.

Load Specific Tables

You can choose to import only certain tables from the database, with the tables_to_load argument.

You just need to specify in a character vector the tables to import, like this:

# Loading only person, visit_occurrence, visit_detail and measurement tables
tables <- c("person", "visit_occurrence", "visit_detail", "measurement")

# Adding the tables_to_load argument in import_dataset
import_dataset(omop_version = "5.4", con = con, tables_to_load = tables)

From the Content Catalog

You can also install a dataset from the content library.

This will allow you to download the code to load data (and only the code, we never download data from the content catalog).

Find the tutorial here.

Conclusion

We have seen how to import data, from a database or from files.


Let's now see how to use this data within a project.

5 - Creating a Project

How to create and configure a project

Introduction

A project can correspond to a study (for example a study on mortality prediction), but also to data analyses outside of studies, such as creating dashboards (a dashboard to visualize the activity of a hospital department for example).


LinkR is a low-code data science application, which means you can:

  • Visualize and analyze your data via a graphical interface, without using a programming language, by creating tabs and widgets (this is the no-code functionality)
  • Analyze your data via programming scripts in R or Python, with all the libraries and functionalities available in each of these programming languages (this is the code functionality)

In this chapter, we will see:

  • How to create and configure a project
  • How to explore the concepts present in a dataset
  • How to organize this data using tabs, widgets and scripts
  • And finally how to share this project with the community

Creating a Project

To start, go to the projects page, from the menu at the top of the screen or from the home page.

Then click on “Create a project”.

Choose a name for your project as well as the dataset to use, then click on “Add”.

You can always change the dataset to use for this project later, in the project configuration.

Click on the project to open it.

You arrive on the home page of your project.

You see here that we have data from 100 patients, given that we chose the “MIMIC-IV demo” dataset when creating the project.

The project home page is divided into several tabs (at the top right of the screen):

Let’s look at these tabs one by one:

  • Summary: this is the page we just saw, here are displayed the main information related to the project: the author(s), the project description, a quick view of the loaded data
  • Data: here are available the details of the data loaded in this project: how many patients, how many stays, how many rows per OMOP table, with some figures to visualize demographic data
  • Data cleaning: here will be configured the data cleaning scripts that will apply to the data when loading the project
  • Share: this tab allows sharing your project with the rest of the community (by downloading it in Zip format or by updating a Git repository)

Note that the project name appears at the top of the screen. If I am on another page of the project (Individual data for example) and I click on the name, I will land again on the project home page.

To the right of the project name, several buttons have appeared:

  • Individual data (icon with a single person): to go to the page where you can configure the data to create a patient record
  • Aggregated data (icon with several people): this is the page where you can visualize and analyze cohort data
  • Concepts (icon with a list): you can search for concepts among those present in the imported dataset
  • Subsets (available by clicking on the three dots): you can create subsets of patients by filtering them according to criteria
  • Project files (also available by clicking on the three dots): you can manage scripts (R or Python) and data files created as part of the project

Configuring the Project

Now that our project is created, we will configure it.

We will go through the project tabs one by one (Summary, Data and Data cleaning).

Summary

Here it is mainly about modifying the information concerning your project, which will facilitate its sharing.

To do this, click on the “Edit information” button on the left of the screen (from the Summary tab).

You can then modify the following information:

  • Name: this is the project name
  • Authors: the different authors of the project, you can separate names with a semicolon (Jane Doe; John Doe)
  • Version: this will allow the community to know the project version, in order to update it in case of modifications
  • Short description: a description of the project in one sentence
  • Give access: this will define who will have access to the project within your LinkR instance

The first four fields therefore concern the project description (this information will be useful particularly in case of project sharing), while the last element concerns access to the project only within your LinkR instance.

Note the dropdown menu at the top right, “English”: you can modify the name and short description of the project in different languages, which will facilitate its sharing.

We have so far given a brief description of the project, but users will need more information to understand your project.

For this, you can modify the long description of the project.

Click on the “Edit description” button on the right of the screen.

You will then see appear a text editor in markdown.

To view the rendering (in HTML), click on the execute button at the top right of the editor (Play icon), or use the shortcut CTRL/CMD + SHIFT + ENTER.

To save the modifications, click on the “Save modifications” icon (check icon), or use the shortcut CTRL/CMD + S.

Just like the short description, you can create one description per language.

Remember to save the modifications of your project information.

Data

Now go to the “Data” tab of our project.

If a dataset is selected, you will see different information concerning this dataset:

  • Number of healthcare establishments from which the patients come
  • Number of patients
  • Number of hospital stays
  • Amount of data per OMOP schema table (number of rows)
  • Visualization of some data distributions (by clicking on “Patients” or “Stays”): age, sex, hospital departments…

You can at any time change the dataset loaded in your project, by modifying the value of the dropdown menu.

For the new dataset to be loaded immediately, click on the “Play” icon next to the dropdown menu.

Data cleaning scripts

This part is still under development, be patient!

This will allow applying data preprocessing scripts, for example:

  • Apply a script to exclude aberrant weight and height data
  • Apply a script to calculate the SOFA score daily

Exploring Concepts

Go to the concepts page of the project via the icon to the right of the project name, at the top of the screen.

You will arrive on this page. Select a terminology in the dropdown menu to load its concepts.

By selecting a terminology, you will see appear in the table the different concepts of this terminology used in the dataset loaded for your project.

You will see the number of patients having at least once this concept in the “Patients” column, and the number of rows from all tables combined associated with this concept, in the “Rows” column.

When you click on a concept in the table, the information related to this concept will appear on the right of the screen.

You can notably retrieve the concept ID, which will be useful when you query the OMOP tables. You can also see the distribution of concept values in the loaded dataset.

You can filter concepts by their name, with the menu at the top of the “Concept name” column. You can also choose which table columns to display. These are the columns of the OMOP CONCEPT table.

All these terminology names can be overwhelming!

To untangle all this, we will quickly see what the OMOP model is and what the ETL process is.

OMOP Model


The OMOP model is a standard data model for health data.

It's a way to organize health data, in the form of a database with tables, each storing a particular type of data.


For example:

  • The PERSON table stores data about individuals (mainly patients)
  • The VISIT table stores data about hospital stays
  • The CONDITION table stores information about patient diagnoses

Each piece of information in the OMOP database is coded using a concept, belonging to a terminology.

Each concept has a unique identifier that you can find via the ATHENA query tool.

Concepts are stored in the _concept_id columns of the different OMOP tables.


ETL Process


To obtain data in OMOP format from medical software, it is necessary to perform an ETL (Extract, Transform and Load) process.


During this process, data is transformed to be adequate to the OMOP data model (each software stores its data differently).

The different local concepts are aligned to the standard OMOP concepts. For example, a hospital's heart rate code will be aligned to the standard concept "Heart rate" from the standard LOINC terminology.


This concept alignment process is long and complicated, given that there are thousands of codes to align, often manually.

This is why the majority of OMOP datasets have only a portion of concepts that are aligned.


This is why you see in the dropdown menu above some standard terminologies (LOINC, SNOMED), and others local - non-standard (prefixed by mimiciv).


For more information on terminologies, go to the dedicated page of the documentation.

To reload the concept count, you can click on the “Reload count” icon at the top left of the screen.

Creating Tabs and Widgets

Now that we have loaded a dataset and explored the concepts composing it, we will be able to visualize and analyze this data, using widgets.

For this, go to the Individual data page, either from the project summary tab, or from the icon at the top of the screen, to the right of the project title (the one with a single individual).

You will arrive on the Individual data page, where you will recreate a patient record according to the needs of your project.

The menu on the left of the screen allows you to:

  • add tabs: tabs allow organizing the different widgets
  • add widgets: we will see, widgets are the elementary building block composing projects. They allow visualizing and analyzing data using plugins
  • edit the page: once widgets are created, you can change their layout on the page. You can also modify or delete tabs.
  • select patients: each subset contains several patients, each patient has one or more stays (hospital stay or consultation)

It’s up to you to choose how to organize your project.

For the individual data page, it is usual to create one tab per theme, with for example a “Hemodynamics” tab gathering data related to a patient’s hemodynamic state, or an “Infectious diseases” tab to display elements related to infectious issues: antibiotic treatments, microbiological samples, etc.

Let’s create a first “Hemodynamics” tab. To do this, click on the “+ Tab” button on the left of the screen, then choose a name.

You will have a new empty tab. Tabs are displayed on the right of the screen.

We will now be able to add different widgets to this tab. Click on the “+ Widget” button on the left of the screen.

You will need to:

  • choose a name
  • choose a plugin
  • choose concepts

A plugin is a script written in R and/or Python allowing to add functionalities to the application.

There are plugins specific to individual data, others to aggregated data, and others mixed.

Each plugin has a main functionality.

Some plugins serve to visualize a type of data, for example the plugin to visualize prescription data as a timeline, or the plugin to display structured data as a table.

Others serve to analyze data, for example the plugin to create a logistic regression model, or the one to train machine learning models.

Each step of a data science project can be transformed into a plugin, to save time and improve quality in data analysis. LinkR aims to offer more and more plugins, thanks to the work of its community.

For the example, we want to display patients’ hemodynamic parameters as a timeline.

We will therefore click on “Select a plugin”.

To display the description of a plugin, click on the “Information” icon.

You will then have a description of the plugin’s functionalities, which allows you to know if it’s the plugin you need to display data as you wish.

Click on the “Timeline continuous var.” plugin to select it.

Now let’s select which concepts to display, by clicking on “Select concepts”.

For the example, we selected the concepts of heart rate and systolic, diastolic and mean arterial pressures with the LOINC terminology.

Let’s choose a name, for example “Hemodynamic timeline” and click on “Add”. Our widget will appear on the page.

A widget will often appear in the same form, with three or four icons at the top of the widget, two buttons on the left and the save file name.

Let’s start with the menu at the top of the widget.

The icons are, from left to right:

  • Figure: allows displaying the figure or more globally the result that the plugin is supposed to display
  • Figure parameters: allows configuring the figure using a graphical interface
  • Figure code: allows editing the R or Python code that displays the figure
  • General parameters: these are the general parameters of the widget, allowing for example to show or hide certain elements

Each widget works the same way: a graphical interface allows configuring the figure. When parameters are modified, the corresponding R or Python code can be generated. Once this code is generated, it can be modified directly with the code editor, which allows going beyond what the graphical interface alone offers.

Widgets work with save files, allowing saving both figure parameters and figure code. This allows creating several configurations for the same widget.

To choose a save file, click on the file name (here “No save file selected”), then select the file in the dropdown menu.

To create a save file, click on the “+” icon on this same page, choose a name and create the file. For this first example, we choose the name “Hemodynamic set 1”.

Once the file is created, the parameters saved in the “Figure parameters” and “Figure code” pages will be saved in this file.

Before configuring our figure, let’s look at the “General parameters” of the widget.

In the “Display” section, we can choose to show or hide the selected save file.

We can also choose to display parameters and the editor side by side with the figure. This will divide the widget screen into two parts, with the figure on the left and the parameters or figure code on the right, which is useful to quickly see the result of our parameters.

In the “Code execution” part, we can choose to execute the code when loading a save file: when loading a project for example, the last selected save file will be loaded, which allows initializing all widgets when loading the project. I can also choose not to load a widget, if it is likely to take time to execute and if it is not necessarily needed as soon as the project loads.

The “Execute code when updating data” option allows for example to update the figure when the patient changes, if this widget uses data patient by patient.

We will therefore choose to hide the save file, display parameters or editor side by side with the figure, and execute code both when loading the save file and when updating data.

We see the save file name disappear, and also the figure icon: indeed, the figure will be displayed in the “Figure parameters” and “Figure code” tabs.

Don’t forget to save your general parameters with the icon on the left of the widget. The widget’s general parameters depend on the widget, and not on a save file.

Before displaying our data, let’s adjust one last detail: let’s enlarge the widget.

To do this, click on “Edit page” on the left of the screen. You will then see new icons appear at the top right of the widget:

  • an icon to put the widget in full screen, which is useful in the widget configuration phase
  • an icon to modify the widget, if you want to modify the name, or add or remove concepts
  • an icon to delete the widget

There are also icons at the four corners, which allow defining the widget size.

Let’s make the widget take the full width of the screen and a third of its height.

Then put it in full screen mode. Click on “Validate modifications” on the left of the screen to exit “Edit” mode.

Let’s go to the “Figure parameters” section to configure our figure.

For this plugin, we have three options:

  • Data to display: do we want to display the selected patient’s data, or only the selected stay?
  • Concepts: which concepts to display? We see here appear the concepts we selected when creating the widget. We can choose to display only some of them.
  • Synchronize timelines: this can be useful to synchronize different widgets with each other.

Select “Patient data” in “Data to display”, then “Heart rate” in the concepts dropdown menu.

Then click on the “Save” icon on the left of the widget, then on the “Display figure” icon (Play icon).

You will be asked to select a patient: indeed, we had not yet chosen a patient.

Start by selecting “All patients” in the “Subset” dropdown menu, then any patient.

Since we had selected to update the code when changing patient, you should see the selected patient’s heart rate as a timeline.

Click again on “Edit page”, then exit full screen mode. Your widget should resume the dimensions you had assigned: a third of the page height and full width, which is suitable for this timeline.

You can zoom on the figure, and change the selected time interval.

Your turn to play!

Now try to:

  • Create a new save file for the current widget, "Hemodynamic set 2" for example
  • Configure the widget to display heart rate and systolic, diastolic and mean arterial pressures
  • Create a new widget with the "Data table" plugin, where you will display the same concepts
  • Synchronize the timelines of the two widgets


You should get something like this:

We have seen how to create tabs and widgets to create a patient record, on the “Individual data” page.

The principle is the same for the “Aggregated data” page, except that tabs generally correspond to steps of a research project, with for example a widget to create the study outcome, a widget to exclude aberrant data or a widget to train machine learning models.

Scripts and Files

Coming soon…

Sharing the Project

Once your project is configured, you can share it by integrating it into your Git repositories, directly from the application.

Go to the “Share” tab from the main project page (by clicking on the project name, in blue, at the top of the page).

The tutorial for sharing content is available here.

Conclusion

We have seen how to create and configure a project, in order to visualize and analyze data thanks to LinkR's low-code interface.


We will see in the rest of the documentation:

6 - Creating a Patient Group

Analyze population subsets using patient groups

Introduction

During a project, it's often necessary to work on a subset of the patient population from a dataset: this is what patient groups enable.


We'll see how to:

  • Create a patient group
  • Add patients to this group
  • Remove patients

Creating a Patient Group

To create a patient group (or subset), go to the Patient Groups page. For this, you need to have loaded a project. Then click on the “Patient Groups” icon from the menu at the top of the page, to the right of the loaded project name.

You’ll arrive at the project’s patient groups page.

A patient group is a subset of a dataset, but it depends on a project. If two projects use the same dataset, they won’t share the same datasets.

An “All patients” group is created by default when creating a project.

To create a patient group, click on the “+” icon on the left side of the screen.

Choose a name, then click “Add”. For this example, we’ll create a group containing patients over 50 years old.

Click on the group you just created: you’ll arrive at the selected group’s page.

On the right side of the screen, you have two tabs:

  • Summary: presents the patient group information, which can be modified (particularly the description)
  • Code: this tab allows you to modify the code and add or remove patients from a group, which we’ll see in the following paragraphs

Adding Patients to a Group

To add patients to a subset, we use the add_patients_to_subset function.

This function takes these two arguments:

  • patients: a numeric vector containing the IDs of patients to add
  • subset_id: the ID of the group to which patients will be added (by default the ID of the selected group, modifying this argument is useful in plugins particularly)

When creating a patient group, the code allowing to add all patients to the group is created.

This code will be executed if the user presses the button to execute the code, or if the group is selected from the project (and if it doesn’t already contain patients).

We’ll modify this code to add patients over 50 years old.

Let’s use the code to create a column containing patient ages, from the tutorial on usual OMOP queries.

d$visit_occurrence %>%
    dplyr::left_join(
        d$person %>% dplyr::select(person_id, birth_datetime),
        by = "person_id"
    ) %>%
    dplyr::collect() %>%
    dplyr::mutate(
        age = round(as.numeric(difftime(visit_start_datetime, birth_datetime, units = "days")) / 365.25, 1)
    )

The code editor of the selected group allows testing the code. We’ll extract the IDs of patients with an age greater than 50 years. For now, we’ll comment out the add_patients_to_subset function.

d$visit_occurrence %>%
    dplyr::left_join(
        d$person %>% dplyr::select(person_id, birth_datetime),
        by = "person_id"
    ) %>%
    dplyr::collect() %>%
    dplyr::mutate(
        age = round(as.numeric(difftime(visit_start_datetime, birth_datetime, units = "days")) / 365.25, 1)
    ) %>%
    dplyr::filter(age > 50) %>%
    dplyr::distinct(person_id) %>%
    dplyr::pull()

Our code works, so we can store these IDs in a variable and then integrate it into the add_patients_to_subset function.

patients <-
    d$visit_occurrence %>%
    dplyr::left_join(
        d$person %>% dplyr::select(person_id, birth_datetime),
        by = "person_id"
    ) %>%
    dplyr::collect() %>%
    dplyr::mutate(
        age = round(as.numeric(difftime(visit_start_datetime, birth_datetime, units = "days")) / 365.25, 1)
    ) %>%
    dplyr::filter(age > 50) %>%
    dplyr::distinct(person_id) %>%
    dplyr::pull()

add_patients_to_subset(patients = patients)

A message indicates that the patients have been successfully added to the group.

Note that here we added patients to the group, but it’s also possible to filter:

  • On the hospitalization stay, by adding the visit_occurrence column to the patients variable
  • On stays in departments, by adding the visit_detail column to the patients variable

Graphical Interface

There is no graphical interface yet, which would be very useful for filtering patients on certain characteristics (age, sex, length of stay, hospitalization dates, presence of concepts such as diagnoses or treatments).

This graphical interface will be developed in the next version.

Removing Patients from a Group

To remove patients from a group, you need to use the remove_patients_from_subset function, which works like add_patients_to_subset, with the same arguments: patients and subset_id.

We could, for example, after adding all patients to the group, remove those with an age less than or equal to 50 years.

Integration in Plugins

It would be interesting to create an Individual Data plugin allowing patient exclusion with one or more exclusion criteria, created by the user.

We could imagine that this plugin removes patients from the “Included patients” group, and adds them to the “Excluded patients” group.

To do this, it would be enough to use the functions add_patients_to_subset and remove_patients_from_subset from the plugin.

How to retrieve the ID of patient groups? Thanks to the m$subsets variable.

All that’s left is to create the plugin!

Conclusion

We've seen how to create patient groups, and how to add and remove patients from a group.

A graphical interface will soon be developed to facilitate adding and removing patients.


We'll subsequently focus on more technical LinkR functionalities, namely:

If you don't wish to explore these functionalities, you only need to read the chapters related to content sharing:

7 - Using the Console

Using the R or Python console in LinkR

Introduction

The LinkR console allows you to execute R, Python, or command line queries (Bash).


We will see how to:

  • Use the console in LinkR
  • Query OMOP data
  • Query other variables in the application

Using the Console

The Console page is accessible from any page in the application by clicking the corresponding link at the top of the screen.

You arrive at a page with:

  • The choice of programming language and output
  • A code editor
  • A block where the result of code execution will be displayed

Start by choosing the programming language and output to use.

We’ll begin by looking at the different outputs available in R:

  • Console: the displayed result will be what would be shown in the R console
  • Figure: corresponds to a figureOutput from Shiny
  • Table: corresponds to a tableOutput from Shiny
  • DataTable: corresponds to a DT::DTOutput from DT and Shiny
  • UI (HTML): the result will be displayed in a uiOutput from Shiny
  • RMarkdown: corresponds to the HTML output from converting the RMarkdown file presented in the console

We’ll look at the outputs with examples. For this, load the LinkR Demo project to load the MIMIC-IV data.

Shortcuts


You can use these shortcuts when your cursor is in the code editor:

  • CTRL/CMD + ENTER without selection: executes the line (or the code block to which the line belongs)
  • CTRL/CMD + ENTER with selected text: executes only the selected text
  • CTRL/CMD + SHIFT + ENTER: executes all code in the editor

R - Console

Select the “Console” output, write d$person in the text editor and execute the code (with the Execute button or with the shortcut).

R - Figure

Here’s an example for creating a histogram showing patient ages from the dataset associated with the project.

Select the “Figure” output and execute this code. Note that you can configure the dimensions and resolution of the image.

d$visit_occurrence %>%
    dplyr::left_join(
        d$person %>% dplyr::select(person_id, birth_datetime),
        by = "person_id"
    ) %>%
    dplyr::collect() %>%
    dplyr::mutate(
        age = round(
            as.numeric(
                difftime(visit_start_datetime, birth_datetime, units = "days")
            ) / 365.25, 1
        )
    ) %>%
    ggplot2::ggplot(ggplot2::aes(x = age)) +
    ggplot2::geom_histogram(binwidth = 5, fill = "#0084D8", color = "white") +
    ggplot2::labs(
        x = "Age (years)",
        y = "Frequency"
      ) +
    ggplot2::theme_minimal() +
    ggplot2::theme(
        plot.title = ggplot2::element_text(size = 16, face = "bold"),
        axis.title = ggplot2::element_text(size = 14)
    )

R - Table

The “Table” output displays the entire dataframe using Shiny’s tableOutput function.

R - DataTable

The “DataTable” output uses the DT library to display results, which provides better display than “Table” with a pagination system.

Note that data must be in memory to be displayed.

You’ll need to write:

d$person %>% dplyr::collect()

R - RMarkdown

The “RMarkdown” output interprets the code as a .Rmd file.

The Markdown will be converted to HTML and the R code will be interpreted.

Python - Console

A simple example of using the console in Python:

import numpy as np

# Generate random data
data = np.random.normal(loc=50, scale=10, size=1000)

print(data)

Python - Matplotlib

It’s also possible to use the Matplotlib library to create figures:

import matplotlib.pyplot as plt
import numpy as np

# Generate random data
data = np.random.normal(loc=50, scale=10, size=1000)

# Create the histogram
plt.hist(data, bins=20, color="#0084D8", edgecolor="white")

# Add labels and title
plt.title("Value Distribution")
plt.xlabel("Values")
plt.ylabel("Frequency")

# Display the plot
plt

Querying OMOP Data

Once data is loaded (by loading a project associated with a dataset or by loading a dataset directly), the data becomes accessible from the console through two methods:

  • Through R variables prefixed with d$ (d for data)
  • Through SQL queries using the R function get_query

All OMOP tables are available through both methods.

R Variables

We’ve seen in the examples above the use of variables prefixed with d$ (d$person, d$measurement, etc).

This data is loaded in lazy format, which means it’s not loaded into memory (allowing quick display of the variable even if it contains billions of rows). Once filtering operations are performed, it’s possible to collect this data into memory with the dplyr::collect() function.

In the following example, we filter data from the Measurement table on person_id 2562658 and on measurement_concept_id 3027018 (corresponding to the LOINC concept - Heart rate).

d$measurement %>%
    dplyr::filter(
        person_id == 2562658,
        measurement_concept_id == 3027018
    ) %>%
    dplyr::collect()

Note that these tables are also available for the selected patient group (or subset), with the d$data_subset list.

This will display the list of patients in the selected group:

d$data_subset$person

SQL Queries

For interoperability purposes, it’s necessary to be able to query OMOP tables in SQL.

When you import data into LinkR, it’s always a database connection (even when you import SQL or Parquet files, which are read as a DuckDB connection).

It’s possible to query data in SQL via the R function get_query.

This code allows you to display all data from the patient table:

get_query("SELECT * FROM person")

You can also use SQL queries from the tutorial usual OMOP queries.

This query, extracted from this tutorial, allows you to get patient ages:

get_query("
    SELECT 
        v.visit_occurrence_id,
        v.person_id,
        ROUND(
            EXTRACT(EPOCH FROM (
                CAST(v.visit_start_datetime AS TIMESTAMP) - 
                CAST(p.birth_datetime AS TIMESTAMP)
            )) / (365.25 * 86400), 
            1
        ) AS age
    FROM 
        visit_occurrence v
    LEFT JOIN 
        (SELECT person_id, birth_datetime FROM person) p
    ON 
        v.person_id = p.person_id;
")

Querying Other Variables

Other variables, prefixed with m$, can be queried from the console. These variables are intended to be used in plugins.

Here’s the list:

  • m$selected_subset: displays the ID of the patient group selected via the dropdown menu
  • m$selected_person: ID of the patient selected via the dropdown menu
  • m$selected_visit_detail: ID of the hospital stay selected via the dropdown menu
  • m$subsets: list of patient groups available for the open project
  • m$subset_persons: list of patients belonging to the selected patient group (subset)

Conclusion

We've seen how the LinkR console allows us to execute R or Python queries, choosing the output format (console, figure, data table, etc.).


We've also seen how to handle data from a loaded project, which will be useful for creating our first plugin in the next section.

8 - Creating a Plugin

How to create plugins to add functionality to LinkR

Plugins are what enable you to visualize and analyze data using a low-code interface.


They are scripts written in R and Python, which use the Shiny library to create the graphical interface.

They allow you to create any visualization or perform any analysis on the data, as long as it's possible in R or Python.


You saw in the demo the use of several plugins, notably the "Timeline continuous var." plugin which displays continuous data in timeline format.


LinkR continuously evolves thanks to plugins created by its user community.


We will see:

  • How to create a simple plugin
  • How to create more complex plugins using the development template

8.1 - Creating a Simple Plugin

Creating a simple plugin to display data as a histogram

Introduction

Let's start by creating a simple plugin.


We will:

  • Define the plugin specifications: what do we want our plugin to offer users?
  • Create the plugin and understand its file structure
  • Create the user interface
  • Develop the server-side logic
  • Create translations to enable its use in multiple languages
  • See how to share the plugin with the user community

Plugin Specifications

We will create a graphical interface that will allow us to visualize the distribution of a variable in the form of a histogram.

We need to make an initial choice: is this a plugin for individual data (patient by patient) or aggregated data (on a group of patients)?

It is more common to want to visualize the distribution of a variable on a group of patients rather than on a single patient, so we will create an aggregated data plugin.

Next, what should our graphical interface look like?

We will split the screen in two: on the left we will visualize our histogram, and on the right we will be able to adjust the figure parameters, with a dropdown menu to choose the variable and a field to choose the number of bars in our histogram.

Server side now.

A histogram is not suitable for visualizing all types of data: we can visualize the distribution of numerical data, and categorical data provided that the number of categories is not too large.

To simplify, we will only allow the display of numerical data. We will therefore restrict the display to the OMOP measurement table.

When we change the value of the dropdown menu for the number of bars in the histogram, the modifications must be taken into account after validation, to avoid unnecessary calculations. We will also need to provide bounds for possible values.

Let’s summarize the specifications of our plugin:

  • User interface (UI) side:
    • Histogram visualization on the left side of the screen
    • Parameters on the right side of the screen
      • Variable to display
      • Number of bars composing the histogram, with lower and upper bounds
      • Validation of modifications
  • Server side:
    • Only allow data from the measurement table
    • Modify the number of bars in the histogram according to the value entered in the dropdown menu
    • Launch the figure code once the validation button is clicked

Creating the Plugin

Go to the plugins page from the menu at the top of the screen.

To create a plugin, click on the “+” icon on the left side of the screen.

Choose a name, for example “Histogram”.

Choose the type of data concerned: the plugin can be used on individual data (patient by patient), aggregated data (a group of patients), or both. For our example, we will choose “Aggregated data”.

It is also possible to copy an existing plugin: we will see this in the next section, when we create a plugin from the development template.

Once the plugin is created, select it. You will arrive at the plugin summary page.

You can see in the top right that a plugin is divided into four tabs:

  • Summary: here will be displayed the general information and description of your plugin. We detail this in the last paragraph: “Sharing the plugin”.
  • Code: this is where we will edit the scripts to create the frontend (user interface) and backend (server-side logic) of our plugin (see the next three paragraphs).
  • Test: this tab allows you to test the plugin code with data
  • Share: this is where you can add this plugin to your Git repository to share it with the rest of the community

Plugin Structure

Go to the Code tab.

A plugin is by default composed of these three files:

  • ui.R: contains the Shiny code for the user interface, which we will detail in the next paragraph
  • server.R: contains the backend of the application, which we will detail in the “Server / backend” paragraph
  • translations.csv: contains the translations for the frontend and backend

UI - User Interface / Frontend

As we saw in the diagram above, we want to split the plugin screen in two, with the figure on the left and the figure parameters on the right.

Start by clicking on the ui.R file on the left side of the screen.

All our user interface code must be within a tagList function, which allows you to put HTML tags together using the R Shiny library.

For two div elements to be side by side, they must themselves be in a div with the attribute style = "display: flex;".

tagList(
    div(
        div(
            # Each id is in an ns function, and includes a %widget_id% tag
            id = ns("split_layout_left_%widget_id%"),
            style = "margin:10px 5px; width:50%; border:dashed 1px;"
        ),
        div(
            id = ns("split_layout_right_%widget_id%"),
            style = "margin:10px 5px; width:50%; border:dashed 1px;"
        ),
        style = "display: flex; height: 100%;", # Allows displaying the two divs above side by side
    )
)

Note that whenever we assign an ID to an HTML element, it must include a %widget_id% tag, which will be replaced by the widget ID, allowing for unique IDs. Indeed, it would be problematic if the same plugin was launched in two different widgets. In case of multiple IDs, the HTML page does not display.

Moreover, each ID is encapsulated in an ns function (see the chapter on Shiny modules in the Mastering Shiny book for more information).

We have added borders to our blocks (div) with border:dashed 1px; to visualize our blocks, which are currently empty. We will remove these attributes later.

Click on the “Run plugin” icon on the left side of the screen (or use the shortcut CMD/CTRL + SHIFT + ENTER).

You will be automatically switched to the “Test” tab and you should get this result.

We can clearly see the two div blocks side by side, with a dotted border.

Now let’s add our histogram.

We use the plotOutput function for this, which we will modify with renderPlot on the server side to display our plot.

div(
    id = ns("split_layout_left_%widget_id%"),
    plotOutput(ns("plot_%widget_id%")), # Always put IDs in ns() with a %widget_id% attribute
    style = "margin:10px 5px; width:50%; border:dashed 1px;"
)

Now let’s create the configuration for our figure, in the right block.

We said above that we wanted three elements:

  • a dropdown menu to choose the variable to display
  • a numeric field to choose the number of bars in the histogram to display
  • a button to display the figure with the chosen parameters

We will use the shiny.fluent library, which is the one used for the entire LinkR user interface, which uses Fluent UI.

Here are the functions to use for our three elements:

In plugins, you must prefix all functions with the library name. For example: shiny.fluent::Dropdown.shinyInput().

Let’s create the code to display the configuration elements for the figure.

div(
    # id with ns and %widget_id%
    id = ns("split_layout_right_%widget_id%"),

    # div containing the title, in bold (strong), with 10px space between title and dropdown
    div(strong(i18np$t("concept")), style = "margin-bottom:10px;"),
    # Dropdown menu with concepts
    div(shiny.fluent::Dropdown.shinyInput(ns("concept_%widget_id%")), style = "width:300px;"), br(),

    # Numeric field to choose the number of bars in the histogram
    # With a value of 50, minimum of 10 and maximum of 100
    div(strong(i18np$t("num_bins")), style = "margin-bottom:10px;"),
    div(shiny.fluent::SpinButton.shinyInput(ns("num_bins_%widget_id%"), value = 50, min = 10, max = 100), style = "width:300px;"), br(),

    # Button to display the figure
    shiny.fluent::PrimaryButton.shinyInput(ns("show_plot_%widget_id%"), i18np$t("show_plot")),
    style = "margin: 10px 5px; width:50%;"
)

Click “Run plugin” again, you should get this.

You can notice that the input titles have been put in an i18np$t function. This allows translating elements from the translation file (translations.csv), which we will see in the next paragraph.

Here is the complete code for the user interface:

tagList(
    div(
        div(
            # Each id is in an ns function, and includes a %widget_id% tag
            id = ns("split_layout_left_%widget_id%"),
            plotOutput(ns("plot_%widget_id%")),
            style = "margin:10px 5px; width:50%; border:dashed 1px;"
        ),
        div(
            # id with ns and %widget_id%
            id = ns("split_layout_right_%widget_id%"),
        
            # div containing the title, in bold (strong), with 10px space between title and dropdown
            div(strong(i18np$t("concept")), style = "margin-bottom:10px;"),
            # Dropdown menu with concepts
            div(shiny.fluent::Dropdown.shinyInput(ns("concept_%widget_id%")), style = "width:300px;"), br(),
        
            # Numeric field to choose the number of bars in the histogram
            # With a value of 50, minimum of 10 and maximum of 100
            div(strong(i18np$t("num_bins")), style = "margin-bottom:10px;"),
            div(shiny.fluent::SpinButton.shinyInput(ns("num_bins_%widget_id%"), value = 50, min = 10, max = 100), style = "width:300px;"), br(),
        
            # Button to display the figure
            shiny.fluent::PrimaryButton.shinyInput(ns("show_plot_%widget_id%"), i18np$t("show_plot")),
            style = "margin: 10px 5px; width:50%;"
        ),
        style = "display: flex; height: 100%;", # Allows displaying the two divs above side by side
    )
)

Translations

Translations are to be inserted in the CSV file translations.csv.

It includes the following columns:

  • base: this is the keyword that you will insert in your code and that will be translated according to the selected language
  • en: this is the translation of the word in English
  • fr: French translation. For now, only English and French have been developed. It will be possible to add other languages in the future.

Click on the translations.csv file, then complete it with the following translations.

base,en,fr
concept,Concept to show,Concept à afficher
num_bins,Number of bins,Nombre de barres
show_plot,Show plot,Afficher la figure

Rerun the code. You should get this.

The keywords have been replaced by their French translation.

We will now make all this dynamic by coding the backend!

Server-side logic / backend

Without the backend, the graphical interface is static, nothing happens when I click on the buttons.

As we saw in the documentation page for creating widgets, when we create a widget, we select the plugin to use as well as the concepts.

The selected concepts will be found in the selected_concepts variable, which includes the following columns:

  • concept_id: the concept ID, either standard (to be found on Athena), or non-standard (in this case, greater than 2000000000 / 2B)
  • concept_name: the concept name
  • domain_id: the name of the OMOP Domain, which often corresponds to the OMOP table (‘Measurement’ domain for the measurement table)
  • vocabulary_id: the name of the terminology corresponding to the concept

To be able to test a plugin and make the backend work, you need to load a project containing data. Launch for example the project used for setup (LinkR Demo).

Then, to simulate widget creation, we will select concepts to test our plugin.

You have a “Select concepts” button on the left side of the screen.

This will open the same menu as when you choose concepts when creating a widget.

Select for example the Heart rate concept from the LOINC terminology, then click “Validate”.

Let’s test: open the server.R file, and copy this code:

print(selected_concepts)

Rerun the plugin code, you should get this.

You see the backend output appear at the bottom of the screen. This is only the case with plugin testing, which makes debugging easier. This output is hidden when plugins are used in projects.

So we can clearly see the Heart rate concept appear with its concept_id.

We will now write the code to update the concepts dropdown menu.

# Adding a row with values 0 / "none"
concepts <-
    tibble::tibble(concept_id = 0L, concept_name = i18np$t("none")) %>%
    dplyr::bind_rows(selected_concepts %>% dplyr::select(concept_id, concept_name))

# We convert the concepts to list format
concepts <- convert_tibble_to_list(concepts, key_col = "concept_id", text_col = "concept_name")

# We establish a delay, so that the dropdown updates after being created
shinyjs::delay(500, shiny.fluent::updateDropdown.shinyInput(session, "concept_%widget_id%", options = concepts, value = 0L))

Update translations.csv to add the translation for none.

base,en,fr
concept,Concept to show,Concept à afficher
num_bins,Number of bins,Nombre de barres
show_plot,Show plot,Afficher la figure
none,None,Aucun

Several things to note.

We add a row with an empty concept, ’none’, which will help avoid errors if the dropdown menu is empty.

We use the convert_tibble_to_list function, which converts a tibble to a list, necessary to be integrated into a shiny.fluent input. The arguments will be key_col for the column containing the concept code (‘concept_id’), and text_col for the column containing the text (‘concept_name’).

We add an execution delay for the update, with shinyjs::delay(), which is 500 ms. This ensures that the dropdown has been created in the UI before updating it.

Execute this code, you should now have a dropdown menu with the concepts we selected (in this case Heart rate).

We just need to display our figure.

We will use the observe_event function, which is a modified version of the observeEvent function from Shiny, with the difference that this function prevents the application from crashing in case of an error in the observer code, and adds error messages to the application log.

This function will trigger the code after detecting an event.

observe_event(input$show_plot_%widget_id%, {

    # The code in this function will be executed
    # each time I click on the button with id 'show_plot_%widget_id%'
    # (so the "Show plot" button)
})

In the case of plugin editing, each time you click “Run plugin”, previously created observers will be invalidated, which avoids conflicts.

Here are the steps of our code:

    1. Retrieve the selected concept from the dropdown menu
    1. Ensure that the concept belongs to a domain that can be displayed as a histogram. For simplicity, we will select only the ‘Measurement’ domain.
    1. Ensure that the data tibble, filtered with the selected concept, is not empty
    1. Create the code for our histogram with ggplot
    1. Update our output
observe_event(input$show_plot_%widget_id%, {

    # 1) Retrieve the selected concept from the dropdown menu
    selected_concept <-
        selected_concepts %>%
        dplyr::filter(concept_id == input$concept_%widget_id%)

    no_data_available <- TRUE

    # 2) Is there a concept selected and is the domain_id equal to 'Measurement'?
    if (nrow(selected_concept) > 0 && selected_concept$domain_id == "Measurement"){

        # 3) Ensure that the data tibble filtered on this concept is not empty
        data <-
            d$measurement %>%
            dplyr::filter(measurement_concept_id == selected_concept$concept_id)

        if(data %>% dplyr::count() %>% dplyr::pull() > 0){

            # 4) Create the histogram code
            plot <-
                data %>%
                ggplot2::ggplot(ggplot2::aes(x = value_as_number)) +
                # We take into account the number of bars from our variable input$num_bins_%widget_id%
                ggplot2::geom_histogram(colour = "white", fill = "#377EB8", bins = input$num_bins_%widget_id%) +
                ggplot2::theme_minimal() +
                # We modify the X and Y axis titles
                ggplot2::labs(x = selected_concept$concept_name, y = i18np$t("occurrences"))

            no_data_available <- FALSE
        }
    }

    # Empty graph if no data available
    if (no_data_available){
            plot <-
                ggplot2::ggplot() +
                ggplot2::theme_void() +
                ggplot2::labs(title = i18np$t("no_data_available"))
    }
    
    # 5) Output update
    output$plot_%widget_id% <- renderPlot(plot)
})

Update the translations.

base,en,fr
concept,Concept to show,Concept à afficher
num_bins,Number of bins,Nombre de barres
show_plot,Show plot,Afficher la figure
none,None,Aucun
occurrences,Occurrences,Occurences
no_data_available,No data available,Pas de données disponibles

You should have this.

We visualize the distribution of heart rate among all patients, from the d$measurement variable.

Here are the three complete files:

tagList(
    div(
        div(
            # Each id is in an ns function, and includes a %widget_id% tag
            id = ns("split_layout_left_%widget_id%"),
            plotOutput(ns("plot_%widget_id%")),
            style = "margin:10px 5px; width:50%; border:dashed 1px;"
        ),
        div(
            # id with ns and %widget_id%
            id = ns("split_layout_right_%widget_id%"),
        
            # div containing the title, in bold (strong), with 10px space between title and dropdown
            div(strong(i18np$t("concept")), style = "margin-bottom:10px;"),
            # Dropdown menu with concepts
            div(shiny.fluent::Dropdown.shinyInput(ns("concept_%widget_id%")), style = "width:300px;"), br(),
        
            # Numeric field to choose the number of bars in the histogram
            # With a value of 50, minimum of 10 and maximum of 100
            div(strong(i18np$t("num_bins")), style = "margin-bottom:10px;"),
            div(shiny.fluent::SpinButton.shinyInput(ns("num_bins_%widget_id%"), value = 50, min = 10, max = 100), style = "width:300px;"), br(),
        
            # Button to display the figure
            shiny.fluent::PrimaryButton.shinyInput(ns("show_plot_%widget_id%"), i18np$t("show_plot")),
            style = "margin: 10px 5px; width:50%;"
        ),
        style = "display: flex; height: 100%;", # Allows displaying the two divs above side by side
    )
)
# Adding a row with values 0 / "none"
concepts <-
    tibble::tibble(concept_id = 0L, concept_name = i18np$t("none")) %>%
    dplyr::bind_rows(selected_concepts %>% dplyr::select(concept_id, concept_name))

# We convert the concepts to list format
concepts <- convert_tibble_to_list(concepts, key_col = "concept_id", text_col = "concept_name")

# We establish a delay, so that the dropdown updates after being created
shinyjs::delay(500, shiny.fluent::updateDropdown.shinyInput(session, "concept_%widget_id%", options = concepts, value = 0L))

# Code that will be executed when the user presses the input show_plot_%widget_id%
observe_event(input$show_plot_%widget_id%, {

    # 1) Retrieve the selected concept from the dropdown menu
    selected_concept <-
        selected_concepts %>%
        dplyr::filter(concept_id == input$concept_%widget_id%)

    no_data_available <- TRUE

    # 2) Is there a concept selected and is the domain_id equal to 'Measurement'?
    if (nrow(selected_concept) > 0 && selected_concept$domain_id == "Measurement"){

        # 3) Ensure that the data tibble filtered on this concept is not empty
        data <-
            d$measurement %>%
            dplyr::filter(measurement_concept_id == selected_concept$concept_id)

        if(data %>% dplyr::count() %>% dplyr::pull() > 0){

            # 4) Create the histogram code
            plot <-
                data %>%
                ggplot2::ggplot(ggplot2::aes(x = value_as_number)) +
                # We take into account the number of bars from our variable input$num_bins_%widget_id%
                ggplot2::geom_histogram(colour = "white", fill = "#377EB8", bins = input$num_bins_%widget_id%) +
                ggplot2::theme_minimal() +
                # We modify the X and Y axis titles
                ggplot2::labs(x = selected_concept$concept_name, y = i18np$t("occurrences"))

            no_data_available <- FALSE
        }
    }

    # Empty graph if no data available
    if (no_data_available){
            plot <-
                ggplot2::ggplot() +
                ggplot2::theme_void() +
                ggplot2::labs(title = i18np$t("no_data_available"))
    }
    
    # 5) Output update
    output$plot_%widget_id% <- renderPlot(plot)
})
base,en,fr
concept,Concept to show,Concept à afficher
num_bins,Number of bins,Nombre de barres
show_plot,Show plot,Afficher la figure
none,None,Aucun
occurrences,Occurrences,Occurences
no_data_available,No data available,Pas de données disponibles

This is not simple and requires knowing how to manipulate the Shiny library. To learn more about Shiny, we recommend the excellent book Mastering Shiny.

Sharing the plugin

Before sharing the plugin, it is necessary to document it, so that users know what it is for and how to use it.

For this, go to the plugin Summary page. You see that the “Short description” fields on the left and “Description” on the right are empty.

Click on the “Edit information” button on the left side of the screen.

You can then modify the information related to the plugin, including the authors who helped in its design and a short description, which will be displayed on the plugins page.

We could for example give this short description to our plugin: “A plugin allowing visualization of structured data as a histogram”.

You can also edit the Complete description by clicking on the icon at the top right of the screen.

This will open an editor where you can write the description in Markdown format.

Once the description modifications are validated, click on the “Save modifications” icon on the right side of the screen.

To validate the plugin information modifications, click on the “Save” icon on the left side of the screen.

Now that your plugin information is well documented, you will be able to share it via the “Share” tab, at the top right of the screen, by following this tutorial.

Conclusion

You have just created your first plugin! You can now use it in a project and especially improve it.


The advantage with plugins is that everything that is doable in R or Python is integrable into LinkR as a plugin!


Plugins quickly become complex, which is why we created a development template to have a solid and common base for coding more complex plugins. This is what we will see in the next chapter.

8.2 - Advanced

9 - Creating Preprocessing Scripts

How to create and apply preprocessing scripts to ensure data quality

This feature will be available in the version 0.4.

10 - Vocabularies

How to import vocabularies, from Athena to LinkR

Vocabularies in OMOP

Two types of vocabularies are used in the OMOP common data model:

  • Standard vocabularies, which are international reference vocabularies. These include:

    • LOINC for laboratory data and vital signs
    • SNOMED for diagnoses and procedures
    • RxNorm for prescriptions

  • Non-standard vocabularies, which are often international vocabularies but not exclusively. These are widely used, which is why they are included, even though they are non-standard. Examples include:

    • ICD-10 for diagnoses
    • CCAM, a French terminology for medical procedures

Both standard and non-standard vocabularies can be used in the OMOP data model. Standard concepts will be found in the _concept_id columns, while non-standard concepts will appear in the _source_concept_id columns. You should aim to use standard concepts as much as possible during the ETL process.

ATHENA

ATHENA is a vocabulary querying platform provided by OHDSI.

It allows you to search for concepts across all OMOP vocabularies using filters.

By clicking on the Download tab at the top of the page, you can download the vocabularies of your choice.

Start by deselecting all vocabularies by clicking on the checkbox at the top left of the screen, then select the vocabularies you wish to download.

For example, we will download the LOINC vocabulary.

Check the LOINC vocabulary box, then click on Download vocabularies at the top right.

Note that some vocabularies are not public and require a license to download.

Choose a name for the bundle, then click on Download. The site will indicate that the bundle is being created, which can take a few minutes as the server processes the SQL query to generate the CSV files and ZIP file.

Next, click on “Show history” and then “Download” to retrieve your bundle.

You will download a ZIP file containing one CSV file per vocabulary table (“VOCABULARY.csv,” “CONCEPT.csv,” etc.).

Importing vocabularies into LinkR

Now, all that remains is to import the vocabulary into LinkR.

To do so, go to the “Vocabularies” page, accessible from the homepage or via the link at the top of the page.

Then click on the “Import concepts or vocabularies” button in the sidebar.

Select either the ZIP file or the individual CSV files.

Click on “Import.” Done! We have successfully imported LOINC into LinkR.

Querying vocabularies in LinkR

Navigate to the application’s database page via the tab at the top right of the screen.

Go to the “Query the database” tab at the top right of the screen and select the option.

At the top right of the screen, select “Public DB.”

You can query the concept tables using SQL:

11 - Content Catalog

Access shared content from other teams

Installing an item

To get started, go to the “Content catalog” page, accessible from the homepage or the top menu.

By selecting a point on the map, you will see its description (corresponding to the README.md file of the Git repository).

Click the “View content” button to access the shared content provided by this team.

You can choose the category of content from the tabs at the top-right of the screen, including:

  • Projects
  • Plugins
  • Data cleaning scripts
  • Datasets

Clicking on a widget will take you to the description of that content.

You can install or update the item simply by clicking the “Install” or “Update” button.

Once the item is installed, you can access it locally from the corresponding page (Projects page, Datasets page, etc.).

To return to the map, click on “Git repositories” at the top of the screen.

12 - Sharing Content

Share your work to contribute to open science

13 - Roadmap

The next steps in LinkR’s development

Version 0.4 (during 2025):

  • Improve content sharing (#122, #123)
  • Create a centralized content catalog page on the website (#125)
  • “Project Files” page: a page to create, edit, and delete files within a project (#128)
  • Integrate data cleaning scripts into projects: choose which scripts to execute and in what order (#129)
  • “Workflow” page: a page to configure the workflow of a project: which files, which scripts, which widgets to execute, and in what order (#130)
  • Integrate LLMs into LinkR: a page to configure the code to load each local LLM (local-only, to avoid data leaks on non-HDS servers) (#8, #127)
  • Create a graphical interface for data import (#97)
  • Create a graphical interface for subset creation (criteria selection, logical operators, etc.) (#126)
  • Background task management / asynchronous programming (#70)

Priority Plugins to Develop (end of 2024 - beginning of 2025):

  • Visualization of procedures (individual data) (#21)
  • Visualization of diagnoses (individual data) (#22)

Priority Data Cleaning Scripts to Develop (mid-2025):

  • SOFA (#1)
  • IGS-2 (#3)
  • Weight and height outliers (#5)