Graphical Interface
A graphical interface does not yet exist, which would be very useful for filtering patients based on certain characteristics (age, gender, length of stay, hospitalization dates, or the presence of concepts such as diagnoses or treatments).
This graphical interface will be developed in the next version.
Introduction
During a project, it is often necessary to work on a subset of the patient population within a dataset.
For example, it might be useful to create a subset “Included patients” containing only the patients ultimately included in the final analyses of a study.
Similarly, one could imagine creating a subset of patients with a certain diagnosis, included within a specific period, or exposed to a particular treatment.
All of this is possible through the creation of subsets.
Creating a Subset
To create a subset, go to the Subsets page. For this, you need to have loaded a project. Then click on the “Subsets” icon in the top menu, to the right of the loaded project name.
You will arrive on the project subsets page.
A subset is a subset of a dataset, but it depends on a project. If two projects use the same dataset, they will not share the same subsets.
A subset “All Patients” is created by default when a project is created.
To create a subset, click the “+” icon on the left side of the screen.
Choose a name, then click “Add.” For this example, we will create a subset containing patients aged over 50 years.
Click on the subset you just created: you will be taken to the page of the selected subset.
On the right side of the screen, there are two tabs:
- Summary: displays the subset’s information, which can be modified (including the subset’s description).
- Code: this tab allows you to modify the code and add or remove patients from a subset, which we will explore in the following sections.
Adding patients to a subset
To add patients to a subset, use the add_patients_to_subset
function.
This function takes the following arguments:
patients
: A numeric vector containing the IDs of the patients to add.subset_id
: The ID of the subset to which the patients will be added (replaced by %subset_id% in the subset code, which is then replaced with the ID of the selected subset).output
,r
,m
,i18n
, andns
: Arguments needed for data manipulation and error message display.
When a subset is created, code is automatically generated to add all patients to the subset.
This code will execute when the user clicks the button to run the code or if the subset is selected from the project (and it does not already contain patients).
We will modify this code to add patients aged over 50.
Let’s create the code to create a column with the patients’ ages.
d$visit_occurrence %>%
dplyr::left_join(
d$person %>% dplyr::select(person_id, birth_datetime),
by = "person_id"
) %>%
dplyr::collect() %>%
dplyr::mutate(
age = round(as.numeric(difftime(visit_start_datetime, birth_datetime, units = "days")) / 365.25, 1)
)
The code editor of the selected subset allows you to test the code. We will extract the IDs of patients aged over 50. For now, we will comment out the add_patients_to_subset
function.
d$visit_occurrence %>%
dplyr::left_join(
d$person %>% dplyr::select(person_id, birth_datetime),
by = "person_id"
) %>%
dplyr::collect() %>%
dplyr::mutate(
age = round(as.numeric(difftime(visit_start_datetime, birth_datetime, units = "days")) / 365.25, 1)
) %>%
dplyr::filter(age > 50) %>%
dplyr::distinct(person_id) %>%
dplyr::pull()
Our code works, so we can store these IDs in a variable and then integrate them into the add_patients_to_subset
function.
A message confirms that the patients have been successfully added to the subset.
Removing patients from a subset
To remove patients from a subset, use the remove_patients_from_subset
function, which works like add_patients_to_subset
with the same arguments, particularly patients
and subset_id
.
For example, after adding all patients to the subset, you could remove those aged 50 or younger.
Integration into Plugins
It would be useful to create an Individual Data plugin for excluding patients based on one or more exclusion criteria defined by the user.
For instance, this plugin could remove patients from the “Included Patients” subset and add them to the “Excluded Patients” subset.
To achieve this, you would simply use the add_patients_to_subset
and remove_patients_from_subset
functions.
How can you retrieve the IDs of subsets? By using the m$subsets
variable.
Now all that’s left is to create the plugin!