beadplexr with CBA and MACSPlex assays

Introduction

This document demonstrates the potential usefulness of beadplexr with data generated with the CBA assay system from BD, and the MACSPlex assay system from Miltenyi Biotec. The package has been tested with assay data from the two systems. Unfortunately, I cannot share the data - for demonstration purposes I have simulated data for the two systems. The simulated data is deliberately noisy, and do not reflect the true quality of the respective assays. The sole purpose of the simulated data is to mimic the two dimensional bead identification.

Load and view the data

library(beadplexr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
data(simplex)

The MACSPlex-like data

The data from the CBA system is similar to the data from the MACSPlex system: there is a single population in the forward-side scatter and 10 populations separated by brightness of FITC and PE. The concentration of the analytes are given in the APC channel.

mplex_data <- simplex[["mplex"]]

mplex_data |> 
  facs_plot(.x = "FSC", .y = "SSC", .type = "hex")

mplex_data |> 
  facs_plot(.x = "FITC", .y = "PE", .type = "hex")

The CBA-like data

There is a single bead population in the forward-side scatter and up to 30 bead populations separated by the brightness of APC and APC-Cy7. The concentration of the analytes are given in the PS channel.

cba_data <- simplex[["cba"]]

cba_data |> 
  facs_plot(.x = "FSC", .y = "SSC", .type = "hex")

cba_data |> 
  facs_plot(.x = "APC", .y = "APC-Cy7", .type = "hex")

Analyte identification

Since there is just a single population in the forward-side scatter we can jump straight to identification of the analytes. It is of course possible to decrease spurious analyte signals by focusing on true bead events in the forward-side scatter, and it might be necessary in a true experiment.

Analyte identification

MACSPlex-like

We use the identify_analyte() to annotate the correct analytes in the MACSPlex-like data. The function requires the parameter .analyte_id, which – in a true experiment – correspond to some identifier given by Miltenyi Biotec. Here we just assign a number.

The function also allows for removing events that are far from the center of the analyte clusters. This is done by the parameter .trim. Here we set the value to 0.1 which means that 10% of the events are removed from the clusters. Excluded events are given the analyte ID NA and can easily be excluded from further processing.

mplex_analyte <- mplex_data |> 
  identify_analyte(.parameter = c("FITC", "PE"), .analyte_id = as.character(c(1:10)))

mplex_analyte |> 
  facs_plot(.x = "FITC", .y = "PE", .beads = "analyte")

mplex_analyte <- mplex_data |> 
  identify_analyte(.parameter = c("FITC", "PE"), .analyte_id = as.character(c(1:10)), .trim = 0.1)

mplex_analyte |> 
  facs_plot(.x = "FITC", .y = "PE", .beads = "analyte")

We can now look at the intensity of the APC channel, to find the relative amount of analyte bound to each bead. In this case it makes very little sense, but we’ll do it for the purpose of completeness

mplex_analyte |> 
  filter(!is.na(analyte)) |> 
  mutate(analyte = factor(analyte, levels = c(1:10))) |> 
  ggplot() +
  aes(x = APC, fill = analyte) +
  geom_density()+
  facet_wrap(~ analyte)+
  theme(legend.position = "none")

The trimming can also be done on the identified analytes. Here we remove 5% of the events.

mplex_analyte |> 
  # filter(!is.na(analyte)) |> 
  group_by(analyte) |> 
  trim_population(.parameter = "APC", .column_name = "analyte", .trim = 0.05) |> 
  ungroup() |> 
  mutate(analyte = factor(analyte, levels = c(1:10))) |>
  ggplot() +
  aes(x = APC, fill = analyte) +
  geom_density() +
  facet_wrap(~ analyte) +
  theme(legend.position = "none")

CBA-like

For the CBA assay system, the approach is quite similar to the MACSPlex:

cba_analyte <- cba_data |> 
  identify_analyte(.parameter = c("APC", "APC-Cy7"), .analyte_id = as.character(c(1:30)), .trim = 0.1)

cba_analyte |> 
  facs_plot(.x = "APC", .y = "APC-Cy7", .beads = "analyte")

ID assignment

The analyte ID is assigned to each cluster, based on the order of the center of the cluster. For analytes where the bead is identified by two fluorescent parameters it could be a little tricky, because the centers are first sorted by the first parameter given, and then the second. The order of parameters and analyte IDs matter a lot! I urge you to double check that the beads are identified as expected. Once you have everything sorted out, it should remain stable for your cytometer.

For the simulated MACSPlex-like data the IDs are assigned as below.

mplex_analyte |> 
  filter(!is.na(analyte)) |> 
  group_by(analyte) |> 
  summarise(`FITC mean` = mean(FITC), `PE mean` = mean(PE)) |> 
  arrange(`FITC mean`, `PE mean`) |> 
  knitr::kable()
analyte FITC mean PE mean
1 5.253017 5.343757
2 5.342780 12.427284
3 6.408917 7.362167
4 7.709976 10.075022
5 8.494268 14.271564
6 8.827282 7.689722
7 12.166583 6.347947
8 12.292569 9.710517
9 14.361117 9.792471
10 14.880174 14.453440

Noise reduction

For the CBA-like data, some clusters are still a bit too wide, while the trimming of 0.1 made others look quite fine. The simulated data contains a bit of noise hat might interfere the the cluster identification. This noise might not be present in the a true experiment, but is included to demonstrate the removal of lonely, noisy events by the despeckel() functionality of beadplexr.

cba_analyte <- cba_data |> 
  despeckle(.parameter = c("APC", "APC-Cy7"), .neighbours = 2) |> 
  identify_analyte(.parameter = c("APC", "APC-Cy7"), .analyte_id = as.character(c(1:30)), .trim = 0.01)

cba_analyte |> 
  facs_plot(.x = "APC", .y = "APC-Cy7", .beads = "analyte")

In this particular example, the despeckle() is a little too rough on the population in the lower left corner, and it is probably more prudent to accept a little noise on one analyte than decimation of another.

Session info

## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_3.5.1       dplyr_1.1.4         beadplexr_0.5.0.999
## [4] rmarkdown_2.29     
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6      jsonlite_1.8.9    compiler_4.4.2    tidyselect_1.2.1 
##  [5] tidyr_1.3.1       cluster_2.1.6     jquerylib_0.1.4   scales_1.3.0     
##  [9] yaml_2.3.10       fastmap_1.2.0     lattice_0.22-6    hexbin_1.28.4    
## [13] R6_2.5.1          labeling_0.4.3    generics_0.1.3    knitr_1.49       
## [17] tibble_3.2.1      maketools_1.3.1   munsell_0.5.1     bslib_0.8.0      
## [21] pillar_1.9.0      rlang_1.1.4       utf8_1.2.4        cachem_1.1.0     
## [25] xfun_0.49         sass_0.4.9        sys_3.4.3         cli_3.6.3        
## [29] withr_3.0.2       magrittr_2.0.3    digest_0.6.37     grid_4.4.2       
## [33] mclust_6.1.1      lifecycle_1.0.4   vctrs_0.6.5       evaluate_1.0.1   
## [37] glue_1.8.0        farver_2.1.2      buildtools_1.0.0  colorspace_2.1-1 
## [41] fansi_1.0.6       purrr_1.0.2       tools_4.4.2       pkgconfig_2.0.3  
## [45] htmltools_0.5.8.1