iModMix is a novel network-based approach that integrates multi-omics data, including metabolomics, proteomics, and transcriptomics data, into a unified network to unveil associations across various layers of omics data and reveal significant associations with phenotypes of interest.
iModMix can incorporate both identified and unidentified metabolites, addressing a key limitation of existing methodologies.
if (!requireNamespace("BiocManager", quietly = TRUE)) {
  install.packages("BiocManager")
}
BiocManager::install("iModMix")Note: to install development version:
iModMix includes an interactive Shiny application that allows users to explore multi-omics integration results in a user-friendly interface. The app can be launched locally from R or accessed online.
To launch the app from your R session, simply run:
This will open the app in your default web browser. Make sure the package is installed and loaded before running the command.
You can also explore the app without installing anything by visiting the hosted version: https://imodmix.moffitt.org/ This online version provides access to the same interactive features, including module visualization, enrichment analysis, and metadata exploration.
Below is an example showing the step-by-step analysis of two datasets through iModMix package: We used 24 normal and 52 tumor clear cell renal cell carcinoma (ccRCC, RC20 dataset) samples (Golkaram et al. 2022; Tang et al. 2023; Benedetti et al. 2023) as the example case study. (https://doi.org/10.5281/zenodo.13988161).
Input data:
# Load the package
library(iModMix)
# Get the path to the expression data file
pathMetabExp <- system.file("Example_data/ccRCC4_Data/Metab_exp.rds", package = "iModMix")
# Load the expression data
dataExp1 <- readRDS(pathMetabExp)
# Check the expression data
dataExp1[1:5, 1:5]##   Feature_ID RC20.MSKC.00573 RC20.MSKC.00574 RC20.MSKC.00575 RC20.MSKC.00576
## 1   Metab373        19.03183        19.15170        18.94049        18.77380
## 2   Metab807        13.34715        13.34715        13.34715        13.34715
## 3   Metab806        13.21118        13.21118        13.21118        13.21118
## 4   Metab808        14.32570        14.32570        14.32570        14.32570
## 5   Metab197        21.29544        21.30204        20.00029        21.41036# Get the path to the Metadata file
pathMetadata <- system.file("Example_data/ccRCC4_Data/Metadata.rds", package = "iModMix")
# Load the Metadata
metadata <- readRDS(pathMetadata)
# Check the Metadata
head(metadata)##            Sample     TN  Histology
## 1 RC20.MSKC.00573  Tumor Clear Cell
## 2 RC20.MSKC.00574  Tumor Clear Cell
## 3 RC20.MSKC.00575  Tumor Clear Cell
## 4 RC20.MSKC.00576  Tumor Clear Cell
## 5 RC20.MSKC.00577  Tumor Clear Cell
## 6 RC20.MSKC.00578 Normal Clear CellThe iModMix downstream pipeline requires an input matrix of RNA-seq counts. The load_data function performs the following steps: * Transposes the input data so that rows represent samples and columns represent features * Applies feature selection (SD below the 25th percentile) * Handles missing values (KNN) This step is optional. If the input matrix already has samples as rows, features as columns, and contains no missing values, users may skip it.
##                   Metab373   Metab807   Metab806   Metab808   Metab197
## RC20.MSKC.00573 0.47187064 -0.8291436 -0.9792751 -0.7464603  0.1386495
## RC20.MSKC.00574 0.65414303 -0.8291436 -0.9792751 -0.7464603  0.1433431
## RC20.MSKC.00575 0.33297295 -0.8291436 -0.9792751 -0.7464603 -0.7821402
## RC20.MSKC.00576 0.07949406 -0.8291436 -0.9792751 -0.7464603  0.2203568
## RC20.MSKC.00577 0.17085992 -0.8291436 -0.9792751 -0.7464603 -0.5085522Calculations for partial correlation is usually the slowest step. Partial correlation is a method of analyzing the relationship between two variables when other variables are present. Graphical Lasso (Glasso) is used to estimate the partial correlation and captures only direct associations. Below is a preview of the sparse partial correlation of the first five features.
# Perform Partial correlation
parcorData1 <- fctPartialCors(loadData1, rho = 0.25)
parcorData1[1:5, 1:5]##             Metab373   Metab807     Metab806   Metab808 Metab197
## Metab373          NA  0.0000000  0.005422781  0.0000000        0
## Metab807 0.000000000         NA -0.319434636 -0.2664755        0
## Metab806 0.005422781 -0.3194346           NA -0.2225263        0
## Metab808 0.000000000 -0.2664755 -0.222526343         NA        0
## Metab197 0.000000000  0.0000000  0.000000000  0.0000000       NAHierarchical clustering is used to identify common neighbors between the features. Calculations are determined using the topographical overlap matrix (TOM) and are based on the sparse partial correlations. Hierarchical clustering is visualized as a dendrogram.
The dendrogram plot is publication-ready and displays a dendrogram of features and modules. Axes: The vertical axis (y-axis) represents the dissimilarity between features, while the horizontal axis (x-axis) shows the modules. Branches: Each line in the dendrogram represents a feature. Features that are closer in the hierarchy (i.e., joined at a lower height in the dendrogram) have more similar expression profiles.
# Perform hierarchical clustering
hcData1 <- fctHierarchicalCluster(parcorMat = parcorData1, tom = TRUE, minModuleSize = 10)## Allowing parallel execution with up to 71 working processes.
## ..connectivity..
## ..matrix multiplication (system BLAS)..
## ..normalization..
## ..done.hcClu <- hcData1$hclustTree
hcMod <- as.matrix(hcData1$dynamicMods_numeric)
WGCNA::plotDendroAndColors(
  dendro = hcClu,
  colors = hcMod,
  dendroLabels = FALSE,
  hang = 0.03,
  addGuide = TRUE,
  guideHang = 0.05,
  groupLabels = "Modules",
  main = "Feature dendrogram and module assignments"
)Let’s see the cluster assignments. Hierarchical clustering generates multiple modules (clusters) to which each feature is assigned. The table below details the following columns:
# Perform cluster assignments
clusterAssigData1 <- fctClusterAssignments(as.data.frame(hcData1$cluster_assignments))
head(clusterAssigData1)##    feature        cluster     col feature_name
## 1 Metab373 cluster_000026 #AA5831     Metab373
## 2 Metab807 cluster_000030 #F481BD     Metab807
## 3 Metab806 cluster_000030 #F481BD     Metab806
## 4 Metab808 cluster_000030 #F481BD     Metab808
## 5 Metab197 cluster_000019 #FFAD12     Metab197
## 6 Metab445 cluster_000017 #F27913     Metab445The 904 features (identified metabolites) were assigned into 34 modules. The first principal component (PC1) is calculated for each module, referred to as an eigenfeature. Eigenfeatures are useful for
# Obtain Eigenfeatures
eigenData1 <- fctEigengenes(loadData1, hcData1$cluster_assignments[, 3])
eigengenesData1 <- eigenData1$module_eigenmetab_Me
eigengenesData1[1:5, 1:5]##                   ME#3B88A0   ME#3C7AB3   ME#419486 ME#46A06B    ME#4BAC50
## RC20.MSKC.00573 -0.09452106  0.04054216 -0.09858921 0.1633222  0.067529213
## RC20.MSKC.00574 -0.05317106 -0.02626292 -0.04779067 0.1090938  0.019922717
## RC20.MSKC.00575 -0.17118698 -0.06062323 -0.12805564 0.1473363 -0.074034973
## RC20.MSKC.00576 -0.08161202 -0.04465726 -0.06401626 0.1345405 -0.001244864
## RC20.MSKC.00577 -0.15085973 -0.05007704 -0.10988318 0.1474912 -0.069908716Since this is purely an example of how to run the entire analysis from start to finish, we are going to limit our analysis to the first 1000 features.
# Get the path to the expression data file
pathRNAexp <- system.file("Example_data/ccRCC4_Data/RNA_exp.rds", package = "iModMix")
# Load the expression data
dataExp2 <- readRDS(pathRNAexp)
dataExp2 <- dataExp2[1:1000, ]
# Check the expression data
dataExp2[1:5, 1:5]##   Feature_ID RC20.MSKC.00573 RC20.MSKC.00574 RC20.MSKC.00575 RC20.MSKC.00576
## 1       A1BG             304             372             277             344
## 2       NAT2               7              10               6              12
## 3        ADA             787            1039             644            1041
## 4       CDH2           11991            7172           14457            9964
## 5       AKT3           13076           13876           10976           12730Calculate correlated matrix
##      A1BG NAT2 ADA       CDH2       AKT3
## A1BG   NA    0   0  0.0000000  0.0000000
## NAT2    0   NA   0  0.0000000  0.0000000
## ADA     0    0  NA  0.0000000  0.0000000
## CDH2    0    0   0         NA -0.2301523
## AKT3    0    0   0 -0.2301523         NACalculate hierarchical clustering
# Perform hierarchical clustering
hcData2 <- fctHierarchicalCluster(parcorMat = parcorData2, tom = TRUE, minModuleSize = 10)## Allowing parallel execution with up to 71 working processes.
## ..connectivity..
## ..matrix multiplication (system BLAS)..
## ..normalization..
## ..done.# Access the hierarchical clustering tree and module assignments
hcClu2 <- hcData2$hclustTree
hcMod2 <- as.matrix(hcData2$dynamicMods_numeric)
# Plot the dendrogram
WGCNA::plotDendroAndColors(
  dendro = hcClu2,
  colors = hcMod2,
  dendroLabels = FALSE,
  hang = 0.03,
  addGuide = TRUE,
  guideHang = 0.05,
  groupLabels = "Modules",
  main = "Feature dendrogram and module assignments"
)Perform network construction and module eigengene calculation
# Perform cluster assignment
clusterAssigData2 <- fctClusterAssignments(as.data.frame(hcData2$cluster_assignments))
eigenData2 <- fctEigengenes(loadData2, hcData2$cluster_assignments[, 3])
eigengenesData2 <- eigenData2$module_eigenmetab_Me
eigengenesData2[1:5, 1:5]##                  ME#3982AE   ME#3D8B9A ME#419486  ME#459D71  ME#4674A9
## RC20.MSKC.00573 0.10017350 -0.02712216 0.1993203 0.10061396 0.04785642
## RC20.MSKC.00574 0.03155250 -0.03207484 0.1931456 0.05334562 0.04834352
## RC20.MSKC.00575 0.11538115 -0.04139072 0.2011653 0.11052656 0.04044095
## RC20.MSKC.00576 0.04221625 -0.04664415 0.1922938 0.05054639 0.01798802
## RC20.MSKC.00577 0.09979154 -0.04398240 0.1641929 0.12994397 0.02970695Statistical analysis by Students t-test compares phenotypes chosen as variable of interest and the user can specify a significance threshold for the p-value, the default p-value is set to 0.05. The eigenfeatures of each module, determined previously, are used as predictors. A data frame is generated with the following columns
Here, TN variable (Tumor, Normal) is selected as the phenotype variable of interest
Boxplots are automatically generated at the bottom for significant eigenfeatures, with outliers identified as individual dots and a legend is provided to describe the compared phenotypes.
# Perform Classification
ClassificationData <- fctPerformClassification(
  eigengeneData = eigengenesData1,
  metadata = metadata,
  phenotypeVariable = "TN",
  significanceThreshold = 0.09
)
ClassificationData$result[1:10, ]##                    Variable           Class     Result_t Result_pValue
## Comparison.t...2  ME#3C7AB3 Tumor vs Normal  6.68590e+00      0.00e+00
## Comparison.t...5  ME#4BAC50 Tumor vs Normal  5.53910e+00      0.00e+00
## Comparison.t...10 ME#904A67 Tumor vs Normal  9.61810e+00      0.00e+00
## Comparison.t...11 ME#91569A Tumor vs Normal  1.32229e+01      0.00e+00
## Comparison.t...12 ME#999999 Tumor vs Normal  7.37100e+00      0.00e+00
## Comparison.t...13 ME#A7558A Tumor vs Normal -4.45020e+00      0.00e+00
## Comparison.t...14 ME#AA5831 Tumor vs Normal -7.81820e+00      0.00e+00
## Comparison.t...15 ME#AF93A2 Tumor vs Normal  1.02016e+01      0.00e+00
## Comparison.t...16 ME#B6742A Tumor vs Normal -5.01910e+00      0.00e+00
## Comparison.t...19 ME#C06162 Tumor vs Normal  1.03488e+01      0.00e+00
##                   Adjusted_pValue
## Comparison.t...2         0.00e+00
## Comparison.t...5         0.00e+00
## Comparison.t...10        0.00e+00
## Comparison.t...11        0.00e+00
## Comparison.t...12        0.00e+00
## Comparison.t...13        0.00e+00
## Comparison.t...14        0.00e+00
## Comparison.t...15        0.00e+00
## Comparison.t...16        0.00e+00
## Comparison.t...19        0.00e+00# Plot BoxPlot
selectedVariable <- "TN"
levels <- unique(metadata[[selectedVariable]])
classLabel <- paste(levels, collapse = " vs ")
plot <- ClassificationData$plots[[1]]
plot <- plot +
  ggplot2::labs(
    title = classLabel, fill = as.factor(levels),
    x = "Variables",
    y = "Class"
  ) +
  ggplot2::theme(
    axis.text.x = ggplot2::element_text(angle = 90, hjust = 1)
  )
plotThis step perform the integration of the datasets previously analyzed using the eigenfeatures obtained from the steps above.
# Summarize into list the different eigenfeatures
eigengenesList <- list(eigengenesData1, eigengenesData2)
clusterList <- list(clusterAssigData1, clusterAssigData2)
IntegrationData1Data2 <- fctModulesCorrelation(eigengenesList, clusterList, threshold = 0.6)
TopCorrelations <- IntegrationData1Data2[["Top_cor_Prot_metab"]]
head(TopCorrelations)##                           from        to   value
## cor_Data1_Data2.1345 D1#C06162 D2#FF9609 -0.6106
## cor_Data1_Data2.1351 D1#E1C62F D2#FF9609 -0.6104
## cor_Data1_Data2.1209 D1#C06162 D2#EE83BB -0.6089
## cor_Data1_Data2.1132 D1#904A67 D2#E6D030 -0.6081
## cor_Data1_Data2.1134 D1#999999 D2#E6D030 -0.6033# Plot correlation
CorrelationPlot <- IntegrationData1Data2$Correlation_Plot
hist(CorrelationPlot[[1]], main = "Correlation: Data 1 / Data 2 ")# Plot Network
nodes <- as.data.frame(IntegrationData1Data2$nodes)
edges <- as.data.frame(IntegrationData1Data2$edges)
n <- IntegrationData1Data2$n
shapes <- c("diamond", "triangle", "dot")
colors <- c("orange", "darkgreen", "darkblue")
network <- visNetwork::visNetwork(nodes = nodes, edges = edges, width = "100%", height = "800px")
network <- visNetwork::visLegend(network,
  useGroups = FALSE,
  addNodes = data.frame(
    label = paste0("Data", 1:n, " Modules"),
    shape = shapes[1:n], color = colors[1:n]
  ),
  addEdges = data.frame(label = "Correlation", shape = "line", length = 200, color = "darkgreen")
)
network <- visNetwork::visInteraction(network, navigationButtons = TRUE)
networkEnrichment analysis can be performed if annotation data for proteomics or transcriptomics is uploaded with the column Symbol available. The drop-down menu displays available libraries for pathway analysis. Choose a library to automatically amend the dataset cluster descriptions on table below. For example, selecting “GO_Biological_Process_2023” will annotate cluster descriptions with enriched biological processes.
To perform enrichment analysis programmatically, you can use the function fctAssignmentGenesEnrichr(). Under column enriched_Term the most highly correlated pathway is displayed and in the following columns, along with enriched_Genes, and p-values as determined by Enrichr. The following code shows how to run the enrichment analysis using transcriptomics data (data2). This chunk is not evaluated to avoid long loading times during vignette rendering.
# Perform enrichment analysis per module
selectedDatabase <- "GO_Biological_Process_2023"
clusterAssignmentsData2Enrich <- fctAssignmentGenesEnrichr(clusterAssignmentsProtGenes = clusterAssigData2, database = selectedDatabase)we show a representative subset of the enrichment results:
Enrichment Analysis Results[1:5,]
| feature | cluster | col | feature_name | enriched_Term | enriched_Overlap | enriched_Genes | enriched_P.value | enriched_Adjusted.P.value | 
|---|---|---|---|---|---|---|---|---|
| A1BG | cluster_000005 | #66628D | A1BG | Cilium Disassembly | 1/5 | HDAC6 | 0.008968455 | 0.09813736 | 
| NAT2 | cluster_000041 | #DD87B4 | NAT2 | NA | NA | NA | NA | NA | 
| ADA | cluster_000017 | #91569A | ADA | Purine Ribonucleoside Metabolic Process | 1/6 | ADA | 0.007477447 | 0.08081181 | 
| CDH2 | cluster_000019 | #B45B76 | CDH2 | Detection Of Muscle Stretch | 1/6 | CDH2 | 0.007179245 | 0.11473611 | 
| AKT3 | cluster_000019 | #B45B76 | AKT3 | Detection Of Muscle Stretch | 1/6 | CDH2 | 0.007179245 | 0.11473611 | 
This table summarizes enriched biological processes associated with selected modules. The full enrichment analysis can be reproduced by uncommenting the function call above.
To ensure compatibility with Bioconductor standards, iModMix supports input data structured as SummarizedExperiment objects. This allows users to integrate metabolomics or other omics data using standardized containers that facilitate downstream analysis and interoperability with other Bioconductor packages.
Below is an example of how to convert your expression matrix and metadata into a SummarizedExperiment object:
# Load required library
library(SummarizedExperiment)
# Assuming:
# - 'dataExp1' has features in rows and samples in columns
# - 'Feature_ID' is the first column of dataExp1
# - 'metadata' has sample information with 'Sample' column matching column names of dataExp1 (except Feature_ID)
# Create SummarizedExperiment object
se <- fctiModMix2SE(dataExp1, metadata)
# Check structure
se## class: SummarizedExperiment 
## dim: 904 76 
## metadata(0):
## assays(1): counts
## rownames(904): Metab373 Metab807 ... Metab744 Metab429
## rowData names(0):
## colnames(76): RC20.MSKC.00573 RC20.MSKC.00574 ... RC20.MSKC.00653
##   RC20.MSKC.00659
## colData names(3): Sample TN Histology## R version 4.5.1 Patched (2025-08-23 r88802)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] SummarizedExperiment_1.40.0 Biobase_2.70.0             
##  [3] GenomicRanges_1.62.0        Seqinfo_1.0.0              
##  [5] IRanges_2.44.0              S4Vectors_0.48.0           
##  [7] BiocGenerics_0.56.0         generics_0.1.4             
##  [9] MatrixGenerics_1.22.0       matrixStats_1.5.0          
## [11] iModMix_1.0.0               knitr_1.50                 
## 
## loaded via a namespace (and not attached):
##   [1] DBI_1.2.3             gridExtra_2.3         attempt_0.3.1        
##   [4] rlang_1.1.6           magrittr_2.0.4        otel_0.2.0           
##   [7] compiler_4.5.1        RSQLite_2.4.3         png_0.1-8            
##  [10] vctrs_0.6.5           reshape2_1.4.4        stringr_1.5.2        
##  [13] pkgconfig_2.0.3       crayon_1.5.3          fastmap_1.2.0        
##  [16] backports_1.5.0       XVector_0.50.0        labeling_0.4.3       
##  [19] promises_1.4.0        rmarkdown_2.30        preprocessCore_1.72.0
##  [22] purrr_1.1.0           bit_4.6.0             xfun_0.53            
##  [25] cachem_1.1.0          jsonlite_2.0.0        blob_1.2.4           
##  [28] later_1.4.4           DelayedArray_0.36.0   parallel_4.5.1       
##  [31] cluster_2.1.8.1       R6_2.6.1              golem_0.5.1          
##  [34] bslib_0.9.0           stringi_1.8.7         RColorBrewer_1.1-3   
##  [37] rpart_4.1.24          jquerylib_0.1.4       Rcpp_1.1.0           
##  [40] iterators_1.0.14      WGCNA_1.73            base64enc_0.1-3      
##  [43] httpuv_1.6.16         Matrix_1.7-4          splines_4.5.1        
##  [46] nnet_7.3-20           tidyselect_1.2.1      abind_1.4-8          
##  [49] rstudioapi_0.17.1     dichromat_2.0-0.1     yaml_2.3.10          
##  [52] doParallel_1.0.17     codetools_0.2-20      plyr_1.8.9           
##  [55] lattice_0.22-7        tibble_3.3.0          shiny_1.11.1         
##  [58] withr_3.0.2           KEGGREST_1.50.0       S7_0.2.0             
##  [61] glassoFast_1.0.1      evaluate_1.0.5        foreign_0.8-90       
##  [64] survival_3.8-3        Biostrings_2.78.0     pillar_1.11.1        
##  [67] checkmate_2.3.3       foreach_1.5.2         ggplot2_4.0.0        
##  [70] scales_1.4.0          xtable_1.8-4          glue_1.8.0           
##  [73] Hmisc_5.2-4           tools_4.5.1           data.table_1.17.8    
##  [76] visNetwork_2.1.4      fastcluster_1.3.0     grid_4.5.1           
##  [79] impute_1.84.0         tidyr_1.3.1           AnnotationDbi_1.72.0 
##  [82] colorspace_2.1-2      htmlTable_2.4.3       Formula_1.2-5        
##  [85] cli_3.6.5             config_0.3.2          S4Arrays_1.10.0      
##  [88] dplyr_1.1.4           gtable_0.3.6          dynamicTreeCut_1.63-1
##  [91] sass_0.4.10           digest_0.6.37         SparseArray_1.10.0   
##  [94] htmlwidgets_1.6.4     farver_2.1.2          memoise_2.0.1        
##  [97] htmltools_0.5.8.1     lifecycle_1.0.4       httr_1.4.7           
## [100] GO.db_3.22.0          mime_0.13             bit64_4.6.0-1