Pathway Analysis

Using DPKS it is possible to perform pathway analysis using proteins selected from differential abundance analysis or by using explainable machine learning. We use the methods available in the awesome GSEAPY1 package to perform overrepresentation tests using hypergeometric distributions on the protein in your QuantMatrix.

Note

This is one of the few methods in a QuantMatrix that does not return an instance of itself to allow for method chaining. Instead, the results of the overrepresentation tests from GSEAPY are returned.

Info

We are working on adding support for GSEA analysis directly from a QuantMatrix

Basic usage of the enrich() method is as follows:

By default, the GO_Biological_Process_2023 is searched if none are indicated using the libraries parameter.

For filtering at the FDR level after differential abundance analysis:

enr = qm.enrich(
    method="overreptest",
    filter_pvalue=True,
    pvalue_column="CorrectedPValue2-1",
    pvalue_cutoff=0.1
)

For filtering SHAP values that have some contribution to prediction:

enr = qm.enrich(
    method="overreptest",
    filter_shap=True,
    shap_column="MeanSHAP2-1",
    shap_cutoff=0.0
)

You can also search multiple libraries and subset the databases to only consider pathways that contain the proteins in your QuantMatrix. All pathways available in GSEAPY can be searched.

Tip

If you have a small number of proteins you are interested in, subsetting the library using the subset_library parameter can be very helpful. This can help with FDR control so that the search space is not too large.

enr = quantified_data.enrich(
    method="enrichr_overreptest",
    libraries=['GO_Biological_Process_2023', 'KEGG_2021_Human', 'Reactome_2022'],
    organism="human",
    filter_pvalue=True,
    subset_library=True
)

The above enr results are detailed in the GSEAPY documentation, and can be used to easily make pathway plots and perform network analysis.

pathway_scatter

Example

There is a jupyter notebook with some detailed examples of how to use this functionality and some possible plots.

Pathway Enrichment: Demonstrates how to perform pathway enrichment analysis.


  1. Zhuoqing Fang, Xinyuan Liu, Gary Peltz, GSEApy: a comprehensive package for performing gene set enrichment analysis in Python, Bioinformatics, 2022;, btac757, https://doi.org/10.1093/bioinformatics/btac757