X-Digest: A Web Platform for Exploring How Popular Bioinformatic Tools and Databases are Used by the Community
Abinanda Prabhakaran1, Avi Ma'ayan1
1Department of Pharmacological Sciences, Windreich Department of Artificial Intelligence and Human Health, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029 USAAbstract
Every year, thousands of bioinformatics tools and databases are developed and published. Yet only a fraction of these resources survive and are highly cited, and little is known about how these resources are used by the community. X-Digest has been developed to automatically generate and update websites that summarize the usage patterns of key widely-cited bioinformatics tools and databases. From articles that cited the tool or database, X-Digest extracts full text with figures from each citing article to build the various sections of the site for each resource. Sections include: 1) a figures page depicting a collection of figures where the tool or database was mentioned; 2) a UMAP visualization of all figures citing the tool or database based on their content similarity; 3) recent citing articles; 4) statistical analysis section; containing plots such as citations over time, most citing journals, topics of citing papers, most used assays in citing articles, and most co-mentions with other tools and databases in the citing articles. X-Digest also automatically generates a review article about the resource with the Google Gemini (PaLM) model LLM node in n8n workflow builder. The review article describes how the tool or database have been applied to assist with data analysis for the most cited articles that cited the tool or database, a detailed explanation can be viewed in the appendix section of the site. To demonstrate X-Digest, we have applied it to produce a website for Enrichr, a functional gene set enrichment analysis platform developed by the Ma’ayan Lab. The X-Digest framework can further be applied to many other top bioinformatics tools or databases, offering a scalable way to explore how widely cited tools and databases are utilized by the community.
Enrichr
Enrichr is a widely used web-based tool that allows researchers to perform functional enrichment analysis on gene sets, helping them to interpret their experimental results in the context of biological pathways and processes from curated list of libraries. Enrichr has 447,453 gene sets organized into 218 gene set libraries. To learn more about using Enrichr or citing Enrichr in your research use the following links:
- Enrichr Site: https://maayanlab.cloud/Enrichr/
- Enrichr-KG Site: https://maayanlab.cloud/enrichr-kg
Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, McDermott MG, Monteiro CD, Gundersen GW, Ma'ayan A. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7.doi: 10.1093/nar/gkw377. PMID: 27141961; PMCID: PMC4987924.
Chen EY, Tan CM, Kou Y, Duan Q, Wang Z, Meirelles GV, Clark NR, Ma'ayan A. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics. 2013 Apr 15;14:128.doi: 10.1186/1471-2105-14-128. PMID: 23586463; PMCID: PMC3637064.
Xie Z, Bailey A, Kuleshov MV, Clarke DJB, Evangelista JE, Jenkins SL, Lachmann A, Wojciechowicz ML, Kropiwnicki E, Jagodnik KM, Jeon M, Ma'ayan A. Gene Set Knowledge Discovery with Enrichr. Curr Protoc. 2021 Mar;1(3):e90.doi: 10.1002/cpz1.90. PMID: 33780170; PMCID: PMC8152575.
Evangelista JE, Xie Z, Marino GB, Nguyen N, Clarke DJB, Ma'ayan A. Enrichr-KG: bridging enrichment analysis across multiple libraries. Nucleic Acids Res. 2023 Jul 5;51(W1):W168-W179.doi: 10.1093/nar/gkad393. PMID: 37166973; PMCID: PMC10320098.
Kuleshov MV, Diaz JEL, Flamholz ZN, Keenan AB, Lachmann A, Wojciechowicz ML, Cagan RL, Ma'ayan A. modEnrichr: a suite of gene set enrichment analysis tools for model organisms. Nucleic Acids Res. 2019 Jul 2;47(W1):W183-W190.doi: 10.1093/nar/gkz347. PMID: 31069376; PMCID: PMC6602483.