Metapub Citations
Research citing metapub in PubMed journals, posters, and beyond.
Advancing pharmacogenetics research in Africa: the “Project Africa GRADIENT” initiative
Authors: Carene Anne Alene Ndong Sima, Houcemeddine Othman, Marlo Möller, The Project Africa GRADIENT Consortium
DOI: 10.1016/j.drudis.2024.103939
Bibliometric data were extracted using an in-house Python script using the metapub library, which searches for articles with African country affiliations that match the terms '…
Emerging trends in multiple sclerosis research
Authors: Gloria Dalla Costa, Giancarlo Comi
DOI: 10.1016/j.msard.2022.104124
Methods
The list of Pubmed ID (PMID) of articles published from 2000 onwards on the research topic ‘Multiple Sclerosis’ has been downloaded from Pubmed, and for each PMID the abstract was retrieved using the metapub library for Python.
Uremic toxicity: gaining novel insights through AI-driven literature review
Authors: Hanjie Zhang, Peter Kotanko
Journal: Nephrology Dialysis Transplantation
Year: 2024
DOI: 10.1093/ndt/gfae069.657
Method
First, we collected on PubMed all abstracts related to the topic of “uremic toxins” through Metapub, a Python library designed to facilitate fetching metadata from PubMed. Second, we set up a RAG system that comprises 2 steps. In a retrieval step, the questions on topic (“uremic toxins”) and the documents (=all collected abstracts and manuscripts) are encoded into vectors (i.e., high-dimensional numerical representations). Similarity measures are used to find the best matches between documents and the questions on topic. Second, in the augmented generation step, the LLM (e.g., ChatGPT) uses these best matches of documents to generate a coherent and informed response.
Loop Catalog: a comprehensive HiChIP database of human and mouse samples
Authors: J Reyna, K Fetter, R Ignacio et al
DOI: 10.1101/2024.04.26.591349
Curating HiChIP and ChIP-seq Samples
To identify a comprehensive list of publicly-released HiChIP datasets, we developed a pipeline that scans NCBI’s Gene Expression Omnibus (GEO) database (Barrett et al., 2013) for studies performing HiChIP experiments. To extract information on these studies the BioPython.Entrez (Buchmann & Holmes, 2019) and metapub.convert (https://pypi.org/project/metapub/) packages were used. Raw sequencing data associated to these studies was then identified from the SRA database using the pysradb Python package (https://github.com/saketkc/pysradb) and the results were manually examined to extract HiChIP samples. ChIP-seq samples corresponding to these studies were also extracted if there was a record of them within the same GEO ID as the HiChIP sample.
Adera2.0: A Drug Repurposing Workflow for Neuroimmunological Investigations Using Neural Networks
Authors: Marzena Lazarczyk et al
DOI: 10.3390/molecules27196453
"The first phase of the workflow (phase I) covers the aim of building a database of the JSON format containing parsed PDFs (Figure 1). This phase consists of five steps. The first step’s objective is to fetch the PubMed IDs related to the search query. This is accomplished by using the PubMed fetcher function available through the Metapub python library. This step uses the input query to search for recent PubMed articles that match the query terms. After that, in the second step, the workflow fetches the abstracts and keywords of the retrieved PubMed IDs. This is achieved through the use of the python library Keybert. The third step in this phase involves downloading the identified PDFs; this is done using the fetch_PDFs library."
Navigating the Multiverse: A Hitchhiker’s Guide to Selecting Harmonisation Methods for Multimodal Biomedical Data
Authors: Murali Aadhitya Magateshvaren Saras, Mithun K. Mitra, Sonika Tyagi
DOI: 10.1101/2024.03.21.24304655
Based on existing literature reviews published over the past decade, a general outline was followed to select articles that mentioned multimodal learning techniques. An extensive search of various ML methods focused on biomedical data was initially gathered using the metapub (https://github.com/metapub/metapub) python module based on the following keywords...
Depression, anxiety, and burnout in academia: topic modeling of PubMed abstracts
Authors: Olga Lezhnina
DOI: 10.3389/frma.2023.1271385
3.2 The data
Web scrapping from PubMed database was conducted with Python (version 3.9.7, Metapub library). It resulted in the sample of 2,846 abstracts of papers published in 903 journals in years 1975–2023. Prior to the analysis, the data was checked for quality (that is, no missing entries and unique DOIs).
TREASURE: Text Mining Algorithm Based on Affinity Analysis and Set Intersection to Find the Action of Tuberculosis Drugs against Other Pathogens
Authors: Pradeepa Sampath et al
DOI: 10.3390/app11156834
Appl. Sci. 2021, 11, 6834 7 of 19 4.1. Data Preprocessing Around eight drugs are analyzed with this model. For this purpose, abstracts from each document are collected, as they provide the accurate and necessary information about the paper. The PubMed abstracts have a unique ID called PMID. The metapub li-brary in python gets these IDs as input and extracts their corresponding abstracts. The number of document abstracts collected for each drug from PubMed is given in Table 1. Table 1. Number of documents collected from PubMed for each drug.
Trends in Technology Usage for Parkinson’s Disease Assessment: A Systematic Review
Authors: Ranadeep Deb, Ganapati Bhat, Sizhe An, Holly Shill, Umit Y. Ogras
Year: 2021
DOI: 10.1101/2021.02.01.21250939
DATA COLLECTION
The methodologies used for downloading the data from the four online databases are also different. The documents were exported in comma-separated values (CSV) format from IEEE Xplore, tab-delimited format from MDPI and BibTex format from Science Direct. While, for PubMed Central, we used a Python-based API, Metapub [104] for an automated search. The information extracted from all of the databases were accumulated and stored together in a .CSV file.
Citation given as:
N. Most. (1999) metapub . PyPI. https://pypi.org/project/metapub/, accessed March 7, 2019.
Promoting Fairness in Classification of Quality of Medical Evidence
Authors: Simon Šuster, Timothy Baldwin, Karin Verspoor
DOI: 10.18653/v1/2023.bionlp-1.39
Data We collect a large dataset of clinical trial abstracts from studies for which manual RoB annotations exist in CDSR, similarly to Marshall et al. (2015a). Starting with the PubMed identifiers for the studies included in CDSR, we then searched for abstracts using the metapub package (5) obtaining a total of around 24,000 abstracts.
Automatic Citation Extraction and Analysis
Authors: Tyler J. Conn, Samuel Hill, Ahmed Mustafa, Dr. Andrea Tartaro
DOI: 10.3115/1699750.1699764
After obtaining the DOI, it is entered into the MetaPub module to extract information on both the original article and all of its citations [1].