Martin Krallinger

Head of Natural Language Processing for Biomedical Information Analysis Unit

Barcelona, Catalonia, Spain

About

The Natural Language Processing for Biomedical Information Analysis (NLP4BIA) research group led by Dr. Martin Krallinger at BSC is a multidisciplinary team of engineers, computational linguists, healthcare experts, and software developers dedicated to the development, application and evaluation of Text Mining, Natural Language Processing (NLP) and Language Technology systems for a diversity of health and biomedical user scenarios. The NLP4BIA team focuses on the creation of publicly accessible high quality biomedical NLP resources to unlock key information and improve analysis of a variety of unstructured data sources, including clinical reports, biomedical literature, clinical trials, patents, or social media content written in different languages, mainly Spanish and English (but also other languages like Catalan). The results of these HPC-empowered AI and deep-learning based text mining resources developed by the group represent the base of more sophisticated semantic search and information retrieval technologies as well as large scale semantic annotation and document indexing strategies to generate structured data from clinical and biomedical texts. This in turn facilitates enhanced mining and data analytics approaches to be exploited for predictive modeling purposes of healthcare data. In this line, the NLP4BIA group has developed and released a range of corpora that contributed to foster the development of new, cutting-edge deep learning, Transformer and language model-based solutions by a global NLP research community through high impact open benchmark shared tasks (e.g. BioCreative, IberEVAL, IberLEF, BioASQ CLEF, eHealth CLEF, Biomedical WMT, BioNLP-OST or SMM4H). Through international and national research collaborations the NLP4BIA group's research output aims to unlock information from unstructured health data, critical to empower AI-based medical data analytics tools of benefit for both research and public healthcare systems (professionals, patients, industry) through technological development by integrating technology in the healthcare value chain for clinical applications and clinical use cases. Among the biomedical NLP practical exploitation scenarios, the group is working on use-cases related to cardiology and cardiovascular diseases (e.g. heart failure), occupational health, biomaterial and chemical entity text mining, rheumatology and rare diseases, COVID-19, cancer, as well as generation of knowledge graphs from text (e.g. extraction of gene regulatory networks, and drug-target interactions).

Experience

  • Head of Natural Language Processing for Biomedical Information Analysis Unit at Barcelona Supercomputing Center

    I am currently the head of the Natural Language Processing for Biomedical Information Analysis (NLP4BIA) research unit, a multidisciplinary team of engineers, computational linguists, healthcare experts, and software developers dedicated to the development, application and evaluation of Text Mining, Natural Language Processing (NLP) and Language Technology systems for a diversity of health and biomedical user scenarios. The NLP4BIA team focuses on the creation of publicly accessible high quality biomedical NLP resources to unlock key information and improve analysis of a variety of unstructured data sources, including clinical reports, biomedical literature, clinical trials, patents, or social media content written in different languages, mainly Spanish and English (but also other languages like Catalan, Italian, Swedish or Romanian). The results of these HPC-empowered AI and deep-learning based text mining resources developed by the group represent the base of more sophisticated semantic search and information retrieval technologies as well as large scale semantic annotation and document indexing strategies to generate structured data from clinical and biomedical texts. This in turn facilitates enhanced mining and data analytics approaches to be exploited for predictive modeling purposes of healthcare data. The NLP4BIA group has developed and released a range of corpora that contributed to foster the development of new, cutting-edge deep learning, Transformer and language model-based solutions by a global NLP research community through high impact open benchmark shared tasks (e.g. BioCreative, IberEVAL, IberLEF, BioASQ CLEF, eHealth CLEF, Biomedical WMT, BioNLP-OST or SMM4H). Regarding practical user scenarios, my group is working on NLP applied to cardiology, cardiovascular diseases, occupational health, biomaterials/chemical entity text mining, rheumatology & rare diseases, COVID-19, cancer, gene regulatory networks, and drug-target interactions.

  • Head of Text Mining Unit at Barcelona Supercomputing Center

    Head of Text Mining Unit at Barcelona Supercomputing Center

  • Head of Biological Text Mining Unit at CNIO - Spanish National Cancer Research Centre

    Head of the text mining unit at the Spanish National Cancer research Centre (CNIO).

  • Honorary/Invited Professor at Universitat Pompeu Fabra

  • Technical research position at CNIO - Spanish National Cancer Research Centre

    * Development of biomedical text mining applications and integration of information extraction data for various EU projects (OpenMinTed, DIAMONDS, ENFIN, MICROME, eTOX). * Research on text mining infrastructure for several topics, including: supervised machine learning text classifiers (for cancer types, e.g. melanoma, pancreas cancer, cell cycle, mitotic spindle), named entity recognition and normalization (genes/proteins, chemical compounds and mutations), implementation relation extraction systems (protein subcellular localization, protein-protein interactions, adverse effect of chemical compounds for toxicological endpoints, enzymatic reactions). * Evaluation of text mining and named entity recognition systems in biology and chemistry: BioCreative community challenge tasks (PPI task, CHEMDNER). * Teaching text mining lecture at the Bioinformatics master * Supervision of summer students and visiting researchers

  • Predoctoral research position at CNIO - Spanish National Cancer Research Centre

  • Predoctoral research position at Centro Nacional de Biotecnología (CNB/CSIC)