Zürich Metropolitan Area
Portfolio: katwre.github.io 彡 PhD degree in computational molecular biology, and 9+ years of experience in industry and academia. 彡 Highly skilled in programming (mostly Python and R), applying and developing machine learning methods for analysing, and integrating various large-scale genomic data, data visualisation, and creative thinking. 彡Thrives in dynamic international environments, with high collaboration between teams. 彡 Interests: • Machine learning/AI applications on genomic data • Gene regulation and regulatory networks • Disease (epi)genomics
- B2B with companies and start-ups involving (01.02.2025-): - Endometriosis classification, and biomarker panel development, immune cell dynamics using targeted RNA-seq and scRNA-seq from menstrual effluent samples (scanpy, scvi-tools, scikit-learn, xgboost, Pytorch, seaborn, matplotlib) - Built a VAE+LSTM-based model to detect spikes of chemicals from a laser particle counter; contributed to Kafka-based streaming pipeline (PyTorch, pandas, Kafka, Zookeeper, MinIO, PostgreSQL) - B2B contract with Ardigen - Digital Contract Research Organization (CRO) specializing in AI-driven drug discovery and bioinformatics https://ardigen.com/digital-cro/ (01.02.2023 - 31.01.2025) - B2B contract with Selvita S.A. - Contract Research Organization (CRO) specializing in drug discovery and development https://selvita.com/ (01.02.2024 - 31.01.2025)
Joint position with Prof. Dr. Ori Bar-Nur 40% (https://rmb.ethz.ch/), Prof. Dr. Ferdinand von Meyenn 40% (https://epigenetics.ethz.ch/) at the ETH Zurich and Dr. Hubert Rehrauer at the Functional Genomics Center Zurich 20% (https://www.sib.swiss/hubert-rehrauer-group)
Collaborated with pharmaceutical companies and a start-up in projects involving: * Biomarker discovery based on publicly available RNA-seq, ATAC-seq, Bisulfite-seq, and histone modifications ChIP-seq data in the context of Alzheimer disease. Used nextflow nf-core pipelines and applied machine learning and deep learning algorithms to find target genes and enhancers, and discussed wet-lab experiments for results validation (R/BioC/nextflow) * Adding custom improvements to visualisation and integration of in-house and publicly available sequencing (epi)genomic data in the IGV app browser in collaboration with software engineers and an UX / UI Designer (Python/Javascript) * Analysis and visualization of cancer-related gene expression and mutation (SNPs/CNVs) datasets to find novel targets for patients with limited therapy options (R/BioC) * Analysis of single-cell RNA sequencing samples (R/BioC) * Identifying novel anticancer targets using machine learning (Python/scikit learn, pandas, numpy, seaborn, matplotlib) deep learning methods (pytorch)
Ph.D. Student in the group of Dr. Altuna Akalin in the Bioinformatics & Omics Data Science Platform at the Max Delbrück Center (the Berlin Institute for Medical Systems Biology). My Ph.D. work provided novel insights into gene regulation and transcription factor binding due to DNA methylation dynamics in cancerous and non-cancerous cells. • Performed exploratory analysis of high-risk paediatric cancer neuroblastoma by using clinical data derived from solid tumor samples and cell-free DNA and work collaboratively with medical doctors from the Charite hospital. Identified DNA methylation biomarkers and gene regulation abnormalities for early diagnosis and improvement of therapy choice. Applied machine learning methods such as multivariate logistic regression, elastic net / ridge regression, random forests, XGBoost. Analysed Bisulfite-seq, RNA-seq, and ChIP-seq patient-derived data. • Published a peer-reviewed publication in which I explored phenomenon of inconsistent ChIP-seq results and its common factor of poor antibody performance, and computationally and statistically segments of the genome with unusually high number of transcription factor binding sites derived from false positive ChIP-seq peaks with antibodies binding to RNA-DNA secondary structures called R-loops. • Demonstrated knowledge of Python and a workflow management system by building a Snakemake pipeline for Bisulfite-seq analysis and data visualisation called PigX Bisulfite-seq as a part of PiGX: Pipelines in Genomics. • Maintainde and developed Bioconductor R package genomation (github.com/BIMSBbioinfo/genomation) for summary, annotation and visualization of genomic data, creating motifActivity R package for computational reconstructions of transcription factor networks from sequencing data (github.com/katwre/motifActivity), and computational analysis provided as collection of R scripts included in my publications.
Recipient of a 7000€ scholarship by the Institute of Computer Science Polish Academy of Sciences and co-financed out of EU funds. • I worked on development and maintenance of an Bioconductor R package genomation (git- hub.com/BIMSBbioinfo/genomation) created for summary, annotation and visualiza- tion of genomic data and project on genomic regions with unusually high number of transcription factor binding sites. Project under supervision of Dr. Altuna Akalin. • Collaborated with the IPI PAN Institute, conducted monthly progress presentations and reports