Prof Dr Olga Kalinina
Bioinformatics is instrumental in all areas of molecular biology, from analysis of genome sequences towards predicting three-dimensional structure of drug-target complexes. We apply cutting-edge bioinformatics and computer science techniques for discovery of novel resistance mechanisms and predicting mode-of-action of bioactive compounds.
Our research and approach
One particular focus of our group is the development of machine learning tools for predicting functional consequences of genetic variants that can be associated with a particular disease or resistance phenotype. In doing so, we aim to predict not only the direction and the magnitude of the effect, i.e. whether a certain variant is likely to be pathogenic or cause resistance to a drug, but also the exact molecular mechanism, which is responsible for it. We do so by combining phylogenetic methods with approaches from structural bioinformatics: computational modelling three-dimensional structure of proteins, their interactions, and dynamics, united in a robust machine learning framework. A particular emphasis of this line of work is discovery of novel resistance mechanisms. Another focus of the research group is investigation of protein-drug interactions and drug-binding pockets with data-mining graph theory-based approaches. We aim to describe protein functional motifs and drug-binding patterns in them, and eventually develop novel machine-learning tool for prediction of drug affinity based on structural descriptors of protein-drug interactions.
By analyzing the spatial distribution of genetic variants in three-dimensional structures of proteins harboring them and their homologs, we can produce hypotheses about the functional consequences of these variants. For example, if a mutation caused by a single-nucleotide polymorphism lies on an interaction interface with another protein or in a ligand-binding pocket, it may affect the corresponding binding affinity, and mutations lying in the protein core can be detrimental for its stability. We develop methods that can annotate very large datasets in this way, providing insight into the relation between mutations’ annotated pathogenic or functional effect and their location in the three-dimensional structures of proteins and their complexes.
We build machine-learning methods for predicting the impact of mutations using a variety of features related to protein three-dimensional structures, interactions, and evolution. The methods can be trained to predict the impact on protein function, as well as their pathogenicity, which correlates with protein function. Additionally, we explore the possibility of training such methods to predict the impact on more specific phenotypes, such as resistance towards antibacterial compounds.
In this BMBF-funded project in cooperation with the Hamburg University and University Hospital Greifswald, we apply our expertise in structural modelling and annotation to investigate novel mechanisms of pathogenesis in cardiac and renal diseases, focussing on the alterations of protein sequences caused by disease-specific alternative splicing events.
Drug side effects are widespread, yet their mechanisms are poorly understood. In this BMBF-funded project, we aim to address the molecular basis and population importance of such mechanisms in cooperation with the Hamburg University and DESY. We combine systems biology, structural bioinformatics, cheminformatics and machine learning to investigate the effects of off-target drug binding and the influence of genetic variation on it.
Antimicrobial Resistance (AMR) is perhaps the most urgent threat to human health. While individual resistance mutations are well-researched, knowing which new mutations can cause antimicrobial resistance is key to developing drugs that reliably sidestep microbial defenses. In a HelmholtzAI-funded project in cooperation with CISPA, we investigate AMR via explainable artificial intelligence, by developing and applying novel methods for discovering easily interpretable local patterns that are significant with regard to one or multiple classes of resistance. We learn a small set of easily interpretable models that together explain the resistance mechanisms in the data, using statistically robust methods for discovering significant subgroups, as well as information theoretic approaches to discovering succinct sets of noise-robust rules.
Recent developments in the protein structure prediction field led to a drastic increase in the number of available protein three-dimensional structures. This creates a challenge and presents an opportunity for discovering fitting approaches to utilise such new datasets in various machine learning settings. On the other hand, large language models (LLMs) trained on protein sequences, called protein large language models (PLLMs), have proven to be useful in various bioinformatics problem settings. In our work, we are interested in applying PLLMs and extending them by considering protein three-dimensional structure. We use sequence-based pre-trained PLLMs and our own structure-based representations, separately and combined, and try to interpret their behaviour on our data. We have developed a self-supervised learning approach for protein three-dimensional structures based on convolutional graph neural networks and graph transformers that creates meaningful embeddings of protein structures. We demonstrated its utility in a variety of downstream tasks, including the prediction of drug-target interactions and predicting products of biosynthetic gene clusters.
We employ the latest developments in protein-based pre-trained models to create efficient deep learning models for predicting protein-drug interactions. In these models, we use representations of not only drugs, but also protein targets, in particular, of their three-dimensional structures. This makes our models unique in the field. To make these models even more useful for biomedical research, in cooperation with the NextAID project at the Saarland University, we explore explainability approaches to identify amino acids in the target proteins that most contribute to interactions. A similar approach is taken in a HelmholtzAI-funded project XAI-Graph with UFZ, where we explore explainable graph-based drug-target interaction models applied to toxicology research.
An extended catalogue of tandem alternative splice sites in human tissue transcriptomes
Mironov A, Denisov S, Gress A, Kalinina O, Pervouchine D (2020)
The bottromycin epimerase BotH defines a group of atypical α/β-hydrolase-fold enzymes
Sikandar A, Franz L, Adam S, Santos-Aberturas J, Horbal L, Luzhetskyy A, Truman A, Kalinina O, Koehnke J (2020)
Nat Chem Biol 16 (9): 1013-1018DOI: 10.1038/s41589-020-0569-y
DIGGER: exploring the functional role of alternative splicing in protein interactions
Louadi Z, Yuan K, Gress A, Tsoy O, Kalinina O, Baumbach J, Kacprowski T, List M (2020)
Nucleic Acids ResDOI: 10.1093/nar/gkaa768
Frequent subgraph mining for biologically meaningful structural motifs
Keller S, Miettinen P, Kalinina O (2020)
SphereCon-a method for precise estimation of residue relative solvent accessible area from limited structural information
Gress A, Kalinina O (2020)
Bioinformatics (Oxford, England) 36 (11): 3372-3378DOI: 10.1093/bioinformatics/btaa159
Resistance-associated substitutions in patients with chronic hepatitis C virus genotype 4 infection
Dietz J, Kalinina O, Vermehren J, Peiffer K, Matschenz K, Buggisch P, Niederau C, Schattenberg J, Müllhaupt B, Yerly S, …, Welsch C, Sarrazin C (2020)
J. Viral Hepat.DOI: 10.1111/jvh.13322.
Non-active site mutants of HIV-1 protease influence resistance and sensitisation towards protease inhibitors
Bastys T, Gapsys V, Walter H, Heger E, Doncheva N, Kaiser R, Groot B, Kalinina O (2020)
Retrovirology 17 (1)DOI: 10.1186/s12977-020-00520-6
A shift of dynamic equilibrium between the KIT active and inactive states causes drug resistance
Srikakulam S, Bastys T, Kalinina O (2020)
Relative Principal Components Analysis: Application to Analyzing Biomolecular Conformational Changes
Ahmad M, Helms V, Kalinina O, Lengauer T (2019)
Journal of chemical theory and computation 15 (4): 2166-2178DOI: 10.1021/acs.jctc.8b01074
Adenosine-to-Inosine RNA Editing in Mouse and Human Brain Proteomes
Levitsky L, Kliuchnikova A, Kuznetsova K, Karpov D, Ivanov M, Pyatnitskiy M, Kalinina O, Gorshkov M, Moshkovskii S (2019)
Proteomics 19 (23)DOI: 10.1002/pmic.201900195.
Targeting actin inhibits repair of doxorubicin-induced DNA damage: a novel therapeutic approach for combination therapy
Pfitzer L, Moser C, Gegenfurtner F, Arner A, Foerster F, Atzberger C, Zisis T, Kubisch-Dohmen R, Busse J, Smith R, …, Vollmar A, Zahler S (2019)
Cell Death Dis. 10 (4)DOI: 10.1038/s41419-019-1546-9
Epistatic Interactions in NS5A of Hepatitis C Virus Suggest Drug Resistance Mechanisms
Knops E, Sierra S, Kalaghatgi P, Heger E, Kaiser R, Kalinina O (2018)
Genes 9 (7)DOI: 10.3390/genes9070343
Consistent Prediction of Mutation Effect on Drug Binding in HIV-1 Protease Using Alchemical Calculations
Bastys T, Gapsys V, Doncheva N, Kaiser R, Groot B, Kalinina O (2018)
Journal of chemical theory and computation 14 (7): 3397-3408DOI: 10.1021/acs.jctc.7b01109
Patterns of amino acid conservation in human and animal immunodeficiency viruses
Voitenko O, Dhroso A, Feldmann A, Korkin D, Kalinina O (2016)
Bioinformatics (Oxford, England) 32 (17): 685-DOI: 10.1093/bioinformatics/btw441
StructMAn: annotation of single-nucleotide polymorphisms in the structural context
Gress A, Ramensky V, Büch J, Keller A, Kalinina O (2016)
Nucleic Acids Res 44 (W1): 463-8DOI: 10.1093/nar/gkw364