Drug Bioinformatics

Prof Dr Olga Kalinina

Bioinformatics is instrumental in all areas of molecular biology, from analysis of genome sequences towards predicting three-dimensional structure of drug-target complexes. We apply cutting-edge bioinformatics and computer science techniques for discovery of novel resistance mechanisms and predicting mode-of-action of bioactive compounds.

Our research and approach

One particular focus of our group is the development of machine learning tools for predicting functional consequences of genetic variants that can be associated with a particular disease or resistance phenotype. In doing so, we aim to predict not only the direction and the magnitude of the effect, i.e. whether a certain variant is likely to be pathogenic or cause resistance to a drug, but also the exact molecular mechanism, which is responsible for it. We do so by combining phylogenetic methods with approaches from structural bioinformatics: computational modelling three-dimensional structure of proteins, their interactions, and dynamics, united in a robust machine learning framework. A particular emphasis of this line of work is discovery of novel resistance mechanisms. Another focus of the research group is investigation of protein-drug interactions and drug-binding pockets with data-mining graph theory-based approaches. We aim to describe protein functional motifs and drug-binding patterns in them, and eventually develop novel machine-learning tool for prediction of drug affinity based on structural descriptors of protein-drug interactions.

Team members

Group Leader

Assistant

Postdoc

Postdoc

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

Research projects

Structural annotation of genetic variants

By analyzing the spatial distribution of genetic variants in three-dimensional structures of proteins harboring them and their homologs, we can produce hypotheses about the functional consequences of these variants. For example, if a mutation caused by a single-nucleotide polymorphism lies on an interaction interface with another protein or in a ligand-binding pocket, it may affect the corresponding binding affinity, and mutations lying in the protein core can be detrimental for its stability. We develop methods that can annotate very large datasets in this way, providing insight into the relation between mutations’ annotated pathogenic or functional effect and their location in the three-dimensional structures of proteins and their complexes.

Prediction of functional effect of mutations

We build machine-learning methods for predicting the impact of mutations using a variety of features related to protein three-dimensional structures, interactions, and evolution. The methods can be trained to predict the impact on protein function, as well as their pathogenicity, which correlates with protein function. Additionally, we explore the possibility of training such methods to predict the impact on more specific phenotypes, such as resistance towards antibacterial compounds.

Systems medicine investigation of alternative splicing in cardiac and renal diseases (Sys_CARE):

In this BMBF-funded project in cooperation with the Hamburg University and University Hospital Greifswald, we apply our expertise in structural modelling and annotation to investigate novel mechanisms of pathogenesis in cardiac and renal diseases, focussing on the alterations of protein sequences caused by disease-specific alternative splicing events.

AI-based methods for synergistic exploration of disease symptoms and drug side effects (DRUGSIDERAI)

Drug side effects are widespread, yet their mechanisms are poorly understood. In this BMBF-funded project, we aim to address the molecular basis and population importance of such mechanisms in cooperation with the Hamburg University and DESY. We combine systems biology, structural bioinformatics, cheminformatics and machine learning to investigate the effects of off-target drug binding and the influence of genetic variation on it.

Identification of novel resistance mechanisms with AI (AMR-XAI)

Antimicrobial Resistance (AMR) is perhaps the most urgent threat to human health. While individual resistance mutations are well-researched, knowing which new mutations can cause antimicrobial resistance is key to developing drugs that reliably sidestep microbial defenses. In a HelmholtzAI-funded project in cooperation with CISPA, we investigate AMR via explainable artificial intelligence, by developing and applying novel methods for discovering easily interpretable local patterns that are significant with regard to one or multiple classes of resistance. We learn a small set of easily interpretable models that together explain the resistance mechanisms in the data, using statistically robust methods for discovering significant subgroups, as well as information theoretic approaches to discovering succinct sets of noise-robust rules.

Protein structure-based representation models

Recent developments in the protein structure prediction field led to a drastic increase in the number of available protein three-dimensional structures. This creates a challenge and presents an opportunity for discovering fitting approaches to utilise such new datasets in various machine learning settings. On the other hand, large language models (LLMs) trained on protein sequences, called protein large language models (PLLMs), have proven to be useful in various bioinformatics problem settings. In our work, we are interested in applying PLLMs and extending them by considering protein three-dimensional structure. We use sequence-based pre-trained PLLMs and our own structure-based representations, separately and combined, and try to interpret their behaviour on our data. We have developed a self-supervised learning approach for protein three-dimensional structures based on convolutional graph neural networks and graph transformers that creates meaningful embeddings of protein structures. We demonstrated its utility in a variety of downstream tasks, including the prediction of drug-target interactions and predicting products of biosynthetic gene clusters.

Explainable Deep Learning for Drug-Target interactions (NEXTAID, XAI-GRAPH)

We employ the latest developments in protein-based pre-trained models to create efficient deep learning models for predicting protein-drug interactions. In these models, we use representations of not only drugs, but also protein targets, in particular, of their three-dimensional structures. This makes our models unique in the field. To make these models even more useful for biomedical research, in cooperation with the NextAID project at the Saarland University, we explore explainability approaches to identify amino acids in the target proteins that most contribute to interactions. A similar approach is taken in a HelmholtzAI-funded project XAI-Graph with UFZ, where we explore explainable graph-based drug-target interaction models applied to toxicology research.

Publications

2020

An extended catalogue of tandem alternative splice sites in human tissue transcriptomes

Mironov A, Denisov S, Gress A, Kalinina O, Pervouchine D (2020)

BookDOI: 10.1101/2020.09.11.292722

The bottromycin epimerase BotH defines a group of atypical α/β-hydrolase-fold enzymes

Sikandar A, Franz L, Adam S, Santos-Aberturas J, Horbal L, Luzhetskyy A, Truman A, Kalinina O, Koehnke J (2020)

Nat Chem Biol 16 (9): 1013-1018DOI: 10.1038/s41589-020-0569-y

DIGGER: exploring the functional role of alternative splicing in protein interactions

Louadi Z, Yuan K, Gress A, Tsoy O, Kalinina O, Baumbach J, Kacprowski T, List M (2020)

Nucleic Acids ResDOI: 10.1093/nar/gkaa768

Frequent subgraph mining for biologically meaningful structural motifs

Keller S, Miettinen P, Kalinina O (2020)

BookDOI: 10.1101/2020.05.14.095695

SphereCon-a method for precise estimation of residue relative solvent accessible area from limited structural information

Gress A, Kalinina O (2020)

Bioinformatics (Oxford, England) 36 (11): 3372-3378DOI: 10.1093/bioinformatics/btaa159

Resistance-associated substitutions in patients with chronic hepatitis C virus genotype 4 infection

Dietz J, Kalinina O, Vermehren J, Peiffer K, Matschenz K, Buggisch P, Niederau C, Schattenberg J, Müllhaupt B, Yerly S, …, Welsch C, Sarrazin C (2020)

J. Viral Hepat.DOI: 10.1111/jvh.13322.

Non-active site mutants of HIV-1 protease influence resistance and sensitisation towards protease inhibitors

Bastys T, Gapsys V, Walter H, Heger E, Doncheva N, Kaiser R, Groot B, Kalinina O (2020)

Retrovirology 17 (1)DOI: 10.1186/s12977-020-00520-6

A shift of dynamic equilibrium between the KIT active and inactive states causes drug resistance

Srikakulam S, Bastys T, Kalinina O (2020)

ProteinsDOI: 10.1002/prot.25963

2019

Relative Principal Components Analysis: Application to Analyzing Biomolecular Conformational Changes

Ahmad M, Helms V, Kalinina O, Lengauer T (2019)

Journal of chemical theory and computation 15 (4): 2166-2178DOI: 10.1021/acs.jctc.8b01074

Adenosine-to-Inosine RNA Editing in Mouse and Human Brain Proteomes

Levitsky L, Kliuchnikova A, Kuznetsova K, Karpov D, Ivanov M, Pyatnitskiy M, Kalinina O, Gorshkov M, Moshkovskii S (2019)

Proteomics 19 (23)DOI: 10.1002/pmic.201900195.

Targeting actin inhibits repair of doxorubicin-induced DNA damage: a novel therapeutic approach for combination therapy

Pfitzer L, Moser C, Gegenfurtner F, Arner A, Foerster F, Atzberger C, Zisis T, Kubisch-Dohmen R, Busse J, Smith R, …, Vollmar A, Zahler S (2019)

Cell Death Dis. 10 (4)DOI: 10.1038/s41419-019-1546-9

2018

Epistatic Interactions in NS5A of Hepatitis C Virus Suggest Drug Resistance Mechanisms

Knops E, Sierra S, Kalaghatgi P, Heger E, Kaiser R, Kalinina O (2018)

Genes 9 (7)DOI: 10.3390/genes9070343

Consistent Prediction of Mutation Effect on Drug Binding in HIV-1 Protease Using Alchemical Calculations

Bastys T, Gapsys V, Doncheva N, Kaiser R, Groot B, Kalinina O (2018)

Journal of chemical theory and computation 14 (7): 3397-3408DOI: 10.1021/acs.jctc.7b01109

Exploring the conformational landscapes of HIV protease structural ensembles using principal component analysis

Hassan S, Srikakulam S, Chandramohan Y, Thangam M, Muthukumar S, Gayathri Devi P, Hanna L (2018)

Proteins 86 (9): 990-1000DOI: 10.1002/prot.25534

Adaptation of a Bacterial Multidrug Resistance System Revealed by the Structure and Function of AlbA

Sikandar A, Cirnski K, Testolin G, Volz C, Brönstrup M, Kalinina O, Müller R, Koehnke J (2018)

J. Am. Chem. Soc. 140 (48): 16641-16649DOI: 10.1021/jacs.8b08895

2017

Spatial distribution of disease-associated variants in three-dimensional structures of protein complexes

Gress A, Ramensky V, Kalinina O (2017)

Oncogenesis 6 (9)DOI: 10.1038/oncsis.2017.79.

2016

Patterns of amino acid conservation in human and animal immunodeficiency viruses

Voitenko O, Dhroso A, Feldmann A, Korkin D, Kalinina O (2016)

Bioinformatics (Oxford, England) 32 (17): 685-DOI: 10.1093/bioinformatics/btw441

StructMAn: annotation of single-nucleotide polymorphisms in the structural context

Gress A, Ramensky V, Büch J, Keller A, Kalinina O (2016)

Nucleic Acids Res 44 (W1): 463-8DOI: 10.1093/nar/gkw364

BALL-SNPgp-from genetic variants toward computational diagnostics

Mueller S, Backes C, Gress A, Baumgarten N, Kalinina O, Moll A, Kohlbacher O, Meese E, Keller A (2016)

Bioinformatics (Oxford, England) 32 (12): 1888-90DOI: 10.1093/bioinformatics/btw084