top of page

Biological code of knots - identification of knotted patterns in biomolecules via AI approach

The goal of this project is to understand the knotting phenomena in proteins based on Artificial Intelligence (AI) approach, and to develop means to predict de novo knotted protein structures. The scope of the project belongs to the area of fundamental research in structural biology. It is focused on the relations between sequence, structure, and function of proteins, and one of the most difficult aspects in this context, which is the role of non-trivial topology. The project is of interdisciplinary nature and involves aspects of knot theory (a branch of mathematics), machine learning, computer simulations, structural biology, and in vivo and in silico studies.


Project duration: 01.01.2023 - 31.12.2025


Project coordinator: Faculty of Mechanical Engineering, UL


Consortium partners: 

Lead Agency: University of Warsaw, Poland

Centre of New Technologies University of WarsawStefana Banacha 2c, 02-097 Warszawa, Polandhttps://cent.uw.edu.pl/en/Vodja of the Polish part of the project: doc. dr. Joanna I. Sulkowska Financer: NCN - National Science Centre (Poland)


Participating organisation:

Masaryk University, Czech RepublicMezirka 8, 60200 Brno, Czech Republichttps://www.muni.cz/enVodja Czech part of the project: Dr. Petr SimecekFinancer: GAČR - Czech Science Foundation

University of Ljubljana, Slovenia Faculty of Mechanical Engineering Aškerčeva cesta 6, 1000 Ljubljana, Slovenia

https://www.fs.uni-lj.si/Vodja Slovenian part of the project: doc. dr. Boštjan Gabrovšek


ARIS - Public Agency for Scientific Research and Innovation of the Republic of Slovenia


Participating organisations within the Slovenian part of the project:

University of Ljubljana, Faculty of Mechanical Engineering, Slovenia (applicant)University of Ljubljana, Faculty of Education, Slovenia (participant)Rudolf - Science and Technology Centre Novo mesto (participant from 1.10.2024 onwards)


Project leader at {Rudolfovo: doc. dr. Boštjan Gabrovšek

Contact: bostjan.gabrovsek@rudolfovo.eu


Project website: https://www.fs.uni-lj.si/project/bioloska-voda-vozlov-identifikacija-vzorcev-vzlanja-v-biomolekulah-z-uporabo-umetne-inteligence/

 

Financial information:

Total project:

Leto

Ure

Plača

Prisp.

Povrač.

Stroški

Amort.

Skupaj

2023

1202

37.213 €

5.984 €

3.774 €

20.842 €

9.928 €

77.741 €

2024

1202

37.213 €

5.984 €

3.774 €

20.842 €

9.928 €

77.741 €

2025

1202

37.213 €

5.984 €

3.774 €

20.842 €

9.928 €

77.741 €

 

            Rudolfovo:

Leto

Ure

Plača

Prisp.

Povrač.

Stroški

Amort.

Skupaj

2024

215

6.656 €

1.070 €

675 €

3.728 €

1.776 €

13.905 €

2025

739

22.879 €

3.679 €

 

2.320 €

 

12.814 €

 

6.104 €

 

47.796  

 

 

 

Project team composition with links to SICRIS data:

FS UL (applicant) Boštjan Gabrovšek https://cris.cobiss.net/ecris/si/sl/researcher/32245 

Aleš Vavpetič https://cris.cobiss.net/ecris/si/sl/researcher/10946 

PEF UL (participating RO) Boštjan Gabrovšek https://cris.cobiss.net/ecris/si/sl/researcher/32245 

Eva Horvat https://cris.cobiss.net/ecris/si/sl/researcher/31566 

Dušan Repovš https://cris.cobiss.net/ecris/si/sl/researcher/5995 

Rudolfovo (participating RO from 1.10.2024 onwards)

Boštjan Gabrovšek https://cris.cobiss.net/ecris/si/sl/researcher/32245


Project phases and description of their realisation (working sketches of the DS):

DS1: design, specification and implementation of the library

DS1.1 implementation of the PlanarDiagram class

The library is publicly available on the GitHub repository [1, COBISS.SI-ID - 191752451]. The PlanarDiagram class is fully implemented. The data structure we used to encode an arbitrary diagram is related to the EM code: the PlanarDiagram class contains a dictionary whose keys are nodes/intersections (an arbitrary hashable object) and whose values are counterclockwise lists of adjacent nodes. Each node, arc/connection (pair of nodes or junctions) or face of the diagram can be assigned different attributes such as colour, weight, etc.

DP1.2: structure manipulation

Most structure manipulation tools are implemented [1]:

- Add/remove/modify diagram components (links, junctions, nodes, etc.).

- Reidemeister shifts, which are key to identify node types (Reidemeister shifts I, II, III, IV and V).

- Orientation of the unoriented structure (canonical orientation, all possible orientations for diagrams containing several components).

- Canonical planar diagram format, which is important in the construction of tables and classifications and allows fast computation

DP 1.3 Invariants

The following invariants have been implemented in the library [1]:

- Yamada polynomial (invariant of the space graph) and other related polynomials [6].

- Kauffman bracket (knot invariant) [7, 8].

- The "unplugging" invariant (invariant of a spatial graph) [7].

- Bondles invariant (quandle invariant of a connected knot) [9].

DP1.4: Visualisation

In the KnotPy library [1], we have implemented the plotting of the above structures in DS1.1. The plotting is done via the layout and plot_layout functions, which take as input an instance of the planar diagram and return as output an image/diagram of the planar diagram (knot, spatial graph, constrained knot, etc.), which can also be saved in raster format (PNG) or vector format (PDF).

The drawing method invokes Andreev's theorem on circle packing, which states that any planar graph can be realised, almost uniquely, as a graph of tangent circles drawn inside a unit disk. This means that each node or junction can be joined to a disc within a unit circle, so that the interiors of these discs are disjoint and tangent if the corresponding nodes are connected by an edge. Knowing exactly the coordinates of all tangent nodes allows us to represent each arc by a few nice smooth arcs [10] that smoothly intersect.

DP4: Documentation and website

We have implemented the use of the Sphinx tool in the library, which updates the entire library documentation for each push to the repository. The documentation, which is still under construction, is available at [11]. Sphinx allows developers to write documentation in plain text using the reStructuredText tag and then convert it to different output formats such as HTML, PDF and ePub. Sphinx offers features such as cross-referencing, automatic indexing, customisation options, integration with documentation tools such as Read the Docs, support for internationalisation and scalability for large projects.


Bibliographic references derived directly from the implementation:

[1] Gabrovšek, B. (2024). KnotPy. B. Gabrovšek. https://github.com/bgabrovsek/knotpy, COBISS.SI-ID 191752451.

[2] Horvat, E. (2023). Nonsmooth manifold decompositions. Journal of geometry and physics, 194, COBISS.SI-ID 140475139

[3] P. Cavicchioli, R. Cavicchioli, B. Gabrovšek, J. C. Hu, Optical recognition of protein structures using artificial intelligence, v pripravi

[4] P. Cavicchioli, B. Gren & Ž. Virk, Topological data analysis of protein structures: decoding lassos and other motifs, v pripravi

[5] E. Horvat, B. Gabrovšek, On Bondles, članek v pripravi.

[6] Brezovnik, S., & Tratnik, N. (2023). Generalized cut method for computing Szeged-like polynomials with applications to polyphenyls and carbon nanocones. Match, 90(2), 401–427, COBISS.SI-ID 150208771

[7] Gabrovšek, B., & Gügümcü, N. (2023). Invariants of Multi-linkoids. Mediterranean journal of mathematics, 20(3), COBISS.SI-ID 145856259

[8] B. Gabrovšek, M. Simonič, The bracket polynomial of bonded knots and applications to entangles proteins, preprint https://arxiv.org/abs/2502.18999.

[9] E. Horvat, B. Gabrovšek, On Bondles, članek v pripravi.

[10] Vavpetič, A., & Žagar, E. (2023). Optimal approximation of spherical squares by tensor product quadratic Bézier patches. Applied mathematics and computation, 457(128196), 12, COBISS.SI-ID – 161773059.

[11] B. Gabrovšek, et al., Dokumentacija programskega paketa KnotPy, https://bgabrovsek.github.io/knotpy/ (2024).

bottom of page