Lasso detection

Identification of lasso topology relies on detection of piercings through a minimal surface spanned on a covalent loop. Detection of piercings for each loop consists of four steps which are described in detail in [1-2] and in extensive supplement to [2]:



Detection of the appropriate closed loop

The covalent loops considered are build on the backbone of the chain and are closed by the bridge. All database records have loops closed by disulfide bridges, with S-S bonds shorter/equal to 3.0 Å. It is a commonly used treshold used e.g by PDB for cysteine bridge detection. As for small loops the probability of threading is negligible, we impose the condition, that the loop has to have length of at least 5 residues. Proteins with smaller loops were moved into the artifacts category. The server allows even more deeper analysis of non-trivial loops in protein structures. The user can analyze various types of closed loops (not only S-S bonds) for different distances independently. These functions allow 1) analyzing also so-called non-covalent lassos [3], and 2) analyzing the topology after introducing mutations [4]. Server has 3 ways to define bridges' placements:

  • By inputing indices of residues between which bridge must be formed.
  • By passing requested aminoacids forming a bridge (e.g. LYS-GLU), and maximal and minimal CA-CA distance.
  • Default, same as in the database (CYS-CYS bonds), but both maximal and minimal S-S distance can be modified.

Polypeptide with a very small loop

Figure 1. Cartoon representation of a A0A2I3TWL7 polypeptide with minimal surface spanned on a lasso loop composed of only 4 amino acids.


Spanning a surface on the loop

After filtering the appropriate covalent loop, the next step in lasso detection is spanning the surface on the closed loop. There is however an infinite number of surfaces which can be spanned on the given loop. This would potentially hinder our analysis, as the piercing pattern would depend on the surface definition. Therefore, to achieve stability of our results we model the minimal surface on a given closed loop. Minimal surfaces are the surfaces of soap bubbles spanned on a closed loop. They are the (local) energetic minimas (minimas of Dirichlet energy functional), hence they are stable solutions for the problem of finding the surface spanned on a given loop. We should however point out, that the locality of the energetic minimum can introduce some ambiguity, as there could be potentially a surface with lower global area (and therefore the surface energy of a soap bubble), while we find only the local minimum. Therefore it is possible, that the result would still depend on the choice of one out of a few minimal surfaces. However, this choice may have small impact while bigger uncertainty comes from protein prediction.

Spanning of minimal surface on a given covalent loop requires an efficient algorithm of its construction. First of all, in most cases we are able only to approximate the minimal surface by its triangulation, i.e. to create the mesh of triangles, whose mean "distance" from actual minimal surface is small enough (Fig. 2).

Minimal surface spanning

Figure 2. Example simple polygon in three-dimensional space (left panel) and a triangulated minimal surface spanned on it (right panel).

There are several algorithms, used in particular in computer graphics, that determine such triangulations. In our work we implemented a slightly modified version of an algorithm discussed in [1]. The initial data for this algorithm consists of coordinates of n vertices in the covalent loop and the number of triangles in the triangulation that we are going to construct. This number one to adjust the level of details of the resulting mesh - the larger the number, the surface is approximated more accurately. Once some initial mesh has been specified, we iteratively adjust it by performing three operations that minimize the (local) area and the Dirichlet energy: Area Minimizing, Laplacian Fairing and Edge Swapping. We quit the iteration if the modification of a triangulation in a given step does not change the surface area sufficiently. A detailed description of the algorithm used is contained in the suplement to [2].

Moreover, as the protein chain is oriented from N- to C-terminus, our analysis enables us to unequivocally orient the surface (the method introduced in [5]). The orientation is spread across all the triangulated surface starting from the triangle closest to the bridge. To orient this triangle, first we form two imaginary vectors, beginning in the “opening” cysteine. One points towards the “closing” cysteine, the second one towards the next atom in the chain (according to its index). The vector product of these two (in this exact order) gives rise to a vector, which we arbitrarily call positively directed (Fig. 3).

Orientation of the surface

Figure 3. Schematic depiction of definition of the orientation of the surface, N and C corresponds respectively to the N- and C-terminus (based on [5]).


Detection of surface piercing

Once the triangulation of the minimal surface is determined, we can verify which segments of the lasso protein tail (or tails) pierce the surface. This is done by checking, if the vector joining two consecutive Cα atoms of the tail pierces any of the triangles forming the surface. If it does so, the index of the atom from the end of the piercing vector is reported along with the direction of crossing i.e. the reverted sign of scalar product between the "positive direction" vector of the surface and the piercing vector (Fig. 4). The direction of piercing is determinable only if the surface is orientable which in our work was always the case. In the depiction of the triangluated surface we denote the direction of piercing by drawing pierced triangles in different colors (e.g. in Fig. 5 blue and green triangles are pierced from opposite directions), and label the segments of a tail that pierce the surface with plus or minus signs respectively (e.g. tail segments denoted -10 and +289 in Fig. 4 pierce the surface from opposite directions).

Note that some proteins have a complicated backbone configuration, giving rise to complicated, self intersecting surfaces as discussed in detail in the supplement of [2]. In such cases it is convenient to present the triangulated surface as a planar barycentric graph, in which each vertex of a triangulation is an average of the vertices it is connected to. By a theorem by Tutte, such representation can be uniquely determined purely from the connectivity structure of a triangular surface. We use a well known algorithm by Tutte [6] to determine such barycentric representation. However, the original algorithm forces the triangles to be most densely packed in the vicinity of the limiting circle. Therefore, we add a hyperbolic transformation shifting the vertices of triangles towards the center, which improve the presentation. As an example, such planar barycentric graph for triangulated minimal surface spanned on a covalent loop in the Glutamate receptor with pdb code 3OM0, is shown in Fig. 5.

Figure 4. Schematic representation of method to detect, which segment of the protein tail pierces the minimal surface. Here the surface piercing is represented just by one triangle.

Figure 5. Glutamate receptor with pdb code 3OM0 in its cartoon representation, along with the minimal surface prescribed for it and a barycentric graph. In the barycentric graph two pierced triangles (blue and green one) are indicating the L2 topology of the loop.


Reduction of artificial piercings

In our analysis we try not to include proteins which lasso structure could be changed by thermal fluctuations of the chain. Therefore, we impose a condition that there must be at least 10 amino acids separation between consecutive piercings (from opposite directions), i.e. a piece of a tail piercing a surface must be sufficiently “deep”. There is one exception to this rule. Observe in Fig. 6 that one may find a complex protein structure, where a minimal surface spanned on a covalent loop, has two distinct pieces located close to each other. In such a case a tail may pierce both pieces of the surface and have less than 10 amino acids between these two crossings, but nonetheless we include such structures in our analysis. To detect such configurations automatically, we compute (using Dijkstra algorithm) the shortest distance (along segments of the triangulation of the minimal surface) between two triangles that are pierced by a tail. If this distance is long enough (larger than 10 segments of the mesh) we include such a structure in our classification.

Figure 6. Example of a unique configuration when a segment of less than 10 residues between consecutive crossings is accepted via our method based on protein with pdb code 4P1E (cartoon representation of structure - left panel, the surface spanned in the right panel). Figures show a tail segment shorter than 10 amino acids piercing two spatially separated parts of the self-intersecting surface.

We also demand the segment between the cysteine bridge and the first piercing to include at least 3 amino acids (see Fig. 7), as the crossings located in close vicinity of the loop can be the effect of random movement of the protein chain. Furthermore, we require the crossing to be located at least 3 residues from the closest terminus, once again, for the crossing to be sufficiently “deep”.

The piercings, which are considered to be artificial are reduced, and are not shown in the default crossing list, nor determine the covalent loop class. However, in order to maintain the transparency of the process, the reduced crossings can be vied for all the closed loops (more about protein presentation in Single protein chain data presentation section). Moreover, via the Advanced options users can create their own conditions to distinguish between artificial and correct piercings.

Figure 7. Example of artificial piercings - the piercing (indicated by an arrow) is located too close to the bridge, based on [1] (cartoon presentation - left panel, triangulated surface - right panel).

[1] Chen W, Cai Y, Zheng J (2008) Constructing triangular meshes of minimal area. Computer Aided Design and Applications 5:508-518.
[2] Niemyska W, Dabrowski-Tumanski P, Kadlof M, Haglund E, Sułkowski P, Sulkowska JI (2016) Complex lasso: new entangled motifs in proteins.
[3] Rana V, Sitarik I, Petucci J, Jiang Y, Song H, O'Brien EP (2024) Non-covalent Lasso Entanglements in Folded Proteins: Prevalence, Functional Implications, and Evolutionary Significance. Journal of Molecular Biology 436(6)
[4] Jiang Y, Neti SS, Sitarik I, Pradhan P, To P, Xia Y, Fried SD, Booker SJ, O'Brien EP (2022) How synonymous mutations alter enzyme structure and function over long timescales. Nature Chemistry 15(3):308-318
[5] Dabrowski-Tumanski P, Sulkowska JI Unique properties of lasso proteins. - under review
[6] William T. Tutte, W.T. (1963) How to draw a graph. Proc. London Math. Society 13(52):743-768.