Source
MaximumContactMapOverlap
Source model issue: #1043
Target
ILP/bool
Motivation
This rule gives MaximumContactMapOverlap immediate solver connectivity through the existing ILP hub.
The formulation follows the standard contact-map-overlap idea: choose an order-preserving residue alignment and maximize the number of contact pairs that are superimposed by the alignment.
Reference
- Rumen Andonov, Noel Malod-Dognin, and Nicola Yanev, "Maximum Contact Map Overlap Revisited," Journal of Computational Biology 18(1):27-41, 2011. https://doi.org/10.1089/cmb.2009.0196
- Wei Xie and Nikolaos V. Sahinidis, "A reduction-based exact algorithm for the contact map overlap problem," Journal of Computational Biology 14(5):637-654, 2007. https://doi.org/10.1089/cmb.2007.R007
Reduction Algorithm
Let the source instance contain ordered contact maps
G_1 = (V_1,E_1) and G_2 = (V_2,E_2), with
V_1 = {0,...,n_1-1} and V_2 = {0,...,n_2-1}.
Construct a binary ILP as follows.
-
For each possible residue match (i,j) in V_1 x V_2, create a binary variable x_{i,j}.
It means vertex i of G_1 is aligned to vertex j of G_2.
-
For each pair of contacts {i,k} in E_1 and {j,l} in E_2 with i < k and j < l, create a binary variable y_{i,k,j,l}.
It means contact {i,k} is preserved by aligning i to j and k to l.
-
Add row constraints for every i in V_1:
sum_j x_{i,j} <= 1.
Each vertex of G_1 is matched at most once.
-
Add column constraints for every j in V_2:
sum_i x_{i,j} <= 1.
Each vertex of G_2 is matched at most once.
-
Add order-preservation constraints. For every i < k in V_1 and every j >= l in V_2, add:
x_{i,j} + x_{k,l} <= 1.
This forbids crossings and equal-image matches, so all matched images are strictly increasing.
-
Link each preserved-contact variable to its endpoint-match variables:
y_{i,k,j,l} <= x_{i,j}
y_{i,k,j,l} <= x_{k,l}.
Because every y has positive objective coefficient and the objective is maximization, an optimal solution sets y_{i,k,j,l} = 1 exactly when the two endpoint matches are selected.
-
Maximize:
sum y_{i,k,j,l}
over all contact pairs from E_1 x E_2 that respect endpoint order.
-
Extract the CMO alignment by mapping i to the unique j with x_{i,j} = 1, and leaving i unmatched if all x_{i,j} are zero.
Correctness Argument
The row and column constraints make the selected x variables a partial injective matching between the two ordered vertex sets. The crossing constraints ensure that if i < k in the first contact map and both vertices are matched, then their images occur in the same order in the second contact map. Therefore every feasible ILP assignment projects to a feasible order-preserving partial alignment.
Conversely, any feasible CMO alignment defines a feasible ILP assignment by setting x_{i,f(i)} = 1 for matched vertices and all other x variables to zero. The alignment is injective and order-preserving, so all row, column, and crossing constraints are satisfied.
For every contact {i,k} in E_1 and contact {j,l} in E_2, the corresponding y variable can be one only when both endpoint matches are selected. Since all y variables have positive objective coefficient and no negative side effect, an optimal ILP sets exactly the preserved-contact y variables to one.
Thus the ILP objective equals the number of contacts preserved by the alignment, and maximizing the ILP objective is exactly maximizing contact-map overlap.
Size Overhead
Let:
n_1 = num_vertices_1,
n_2 = num_vertices_2,
m_1 = num_contacts_1,
m_2 = num_contacts_2.
A direct formulation has:
| Target metric |
Formula |
num_vars |
n_1 * n_2 + m_1 * m_2 |
num_constraints |
n_1 + n_2 + n_1^2 * n_2^2 + 2 * m_1 * m_2 |
The crossing-constraint count is a conservative quadratic-pair bound. Implementations may generate only the invalid ordered pairs i < k and j >= l, but the asymptotic overhead remains O(n_1^2 n_2^2 + m_1 m_2).
Validation Method
Use closed-loop validation on small contact maps:
- enumerate all order-preserving partial alignments directly,
- compute the maximum contact overlap by brute force,
- solve the ILP instance,
- extract the alignment from the
x variables,
- verify that the extracted alignment is order-preserving and preserves exactly the ILP objective number of contacts.
Include cases where:
- the best alignment leaves some residues unmatched,
- a crossing alignment would preserve more contacts but must be rejected,
- two alignments tie for optimum,
- no contact is preserved,
- all contacts of the smaller map can be preserved.
Example
Use the source instance from #1043.
First contact map:
E_1 = {{0,2}, {1,3}}.
Second contact map:
E_2 = {{0,3}, {1,4}, {0,2}}.
The optimal alignment is:
Vertex in G_1 |
Image in G_2 |
0 |
0 |
1 |
1 |
2 |
3 |
3 |
4 |
Set the corresponding match variables to one:
x_{0,0} = 1,
x_{1,1} = 1,
x_{2,3} = 1,
x_{3,4} = 1.
Then the preserved-contact variables include:
y_{0,2,0,3} = 1, since {0,2} maps to {0,3},
y_{1,3,1,4} = 1, since {1,3} maps to {1,4}.
The ILP objective value is therefore 2. Since G_1 has only two contacts, this is optimal and matches the source CMO objective.
BibTeX
@article{AndonovMalodDogninYanev2011CMO,
author = {Rumen Andonov and Noel Malod-Dognin and Nicola Yanev},
title = {Maximum Contact Map Overlap Revisited},
journal = {Journal of Computational Biology},
volume = {18},
number = {1},
pages = {27--41},
year = {2011},
doi = {10.1089/cmb.2009.0196},
url = {https://doi.org/10.1089/cmb.2009.0196}
}
@article{XieSahinidis2007CMO,
author = {Wei Xie and Nikolaos V. Sahinidis},
title = {A Reduction-Based Exact Algorithm for the Contact Map Overlap Problem},
journal = {Journal of Computational Biology},
volume = {14},
number = {5},
pages = {637--654},
year = {2007},
doi = {10.1089/cmb.2007.R007},
url = {https://doi.org/10.1089/cmb.2007.R007}
}
Source
MaximumContactMapOverlap
Source model issue: #1043
Target
ILP/bool
Motivation
This rule gives
MaximumContactMapOverlapimmediate solver connectivity through the existing ILP hub.The formulation follows the standard contact-map-overlap idea: choose an order-preserving residue alignment and maximize the number of contact pairs that are superimposed by the alignment.
Reference
Reduction Algorithm
Let the source instance contain ordered contact maps
G_1 = (V_1,E_1)andG_2 = (V_2,E_2), withV_1 = {0,...,n_1-1}andV_2 = {0,...,n_2-1}.Construct a binary ILP as follows.
For each possible residue match
(i,j) in V_1 x V_2, create a binary variablex_{i,j}.It means vertex
iofG_1is aligned to vertexjofG_2.For each pair of contacts
{i,k} in E_1and{j,l} in E_2withi < kandj < l, create a binary variabley_{i,k,j,l}.It means contact
{i,k}is preserved by aligningitojandktol.Add row constraints for every
i in V_1:sum_j x_{i,j} <= 1.Each vertex of
G_1is matched at most once.Add column constraints for every
j in V_2:sum_i x_{i,j} <= 1.Each vertex of
G_2is matched at most once.Add order-preservation constraints. For every
i < kinV_1and everyj >= linV_2, add:x_{i,j} + x_{k,l} <= 1.This forbids crossings and equal-image matches, so all matched images are strictly increasing.
Link each preserved-contact variable to its endpoint-match variables:
y_{i,k,j,l} <= x_{i,j}y_{i,k,j,l} <= x_{k,l}.Because every
yhas positive objective coefficient and the objective is maximization, an optimal solution setsy_{i,k,j,l} = 1exactly when the two endpoint matches are selected.Maximize:
sum y_{i,k,j,l}over all contact pairs from
E_1 x E_2that respect endpoint order.Extract the CMO alignment by mapping
ito the uniquejwithx_{i,j} = 1, and leavingiunmatched if allx_{i,j}are zero.Correctness Argument
The row and column constraints make the selected
xvariables a partial injective matching between the two ordered vertex sets. The crossing constraints ensure that ifi < kin the first contact map and both vertices are matched, then their images occur in the same order in the second contact map. Therefore every feasible ILP assignment projects to a feasible order-preserving partial alignment.Conversely, any feasible CMO alignment defines a feasible ILP assignment by setting
x_{i,f(i)} = 1for matched vertices and all otherxvariables to zero. The alignment is injective and order-preserving, so all row, column, and crossing constraints are satisfied.For every contact
{i,k} in E_1and contact{j,l} in E_2, the correspondingyvariable can be one only when both endpoint matches are selected. Since allyvariables have positive objective coefficient and no negative side effect, an optimal ILP sets exactly the preserved-contactyvariables to one.Thus the ILP objective equals the number of contacts preserved by the alignment, and maximizing the ILP objective is exactly maximizing contact-map overlap.
Size Overhead
Let:
n_1 = num_vertices_1,n_2 = num_vertices_2,m_1 = num_contacts_1,m_2 = num_contacts_2.A direct formulation has:
num_varsn_1 * n_2 + m_1 * m_2num_constraintsn_1 + n_2 + n_1^2 * n_2^2 + 2 * m_1 * m_2The crossing-constraint count is a conservative quadratic-pair bound. Implementations may generate only the invalid ordered pairs
i < kandj >= l, but the asymptotic overhead remainsO(n_1^2 n_2^2 + m_1 m_2).Validation Method
Use closed-loop validation on small contact maps:
xvariables,Include cases where:
Example
Use the source instance from #1043.
First contact map:
E_1 = {{0,2}, {1,3}}.Second contact map:
E_2 = {{0,3}, {1,4}, {0,2}}.The optimal alignment is:
G_1G_200112334Set the corresponding match variables to one:
x_{0,0} = 1,x_{1,1} = 1,x_{2,3} = 1,x_{3,4} = 1.Then the preserved-contact variables include:
y_{0,2,0,3} = 1, since{0,2}maps to{0,3},y_{1,3,1,4} = 1, since{1,3}maps to{1,4}.The ILP objective value is therefore
2. SinceG_1has only two contacts, this is optimal and matches the source CMO objective.BibTeX