Motivation
MaximumContactMapOverlap is a standard protein-structure comparison problem that is not currently in the catalog.
It is useful because it models contact-map alignment directly: given two ordered residue contact graphs, find an order-preserving alignment that maximizes the number of superimposed contacts. This is a core combinatorial formulation behind flexible protein-structure comparison, motif comparison, and contact-map based clustering.
Associated rule:
Definition
Name: MaximumContactMapOverlap
Aliases: CMO, MaxCMO
Reference:
- Rumen Andonov, Noel Malod-Dognin, and Nicola Yanev, "Maximum Contact Map Overlap Revisited," Journal of Computational Biology 18(1):27-41, 2011. https://doi.org/10.1089/cmb.2009.0196
- Wei Xie and Nikolaos V. Sahinidis, "A reduction-based exact algorithm for the contact map overlap problem," Journal of Computational Biology 14(5):637-654, 2007. https://doi.org/10.1089/cmb.2007.R007
Let
G_1 = (V_1, E_1) and G_2 = (V_2, E_2)
be two finite ordered contact maps. Each V_r is ordered as residue positions
0, 1, ..., n_r - 1, and each E_r is a simple undirected contact set on those vertices.
A solution is an order-preserving partial injective alignment
f : V_1 -> V_2 union {unmatched}.
The alignment is feasible iff:
- each vertex of
V_1 is either unmatched or mapped to one vertex of V_2,
- no two vertices of
V_1 are mapped to the same vertex of V_2,
- if
i < k in V_1 and both are matched, then f(i) < f(k) in V_2.
The objective is to maximize the number of preserved contacts:
|{ {i,k} in E_1 : i and k are both matched and {f(i), f(k)} in E_2 }|.
So this is a maximization problem.
Variables
Let n_1 = |V_1| and n_2 = |V_2|.
- Count:
n_1 variables
- Per-variable domain:
{0, 1, ..., n_2}
- Meaning: value
0 means unmatched; value j + 1 means the vertex is matched to vertex j in V_2
A configuration is feasible iff its nonzero matched values are strictly increasing along the order of V_1. The objective value is the number of contacts of E_1 whose endpoints are mapped to endpoints of a contact in E_2.
Schema (data type)
Type name: MaximumContactMapOverlap
| Field |
Mathematical type |
Description |
num_vertices_1 |
integer |
Number of ordered residues/vertices in first contact map |
contacts_1 |
set of unordered vertex pairs |
Scored contacts in first contact map |
num_vertices_2 |
integer |
Number of ordered residues/vertices in second contact map |
contacts_2 |
set of unordered vertex pairs |
Scored contacts in second contact map |
Suggested size fields:
num_vertices_1
num_vertices_2
num_contacts_1
num_contacts_2
Input validation should require:
- all contact endpoints are valid vertex indices,
- no self-loops,
- no parallel duplicate contacts after normalizing unordered pairs,
- vertices are ordered implicitly by their indices.
Complexity
The direct brute-force baseline enumerates every map from V_1 to V_2 union {unmatched} and rejects those that are not order-preserving and injective.
This gives the conservative worst-case expression:
(num_vertices_2 + 1) ^ num_vertices_1.
This is a baseline enumeration bound, not a best-known exact algorithm bound. The CMO literature contains substantially stronger exact algorithms and integer-programming approaches.
Extra Remark
For biological contact maps, consecutive-residue contacts are often excluded before scoring because they are trivial backbone contacts. The clean implementation convention is that the input edge sets E_1 and E_2 already contain exactly the contacts intended to be scored. The objective should not contain an additional special case for adjacent residues.
The mathematical model is symmetric in the two contact maps, but the internal configuration is intentionally asymmetric:
graph 1 -> graph 2 or unmatched.
This keeps evaluation simple while representing every order-preserving alignment.
Reduction Rule Crossref
How to solve
Example Instance
Let the first contact map have vertices 0,1,2,3 and contacts:
E_1 = {{0,2}, {1,3}}.
Let the second contact map have vertices 0,1,2,3,4 and contacts:
E_2 = {{0,3}, {1,4}, {0,2}}.
Expected Outcome
One optimal alignment is:
Vertex in G_1 |
Image in G_2 |
0 |
0 |
1 |
1 |
2 |
3 |
3 |
4 |
This alignment is order-preserving: 0 < 1 < 3 < 4.
It preserves both contacts of G_1:
{0,2} maps to {0,3}, which is in E_2,
{1,3} maps to {1,4}, which is in E_2.
So the objective value is 2. Since G_1 has only two contacts, no alignment can preserve more than two contacts, so this is optimal.
BibTeX
@article{AndonovMalodDogninYanev2011CMO,
author = {Rumen Andonov and Noel Malod-Dognin and Nicola Yanev},
title = {Maximum Contact Map Overlap Revisited},
journal = {Journal of Computational Biology},
volume = {18},
number = {1},
pages = {27--41},
year = {2011},
doi = {10.1089/cmb.2009.0196},
url = {https://doi.org/10.1089/cmb.2009.0196}
}
@article{XieSahinidis2007CMO,
author = {Wei Xie and Nikolaos V. Sahinidis},
title = {A Reduction-Based Exact Algorithm for the Contact Map Overlap Problem},
journal = {Journal of Computational Biology},
volume = {14},
number = {5},
pages = {637--654},
year = {2007},
doi = {10.1089/cmb.2007.R007},
url = {https://doi.org/10.1089/cmb.2007.R007}
}
Motivation
MaximumContactMapOverlapis a standard protein-structure comparison problem that is not currently in the catalog.It is useful because it models contact-map alignment directly: given two ordered residue contact graphs, find an order-preserving alignment that maximizes the number of superimposed contacts. This is a core combinatorial formulation behind flexible protein-structure comparison, motif comparison, and contact-map based clustering.
Associated rule:
Definition
Name:
MaximumContactMapOverlapAliases:
CMO,MaxCMOReference:
Let
G_1 = (V_1, E_1)andG_2 = (V_2, E_2)be two finite ordered contact maps. Each
V_ris ordered as residue positions0, 1, ..., n_r - 1, and eachE_ris a simple undirected contact set on those vertices.A solution is an order-preserving partial injective alignment
f : V_1 -> V_2 union {unmatched}.The alignment is feasible iff:
V_1is either unmatched or mapped to one vertex ofV_2,V_1are mapped to the same vertex ofV_2,i < kinV_1and both are matched, thenf(i) < f(k)inV_2.The objective is to maximize the number of preserved contacts:
|{ {i,k} in E_1 : i and k are both matched and {f(i), f(k)} in E_2 }|.So this is a maximization problem.
Variables
Let
n_1 = |V_1|andn_2 = |V_2|.n_1variables{0, 1, ..., n_2}0means unmatched; valuej + 1means the vertex is matched to vertexjinV_2A configuration is feasible iff its nonzero matched values are strictly increasing along the order of
V_1. The objective value is the number of contacts ofE_1whose endpoints are mapped to endpoints of a contact inE_2.Schema (data type)
Type name:
MaximumContactMapOverlapnum_vertices_1contacts_1num_vertices_2contacts_2Suggested size fields:
num_vertices_1num_vertices_2num_contacts_1num_contacts_2Input validation should require:
Complexity
The direct brute-force baseline enumerates every map from
V_1toV_2 union {unmatched}and rejects those that are not order-preserving and injective.This gives the conservative worst-case expression:
(num_vertices_2 + 1) ^ num_vertices_1.This is a baseline enumeration bound, not a best-known exact algorithm bound. The CMO literature contains substantially stronger exact algorithms and integer-programming approaches.
Extra Remark
For biological contact maps, consecutive-residue contacts are often excluded before scoring because they are trivial backbone contacts. The clean implementation convention is that the input edge sets
E_1andE_2already contain exactly the contacts intended to be scored. The objective should not contain an additional special case for adjacent residues.The mathematical model is symmetric in the two contact maps, but the internal configuration is intentionally asymmetric:
graph 1 -> graph 2 or unmatched.This keeps evaluation simple while representing every order-preserving alignment.
Reduction Rule Crossref
How to solve
ILP/boolvia [Rule] MaximumContactMapOverlap to ILP #1044.Example Instance
Let the first contact map have vertices
0,1,2,3and contacts:E_1 = {{0,2}, {1,3}}.Let the second contact map have vertices
0,1,2,3,4and contacts:E_2 = {{0,3}, {1,4}, {0,2}}.Expected Outcome
One optimal alignment is:
G_1G_200112334This alignment is order-preserving:
0 < 1 < 3 < 4.It preserves both contacts of
G_1:{0,2}maps to{0,3}, which is inE_2,{1,3}maps to{1,4}, which is inE_2.So the objective value is
2. SinceG_1has only two contacts, no alignment can preserve more than two contacts, so this is optimal.BibTeX