[Model] MaximumContactMapOverlap

## Motivation

`MaximumContactMapOverlap` is a standard protein-structure comparison problem that is not currently in the catalog.

It is useful because it models contact-map alignment directly: given two ordered residue contact graphs, find an order-preserving alignment that maximizes the number of superimposed contacts. This is a core combinatorial formulation behind flexible protein-structure comparison, motif comparison, and contact-map based clustering.

**Associated rule:**
- [ ] #1044 

## Definition

**Name:** `MaximumContactMapOverlap`  
**Aliases:** `CMO`, `MaxCMO`  
**Reference:**
- Rumen Andonov, Noel Malod-Dognin, and Nicola Yanev, "Maximum Contact Map Overlap Revisited," *Journal of Computational Biology* 18(1):27-41, 2011. https://doi.org/10.1089/cmb.2009.0196
- Wei Xie and Nikolaos V. Sahinidis, "A reduction-based exact algorithm for the contact map overlap problem," *Journal of Computational Biology* 14(5):637-654, 2007. https://doi.org/10.1089/cmb.2007.R007

Let
`G_1 = (V_1, E_1)` and `G_2 = (V_2, E_2)`
be two finite ordered contact maps. Each `V_r` is ordered as residue positions
`0, 1, ..., n_r - 1`, and each `E_r` is a simple undirected contact set on those vertices.

A solution is an order-preserving partial injective alignment
`f : V_1 -> V_2 union {unmatched}`.

The alignment is feasible iff:
- each vertex of `V_1` is either unmatched or mapped to one vertex of `V_2`,
- no two vertices of `V_1` are mapped to the same vertex of `V_2`,
- if `i < k` in `V_1` and both are matched, then `f(i) < f(k)` in `V_2`.

The objective is to maximize the number of preserved contacts:

`|{ {i,k} in E_1 : i and k are both matched and {f(i), f(k)} in E_2 }|`.

So this is a maximization problem.

## Variables

Let `n_1 = |V_1|` and `n_2 = |V_2|`.

- **Count:** `n_1` variables
- **Per-variable domain:** `{0, 1, ..., n_2}`
- **Meaning:** value `0` means unmatched; value `j + 1` means the vertex is matched to vertex `j` in `V_2`

A configuration is feasible iff its nonzero matched values are strictly increasing along the order of `V_1`. The objective value is the number of contacts of `E_1` whose endpoints are mapped to endpoints of a contact in `E_2`.

## Schema (data type)

**Type name:** `MaximumContactMapOverlap`

| Field | Mathematical type | Description |
|-------|-------------------|-------------|
| `num_vertices_1` | integer | Number of ordered residues/vertices in first contact map |
| `contacts_1` | set of unordered vertex pairs | Scored contacts in first contact map |
| `num_vertices_2` | integer | Number of ordered residues/vertices in second contact map |
| `contacts_2` | set of unordered vertex pairs | Scored contacts in second contact map |

Suggested size fields:
- `num_vertices_1`
- `num_vertices_2`
- `num_contacts_1`
- `num_contacts_2`

Input validation should require:
- all contact endpoints are valid vertex indices,
- no self-loops,
- no parallel duplicate contacts after normalizing unordered pairs,
- vertices are ordered implicitly by their indices.

## Complexity

The direct brute-force baseline enumerates every map from `V_1` to `V_2 union {unmatched}` and rejects those that are not order-preserving and injective.

This gives the conservative worst-case expression:

`(num_vertices_2 + 1) ^ num_vertices_1`.

This is a baseline enumeration bound, not a best-known exact algorithm bound. The CMO literature contains substantially stronger exact algorithms and integer-programming approaches.

## Extra Remark

For biological contact maps, consecutive-residue contacts are often excluded before scoring because they are trivial backbone contacts. The clean implementation convention is that the input edge sets `E_1` and `E_2` already contain exactly the contacts intended to be scored. The objective should not contain an additional special case for adjacent residues.

The mathematical model is symmetric in the two contact maps, but the internal configuration is intentionally asymmetric:

`graph 1 -> graph 2 or unmatched`.

This keeps evaluation simple while representing every order-preserving alignment.

## Reduction Rule Crossref

- [ ] #1044 

## How to solve

- [x] It can be solved by brute force over order-preserving partial alignments.
- [x] It can be reduced exactly to `ILP/bool` via #1044.
- [x] It also has specialized exact algorithms in the CMO literature.

## Example Instance

Let the first contact map have vertices `0,1,2,3` and contacts:

`E_1 = {{0,2}, {1,3}}`.

Let the second contact map have vertices `0,1,2,3,4` and contacts:

`E_2 = {{0,3}, {1,4}, {0,2}}`.

## Expected Outcome

One optimal alignment is:

| Vertex in `G_1` | Image in `G_2` |
|-----------------|----------------|
| `0` | `0` |
| `1` | `1` |
| `2` | `3` |
| `3` | `4` |

This alignment is order-preserving: `0 < 1 < 3 < 4`.

It preserves both contacts of `G_1`:
- `{0,2}` maps to `{0,3}`, which is in `E_2`,
- `{1,3}` maps to `{1,4}`, which is in `E_2`.

So the objective value is `2`. Since `G_1` has only two contacts, no alignment can preserve more than two contacts, so this is optimal.

## BibTeX

```bibtex
@article{AndonovMalodDogninYanev2011CMO,
  author  = {Rumen Andonov and Noel Malod-Dognin and Nicola Yanev},
  title   = {Maximum Contact Map Overlap Revisited},
  journal = {Journal of Computational Biology},
  volume  = {18},
  number  = {1},
  pages   = {27--41},
  year    = {2011},
  doi     = {10.1089/cmb.2009.0196},
  url     = {https://doi.org/10.1089/cmb.2009.0196}
}

@article{XieSahinidis2007CMO,
  author  = {Wei Xie and Nikolaos V. Sahinidis},
  title   = {A Reduction-Based Exact Algorithm for the Contact Map Overlap Problem},
  journal = {Journal of Computational Biology},
  volume  = {14},
  number  = {5},
  pages   = {637--654},
  year    = {2007},
  doi     = {10.1089/cmb.2007.R007},
  url     = {https://doi.org/10.1089/cmb.2007.R007}
}
```

Vertex in `G_1`	Image in `G_2`
`0`	`0`
`1`	`1`
`2`	`3`
`3`	`4`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] MaximumContactMapOverlap #1043

Motivation

Definition

Variables

Schema (data type)

Complexity

Extra Remark

Reduction Rule Crossref

How to solve

Example Instance

Expected Outcome

BibTeX

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Field	Mathematical type	Description
`num_vertices_1`	integer	Number of ordered residues/vertices in first contact map
`contacts_1`	set of unordered vertex pairs	Scored contacts in first contact map
`num_vertices_2`	integer	Number of ordered residues/vertices in second contact map
`contacts_2`	set of unordered vertex pairs	Scored contacts in second contact map

[Model] MaximumContactMapOverlap #1043

Description

Motivation

Definition

Variables

Schema (data type)

Complexity

Extra Remark

Reduction Rule Crossref

How to solve

Example Instance

Expected Outcome

BibTeX

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions