[Rule] MaximumContactMapOverlap to ILP

## Source

MaximumContactMapOverlap

Source model issue: #1043

## Target

ILP/bool

## Motivation

This rule gives `MaximumContactMapOverlap` immediate solver connectivity through the existing ILP hub.

The formulation follows the standard contact-map-overlap idea: choose an order-preserving residue alignment and maximize the number of contact pairs that are superimposed by the alignment.

## Reference

- Rumen Andonov, Noel Malod-Dognin, and Nicola Yanev, "Maximum Contact Map Overlap Revisited," *Journal of Computational Biology* 18(1):27-41, 2011. https://doi.org/10.1089/cmb.2009.0196
- Wei Xie and Nikolaos V. Sahinidis, "A reduction-based exact algorithm for the contact map overlap problem," *Journal of Computational Biology* 14(5):637-654, 2007. https://doi.org/10.1089/cmb.2007.R007

## Reduction Algorithm

Let the source instance contain ordered contact maps
`G_1 = (V_1,E_1)` and `G_2 = (V_2,E_2)`, with
`V_1 = {0,...,n_1-1}` and `V_2 = {0,...,n_2-1}`.

Construct a binary ILP as follows.

1. For each possible residue match `(i,j) in V_1 x V_2`, create a binary variable `x_{i,j}`.
   It means vertex `i` of `G_1` is aligned to vertex `j` of `G_2`.

2. For each pair of contacts `{i,k} in E_1` and `{j,l} in E_2` with `i < k` and `j < l`, create a binary variable `y_{i,k,j,l}`.
   It means contact `{i,k}` is preserved by aligning `i` to `j` and `k` to `l`.

3. Add row constraints for every `i in V_1`:

   `sum_j x_{i,j} <= 1`.

   Each vertex of `G_1` is matched at most once.

4. Add column constraints for every `j in V_2`:

   `sum_i x_{i,j} <= 1`.

   Each vertex of `G_2` is matched at most once.

5. Add order-preservation constraints. For every `i < k` in `V_1` and every `j >= l` in `V_2`, add:

   `x_{i,j} + x_{k,l} <= 1`.

   This forbids crossings and equal-image matches, so all matched images are strictly increasing.

6. Link each preserved-contact variable to its endpoint-match variables:

   `y_{i,k,j,l} <= x_{i,j}`

   `y_{i,k,j,l} <= x_{k,l}`.

   Because every `y` has positive objective coefficient and the objective is maximization, an optimal solution sets `y_{i,k,j,l} = 1` exactly when the two endpoint matches are selected.

7. Maximize:

   `sum y_{i,k,j,l}`

   over all contact pairs from `E_1 x E_2` that respect endpoint order.

8. Extract the CMO alignment by mapping `i` to the unique `j` with `x_{i,j} = 1`, and leaving `i` unmatched if all `x_{i,j}` are zero.

## Correctness Argument

The row and column constraints make the selected `x` variables a partial injective matching between the two ordered vertex sets. The crossing constraints ensure that if `i < k` in the first contact map and both vertices are matched, then their images occur in the same order in the second contact map. Therefore every feasible ILP assignment projects to a feasible order-preserving partial alignment.

Conversely, any feasible CMO alignment defines a feasible ILP assignment by setting `x_{i,f(i)} = 1` for matched vertices and all other `x` variables to zero. The alignment is injective and order-preserving, so all row, column, and crossing constraints are satisfied.

For every contact `{i,k} in E_1` and contact `{j,l} in E_2`, the corresponding `y` variable can be one only when both endpoint matches are selected. Since all `y` variables have positive objective coefficient and no negative side effect, an optimal ILP sets exactly the preserved-contact `y` variables to one.

Thus the ILP objective equals the number of contacts preserved by the alignment, and maximizing the ILP objective is exactly maximizing contact-map overlap.

## Size Overhead

Let:
- `n_1 = num_vertices_1`,
- `n_2 = num_vertices_2`,
- `m_1 = num_contacts_1`,
- `m_2 = num_contacts_2`.

A direct formulation has:

| Target metric | Formula |
|---------------|---------|
| `num_vars` | `n_1 * n_2 + m_1 * m_2` |
| `num_constraints` | `n_1 + n_2 + n_1^2 * n_2^2 + 2 * m_1 * m_2` |

The crossing-constraint count is a conservative quadratic-pair bound. Implementations may generate only the invalid ordered pairs `i < k` and `j >= l`, but the asymptotic overhead remains `O(n_1^2 n_2^2 + m_1 m_2)`.

## Validation Method

Use closed-loop validation on small contact maps:
- enumerate all order-preserving partial alignments directly,
- compute the maximum contact overlap by brute force,
- solve the ILP instance,
- extract the alignment from the `x` variables,
- verify that the extracted alignment is order-preserving and preserves exactly the ILP objective number of contacts.

Include cases where:
- the best alignment leaves some residues unmatched,
- a crossing alignment would preserve more contacts but must be rejected,
- two alignments tie for optimum,
- no contact is preserved,
- all contacts of the smaller map can be preserved.

## Example

Use the source instance from #1043.

First contact map:

`E_1 = {{0,2}, {1,3}}`.

Second contact map:

`E_2 = {{0,3}, {1,4}, {0,2}}`.

The optimal alignment is:

| Vertex in `G_1` | Image in `G_2` |
|-----------------|----------------|
| `0` | `0` |
| `1` | `1` |
| `2` | `3` |
| `3` | `4` |

Set the corresponding match variables to one:
- `x_{0,0} = 1`,
- `x_{1,1} = 1`,
- `x_{2,3} = 1`,
- `x_{3,4} = 1`.

Then the preserved-contact variables include:
- `y_{0,2,0,3} = 1`, since `{0,2}` maps to `{0,3}`,
- `y_{1,3,1,4} = 1`, since `{1,3}` maps to `{1,4}`.

The ILP objective value is therefore `2`. Since `G_1` has only two contacts, this is optimal and matches the source CMO objective.

## BibTeX

```bibtex
@article{AndonovMalodDogninYanev2011CMO,
  author  = {Rumen Andonov and Noel Malod-Dognin and Nicola Yanev},
  title   = {Maximum Contact Map Overlap Revisited},
  journal = {Journal of Computational Biology},
  volume  = {18},
  number  = {1},
  pages   = {27--41},
  year    = {2011},
  doi     = {10.1089/cmb.2009.0196},
  url     = {https://doi.org/10.1089/cmb.2009.0196}
}

@article{XieSahinidis2007CMO,
  author  = {Wei Xie and Nikolaos V. Sahinidis},
  title   = {A Reduction-Based Exact Algorithm for the Contact Map Overlap Problem},
  journal = {Journal of Computational Biology},
  volume  = {14},
  number  = {5},
  pages   = {637--654},
  year    = {2007},
  doi     = {10.1089/cmb.2007.R007},
  url     = {https://doi.org/10.1089/cmb.2007.R007}
}
```


Target metric	Formula
`num_vars`	`n_1 * n_2 + m_1 * m_2`
`num_constraints`	`n_1 + n_2 + n_1^2 * n_2^2 + 2 * m_1 * m_2`

Vertex in `G_1`	Image in `G_2`
`0`	`0`
`1`	`1`
`2`	`3`
`3`	`4`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Rule] MaximumContactMapOverlap to ILP #1044

Source

Target

Motivation

Reference

Reduction Algorithm

Correctness Argument

Size Overhead

Validation Method

Example

BibTeX

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Rule] MaximumContactMapOverlap to ILP #1044

Description

Source

Target

Motivation

Reference

Reduction Algorithm

Correctness Argument

Size Overhead

Validation Method

Example

BibTeX

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions