Companion rule for the model proposal in #1046.
Source / Target
Motivation
A direct integer-linear-programming formulation for MAAF gives an immediate solver path via the existing ILP solver backend, independent of the structural MAAF → MinimumFeedbackVertexSet route. The formulation is compact and natural: cut-indicator variables per tree edge plus block-ancestor indicator variables plus a topological-rank constraint enforcing acyclicity.
Reference
Wu, "Close lower and upper bounds for the minimum reticulate network of multiple phylogenetic trees", Bioinformatics 26(12), 2010. https://doi.org/10.1093/bioinformatics/btq198
See also Chen, Wang, "Algorithms for reticulate networks of multiple phylogenetic trees", IEEE/ACM TCBB 9(2), 2012 for a related ILP.
Reduction Algorithm
Let $T_1, T_2$ be the input rooted binary phylogenetic trees on leaf set $X$, $n = |X|$. Let $E_t$ denote the non-root edges of $T_t$ ($|E_t| = 2n - 2$).
-
Cut-indicator variables. For each non-root edge $e \in E_1 \cup E_2$, introduce a binary variable $y_e \in {0,1}$: $y_e = 1$ iff $e$ is cut (removed from $T_t$).
-
Block-ancestor variables. Let $B = {1,\dots,n}$ be candidate block indices. For each ordered pair $(i, j) \in B \times B$ with $i \neq j$, introduce a binary variable $z_{ij} \in {0,1}$ indicating the arc $i \to j$ in the block ancestor digraph.
-
Rank variables. For each $i \in B$, introduce an integer variable $r_i \in {0, 1, \dots, n-1}$.
-
Partition-agreement constraints. For each pair of leaves $(\ell, m) \in X \times X$, let $P_t(\ell, m)$ be the unique edge path from $\ell$ to $m$ in $T_t$. The constraint
$$\sum_{e \in P_1(\ell,m)} y_e ;=; 0 ;;\Longleftrightarrow;; \sum_{e \in P_2(\ell,m)} y_e ;=; 0$$
(linearized via auxiliary indicators $s_{\ell,m}^{(t)} \in {0,1}$ with $s_{\ell,m}^{(t)} \leq \sum_{e \in P_t(\ell,m)} y_e$ and $s_{\ell,m}^{(t)} \geq y_e$ for every $e \in P_t(\ell,m)$, plus $s_{\ell,m}^{(1)} = s_{\ell,m}^{(2)}$) enforces that $\ell$ and $m$ lie in the same block of $F$ iff neither tree cuts their connecting path.
-
Acyclicity constraints. For each $(i, j) \in B \times B$, $i \neq j$: $r_i - r_j + n \cdot z_{ij} \leq n - 1$ (standard Miller–Tucker–Zemlin-style rank constraint, infeasible precisely when the $z$-digraph contains a cycle).
-
Objective.
$$\text{minimize} ;; 1 + \sum_{e \in E_1} y_e.$$
(Each cut in $T_1$ produces one additional block; by the partition-agreement constraints the number of $T_1$-cuts equals the number of $T_2$-cuts and equals $|F| - 1$.)
Return the resulting ILP as the target instance.
Size Overhead
Target size fields from pred show ILP → size_fields = [num_constraints, num_variables].
| Field |
Formula |
num_variables |
4 * num_leaves + num_leaves ^ 2 |
num_constraints |
num_leaves ^ 2 |
(The $4n$ term counts the $y_e$ and $s^{(t)}{\ell,m}$ variables up to lower-order terms; $n^2$ bounds the $z{ij}$ plus partition-agreement equalities.)
Validation Method
Closed-loop test: construct the source instance, apply the reduction, solve the target ILP (brute-force or ILP solver), lift back to a partition $F$, and verify $|F| - 1$ matches the ground-truth hybridization number.
Example (fully worked)
Source. $X = {a,b,c,d}$, $T_1 = ((a,b),(c,d))$, $T_2 = ((a,c),(b,d))$.
Construction. $T_1$ and $T_2$ each have 6 non-root edges. The ILP has 12 $y$ variables, 12 $z$ variables, 4 $r$ variables, 32 $s$ variables (2 per ordered leaf pair per tree), with 16 partition-agreement equalities and 12 acyclicity constraints.
Target solved. Optimal objective = $1 + 3 = 4$; optimal $y$ assignment cuts 3 edges in $T_1$ (all three internal edges incident to the root on either side), producing the partition ${{a},{b},{c},{d}}$.
Lifted solution. $F = {{a},{b},{c},{d}}$, $|F| = 4$, hybridization number = 3.
BibTeX
@article{Wu2010,
author = {Wu, Yufeng},
title = {Close lower and upper bounds for the minimum reticulate network of multiple phylogenetic trees},
journal = {Bioinformatics},
volume = {26},
number = {12},
pages = {i140--i148},
year = {2010},
doi = {10.1093/bioinformatics/btq198},
}
Companion rule for the model proposal in #1046.
Source / Target
MaximumAcyclicAgreementForest(default variant) — see [Model] MaximumAcyclicAgreementForest #1046ILP/i32Motivation
A direct integer-linear-programming formulation for MAAF gives an immediate solver path via the existing
ILPsolver backend, independent of the structuralMAAF → MinimumFeedbackVertexSetroute. The formulation is compact and natural: cut-indicator variables per tree edge plus block-ancestor indicator variables plus a topological-rank constraint enforcing acyclicity.Reference
Wu, "Close lower and upper bounds for the minimum reticulate network of multiple phylogenetic trees", Bioinformatics 26(12), 2010. https://doi.org/10.1093/bioinformatics/btq198
See also Chen, Wang, "Algorithms for reticulate networks of multiple phylogenetic trees", IEEE/ACM TCBB 9(2), 2012 for a related ILP.
Reduction Algorithm
Let$T_1, T_2$ be the input rooted binary phylogenetic trees on leaf set $X$ , $n = |X|$ . Let $E_t$ denote the non-root edges of $T_t$ ($|E_t| = 2n - 2$ ).
(linearized via auxiliary indicators
(Each cut in
Return the resulting ILP as the target instance.
Size Overhead
Target size fields from
pred show ILP→size_fields = [num_constraints, num_variables].num_variables4 * num_leaves + num_leaves ^ 2num_constraintsnum_leaves ^ 2(The$4n$ term counts the $y_e$ and $s^{(t)}{\ell,m}$ variables up to lower-order terms; $n^2$ bounds the $z{ij}$ plus partition-agreement equalities.)
Validation Method
Closed-loop test: construct the source instance, apply the reduction, solve the target ILP (brute-force or ILP solver), lift back to a partition$F$ , and verify $|F| - 1$ matches the ground-truth hybridization number.
Example (fully worked)
Source.$X = {a,b,c,d}$ , $T_1 = ((a,b),(c,d))$ , $T_2 = ((a,c),(b,d))$ .
Construction.$T_1$ and $T_2$ each have 6 non-root edges. The ILP has 12 $y$ variables, 12 $z$ variables, 4 $r$ variables, 32 $s$ variables (2 per ordered leaf pair per tree), with 16 partition-agreement equalities and 12 acyclicity constraints.
Target solved. Optimal objective =$1 + 3 = 4$ ; optimal $y$ assignment cuts 3 edges in $T_1$ (all three internal edges incident to the root on either side), producing the partition ${{a},{b},{c},{d}}$ .
Lifted solution.$F = {{a},{b},{c},{d}}$ , $|F| = 4$ , hybridization number = 3.
BibTeX