BOSPERRUS: BOundary effects in SPatial graphs: ERRor modeling and Untangling Strategies

Tutorial — Getting Started with `BosporusFlow`

BOSPORUS (BOundary effects in Spatial graphs: error modeling and Untangling Strategies) detects and corrects boundary effects in spatial graphs. Nodes close to the border of a tissue sample or point cloud tend to have systematically lower centrality scores than nodes in the center — not because they are biologically different, but because they have fewer neighbours. BOSPORUS models this distance-dependent bias and optionally corrects for it.

Installation

Within a conda environment, run

pip install git+https://github.com/bionetslab/bosporus-package.git

Pip dependencies (scikit-learn, scipy, numpy, pandas) are installed automatically. Within the environment, you also need to

conda install graph-tool -c conda-forge

Key concepts

Entry points / input types:

Like its namesake, the Flow class in the Bosporus package enables multiple entry points and flexible branching:
🟣 Given coordinates, a distance function, a graph operator (Delaunay, k-nearest neighbor, or radius nearest neighbor, and functions for node observations (degree, closeness, clustering coefficients, …).
🩷 Given coordinates, a distance function, an edge list, and functions for node observations (degree, closeness, clustering coefficients, …).
🟡 Given coordinates, a distance function, and observations.
☘️ Given distances and observations.

Distance functions:

Quick-start: one-liner with `run_all`

import numpy as np
from bosporus import BosporusFlow

rng = np.random.default_rng(42)
coords = rng.uniform(0, 100, size=(500, 2))   # 500 random 2-D points

flow = BosporusFlow(coords)

# run_all builds the graph, computes centralities, fits models, and corrects
best_fits = flow.run_all(
    graph_type="delaunay",
    params=None,
    distance_function=flow.compute_distance_to_convex_hull,
)

# Inspect results
print(flow.fit_quality[["measure", "best_fit_type", "observed_effect_strength"]])
print(flow.df.head())   # contains raw + corrected centrality columns

run_all returns a dictionary of the best Fit object for each centrality measure.

Step-by-step walkthrough

Step 1 – Create a toy dataset

We simulate a disc-shaped point cloud to produce an obvious boundary effect: nodes at the perimeter will always have fewer neighbours than nodes at the centre.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from bosporus import BosporusFlow

rng = np.random.default_rng(0)

# Sample uniformly inside a disc of radius 50
n = 600
angles = rng.uniform(0, 2 * np.pi, n)
radii  = 50 * np.sqrt(rng.uniform(0, 1, n))   # sqrt for uniform area sampling
coords = np.stack([radii * np.cos(angles),
                   radii * np.sin(angles)], axis=1)

Step 2 – Initialise `BosporusFlow`

flow = BosporusFlow(coords)

BosporusFlow takes a single argument: an (N, 2) or (N, 3) NumPy array of spatial coordinates. Internally it creates an empty DataFrame flow.df indexed by node ID.

Step 3 – Build the spatial graph

Choose one of three graph construction methods:

# Option A – Delaunay triangulation (recommended default, no hyperparameter)
flow.construct_graph("delaunay", params=None)

# Option B – k-nearest-neighbour graph (directed)
# flow.construct_graph("knn", params={"k": 6})

# Option C – radius-nearest-neighbour graph (undirected)
# flow.construct_graph("rnn", params={"r": 15.0})

The edge list is stored in flow.edge_list.

Step 4 – Compute centrality measures

measures = ["degree", "closeness", "betweenness", "harmonic", "clustering", "pagerank"]
flow.compute_centralities(measures=measures)

print(flow.df.head())

All six measures are now columns in flow.df.

Step 5 – Compute a boundary distance

BOSPORUS needs to know how far each node is from the boundary. The method returns the column key that was added to flow.df and that will be used for fitting.

# Distance to the convex hull of the point cloud
distance_key = flow.compute_distance_to_convex_hull()

# Alternatives (use at most one):
# distance_key = flow.compute_distance_to_rectangular_border()
# distance_key = flow.compute_distance_to_pointset(pointset=boundary_coords)
# distance_key = flow.compute_distance_to_mask(mask=binary_2d_array)

print(f"Distance column: '{distance_key}'")
print(flow.df[[distance_key]].describe())

Step 6 – Fit models and correct

from bosporus.fit import ConstantFit, PiecewiseLinearFit, ExponentialSaturationFit, MichaelisMentenFit

flow.fit_models(
    measures=measures,
    distance_key=distance_key,
    fits=[ConstantFit, PiecewiseLinearFit, ExponentialSaturationFit, MichaelisMentenFit],
    calculate_rel_ll_to_baseline=ConstantFit,   # ConstantFit = "no boundary effect" baseline
)

After this step:

flow.fit_quality — a DataFrame with one row per measure summarising the best model, AIC-based statistics, effect strength, and half-life.
flow.df — now also contains columns named "BOSPORUS corrected <measure>" for each centrality.

Reading the results

# Overview table
print(flow.fit_quality[[
    "measure",
    "best_fit_type",
    "observed_effect_strength",   # signed relative drop from border to centre
    "observed_half_life",         # distance at which ~50% of the effect has decayed
    "affected samples",           # fraction of nodes still in the boundary-affected zone
]].to_string(index=False))

observed_effect_strength is defined as (c_border − c_centre) / (c_border + c_centre). A value close to 0 means negligible boundary bias; a value near ±1 means strong bias.

best_fit_type tells you which model won the AIC comparison:

Model	Interpretation
`Constant Fit`	No detectable boundary effect
`Piecewise Linear Fit`	Linear drop up to a breakpoint, then flat plateau
`Exponential Saturation Fit`	Smooth exponential approach to a plateau
`Michaelis-Menten Fit`	Hyperbolic saturation (analogous to enzyme kinetics)

To access the corrected values:

# All corrected columns at once
corrected_cols = [c for c in flow.df.columns if c.startswith("BOSPORUS corrected")]
print(flow.df[corrected_cols].head())

# Single measure
flow.df[["degree", "BOSPORUS corrected degree"]].head()

To access fit parameters directly:

degree_fit = flow.best_fits["degree"]
print(degree_fit.name)      # e.g. "Exponential Saturation Fit"
print(degree_fit.params)    # fitted parameter dict
print(degree_fit.AIC)

Choosing a graph type

Graph type	Key parameter	When to use
`"delaunay"`	none	Default choice; produces a natural triangulation without isolated nodes
`"knn"`	`k` (int)	When you want a fixed connectivity per node
`"rnn"`	`r` (float, same units as coords)	When biological interaction radius is known

Choosing a distance function

Method	Use case
`compute_distance_to_convex_hull()`	General default; works for any convex or near-convex tissue shape
`compute_distance_to_rectangular_border()`	Rectangular imaging window / biopsy
`compute_distance_to_pointset(pointset)`	Border defined by a set of explicit landmark coordinates
`compute_distance_to_mask(mask)`	Binary image mask defining the tissue boundary

All four methods accept an optional distance_key string argument to rename the distance column in flow.df.

Visualising the correction (optional)

import matplotlib.pyplot as plt

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

sc = axes[0].scatter(coords[:, 0], coords[:, 1],
                     c=flow.df["degree"], cmap="viridis", s=8)
axes[0].set_title("Raw degree centrality")
plt.colorbar(sc, ax=axes[0])

sc2 = axes[1].scatter(coords[:, 0], coords[:, 1],
                      c=flow.df["BOSPORUS corrected degree"], cmap="viridis", s=8)
axes[1].set_title("BOSPORUS-corrected degree centrality")
plt.colorbar(sc2, ax=axes[1])

plt.tight_layout()
plt.savefig("bosporus_correction.png", dpi=150)
plt.show()

Nodes near the border should appear more uniform after correction.

API reference summary

`BosporusFlow(coordinates)`

Method	Signature	Description
`run_all`	`(graph_type, params, distance_function, ...)`	Single-call convenience wrapper
`construct_graph`	`(graph_type, params)`	Build edge list
`compute_centralities`	`(measures)`	Populate `flow.df` with centrality columns
`compute_distance_to_convex_hull`	`(distance_key=None)`	Add convex-hull distance column
`compute_distance_to_rectangular_border`	`(distance_key=None)`	Add rectangular-border distance column
`compute_distance_to_pointset`	`(pointset, distance_key=None)`	Add distance-to-pointset column
`compute_distance_to_mask`	`(mask, distance_key=None)`	Add distance-to-mask column
`fit_models`	`(measures, distance_key, fits, ...)`	Fit all models, select best by AIC, correct centralities

Key attributes after a full run

Attribute	Type	Content
`flow.df`	`pd.DataFrame`	All node-level data: coordinates index, centralities, distances, corrected values
`flow.fit_quality`	`pd.DataFrame`	Per-measure model selection summary
`flow.best_fits`	`dict[str, Fit]`	Best `Fit` object per measure
`flow.edge_list`	`set`	Set of edges (frozensets or tuples)

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github/workflows		.github/workflows
bosporus		bosporus
plots_readme		plots_readme
tests		tests
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BOSPERRUS: BOundary effects in SPatial graphs: ERRor modeling and Untangling Strategies

Tutorial — Getting Started with `BosporusFlow`

Table of Contents

Installation

Key concepts

Quick-start: one-liner with `run_all`

Step-by-step walkthrough

Step 1 – Create a toy dataset

Step 2 – Initialise `BosporusFlow`

Step 3 – Build the spatial graph

Step 4 – Compute centrality measures

Step 5 – Compute a boundary distance

Step 6 – Fit models and correct

Reading the results

Choosing a graph type

Choosing a distance function

Visualising the correction (optional)

API reference summary

`BosporusFlow(coordinates)`

Key attributes after a full run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

BOSPERRUS: BOundary effects in SPatial graphs: ERRor modeling and Untangling Strategies

Tutorial — Getting Started with BosporusFlow

Table of Contents

Installation

Key concepts

Quick-start: one-liner with run_all

Step-by-step walkthrough

Step 1 – Create a toy dataset

Step 2 – Initialise BosporusFlow

Step 3 – Build the spatial graph

Step 4 – Compute centrality measures

Step 5 – Compute a boundary distance

Step 6 – Fit models and correct

Reading the results

Choosing a graph type

Choosing a distance function

Visualising the correction (optional)

API reference summary

BosporusFlow(coordinates)

Key attributes after a full run

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Tutorial — Getting Started with `BosporusFlow`

Quick-start: one-liner with `run_all`

Step 2 – Initialise `BosporusFlow`

`BosporusFlow(coordinates)`

Packages