privacynet

Privacy-preserving network telemetry — IP anonymization, flow aggregation, and differential privacy for sharing infrastructure data safely.

The Problem

Network engineers and security teams need to share telemetry (NetFlow, sFlow, IPFIX) for collaborative analysis, benchmarking, and research. But raw flow data contains sensitive information: internal IP addresses reveal topology, individual flows expose user behavior, and exact counts can be used to fingerprint organizations.

privacynet provides a composable toolkit to sanitize network telemetry before sharing, with configurable privacy guarantees ranging from simple anonymization to formal differential privacy.

Installation

pip install privacynet

Or from source:

git clone https://github.com/cwccie/privacynet.git
cd privacynet
pip install -e ".[dev]"

Quick Start

import pandas as pd
from privacynet import PrivacyPipeline

# Load your NetFlow data
df = pd.read_csv("flows.csv")

# Apply medium privacy (anonymize IPs + aggregate by time window)
pipeline = PrivacyPipeline(level="medium", key="your-secret-key")
safe_df = pipeline.process(df)

# Share safe_df with collaborators
safe_df.to_csv("safe_flows.csv", index=False)

Privacy Levels

Level	Anonymization	Aggregation	DP Noise	Use Case
`low`	Prefix-preserving IPs	No	No	Internal sharing, subnet analysis
`medium`	Prefix-preserving IPs	5-min window	No	Cross-team sharing, trend analysis
`high`	Prefix-preserving IPs	5-min window	Laplace	External sharing, published research

Components

IP Anonymization

Three methods for different privacy/utility tradeoffs:

from privacynet import IPAnonymizer

anon = IPAnonymizer(method="prefix", key="secret")

# Prefix-preserving: maintains subnet relationships
anon.anonymize_ip("10.0.1.100")           # → deterministic mapping

# Subnet truncation: simple but effective
anon.anonymize_ip("10.0.1.100", method="truncate")  # → "10.0.1.0"

# Random mapping: maximum privacy, consistent within session
anon.anonymize_ip("10.0.1.100", method="random")    # → random but consistent

# Also supports MAC and hostname anonymization
anon.anonymize_mac("aa:bb:cc:dd:ee:ff")   # → "aa:bb:cc:<hashed>"
anon.anonymize_hostname("db-primary")      # → "host-a3f1b2c9"

Flow Aggregation

Group individual flows to hide specific connections while preserving statistical value:

from privacynet import FlowAggregator

agg = FlowAggregator(min_group_size=5)  # k-anonymity: k=5

# Aggregate by time window
result = agg.temporal_aggregate(df, window="5min")

# Aggregate by source subnet
result = agg.subnet_aggregate(df, prefix_len=24)

# Aggregate by protocol
result = agg.protocol_aggregate(df)

Differential Privacy

Add calibrated noise with formal privacy guarantees:

from privacynet import DPMechanism

dp = DPMechanism(epsilon=1.0)

# Laplace mechanism for numeric values
noisy_bytes = dp.add_laplace_noise(value=150000, sensitivity=1000)

# Private count queries
noisy_count = dp.private_count(true_count=42)

# Track privacy budget
print(f"Budget spent: {dp.budget_spent}")
print(f"Remaining: {dp.remaining_budget(total_budget=10.0)}")

Validation

Verify that anonymized data retains analytical utility:

from privacynet import PrivacyValidator

validator = PrivacyValidator()
report = validator.validate(original_df, anonymized_df, value_cols=["bytes", "packets"])

print(f"Distribution similarity (bytes): {report['dist_similarity_bytes']:.2f}")
print(f"Correlation preservation: {report['correlation_diff']:.4f}")
print(f"Mean relative error (bytes): {report['mre_bytes']:.4f}")

CLI

# Anonymize IPs in a CSV
privacynet anonymize flows.csv --method prefix --key my-secret

# Aggregate by time window
privacynet aggregate flows.csv --window 5min --min-group 5

# Run the full pipeline
privacynet pipeline flows.csv --level high --key my-secret

Development

pip install -e ".[dev]"
pytest --cov=privacynet
ruff check src/ tests/

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
src/privacynet		src/privacynet
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

privacynet

The Problem

Installation

Quick Start

Privacy Levels

Components

IP Anonymization

Flow Aggregation

Differential Privacy

Validation

CLI

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

privacynet

The Problem

Installation

Quick Start

Privacy Levels

Components

IP Anonymization

Flow Aggregation

Differential Privacy

Validation

CLI

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages