Privacy-preserving network telemetry — IP anonymization, flow aggregation, and differential privacy for sharing infrastructure data safely.
Network engineers and security teams need to share telemetry (NetFlow, sFlow, IPFIX) for collaborative analysis, benchmarking, and research. But raw flow data contains sensitive information: internal IP addresses reveal topology, individual flows expose user behavior, and exact counts can be used to fingerprint organizations.
privacynet provides a composable toolkit to sanitize network telemetry before sharing, with configurable privacy guarantees ranging from simple anonymization to formal differential privacy.
pip install privacynetOr from source:
git clone https://github.com/cwccie/privacynet.git
cd privacynet
pip install -e ".[dev]"import pandas as pd
from privacynet import PrivacyPipeline
# Load your NetFlow data
df = pd.read_csv("flows.csv")
# Apply medium privacy (anonymize IPs + aggregate by time window)
pipeline = PrivacyPipeline(level="medium", key="your-secret-key")
safe_df = pipeline.process(df)
# Share safe_df with collaborators
safe_df.to_csv("safe_flows.csv", index=False)| Level | Anonymization | Aggregation | DP Noise | Use Case |
|---|---|---|---|---|
low |
Prefix-preserving IPs | No | No | Internal sharing, subnet analysis |
medium |
Prefix-preserving IPs | 5-min window | No | Cross-team sharing, trend analysis |
high |
Prefix-preserving IPs | 5-min window | Laplace | External sharing, published research |
Three methods for different privacy/utility tradeoffs:
from privacynet import IPAnonymizer
anon = IPAnonymizer(method="prefix", key="secret")
# Prefix-preserving: maintains subnet relationships
anon.anonymize_ip("10.0.1.100") # → deterministic mapping
# Subnet truncation: simple but effective
anon.anonymize_ip("10.0.1.100", method="truncate") # → "10.0.1.0"
# Random mapping: maximum privacy, consistent within session
anon.anonymize_ip("10.0.1.100", method="random") # → random but consistent
# Also supports MAC and hostname anonymization
anon.anonymize_mac("aa:bb:cc:dd:ee:ff") # → "aa:bb:cc:<hashed>"
anon.anonymize_hostname("db-primary") # → "host-a3f1b2c9"Group individual flows to hide specific connections while preserving statistical value:
from privacynet import FlowAggregator
agg = FlowAggregator(min_group_size=5) # k-anonymity: k=5
# Aggregate by time window
result = agg.temporal_aggregate(df, window="5min")
# Aggregate by source subnet
result = agg.subnet_aggregate(df, prefix_len=24)
# Aggregate by protocol
result = agg.protocol_aggregate(df)Add calibrated noise with formal privacy guarantees:
from privacynet import DPMechanism
dp = DPMechanism(epsilon=1.0)
# Laplace mechanism for numeric values
noisy_bytes = dp.add_laplace_noise(value=150000, sensitivity=1000)
# Private count queries
noisy_count = dp.private_count(true_count=42)
# Track privacy budget
print(f"Budget spent: {dp.budget_spent}")
print(f"Remaining: {dp.remaining_budget(total_budget=10.0)}")Verify that anonymized data retains analytical utility:
from privacynet import PrivacyValidator
validator = PrivacyValidator()
report = validator.validate(original_df, anonymized_df, value_cols=["bytes", "packets"])
print(f"Distribution similarity (bytes): {report['dist_similarity_bytes']:.2f}")
print(f"Correlation preservation: {report['correlation_diff']:.4f}")
print(f"Mean relative error (bytes): {report['mre_bytes']:.4f}")# Anonymize IPs in a CSV
privacynet anonymize flows.csv --method prefix --key my-secret
# Aggregate by time window
privacynet aggregate flows.csv --window 5min --min-group 5
# Run the full pipeline
privacynet pipeline flows.csv --level high --key my-secretpip install -e ".[dev]"
pytest --cov=privacynet
ruff check src/ tests/MIT License — Copyright (c) 2026 Corey Wade