Context Compressor

Shrink LLM context windows — removing noise, redundancy, and long-tail detail without losing the signal. Typically 40–80% fewer tokens depending on how repetitive the input is (benchmarks).

🧩 Part of the Agent Loop Toolkit — three small, zero-dependency, framework-agnostic libraries you bolt onto any agent loop. Each works standalone; together they cover context → gate → steer.

Where it plugs in Library What it does

The context going in ← you are here context-compressor Shrink the LLM context window 40–80% — drop noise, redundancy, long-tail detail

The plan, before a step runs precheck-guardian Preview the plan, see per-step risk, approve / reject / edit

The run, while it's live something-else Interject, pause, or guard a live loop without restarting

Long agent loops, verbose tool output, and 10,000-line security scans blow past even a 200K context window — and every redundant token you send costs latency and money. Context Compressor is a small, fast, rule-based pipeline that strips the fat out of any text before it reaches the model.

from context_compressor import ContextCompressor

compressor = ContextCompressor()
result = compressor.compress(noisy_log_text)

print(result.stats.summary())   # 136 -> 65 tokens (52.2% smaller, backend=tiktoken)
print(result.compressed)        # the cleaned text, ready to send to your LLM

Why

Problem	What happens	What this does
Context overflow	Multi-turn agents accumulate history until the window overflows and the run breaks.	Collapse repeated turns and boilerplate phrasing.
Verbose tool output	A vuln scan or `SELECT *` dumps thousands of near-identical lines, 90% noise.	Drop noise, dedupe rows, trim long-tail detail.
Token cost	Every wasted token is latency + dollars on every call.	40–80% token reduction on noisy input, measured with `tiktoken`.

It works on anything text: chat transcripts, application logs, JSON blobs, SQL result dumps, and security scanner output.

Highlights

Zero required dependencies. Pure Python standard library. tiktoken is optional — without it a built-in heuristic counter is used automatically.
Lossless-leaning by default. Removals are high-precision; counts are preserved (port 22 open [x3]) rather than silently dropped.
Composable pipeline. Toggle each stage, tune thresholds, or add your own noise patterns via plain dataclass config.
Measured, not guessed. Every run returns before/after token counts and a per-stage breakdown.
Security-aware. A dedicated summarizer turns raw scanner output into a severity-ranked brief.
CLI included. cat scan.log | context-compress --stats.

How it compares

	Context Compressor	LLM-based compressors (e.g. LLMLingua)	Manual truncation
Approach	Rule-based pipeline	A model scores/keeps tokens	Cut to last N tokens
Dependencies	Zero (tiktoken optional)	A language model + GPU	None
Speed	~milliseconds, CPU	Model inference latency	Instant
Deterministic	Yes	No	Yes
Preserves structure	Yes (counts, severity, JSON shape)	Partly	No — drops whole tail
Best at	Noisy/repetitive logs, scans, transcripts	Dense natural-language prose	Quick-and-dirty

They're complementary: use this to cheaply strip the obvious 50–80% of noise on CPU, and reach for an LLM-based compressor only when you need to squeeze dense prose further.

Install

pip install llm-context-compressor                 # zero dependencies
pip install "llm-context-compressor[tiktoken]"     # exact OpenAI/Anthropic-style token counts

Install name is llm-context-compressor; the import is import context_compressor (like scikit-learn → import sklearn).

Or from source:

git clone https://github.com/uninhibited-scholar/context-compressor
cd context-compressor
pip install -e ".[dev]"
pytest

How it works

The pipeline runs cheap, high-precision stages first, then optionally falls back to extractive summarization only if a target_ratio is requested and the rules didn't get there:

raw text
   │
   ▼  1. NoiseFilter        drop timestamps, progress bars, status chatter, separators
   ▼  2. DetailTrimmer      shorten long strings, cap JSON depth, collapse table runs, strip log metadata
   ▼  3. PatternRemover     collapse repeated agent phrasing ("as I mentioned…")
   ▼  4. RedundancyFilter   dedupe identical/near-identical lines, keep counts
   ▼  5. ExtractiveSummarizer   (optional) TextRank-style, only if still over target
   │
   ▼
compressed text  +  full token/stage metrics

Usage

Presets

from context_compressor import ContextCompressor, CompressionConfig

ContextCompressor(CompressionConfig.conservative())  # safe, lossless-ish
ContextCompressor()                                  # balanced default
ContextCompressor(CompressionConfig.aggressive())    # smallest output

Hit a target size

cfg = CompressionConfig(target_ratio=0.3)   # aim for 30% of the original
result = ContextCompressor(cfg).compress(long_transcript)

Tune any stage

from context_compressor import CompressionConfig, NoiseConfig

cfg = CompressionConfig()
cfg.noise.drop_log_levels = True
cfg.redundancy.near_duplicate = True
cfg.trim.max_list_items = 10
cfg.noise.extra_patterns.append((r"^TRACE:.*$", "trace_line"))  # your own rule

Security scan brief

brief = ContextCompressor().compress_security_scan(nessus_output, examples_per_type=3)
print(brief)

【Security Scan Summary】

Detected 11 findings across 6 categories.

🔴 SQL Injection [CRITICAL]: 2  (CVE-2024-2117)
    1. https://shop.local/search?q=test
    2. https://shop.local/item?id=42
🟠 Cross-Site Scripting (XSS) [HIGH]: 2
    ...
🟢 Open Port / Service [LOW]: 3
    1. 10.0.0.5
    … and 2 more

Command line

context-compress scan.log --stats
cat transcript.txt | context-compress --preset aggressive
context-compress nessus.txt --security
context-compress big.log --target 0.3 > small.log

Compressed text goes to stdout; metrics go to stderr, so pipes stay clean.

Reading the metrics

result = compressor.compress(text)
s = result.stats

s.original_tokens      # 136
s.compressed_tokens    # 65
s.reduction_pct        # 52.2
s.token_backend        # "tiktoken" or "heuristic"
for stage in s.stages:
    print(stage.name, stage.chars_removed, stage.details)

Benchmarks

Reproducible with python benchmarks/benchmark.py (token counts via tiktoken):

Dataset	Tokens before	Tokens after	Reduction	Time
Application log	13,479	7,864	41.7%	30 ms
Security scan	4,625	3,863	16.5%	11 ms
Agent transcript	3,172	2,665	16.0%	8 ms
JSON result dump	1,124	220	80.4%	1 ms

Reduction scales with how repetitive the input is — heavily duplicated logs and scan output compress much further than already-unique prose. See benchmarks/BENCHMARKS.md for the chart.

JSON blobs

result = compressor.compress_json(huge_json_string)   # caps depth, lists, long strings
print(result.compressed)

RAG: LangChain & LlamaIndex

Drop-in adapters compress retrieved chunks before they reach the model. They are dependency-free (they duck-type the document objects), so installing this package never pulls in either framework.

# LangChain — implements the BaseDocumentTransformer interface
from context_compressor.integrations import CompressorDocumentTransformer

transformer = CompressorDocumentTransformer()
smaller_docs = transformer.transform_documents(retrieved_docs)
# each doc.metadata["compression"] now records the token savings

# LlamaIndex — works on nodes / Documents
from context_compressor.integrations import compress_nodes

nodes = compress_nodes(retriever.retrieve("my query"))

Integrating with an agent loop

from context_compressor import ContextCompressor, CompressionConfig

compressor = ContextCompressor(CompressionConfig(target_ratio=0.4))

def before_model_call(history: str) -> str:
    # Compress accumulated context before each turn to stay under the window.
    return compressor.compress(history).compressed

API at a glance

Object	Purpose
`ContextCompressor`	The pipeline. `.compress(text) -> CompressionResult`.
`CompressionConfig`	All knobs; `.aggressive()` / `.conservative()` presets.
`CompressionResult`	`.compressed` text + `.stats`.
`NoiseFilter`, `RedundancyFilter`, `DetailTrimmer`, `PatternRemover`	Stages, usable standalone.
`ExtractiveSummarizer`	Dependency-free TextRank-style summarizer.
`SecuritySummarizer`	Scanner output → severity-ranked brief.
`TokenCounter`	tiktoken-backed counter with heuristic fallback.

Development

pip install -e ".[dev]"
pytest --cov=context_compressor      # 24 tests, ~94% coverage

中文简介

Context Compressor 是一个零依赖的 Python 库，用于在把文本送入大模型之前压缩上下文：去除噪音（时间戳、进度条、状态消息）、合并重复行、裁剪长字符串/ 深层 JSON、折叠 Agent 重复话术，并可选地做抽取式摘要。典型可减少 50–80% 的 Token，配合 tiktoken 可获得与 OpenAI/Anthropic 对齐的精确计数。内置面向网络安全扫描结果的专属摘要器，可将上万行扫描日志归纳为按风险等级排序的简报。

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
assets		assets
benchmarks		benchmarks
examples		examples
src/context_compressor		src/context_compressor
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
RELEASING.md		RELEASING.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Context Compressor

Why

Highlights

How it compares

Install

How it works

Usage

Presets

Hit a target size

Tune any stage

Security scan brief

Command line

Reading the metrics

Benchmarks

JSON blobs

RAG: LangChain & LlamaIndex

Integrating with an agent loop

API at a glance

Development

中文简介

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Where it plugs in	Library	What it does
The context going in ← you are here	context-compressor	Shrink the LLM context window 40–80% — drop noise, redundancy, long-tail detail
The plan, before a step runs	precheck-guardian	Preview the plan, see per-step risk, approve / reject / edit
The run, while it's live	something-else	Interject, pause, or guard a live loop without restarting

Folders and files

Latest commit

History

Repository files navigation

Context Compressor

Why

Highlights

How it compares

Install

How it works

Usage

Presets

Hit a target size

Tune any stage

Security scan brief

Command line

Reading the metrics

Benchmarks

JSON blobs

RAG: LangChain & LlamaIndex

Integrating with an agent loop

API at a glance

Development

中文简介

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages