🍋 citrx

Local-first Apache & Nginx access-log analysis, in your terminal

Stream huge access logs, detect attacks and abuse with deterministic local rules, explore everything in an interactive TUI — and only ask AI when you decide to.

English · Español

# A single file, a folder, compressed or plain — citrx figures it out
npx @javipm/citrx /var/log/nginx/access.log
npx @javipm/citrx /var/log/nginx/          # a whole folder of logs
npx @javipm/citrx access.log.gz logs.zip   # .gz .br .zip .tar.gz .tgz
cat access.log | npx @javipm/citrx -        # stdin

That command streams the input, validates it, runs ~30 detection rules, and opens a full-screen TUI. No account, no upload, no telemetry.

citrx TUI — summary screen with incident tabs and the global access-log table

🖼️ Screenshots

_{Summary — incident tabs + indexed access-log table}	_{Incident — evidence + related access-log rows}
_{Top values — top IPs, paths, UAs, params}	_{Filter — query language across the log}
_{Terminal report — --no-interactive}	_{HTML report — self-contained, offline}

🤔 Why citrx

Access logs quietly hide expensive crawlers, scanner noise, fake bots, SQLi/XSS payloads, POST abuse, and traffic spikes. citrx is built for DevOps, security engineers, and backend developers who need fast answers to:

What happened?
Which paths, IPs, methods, user-agents, and query params are involved?
Which requests should I actually inspect?
Which WAF / rate-limit rule would reduce the impact?

The workflow is deliberately offline-first:

1. Run deterministic local analysis   →  no network, bounded memory
2. Explore incidents + raw requests    →  interactive TUI
3. Filter, sort, inspect, select rows  →  small query language
4. Ask AI — only when you press `a`     →  compact, redacted context

✨ Features


🌊 Streaming	Bounded-memory, line-by-line parsing. Multi-GB logs never load fully into RAM.
🧭 Format auto-detect	Samples each input, picks `apache_common` / `apache_combined` / `nginx_combined`, fails early on non-access-log input.
🧩 Custom formats	Declarative JSON config with one regex + named fields, validated with `zod`.
🛡️ ~30 detection rules	SQLi/XSS/LFI/SSRF/cmd-injection, recon, fake bots, scanners, DDoS bursts, AI crawlers, POST hotspots, error storms.
🖥️ Full TUI	Incident tabs, indexed access-log table, on-demand row loading, top values, request detail, exports.
🔎 Query language	`AND`/`OR`/`NOT`, parentheses, field operators, status families, wildcards, per-param filters.
📤 Reports	Terminal, JSON, Markdown, and self-contained offline HTML.
🤖 Opt-in AI	OpenAI is never called during analysis — only on explicit `a`, with redacted context.
📦 Compressed inputs	`.gz`, `.br`, `.zip`, `.tar.gz`, `.tgz`, folders, and stdin.
🔒 Local-first	No telemetry, secrets redacted, temp index deleted on exit.

🚀 Quick start

Run without installing

# npm
npx @javipm/citrx /var/log/nginx/access.log

# pnpm
pnpx @javipm/citrx /var/log/nginx/access.log

# yarn
yarn dlx @javipm/citrx /var/log/nginx/access.log

# bun
bunx @javipm/citrx /var/log/nginx/access.log

Install globally

npm i -g @javipm/citrx
citrx /var/log/nginx/access.log

Common invocations

# Analyze many paths, folders, and compressed files at once
citrx ./logs access.log.gz archive.zip

# Read from stdin
cat access.log | citrx -

# Non-interactive terminal report (CI, pipes, cron)
citrx access.log --no-interactive

# Structured reports
citrx access.log --json
citrx access.log --markdown --out report.md
citrx access.log --html     --out report.html

# Restrict a date range
citrx access.log --since 2026-05-25T00:00:00Z --until 2026-05-25T23:59:59Z

# Force a parser
citrx access.log --format apache_combined

Requirements: Node.js >=22 (developed and tested on 24.15). npx/pnpx handle the rest.

Exit codes make citrx CI-friendly:

Code	Meaning
`0`	Success, no high/critical incidents
`1`	Execution / configuration error
`2`	High or critical incidents found

📟 What the output looks like

Running the non-interactive report on a small synthetic log (citrx demo_access.log --no-interactive):

citrx access log analysis

Files: 1
Lines: 72/72
Invalid: 0
Bytes served: 86972
Time range: 2026-05-25T10:00:01.000Z to 2026-05-25T10:05:59.000Z
Peak global RPS: 3 at 2026-05-25T10:03:00.000Z
Formats: apache_combined

Top IPs
      60  8.8.4.4
       4  198.51.100.23
       3  45.83.66.12
       2  192.0.2.55
...

Known AI bots
       3  GPTBot ips=1 paths=1 robots=no

Security incidents (attacks)
  critical 100  SQL injection payload count=1
       ip: 198.51.100.23
       /index.php
       sample: /index.php?id=1+AND+SLEEP(5)
  critical  95  Known scanner user-agent count=4
       ip: 198.51.100.23
  critical  90  Sensitive file probe count=2 2XX_HIT
       ip: 198.51.100.23
       /.env
       /.git/config
  high      85  Known scanner user-agent count=2
       ip: 192.0.2.55

2XX_HIT means the payload or probe received at least one 2xx response — a possible successful reply worth inspecting, not proven compromise.

🧰 CLI reference

Usage: citrx [options] <paths...>

Options:
  --json                  Write machine-readable JSON output.
  --markdown              Write Markdown output.
  --html                  Write a self-contained HTML report.
  --out <path>            Write report output to a file.
  --no-interactive        Print the terminal report instead of opening the TUI.
  --format <format>       auto, apache_common, apache_combined,
                          nginx_combined, or custom:<name>.   (default: auto)
  --format-config <path>  JSON file with custom access-log formats.
  --top <n>               Limit top lists.                    (default: 20)
  --since <date>          Include entries at or after this date.
  --until <date>          Include entries at or before this date.
  --include <glob>        Include paths matching this glob.
  --exclude <glob>        Exclude paths matching this glob.
  --no-color              Disable colored terminal output.
  --debug                 Print debug details on failure.
  -v, --version           Display the current version.
  -h, --help              Display help for command.

Environment:

NO_COLOR=1 — disable color.
CITRX_QUIET=1 — silence startup/progress noise for terminal output.

If stdout/stdin are TTYs and no report format is requested, citrx opens the TUI by default. --no-interactive prints the terminal report instead.

📥 Inputs & formats

Supported inputs

Individual files · folders · stdin (-) · .gz · .br · .zip · .tar.gz · .tgz

ZIP/TAR archives are scanned for candidate log files (access.log, .log, .txt, extensionless logs, .gz, .br). Everything is streamed — full logs are never read into memory. The TUI builds a temporary access-log index under the OS temp dir and removes it on exit.

Built-in formats

apache_common · apache_combined · nginx_combined

Default is --format auto: citrx samples each input, chooses the best parser, and fails early when the sample doesn't look like an Apache/Nginx access log.

Custom formats

One declarative JSON config, one regex with named groups, validated by zod:

{
  "formats": [
    {
      "name": "pipe",
      "pattern": "^(?<ip>\\S+)\\|(?<timestamp>[^|]+)\\|(?<method>\\S+)\\|(?<target>\\S+)\\|(?<protocol>HTTP/[^|]+)\\|(?<status>\\d{3})\\|(?<bytes>\\S+)\\|(?<userAgent>.*)$",
      "fields": {
        "ip": "ip", "timestamp": "timestamp", "method": "method",
        "target": "target", "protocol": "protocol", "status": "status",
        "bytes": "bytes", "userAgent": "userAgent"
      }
    }
  ]
}

citrx access.log --format custom:pipe --format-config ./formats.json

Required fields: ip, timestamp, method, target, protocol, status. Optional: bytes, referer, userAgent, host, requestTime, upstreamTime, forwardedFor.

🖥️ Interactive TUI

When stdout/stdin are TTYs and no report format is requested, citrx opens a full-screen terminal UI. It's the core product surface, not a debug view.

┌─ citrx ────────────────────────────────────────────────────────────────────┐
│  [ access log ] [ SATURATION ] [ SECURITY ] [ OTHER ]          Tab to switch │
├──────────────────────────────────────────────────────────────────────────────┤
│  #     IP              TIME      MTH  ST   BYTES  PATH                        │
│  3     198.51.100.23   10:01:11  GET  500      0  /index.php?id=1+AND+SLEEP.. │
│  5     198.51.100.23   10:01:12  GET  200   1200  /.env                       │
│  7     192.0.2.55      10:02:00  GET  404      0  /wp-admin/                  │
│ ...                                                                           │
├──────────────────────────────────────────────────────────────────────────────┤
│  f filter   s sort   t top   Enter detail   a ask AI   e export   h help     │
└──────────────────────────────────────────────────────────────────────────────┘

Summary screen

Incident area has three tabs (cycle with Tab: access log → SATURATION → SECURITY → OTHER → access log):

Tab	Contents
🌊 SATURATION (default)	Rate bursts, DDoS, AI crawlers, abusive bots — traffic/resource abuse
🛡️ SECURITY	SQLi/XSS/LFI payloads, recon, fake bots, scanner UAs — compromise attempts
🗂️ OTHER	Low-signal / noise incidents filtered from the main panels

Tab              switch focus between access log and incident panels
↑/↓              move row            PgUp/PgDn   page through rows
Enter / d        open incident or request detail
f or /           filter access-log rows
s or S           open sort menu      t           global top values
Space            select current row  A           select visible rows
a                ask AI about current view/selection
e                open export menu (CSV, JSON, TSV)
r                reset filter, sort, and row selection
h                contextual help overlay (keys + filter syntax)
q                ask before quit

Incident screen

Evidence + every related access-log line. Rows load on demand by fixed-size buckets, so even huge incidents are responsive immediately. Filtering or sorting a large incident shows background progress in the status bar — press Esc to cancel and revert.

↑/↓ · PgUp/PgDn  navigate            Enter / d   open request detail
t                top values for this incident (computed from full row set)
f · s/S          filter · sort       Space · A   select row · visible page
a · e            ask AI · export     r           reset filter + selection
b                back to summary

Top values · request detail · export

Top values (t): top IPs, paths, user-agents, query params, and param values. Respects the active filter. Enter applies a filter from a value.
Request detail (Enter/d): full source, timestamp, IP, method, status, bytes, path, target, user-agent, and raw line with wrapping.
Export (e): CSV / JSON / TSV. Summary exports the selected rows or the full filtered result; incident export streams all filtered rows to a temp file and atomically renames it when done. Esc aborts a running export.

Long-running filter/sort/top/export operations always show a loading state — the app never looks frozen — and Esc consistently cancels the active operation before navigating.

🔎 Filtering

Filters work on the global access log, incident rows, and top-value drill-downs. Case-insensitive, with a small query language:

plain text searches across IP, time, method, path, target, status, bytes, UA, raw line
adjacent terms mean AND; explicit AND, OR, |, parentheses, and !/NOT
: = contains, = = exact, != = negated match
>, >=, <, <= for status, bytes, line
status families: status:2xx, status:3xx, status:4xx, status:5xx
anchored wildcards: ip:66.249.*
quoted values for spaces/symbols: ua:"Googlebot/2.1"
URL-encoded values are decoded before matching

method:POST status:200 url:*admin*
(method:POST OR method:PUT) status:2xx
(status:403 | status:404) !ua:*Googlebot*
ip:66.249.* bytes>50000
status:5xx path:/checkout
method!=GET status>=400
param:q                # any request with a q parameter
param:q=*select*       # q value contains "select"
param:*=*sleep*        # any param value contains "sleep"
raw:"union select"
source:access.log line>=10000 line<20000

Fields: ip, method, status, path, target, url, ua, bytes, param, query, source, line, time, raw

Aliases: url→target, timestamp→time, userAgent→ua, st→status, ln→line, src→source, qs→query, mth→method, params→param

Bare text is great for quick hunting — googlebot checkout requires both words somewhere in the searchable line.

🤖 AI mode (opt-in)

OpenAI is never called during analysis — only when you press a in the TUI and OPENAI_API_KEY is set.

export OPENAI_API_KEY="sk-proj-..."

# optional
export CITRX_OPENAI_MODEL="gpt-5.4-mini"
export CITRX_AI_MAX_LINES="200"
export CITRX_AI_MAX_CHARS="60000"

It receives compact, redacted context only:

report summary + time stats
top IPs / paths / methods / statuses + behavior stats
selected incident evidence
selected rows, or visible filtered rows when nothing is selected
user-agent references instead of repeating long UAs

The answer renders in a dedicated scrollable screen with lightweight Markdown. Access logs contain no ASN data, so the model is instructed never to invent ASN/organization details.

📊 Reports

Format	Flag	Notes
Terminal	`--no-interactive` (or non-TTY)	Colored summary + incidents
JSON	`--json`	Machine-readable, typed report model
Markdown	`--markdown`	Great for tickets / PRs
HTML	`--html`	Self-contained, offline, no external resources

Use --out <path> to write to disk. HTML reports embed CSS/JS, escape all data, ship sortable/filterable tables, and are print/PDF friendly.

🛡️ Detection rules

Every incident carries a kind that drives its TUI panel:

Kind	Panel	Examples
`compromise`	🛡️ SECURITY	SQLi/XSS/LFI payloads, recon, fake bots, scanner tools
`saturation`	🌊 SATURATION	DDoS bursts, AI crawlers, abusive crawlers, POST hotspots
`noise`	🗂️ OTHER	Low-signal patterns unlikely to need immediate action

Payload & recon rules

ID prefix	Category	Kind	Meaning
`sqli:`	`sql_injection`	compromise	`union select`, sleep/benchmark, encoded SQL
`xss:`	`xss`	compromise	script/browser execution indicators
`lfi_rfi:`	`path_traversal`	compromise	traversal, LFI/RFI, `php://filter`, sensitive paths
`ssrf:`	`ssrf`	compromise	localhost, metadata IPs/hosts, callback-like params
`command_injection:`	`command_injection`	compromise	shell metacharacters + command indicators
`recon_sensitive_file:`	`recon`	compromise	probes for `.env`, `.git`, backups, dumps
`rare_method:`	`http_anomaly`	noise	uncommon methods (`CONNECT`, `TRACE`, `OPTIONS`)

Payload incidents are grouped by attacker IP (one incident per IP). Scoring by response outcome:

any 2xx → SECURITY, critical/100 + 2XX_HIT (payload landed)
any 5xx → SECURITY, critical/90
only blocked/redirected → OTHER noise (context, not proven impact)

recon_sensitive_file needs ≥2 successful responses or a 10% success ratio to avoid flagging ordinary 404 scanners.

Aggregate path, rate / DDoS, error-storm rules

ID prefix	Category	Kind	Meaning
`abusive_crawl:`	`abusive_crawling`	saturation/noise	served path pressure or distributed crawling on a non-entrypoint path
`query_explosion:`	`abusive_crawling`	noise	one path with many query variants
`post_hotspot:`	`post_hotspot`	noise	endpoint with unusually many POSTs
`ddos_rps_burst_single_ip:`	`ddos`	saturation	one IP exceeds per-second RPS for consecutive seconds
`ddos_global_rps_spike`	`ddos`	saturation	global RPS over baseline for consecutive seconds
`http_head_flood:`	`ddos`	saturation	one IP with high ratio + peak of HEAD requests
`ddos_distributed_subnet:`	`ddos`	saturation	IPv4 `/24` or IPv6 `/48` over RPS + unique-IP thresholds
`http_4xx_storm:`	`http_anomaly`	noise	one IP, many 4xx in adjacent minute buckets
`http_5xx_storm:`	`http_anomaly`	saturation	one IP, many 5xx in adjacent minute buckets

Bot & scanner rules

ID prefix	Category	Kind	Meaning
`ai_scraper_known:`	`ai_scraper`	saturation/noise	known AI crawler/assistant UA, grouped by bot
`scanner_ua_known:`	`scanner`	compromise	known scanner/offensive tooling UA
`scanner_signature_paths:`	`scanner`	compromise	one IP touches many fingerprint paths
`single_ip_path_explosion:`	`abusive_crawling`	saturation	one IP > 10 unique paths/minute sustained
`ua_rotation_same_ip:`	`http_anomaly`	noise	one IP, many UAs and peak RPS ≥ 5
`fake_bot_googlebot:`	`fake_bot`	compromise	claims Googlebot but IP outside published ranges
`fake_bot_bingbot:`	`fake_bot`	compromise	claims bingbot but IP outside published Bing ranges

Notes: single_ip_path_explosion needs pathsPerMinute ≥ 10 (asset-heavy page loads don't trigger it). abusive_crawl enters SATURATION only with real served volume + a served-per-minute peak. fake_bot_* needs ≥10 requests. Verified Googlebot/Bingbot IPs are excluded from all bot/scanner detections.

Refresh the bundled Googlebot/Bingbot IP-range snapshots with:

pnpm run update-bot-ranges

🎯 Scoring

Each incident has kind, severity, score (0–100), typed evidence, redacted samples, and successful?.

Score	Severity
0–24	`info`
25–49	`low`
50–74	`medium`
75–89	`high`
90–100	`critical`

Post-processing multipliers:

+10 when the same evidence.ip appears in ≥2 incidents (correlated attacker)
+15 when a pattern persists ≥30 min (persistence bonus)
−10 for moderate AI crawlers that requested robots.txt

Persistence bonus does not apply to ai_scraper_known:* — AI crawlers run for weeks, so duration alone isn't a signal. Panels sort by kind weight (compromise → saturation → noise), then by score descending.

🔒 Security & privacy

Local analysis first — no network call during analysis.
No telemetry, ever. (If ever added, strict opt-in only.)
AI only on explicit a, with redacted context.
Secrets redacted in URL/query values: token, _token, sid, session, password, passwd, key, secret, jwt, auth, authorization
HTML output escaped; log content is never executed.
Temp TUI index files are deleted on exit.

Treat logs, exported JSON, paths, IPs, and route names as sensitive customer data — keep them out of public commits.

🛠️ Development

pnpm install
pnpm run typecheck
pnpm test
pnpm run build

# run from source against a fixture
pnpm run dev -- examples/your.log
pnpm run dev -- examples/your.log --json

Project layout:

input/    path discovery, stdin, compressed/archive readers
parser/   format detection, parser registry, built-in + custom parsers
analysis/ streaming aggregation, behavior tracking, incident match sets
rules/    deterministic request/path rules and scoring
run/      temporary run workspace and access-log index
tui/      Ink screens, hooks, filters, tables, overlays
ai/       compact redacted context builder + OpenAI client
report/   terminal, JSON, Markdown, HTML renderers

Stack: TypeScript (ESM) · commander · ink + React · zod · picocolors · official openai SDK · Vitest.

📄 License

MIT © javipm

_{Built for people who read their access logs. 🍋}

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
assets		assets
scripts		scripts
src		src
.gitignore		.gitignore
.nvmrc		.nvmrc
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
README_ES.md		README_ES.md
eslint.config.js		eslint.config.js
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🍋 citrx

Local-first Apache & Nginx access-log analysis, in your terminal

🖼️ Screenshots

📑 Table of contents

🤔 Why citrx

✨ Features

🚀 Quick start

Run without installing

Install globally

Common invocations

📟 What the output looks like

🧰 CLI reference

📥 Inputs & formats

Supported inputs

Built-in formats

Custom formats

🖥️ Interactive TUI

Summary screen

Incident screen

Top values · request detail · export

🔎 Filtering

🤖 AI mode (opt-in)

📊 Reports

🛡️ Detection rules

🎯 Scoring

🔒 Security & privacy

🛠️ Development

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🍋 citrx

Local-first Apache & Nginx access-log analysis, in your terminal

🖼️ Screenshots

📑 Table of contents

🤔 Why citrx

✨ Features

🚀 Quick start

Run without installing

Install globally

Common invocations

📟 What the output looks like

🧰 CLI reference

📥 Inputs & formats

Supported inputs

Built-in formats

Custom formats

🖥️ Interactive TUI

Summary screen

Incident screen

Top values · request detail · export

🔎 Filtering

🤖 AI mode (opt-in)

📊 Reports

🛡️ Detection rules

🎯 Scoring

🔒 Security & privacy

🛠️ Development

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages