Python · Defensive Security

BlueStack — SIEM-in-a-Box for the B0bTheSkull Blue-Team Toolkit

A pre-wired ELK stack that ingests JSON from four custom blue-team tools, normalizes severity, tags events to MITRE ATT&CK techniques, and surfaces everything in Kibana — one command to stand up

elk elasticsearch logstash kibana docker siem mitre-attack blue-team defensive-security

← Back to projects

BlueStack — SIEM-in-a-Box

A Docker Compose stack that turns raw JSON output from four home-built detection tools into a searchable, cross-correlated SIEM without standing up anything manually.

The Goal

My other blue-team tools — LogHound, NetSentinel, HoneyNet, and ThreatPulse — each emit structured JSON. The problem was reading it: four terminal windows and grep only go so far. I wanted a single place to query across all of them, pivot from an IP to the events it generated across every source, and actually see MITRE ATT&CK coverage laid out visually.

BlueStack is that glue layer. One docker compose up brings up Elasticsearch, Logstash, and Kibana sized for a laptop, wires each tool to its own dedicated pipeline, and leaves you in Kibana ready to query.

How It Works

Each source tool has its own Logstash TCP port and pipeline config. That isolation matters: a parse error or malformed payload from one tool can't stall ingestion from the others.

The pipelines don't just forward JSON — they enrich every event before it hits Elasticsearch:

Severity normalization. Every source uses a severity field but with inconsistent casing. The pipeline uppercases it and enforces CRITICAL / HIGH / MEDIUM / LOW / INFO across the board.
source_tool tagging. Every document gets source_tool: loghound | netsentinel | honeynet | threatpulse. A single Kibana data view (bluestack-*) can then split or filter by origin without separate dashboards per tool.
MITRE ATT&CK enrichment. Known event_type values are mapped to technique IDs directly in the pipeline filter. NetSentinel's port_scan becomes T1046 / discovery. arp_spoof becomes T1557.002 / credential-access. HoneyNet's credential_attempt maps to T1110. And so on. LogHound passes its own mitre_technique field straight through.
ECS-aligned field aliases. source.ip, destination.address, mitre.technique_id, and mitre.tactic are written as ECS-style nested fields so cross-source dashboards get correct IP and date field types out of the shared index template rather than defaulting to keyword.
LogHound flattening. LogHound emits a wrapped report with a findings[] array. The pipeline splits it so each finding becomes its own document — otherwise a single scan would appear as one giant blob in Discover.

A setup script handles the full stand-up idempotently: compose up, install the index template, create Kibana data views. A separate sample-data script ships one payload per source through Logstash so you can verify the whole pipeline in under a minute.

Capabilities

Four independent Logstash pipelines — one per tool, isolated TCP ports (5001–5004), no shared parse state
Shared index template — enforces ip, date, and keyword field types across all four indices so aggregations and IP range queries work correctly
MITRE ATT&CK heatmap-ready — mitre.tactic × mitre.technique_id fields on every enriched event
Auto-created Kibana data views — open Discover and start querying immediately, no manual index pattern setup
Laptop-sized defaults — single-node Elasticsearch with 1 GB JVM heap; all ports bound to loopback only, nothing exposed to the LAN
Filebeat sidecar path — for continuous ingestion from live log files instead of piping with nc
Sample payloads — one realistic event per source bundled in examples/, enough to verify every pipeline stage

Tech & Tools

Layer	What's used
Ingestion	Logstash 8.x, TCP input, `json_lines` codec
Storage	Elasticsearch 8.x, single-node, custom index template
Visualization	Kibana 8.x, data views API
Orchestration	Docker Compose, `.env`-driven port and heap config
Optional sidecar	Filebeat `inputs.d` for file-based ingestion
Enrichment model	MITRE ATT&CK (technique IDs baked into pipeline filters)
ECS compliance	Elastic Common Schema field aliases for IP and address fields

What I Learned

On pipeline design. Giving each source its own Logstash pipeline and port rather than routing everything through one input was the right call early. It made debugging parse failures straightforward — I could ship a bad payload to port 5003 and watch only that pipeline error without touching the others. The tradeoff is four config files to maintain, but the isolation is worth it.

On index templates vs. dynamic mapping. Elasticsearch's dynamic mapping would have guessed source_ip as a text field the first time it saw one, which breaks IP range queries and geolocation entirely. Writing a shared index template that pre-declares ip fields as ip type before any data arrives saved a lot of re-indexing headaches.

On MITRE enrichment in the pipeline vs. the app. I initially considered having each tool emit its own technique IDs. Centralizing the mapping in Logstash filter configs turned out to be cleaner — it keeps the detection tools focused on detection, and the ATT&CK taxonomy in one place that's easy to update without touching four separate codebases.

On ECS alignment. Using ECS-style nested fields (source.ip instead of source_ip) paid off immediately in Kibana. Several built-in lens visualizations and the Discover IP reputation lookup work on ECS fields by default — I didn't have to customize anything to get them to recognize the address fields.

On lab vs. production scope. BlueStack ships with auth disabled and TLS off intentionally — it's a lab kit, not a hardened deployment. Documenting that boundary clearly (loopback-only bindings, a HARDENING.md) was more useful than half-implementing security controls that would have made the lab harder to run without actually being production-ready.