Automated Malware Triage Workbench

An analyst-focused triage environment that reduces the time required to classify suspicious binaries and scripts.

Impact

Reduced repetitive first-pass enrichment into a single analyst-facing workflow.

Impact

Improved consistency in how suspicious files are documented before escalation.

Impact

Established a reusable base for deeper sandbox and memory-analysis extensions.

Deliverables

  • Upload and triage pipeline for binaries, scripts, and archives.
  • Normalized evidence outputs for hashes, YARA hits, strings, entropy, and heuristics.
  • Analyst review interface with scoring cues and investigation notes.

References

Artifacts

  • Triage workflow diagram slot reserved for upcoming artifact

Problem

Security teams often receive a large volume of suspicious files with limited context, forcing analysts to spend too much time on repetitive enrichment before they can determine what deserves deeper reversing effort.

Approach

  • Build a structured upload and triage pipeline that normalizes binaries, scripts, and archives into a consistent analysis workflow.
  • Attach hash enrichment, YARA scanning, strings extraction, entropy checks, and unpacking heuristics in a single analyst-facing interface.
  • Prioritize suspicious samples with a weighted scoring model that routes the right artifacts to deeper manual analysis.

Architecture / Workflow

  • Ingestion service validates submissions, fingerprints artifacts, and expands supported archive types.
  • Processing workers execute static checks, run signature rules, and generate normalized JSON evidence.
  • A triage UI presents verdict cues, extracted indicators, and analyst notes while preserving the evidence chain.

Tools and Technologies Used

Python, YARA, Ghidra, Flask, Docker

Results / Impact

  • Reduced repeated first-pass analysis steps into a single workflow.
  • Improved consistency in how suspicious samples are documented before escalation.
  • Created a reusable foundation for future sandbox and memory-analysis integrations.

Key Technical Takeaways

  • Fast triage only matters when outputs remain useful to deeper investigators.
  • Normalized evidence formats make automation easier to extend.
  • Scoring needs analyst override paths to avoid false confidence.