stable

cleave

Cleave decomposes software into capabilities. Source or binary, 20+ languages, six binary formats, one pass. Output is labeled with MITRE ATT&CK techniques and Malware Behavior Catalog behaviors — the taxonomy analysts already use. Run it standalone for triage, or feed its output to litmus.

Capabilities

  • 20+ languages — Python, JavaScript, TypeScript, Go, Rust, C/C++, Java, C#, Swift, Objective-C, Ruby, PHP, Perl, Lua, Shell, PowerShell, Groovy, Scala, Zig, Elixir
  • Binary formats — Mach-O, ELF, PE, Java .class, Python .pyc, compiled AppleScript
  • Archive formats — ZIP, TAR, 7z, RAR, plus JAR/WAR, deb, rpm, apk, gem, crate, whl, nupkg, phar, vsix, xpi, crx, ipa, epub
  • Document & data formats — RTF, LNK, PNG steganography, PDF, plist, VBScript, Batch, manifests, GitHub Actions workflows
  • AST traversal via Tree-sitter for source-level analysis
  • Binary reverse engineering via Rizin, header parsing via Goblin
  • Signature matching via YARA-X
  • String extraction via stng for payload decoding

AI-derived rulesets

Rulesets are a mix of 50,000+ rules. The hand-written ones are precise for patterns humans understand well. The AI-derived ones come from a training corpus of about a million samples and cover the long tail across 20+ languages — territory no team is going to keep up with by hand.

The AI runs at build time, not at runtime. The cleave you ship is deterministic: same input, same output, every run. No model weights loaded, no GPU, no network, no telemetry. If you don't trust that, the rules are in the source tree.

Install

Homebrew (macOS or Linux):
brew tap atomdrift/tap https://codeberg.org/atomdrift/homebrew-tap.git

brew install atomdrift/tap/cleave
From source
git clone --depth 1 https://codeberg.org/atomdrift/cleave.git
make install

Usage

$ cleave suspect.bin
$ cleave /tmp/box-o-malware   # recursive, unpacks archives

For more thorough results, install Rizin (binary reverse-engineering) and UPX (unpacker).