ONNX Split-Point Tool (GUI + CLI)

Single-split boundary analysis for ONNX graphs. The tool enumerates all topological boundaries and ranks them by communication volume, compute balance, and (optionally) a simple link/latency model.

Repo layout (refactored)

onnx_splitpoint_tool/ – Python package (core code)
- api.py – public API / compatibility re-exports
- cli.py – CLI implementation
- gui_app.py – Tkinter GUI
- onnx_utils.py – parsing + graph utilities
- metrics.py – cost/FLOPs/memory estimation + ranking
- pruning.py – skip-/block-aware pruning
- split_export.py – split export + runner skeleton + context diagrams
analyse_and_split.py – thin wrapper entrypoint (kept for backwards compatibility)
analyse_and_split_gui.py – thin wrapper entrypoint (kept for backwards compatibility)
download_model_zoo_examples.py – helper to download ONNX model zoo examples
model_zoo_manifest.json – curated model list for the downloader

Install

pip install onnx numpy matplotlib pillow
# optional (validation/runner):
pip install onnxruntime

Optional: Hailo DFC integration (Linux or Windows via WSL2)

The GUI can optionally run a Hailo feasibility check (parse-only) during ranking. This is useful to automatically prune split candidates where a partition cannot be translated by the Hailo toolchain.

Linux (native)

Install the Hailo Dataflow Compiler (DFC) Python wheel into the same Python environment as this tool.
In the GUI enable Hailo check and set Backend = local (or keep auto).

Windows + WSL2 (recommended)

The Hailo DFC wheel is Linux-only. Recommended setup:

Install WSL2 + an Ubuntu distro.
Create a WSL virtualenv (default expected path: ~/hailo_dfc_venv) and install the Hailo DFC wheel there. A helper script is included: ./scripts/setup_hailo_dfc_wsl.sh /path/to/hailo_dataflow_compiler-*.whl
Run the GUI on Windows, enable Hailo check, set Backend = wsl (or auto), and point it to:
- WSL venv: ~/hailo_dfc_venv/bin/activate
- WSL distro: optional (leave empty to use the default WSL distro)

Notes:

The tool calls wsl.exe and runs a helper script inside WSL; you do not need to activate the venv manually.
If your model files are on C:/D:, WSL can access them via /mnt/c/... and /mnt/d/....
A small helper for interactive shells is included: source ./scripts/activate_hailo_dfc_venv.sh

Practical tips:

The GUI provides a Test backend button in the Hailo section to verify that the selected backend (local or WSL) can import the Hailo SDK.
Hailo parse-only results are cached across runs (by sub-model hash) to avoid re-running DFC translation during ranking. You can clear the cache from the GUI (Clear cache) or delete ~/.onnx_splitpoint_tool/hailo_parse_cache.json.

Run

GUI

python analyse_and_split_gui.py

CLI

python analyse_and_split.py path/to/model.onnx --topk 10

Highlights

Export split-context diagrams (full + cut-flow) with configurable context hops (0..3).
Suggest split boundaries based on activation-communication + compute balance.
Skip-/Block-aware candidate pruning (heuristic: avoids splitting inside long skip/residual blocks).
Link-model plugin for latency/energy with optional constraints (bandwidth/latency/energy).
Pareto export + clean System / Workload separation (system_config.json + workload_profile.json).
Peak activation memory (approx, from value spans):
- per boundary: live-set bytes (same basis as Comm(b))
- per partition: peak_left[b]=max_{i<=b} live(i), peak_right[b]=max_{i>=b} live(i)
- optional constraints: max act mem left/right (fits SRAM/VRAM)
Split directly from the GUI (export part1/part2 ONNX).
Strict boundary option (rejects splits where part2 needs additional intermediate activations beyond the cut tensors; original graph inputs are allowed).
Optional onnxruntime validation (full(x) ~= part2(part1(x))).
Runner skeleton generator (run_split_onnxruntime.py)
- generic ORT benchmark runner for: full / part1 / part2 / composed
- supports input feeds via NPZ (--inputs-npz) and saving generated inputs (--save-inputs-npz)
- supports dumping standalone split interfaces as NPZ (--dump-interface {right,left,min,either}) with metadata
- supports CPU/CUDA/TensorRT (engine cache + fast-build preset) and writes validation_report.json

Example (dump interface for Part 2 / “right” side):

./run_split_onnxruntime.sh --provider tensorrt --dump-interface right --dump-interface-out results/interface
# outputs: results/interface_right.npz (and metadata in __meta__)

Example (LLM-style shapes via overrides):

./run_split_onnxruntime.sh --provider cuda --shape-override "input_ids=1x128 attention_mask=1x128"

Example (reproducible inputs to NPZ):

./run_split_onnxruntime.sh --provider cpu --seed 0 --save-inputs-npz results/inputs_full.npz
./run_split_onnxruntime.sh --provider cpu --inputs-npz results/inputs_full.npz

Benchmark set generator (GUI button: “Benchmark set…”)
- exports one subfolder per selected split (models + runner)
- writes benchmark_suite.py to run all cases and aggregate results/plots
- also exports paper assets into the benchmark folder root:
  - split_candidates.tex
  - plots (analysis_*.pdf / analysis_*.svg)
  - system_config.json, workload_profile.json
  - pareto_export.csv, candidate_pruning.json

Notes

If Graphviz dot is installed, .dot files are rendered to SVG/PDF automatically. Otherwise a matplotlib fallback diagram is created.
Windows: do not run .bat files with Python. Double click, or run from PowerShell directly.

External-data ONNX models (`.onnx` + `.onnx.data`)

Large models exported with ONNX external data are supported.

When exporting splits/benchmark sets, the tool tries to make the output folder usable by:

creating a hardlink to the referenced *.data file (fast, no extra disk use; requires same filesystem),
or falling back to symlink/copy,
and if none of the above is possible, it rewrites the ONNX external-data location to an absolute path (works locally, not portable).

GUI logs

The GUI writes logs to both:

~/.onnx_splitpoint_tool/gui.log
./gui.log (current working directory)

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
docs		docs
onnx_splitpoint_tool		onnx_splitpoint_tool
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.txt		README.txt
analyse_and_split.py		analyse_and_split.py
analyse_and_split_gui.py		analyse_and_split_gui.py
download_model_zoo_examples.py		download_model_zoo_examples.py
model_zoo_manifest.json		model_zoo_manifest.json
pt_to_onnx.py		pt_to_onnx.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
start_gui.sh		start_gui.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ONNX Split-Point Tool (GUI + CLI)

Repo layout (refactored)

Install

Optional: Hailo DFC integration (Linux or Windows via WSL2)

Linux (native)

Windows + WSL2 (recommended)

Run

GUI

CLI

Highlights

Notes

External-data ONNX models (`.onnx` + `.onnx.data`)

GUI logs

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ONNX Split-Point Tool (GUI + CLI)

Repo layout (refactored)

Install

Optional: Hailo DFC integration (Linux or Windows via WSL2)

Linux (native)

Windows + WSL2 (recommended)

Run

GUI

CLI

Highlights

Notes

External-data ONNX models (*.onnx + *.onnx.data)

GUI logs

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

External-data ONNX models (`.onnx` + `.onnx.data`)

Packages