[DOCS] Add tutorial for exporting and loading back Relax executables#18404
[DOCS] Add tutorial for exporting and loading back Relax executables#18404tqchen merged 7 commits intoapache:mainfrom
Conversation
Summary of ChangesHello @tlopex, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new, detailed tutorial aimed at clarifying the process of exporting and loading compiled Relax executables. It provides a practical guide for users to convert their AI models, originating from frameworks like PyTorch or ONNX, into deployable shared libraries. The tutorial covers the entire workflow from compilation to execution within the TVM runtime, addressing common deployment challenges and enabling efficient model inference across various target devices, including remote systems. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds a new tutorial on exporting and loading Relax executables. The tutorial is comprehensive and well-structured. My review focuses on improving the clarity and correctness of some code snippets, particularly for the alternative parameter handling flow and the remote execution example, to ensure users can follow them without issues.
| # | ||
| # mod = from_exported_program(exported_program, keep_params_as_input=False) | ||
| # # Parameters are now embedded as constants in the module | ||
| # executable = relax.build(built_mod, target=TARGET) |
There was a problem hiding this comment.
The variable built_mod here refers to the module built from mod with keep_params_as_input=True. However, in this alternative flow, mod is redefined with keep_params_as_input=False. Using the old built_mod is incorrect as it will not have the parameters embedded.
To fix this, you should re-apply the compilation pipeline to the new mod before building. For example:
with TARGET:
built_mod_embedded = pipeline(mod)
executable = relax.build(built_mod_embedded, target=TARGET)| # executable.export_library("mlp_arm.so") | ||
| # | ||
| # # Step 2: Connect to remote device RPC server | ||
| # remote = rpc.connect("192.168.1.100", 9090) # Device IP and RPC port | ||
| # | ||
| # # Step 3: Upload the compiled library and parameters | ||
| # remote.upload("mlp_arm.so") | ||
| # remote.upload("model_params.npz") |
There was a problem hiding this comment.
The paths used for exporting and uploading artifacts in this RPC example are inconsistent with the rest of the tutorial, which uses ARTIFACT_DIR. For example, executable.export_library("mlp_arm.so") saves to the current directory, and remote.upload("model_params.npz") assumes the file is in the current directory, while it was actually saved to relax_export_artifacts/.
For consistency, it's better to use the ARTIFACT_DIR and params_path variables defined earlier in the tutorial.
arm_lib_path = ARTIFACT_DIR / "mlp_arm.so"
executable.export_library(str(arm_lib_path))
# Step 2: Connect to remote device RPC server
remote = rpc.connect("192.168.1.100", 9090) # Device IP and RPC port
# Step 3: Upload the compiled library and parameters
remote.upload(str(arm_lib_path))
remote.upload(str(params_path))| # # Step 4: Load and run on remote device | ||
| # lib = remote.load_module("mlp_arm.so") | ||
| # vm = relax.VirtualMachine(lib, remote.cpu()) | ||
| # # ... prepare input and params, then run inference |
There was a problem hiding this comment.
The comment ... prepare input and params, then run inference is too brief. It would be more helpful for users to see a concrete example of how to prepare data on the remote device and execute the model, as it involves creating remote tensors from local data.
# ... prepare input and params on remote, then run inference
dev = remote.cpu()
params_npz = np.load(str(params_path))
vm_params = [tvm.runtime.tensor(params_npz[f"p_{i}"], dev) for i in range(len(params_npz.files))]
data = np.random.randn(1, 1, 28, 28).astype("float32")
vm_input = tvm.runtime.tensor(data, dev)
output = vm["main"](vm_input, *vm_params)
print(f"Remote execution finished. Output shape: {output[0].asnumpy().shape}")|
cc @tqchen |
This PR adds a comprehensive tutorial demonstrating how to export compiled Relax modules to shared libraries (
.sofiles) and load them back into the TVM runtime. This answers the question about exporting PyTorch/ONNX models to executable files.