Skip to content

[DOCS] Add tutorial for exporting and loading back Relax executables#18404

Merged
tqchen merged 7 commits intoapache:mainfrom
tlopex:tut1
Nov 1, 2025
Merged

[DOCS] Add tutorial for exporting and loading back Relax executables#18404
tqchen merged 7 commits intoapache:mainfrom
tlopex:tut1

Conversation

@tlopex
Copy link
Member

@tlopex tlopex commented Oct 30, 2025

This PR adds a comprehensive tutorial demonstrating how to export compiled Relax modules to shared libraries (.so files) and load them back into the TVM runtime. This answers the question about exporting PyTorch/ONNX models to executable files.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @tlopex, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new, detailed tutorial aimed at clarifying the process of exporting and loading compiled Relax executables. It provides a practical guide for users to convert their AI models, originating from frameworks like PyTorch or ONNX, into deployable shared libraries. The tutorial covers the entire workflow from compilation to execution within the TVM runtime, addressing common deployment challenges and enabling efficient model inference across various target devices, including remote systems.

Highlights

  • New Tutorial Added: A comprehensive tutorial has been added that demonstrates how to export compiled Relax modules to shared libraries (.so files).
  • Loading and Execution: The tutorial details the process of loading these exported shared libraries back into the TVM runtime and executing them, covering both interactive and standalone script usage.
  • Model Deployment Workflow: It illustrates how to transform Relax programs, including those imported from PyTorch or ONNX models, into deployable artifacts using tvm.relax APIs.
  • Parameter Management: Guidance is provided on how to separate and save model parameters for flexible deployment, with an alternative method for embedding parameters directly into the shared library.
  • Advanced Deployment Scenarios: The tutorial includes instructions for running models on GPU devices and deploying to remote targets, such as ARM Linux devices, utilizing TVM's RPC mechanism.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new tutorial on exporting and loading Relax executables. The tutorial is comprehensive and well-structured. My review focuses on improving the clarity and correctness of some code snippets, particularly for the alternative parameter handling flow and the remote execution example, to ensure users can follow them without issues.

#
# mod = from_exported_program(exported_program, keep_params_as_input=False)
# # Parameters are now embedded as constants in the module
# executable = relax.build(built_mod, target=TARGET)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variable built_mod here refers to the module built from mod with keep_params_as_input=True. However, in this alternative flow, mod is redefined with keep_params_as_input=False. Using the old built_mod is incorrect as it will not have the parameters embedded.

To fix this, you should re-apply the compilation pipeline to the new mod before building. For example:

with TARGET:
    built_mod_embedded = pipeline(mod)
executable = relax.build(built_mod_embedded, target=TARGET)

Comment on lines +336 to +343
# executable.export_library("mlp_arm.so")
#
# # Step 2: Connect to remote device RPC server
# remote = rpc.connect("192.168.1.100", 9090) # Device IP and RPC port
#
# # Step 3: Upload the compiled library and parameters
# remote.upload("mlp_arm.so")
# remote.upload("model_params.npz")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The paths used for exporting and uploading artifacts in this RPC example are inconsistent with the rest of the tutorial, which uses ARTIFACT_DIR. For example, executable.export_library("mlp_arm.so") saves to the current directory, and remote.upload("model_params.npz") assumes the file is in the current directory, while it was actually saved to relax_export_artifacts/.

For consistency, it's better to use the ARTIFACT_DIR and params_path variables defined earlier in the tutorial.

   arm_lib_path = ARTIFACT_DIR / "mlp_arm.so"
   executable.export_library(str(arm_lib_path))

   # Step 2: Connect to remote device RPC server
   remote = rpc.connect("192.168.1.100", 9090)  # Device IP and RPC port

   # Step 3: Upload the compiled library and parameters
   remote.upload(str(arm_lib_path))
   remote.upload(str(params_path))

# # Step 4: Load and run on remote device
# lib = remote.load_module("mlp_arm.so")
# vm = relax.VirtualMachine(lib, remote.cpu())
# # ... prepare input and params, then run inference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment ... prepare input and params, then run inference is too brief. It would be more helpful for users to see a concrete example of how to prepare data on the remote device and execute the model, as it involves creating remote tensors from local data.

   # ... prepare input and params on remote, then run inference
   dev = remote.cpu()
   params_npz = np.load(str(params_path))
   vm_params = [tvm.runtime.tensor(params_npz[f"p_{i}"], dev) for i in range(len(params_npz.files))]
   data = np.random.randn(1, 1, 28, 28).astype("float32")
   vm_input = tvm.runtime.tensor(data, dev)
   output = vm["main"](vm_input, *vm_params)
   print(f"Remote execution finished. Output shape: {output[0].asnumpy().shape}")

Clarify tutorial description for exporting and loading Relax modules.
@tlopex
Copy link
Member Author

tlopex commented Oct 30, 2025

cc @tqchen

Copy link
Contributor

@MasterJH5574 MasterJH5574 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@tqchen tqchen merged commit 36a4696 into apache:main Nov 1, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants