[CI] Make Graviton3 default AArch64 job runner node#15352
Conversation
In order to support SVE testing, migrating the current default AArch64 nodes to Graviton3 based nodes. Using r7g.large instances which have the memory requirements to support the TVM workloads.
| {% call m.invoke_build( | ||
| name='BUILD: arm', | ||
| node='ARM-SMALL', | ||
| node='ARM-GRAVITON3', |
There was a problem hiding this comment.
Thanks, it would be useful to have a analysis of cost and the way we structure the tests.
As of now running the UTs directly through e2e compilation can take up a lot of CI time. A lot of that comes from tests that are as a matter of fact integration tests
Would be great for us to isolate out a limited set of integration tests (in cases with tests/arm_sve/) and only run limited set of testcases over these would be useful. Like our require_cuda tag, while majority of tests do not have to go through the specific nodes
|
Thanks, it would be useful to have a analysis of cost of the new instance. As of now running the UTs directly through e2e compilation can take up a lot of CI time. A lot of that comes from tests that likely do not need SVE. My understanding is that we will need SVE for some of the integration tests. Ideally we should isolate out a limited set of integration tests(e.g. via Most remainder of the tests can be structured through UTs and likely do not need SVE |
|
@tqchen the new instance type is slightly more expensive on paper, as detailed below:
However, the new generation of instance has been proven to improve performance (see: Re:invent presentation). Which indicates this is an improvement for CI costs. If you look at the diff, this replaces the |
|
OK get it, seems to be good on this |
In order to support SVE testing, migrating the current default AArch64 nodes to Graviton3 based nodes. Using r7g.large instances which have the memory requirements to support the TVM workloads.