[Bugfix][TOPI] Fix a bug in arm_cpu int8 conv2d i8mm schedule#15484
Merged
ekalda merged 1 commit intoapache:mainfrom Aug 7, 2023
Merged
[Bugfix][TOPI] Fix a bug in arm_cpu int8 conv2d i8mm schedule#15484ekalda merged 1 commit intoapache:mainfrom
ekalda merged 1 commit intoapache:mainfrom
Conversation
`topi.arm_cpu.schedule_conv2d_NHWC_quantized_interleaved` was failing compilation with the `+i8mm` extension enabled whenever the output height and output width were both equal to 1, such that OH x OW = 1. Padding was being removed during the `tir.BufferShapeLegalize` pass, causing an error in the `tir.BufferBindUnwrapper` pass. Some of the removed padding was necessary for tensorize (using the `gemm_acc_2x2_int8_int8_int32` intrinsic), which expects 2x2 output tiles. However, because of the optimisations mentioned above, the output tensor `C_interleaved` was reduced to having 1x2 tiles instead. e.g. for A = [1x1x1x8], W = [1x1x8x24], C = [1x1x1x24]: - Before fix: `C_interleaved = T.Buffer((1, 1, 2, 1, 6, 1, 2), "int32”)` - After fix: `C_interleaved = T.Buffer((1, 1, 2, 1, 6, 2, 2), "int32”)` To make sure the required padding is left untouched, while the rest of it is still removed, a dummy reference to the needed axis is declared. Finally, the leftover padding is still disregarded when computing the final output tensor `C`.
Collaborator
|
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
Contributor
Author
ekalda
approved these changes
Aug 4, 2023
Contributor
ekalda
left a comment
There was a problem hiding this comment.
Thanks @Anndrey24, LGTM, great work!
Contributor
|
Thanks @Anndrey24! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
topi.arm_cpu.schedule_conv2d_NHWC_quantized_interleavedwas failing compilation with the+i8mmextension enabled (as done in #14888) whenever the output height and output width were both equal to 1, such that OH x OW = 1.Padding was being removed during the
tir.BufferShapeLegalizepass, causing an error in thetir.BufferBindUnwrapperpass. Some of the removed padding was necessary for tensorize (using thegemm_acc_2x2_int8_int8_int32intrinsic), which expects 2x2 output tiles. However, because of the optimisations mentioned above, the output tensorC_interleavedwas reduced to having 1x2 tiles instead.e.g. for A = [1x1x1x8], W = [1x1x8x24], C = [1x1x1x24]:
C_interleaved = T.Buffer((1, 1, 2, 1, 6, 1, 2), "int32”)C_interleaved = T.Buffer((1, 1, 2, 1, 6, 2, 2), "int32”)To make sure the required padding is left untouched, while the rest of it is still removed, a dummy reference to the needed axis is declared.
In the end, the leftover padding is still disregarded when computing the final output tensor
C.