Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable QC8/QS8 GEMM/IGEMM for Wasm relaxed integer dot product instruction on x64 #6454

Open
fanchenkong1 opened this issue May 22, 2024 · 0 comments

Comments

@fanchenkong1
Copy link
Contributor

V8 now supports AVX-VNNI instructions. The i32x4.dot_i8x16_i7x16_adds can be compiled to vpdpbusd on x64 devices, which increase the speed of applications using this opcode.

XNNPACK already has QC8/QS8 GEMM/IGEMM microkernels using relaxed simd dot products. But they are limited to certain implementation of i32x4.dot_i8x16_i7x16_adds (CheckWAsmSDOT). We would also need microkernels for VNNI-style i32x4.dot_i8x16_i7x16_adds. Our performance test using vpdpbusd on end2end_bench with a PoC show large improvement in following cases.

d8/end2end_bench Reduction on execution time%
QC8MobileNetV1/T:1/real_time -45.60%
QC8MobileNetV2/T:1/real_time -30.50%
QS8MobileNetV1/T:1/real_time -45.40%
QS8MobileNetV2/T:1/real_time -30.30%

Does XNNPACK have plan on adding new microkernels for VNNI implementation of Wasm relaxed integer dot product? We can provide patch if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant