Skip to content

Conversation

@geesun
Copy link

@geesun geesun commented Nov 7, 2025

Add an article for How to Benchmark a Single KleidiAI Micro-kernel in ExecuTorch

It includes the following:

  • Cross-compile ExecuTorch for the ARM64 platform, enabling XNNPACK and KleidiAI with SME2 support.

  • Create ExecuTorch models that can be accelerated by SME2 through KleidiAI.

  • Use the executor_runner tool to generate ETDump profiling data.

  • Analyze the contents of ETRecord and ETDump using the ExecuTorch Inspector API.

  • I have reviewed Create a Learning Path

Please do not include any confidential information in your contribution. This includes confidential microarchitecture details and unannounced product information.

  • I have checked my contribution for confidential information

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the Creative Commons Attribution 4.0 International License.

@geesun geesun force-pushed the kai-performance branch 2 times, most recently from 19acc01 to 674d462 Compare November 10, 2025 02:36
@geesun geesun changed the title Add How to Measure Kleidai Kernel Performance in ExecuTorch Add How to Benchmark a Single KleidiAI Micro-kernel in ExecuTorch Nov 10, 2025
@geesun geesun force-pushed the kai-performance branch 4 times, most recently from 0080950 to 9e4902d Compare November 12, 2025 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant