Mar 28, 2023 | Read time 8 min

How to Accurately Time CUDA Kernels in Pytorch

In a world of increasingly costly machine learning model deployments, ensuring accurate GPU operation timing is key to resource optimization. In this blog post, we explore best practices to achieve this in PyTorch.
PyTorch timing operations
Lawrence Atkins
Lawrence AtkinsMachine Learning Engineer
David MacLeod
David MacLeodMachine Learning Architect
References[1] Hoffmann, Jordan, et al. "Training compute-optimal large language models." arXiv preprint arXiv:2203.15556 (2022).

[2] Dao, Tri, et al. "Flashattention: Fast and memory-efficient exact attention with io-awareness." Advances in Neural Information Processing Systems 35 (2022): 16344-16359.

[3] Yao, Zhewei, et al. "ZeroQuant: Efficient and affordable post-training quantization for large-scale transformers." Advances in Neural Information Processing Systems 35 (2022): 27168-27183.

[4] Fawzi, Alhussein, et al. "Discovering faster matrix multiplication algorithms with reinforcement learning." Nature 610.7930 (2022): 47-53.
AuthorsLawrence Atkins & David MacLeod
AcknowledgementsCaroline Dockes, Ed Rees, Ellena Reid & Markus Hennerbichler
Carousel slide image
Company

Better than Whisper: how Adobe Premiere's on-device speech engine got rebuilt

Andrew Innes
Andrew InnesChief Architect
Carousel slide image
Technical

The Adobe story: How we made cloud-grade AI work on your laptop

Andrew Innes
Andrew InnesChief Architect
Carousel slide image
Technical

De-risk your voice agent: The 11 best voice agent testing platforms in 2026

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Technical

How to build a microbatching workflow with the Speechmatics API

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Product

Alphanumeric speech recognition: why voice assistants mangle SKUs (and how to fix it)

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Company

Adobe and Speechmatics deliver cloud-grade speech recognition on-device for Premiere

Speechmatics
SpeechmaticsEditorial Team