[PERF]: Faster void * conversion by mdboom · Pull Request #1616 · NVIDIA/cuda-python

mdboom · 2026-02-12T19:20:02Z

We currently accept an int, CUdeviceptr, or a buffer-providing object as convertible to a void *. This is currently handled with a class _HelperInputVoidPtr, which mainly exists to manage the lifetime when the input exposes a buffer.

This object (like all PyObjects) is allocated on the heap and gets free'd implicitly by Cython at the end of the function. Since it only exists to manage lifetimes when the object exposes a buffer, we pay this heap allocation penalty even in the common case where the input is a simple integer.

This changes the code to statically allocate the Py_buffer on the stack, and so is faster for similar reasons to #1545. This means we are trading some stack space (88 bytes) for speed. But given that CUDA Python API calls can't recursively call themselves, I'm not concerned.

This improves the overhead time in the benchmark in #659 from 2.97us/call to 2.67us/call.

The old _HelperInputVoidPtr class stays around here because it is still useful when the input is a list of void *-convertible things and we can't statically determine how much space to allocate.

copy-pr-bot · 2026-02-12T19:20:06Z

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

mdboom · 2026-02-12T19:50:05Z

/ok to test

leofang · 2026-02-13T15:04:08Z

cuda_bindings/cuda/bindings/_lib/utils.pxi.in

-        elif isinstance(ptr, (_driver["CUdeviceptr"])):
-            self._cptr = <void*><void_ptr>int(ptr)


Q: This path seems to be gone?

Yep -- it is now redundant with calling int(ptr).

Sorry, I mean it's handled by this line, with the implicit conversion to int:

return <void *><void_ptr>ptr

leofang · 2026-02-13T15:09:58Z

cuda_bindings/cuda/bindings/_lib/utils.pxd.in

+cdef void * _helper_input_void_ptr(ptr, _HelperInputVoidPtrStruct *buffer)
+
+cdef inline void * _helper_input_void_ptr_free(_HelperInputVoidPtrStruct *helper):
+    if helper[0]._pybuffer.buf != NULL:


Q: Should we check first if helper is NULL?

We could, but since this internal code only ever called from generated code, I think it's safe to skip it. Since _helper is stack-allocated, there is no malloc failing to check for.

leofang · 2026-02-13T15:12:53Z

cuda_bindings/cuda/bindings/_lib/utils.pxi.in

-            self._cptr = NULL
-        elif isinstance(ptr, (int)):
-            # Easy run, user gave us an already configured void** address
+        try:


It seems we can avoid code duplication by replacing the try-except block with a call to the new helper like this?

self._cptr = _helper_input_void_ptr(ptr, <_HelperInputVoidPtrStruct*><PyObject*>self)

(I'm not so sure about the self casting, I think it's correct because they share the same layout.)

_HelperInputVoidPtr is a PyObject *, but _HelperInputVoidPtrStruct is not, so they actually do have quite different layouts. The struct, since allocated on the stack, doesn't need a PyObject header for reference counting and type checking etc., which is partly why the performance hack works. I think I could probably reduce this duplication another way, however, by making _HelperInputVoidPtrStruct a member of _HelperInputVoidPtr and then passing a reference to that.

I've updated this to reduce the code duplication.

mdboom · 2026-02-17T16:57:26Z

/ok to test

leofang · 2026-02-17T20:59:34Z

/ok to test 467f108

github-actions · 2026-02-18T16:52:06Z

Backport failed for 12.9.x, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 12.9.x
git worktree add -d .worktree/backport-1616-to-12.9.x origin/12.9.x
cd .worktree/backport-1616-to-12.9.x
git switch --create backport-1616-to-12.9.x
git cherry-pick -x d40517a26045d3763ba41e627e3340b9bb392874

github-actions · 2026-02-18T17:05:22Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

mdboom added 3 commits February 12, 2026 14:00

[PERF]: Faster void * conversion

143b128

Restore cptr property

2350e36

Fixes in runtime.pyx.in

a992e59

This comment has been minimized.

Sign in to view

Merge branch 'main' into faster-conversion

0462a0d

leofang reviewed Feb 13, 2026

View reviewed changes

leofang added this to the cuda.bindings 13.1.2 & 12.9.6 milestone Feb 14, 2026

leofang assigned mdboom Feb 14, 2026

leofang added enhancement Any code-related improvements cuda.bindings Everything related to the cuda.bindings module P1 Medium priority - Should do labels Feb 14, 2026

mdboom added 2 commits February 17, 2026 09:59

Reduce code duplication

e9a6c3e

Merge remote-tracking branch 'upstream/main' into faster-conversion

0baf1fa

mdboom requested a review from leofang February 17, 2026 14:59

mdboom mentioned this pull request Feb 17, 2026

BUG: Fix use-after-free in _HelperInputVoidPtr properties #1629

Merged

Merge branch 'main' into faster-conversion

467f108

leofang approved these changes Feb 17, 2026

View reviewed changes

leofang added the to-be-backported Trigger the bot to raise a backport PR upon merge label Feb 17, 2026

leofang modified the milestones: cuda.bindings 13.1.2 & 12.9.6, cuda.bindings next Feb 17, 2026

mdboom mentioned this pull request Feb 18, 2026

[PERF]: Epic for binding overhead improvements #1645

Open

mdboom merged commit d40517a into NVIDIA:main Feb 18, 2026
88 checks passed

		elif isinstance(ptr, (_driver["CUdeviceptr"])):
		self._cptr = <void*><void_ptr>int(ptr)

Conversation

mdboom commented Feb 12, 2026

Uh oh!

copy-pr-bot bot commented Feb 12, 2026

Uh oh!

mdboom commented Feb 12, 2026

Uh oh!

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mdboom commented Feb 17, 2026

Uh oh!

leofang commented Feb 17, 2026

Uh oh!

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments