Skip to content

Add CUDA-Graphics (OpenGL) interop support to cuda.core#1608

Open
rparolin wants to merge 9 commits intoNVIDIA:mainfrom
rparolin:rparolin/graphics_cuda_interop
Open

Add CUDA-Graphics (OpenGL) interop support to cuda.core#1608
rparolin wants to merge 9 commits intoNVIDIA:mainfrom
rparolin:rparolin/graphics_cuda_interop

Conversation

@rparolin
Copy link
Collaborator

@rparolin rparolin commented Feb 12, 2026

Summary

Adds CUDA-Graphics (OpenGL) interoperability support to cuda.core, enabling zero-copy sharing of GPU memory between CUDA compute kernels and OpenGL graphics renderers.

  • GraphicsResource — Cython extension class for registering and managing OpenGL buffers/images with CUDA. Supports from_gl_buffer() and from_gl_image() factory methods, map()/unmap() with context manager support, and automatic resource cleanup via close().
  • GraphicsRegisterFlags — IntEnum for controlling registration behavior (NONE, READ_ONLY, WRITE_DISCARD, SURFACE_LOAD_STORE, TEXTURE_GATHER).
  • C++ handle layerGraphicsResourceHandle (shared_ptr with custom deleter) integrated into the existing resource handle infrastructure.
  • Examplegl_interop_plasma.py: real-time plasma effect rendered by a CUDA kernel directly into an OpenGL PBO, displayed via pyglet.
  • Tests — Comprehensive test suite covering registration, map/unmap, context manager, error handling, idempotent close, and GC behavior.

New Public API

import ctypes
from OpenGL import GL as gl
from cuda.core import GraphicsResource, GraphicsRegisterFlags

# Create an OpenGL Pixel Buffer Object (PBO) — GPU memory owned by OpenGL
pbo = ctypes.c_uint(0)
gl.glGenBuffers(1, ctypes.byref(pbo))
gl.glBindBuffer(gl.GL_PIXEL_UNPACK_BUFFER, pbo.value)
nbytes = width * height * 4  # RGBA, 1 byte per channel
gl.glBufferData(gl.GL_PIXEL_UNPACK_BUFFER, nbytes, None, gl.GL_DYNAMIC_DRAW)
pbo_id = pbo.value

# Register the PBO with CUDA for zero-copy access
resource = GraphicsResource.from_gl_buffer(pbo_id, flags=GraphicsRegisterFlags.WRITE_DISCARD)

# Map for CUDA kernel access (context manager auto-unmaps)
with resource.map(stream=stream) as buf:
    launch(stream, config, kernel, buf.handle, width, height, time)

# Cleanup
resource.close()

Files Changed

File Description
_graphics.pyx / _graphics.pxd New Cython module with GraphicsResource and GraphicsRegisterFlags
_cpp/resource_handles.cpp / .hpp GraphicsResourceHandle creation + cuGraphicsUnregisterResource function pointer
_resource_handles.pxd / .pyx Type alias and function pointer initialization for graphics handles
__init__.py Export GraphicsResource and GraphicsRegisterFlags
tests/test_graphics.py Test suite (~350 lines)
examples/gl_interop_plasma.py Full working example (~410 lines)
pixi.toml / pixi.lock Added pyglet test dependency

Screenshot

image

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Feb 12, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rparolin rparolin self-assigned this Feb 12, 2026
Comment on lines +243 to +245
HANDLE_RETURN(
cydriver.cuGraphicsMapResources(1, &raw, cy_stream)
)
Copy link
Contributor

@Andy-Jost Andy-Jost Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: you could make the unmap more reliable (and avoid the duplicated calls to cuGraphicsUnmapResources) by making this into a resource, similar to the handle.

@rparolin rparolin changed the title initial commit, plasma effect is rendering Add CUDA-Graphics (OpenGL) interop support to cuda.core Feb 18, 2026
@rparolin rparolin marked this pull request as ready for review February 18, 2026 22:37
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Feb 18, 2026

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

- Remove unused GraphicsResourceHandle cimport (cython-lint)
- Split long f-string to stay within 120-char line limit (ruff E501)
- Rename unused variable buf to _buf (ruff F841)
- Add noqa: S110 to intentional try-except-pass in test cleanup
- Apply ruff auto-fixes (import sorting, formatting)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rparolin rparolin requested a review from Andy-Jost February 18, 2026 22:49
@rparolin rparolin added this to the cuda.core v0.7.0 milestone Feb 18, 2026
@rparolin rparolin added cuda.core Everything related to the cuda.core module example Improvements or additions to code examples labels Feb 18, 2026
@rparolin
Copy link
Collaborator Author

/ok to test

@github-actions
Copy link

@leofang
Copy link
Member

leofang commented Feb 18, 2026

@jakirkham as the original requester of #241, would it be possible for you to review and provide us your feedbacks here? 🙂

@rparolin rparolin enabled auto-merge (squash) February 19, 2026 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module example Improvements or additions to code examples

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments