The area of shader programming offers many tough problems to solve. The range of target platforms is vast: from CPU path-tracers to mobile GPUs - served by a zoo of incompatible languages: from GLSL to HLSL, from OSL to WGSL.
Common challenges include portability, managing specializations, and a lack of abstraction mechanisms. The solutions for these include the archaic C Preprocessor, templates/generics, visual graph frameworks, transpilers and, finally, embedded domain-specific languages (EDSLs).
Python is an ideal host for Embedded Domain-Specific Languages (EDSLs). Warp, Taichi, Numba, and Triton evolved to target GPU compute. All of them share common architectural decisions. They capture the program's logic by inspecting the Python source code, generate an internal representation and compile that IR to the target format.
The above approach comes with significant disadvantages. Only a subset of Python is supported, debugging with standard tools is impossible, integration with external Python code is limited, metaprogramming requires special syntax, and heavy compiler infrastructure needs to be implemented in a language like C++.
This talk proposes an alternative architecture. Instead of introspection, we capture the program's logic by tracing execution with proxy objects at Python runtime, similar to JAX and PyTorch. Instead of building an IR, we emit target code eagerly, line-by-line, similar to how PyTorch Eager Mode launches computations. And because we don't implement a compiler, the implementation remains 100% Python.
We discuss in detail how core elements of Python syntax can be overloaded to implement such an architecture:
__setattr__/__getattr__ to capture variable names.Attendees will leave with a toolbox of Python mataprogramming patterns empowering them to write a code generator in Python without having to implement a compiler.