Halide 18.0.0
Halide compiler and libraries
|
A code generator that emits GPU code from a given Halide stmt. More...
#include <CodeGen_GPU_Dev.h>
Public Types | |
enum | MemoryFenceType { None = 0 , Device = 1 , Shared = 2 } |
An mask describing which type of memory fence to use for the gpu_thread_barrier() intrinsic. More... | |
Public Member Functions | |
virtual | ~CodeGen_GPU_Dev () |
virtual void | add_kernel (Stmt stmt, const std::string &name, const std::vector< DeviceArgument > &args)=0 |
Compile a GPU kernel into the module. | |
virtual void | init_module ()=0 |
(Re)initialize the GPU kernel module. | |
virtual std::vector< char > | compile_to_src ()=0 |
virtual std::string | get_current_kernel_name ()=0 |
virtual void | dump ()=0 |
virtual std::string | api_unique_name ()=0 |
This routine returns the GPU API name that is combined into runtime routine names to ensure each GPU API has a unique name. | |
virtual std::string | print_gpu_name (const std::string &name)=0 |
Returns the specified name transformed by the variable naming rules for the GPU language backend. | |
virtual bool | kernel_run_takes_types () const |
Allows the GPU device specific code to request halide_type_t values to be passed to the kernel_run routine rather than just argument type sizes. | |
Static Public Member Functions | |
static bool | is_block_uniform (const Expr &expr) |
Checks if expr is block uniform, i.e. | |
static bool | is_buffer_constant (const Stmt &kernel, const std::string &buffer) |
Checks if the buffer is a candidate for constant storage. | |
static Stmt | scalarize_predicated_loads_stores (Stmt &s) |
Modifies predicated loads and stores to be non-predicated, since most GPU backends do not support predication. | |
A code generator that emits GPU code from a given Halide stmt.
Definition at line 18 of file CodeGen_GPU_Dev.h.
An mask describing which type of memory fence to use for the gpu_thread_barrier() intrinsic.
Not all GPUs APIs support all types.
Enumerator | |
---|---|
None | |
Device | |
Shared |
Definition at line 75 of file CodeGen_GPU_Dev.h.
|
virtual |
|
pure virtual |
Compile a GPU kernel into the module.
This may be called many times with different kernels, which will all be accumulated into a single source module shared by a given Halide pipeline.
|
pure virtual |
(Re)initialize the GPU kernel module.
This is separate from compile, since a GPU device module will often have many kernels compiled into it for a single pipeline.
|
pure virtual |
|
pure virtual |
|
pure virtual |
This routine returns the GPU API name that is combined into runtime routine names to ensure each GPU API has a unique name.
|
pure virtual |
Returns the specified name transformed by the variable naming rules for the GPU language backend.
Used to determine the name of a parameter during host codegen.
Allows the GPU device specific code to request halide_type_t values to be passed to the kernel_run routine rather than just argument type sizes.
Definition at line 54 of file CodeGen_GPU_Dev.h.
Checks if expr is block uniform, i.e.
does not depend on a thread var.
|
static |
Checks if the buffer is a candidate for constant storage.
Most GPUs (APIs) support a constant memory storage class that cannot be written to and performs well for block uniform accesses. A buffer is a candidate for constant storage if it is never written to, and loads are uniform within the workgroup.
Modifies predicated loads and stores to be non-predicated, since most GPU backends do not support predication.