Explore tens of thousands of sets crafted by our community.
GPU Programming Basics
25
Flashcards
0/25
GPU
A Graphics Processing Unit (GPU) is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display.
CUDA
Compute Unified Device Architecture (CUDA) is a parallel computing platform and application programming interface (API) model created by Nvidia that allows software developers to use a CUDA-enabled graphics processing unit (GPU) for general purpose processing.
GPGPU
General-Purpose computing on Graphics Processing Units (GPGPU) is the use of a GPU to handle computation typically handled by the Central Processing Unit (CPU).
Shader
A shader is a type of computer program that was originally used for shading (the production of appropriate levels of light, darkness, and color within an image) but which now performs a variety of specialized functions in various fields of computer graphics special effects.
Stream Processor
A stream processor is a type of computing unit found inside a GPU that is optimized for processing multiple data elements simultaneously (data parallelism).
Thread Block
In GPU programming, a thread block is a group of threads that can be executed independently and can share data through a shared memory space provided by the GPU.
Warp
A warp is the basic unit of parallel execution in Nvidia's GPUs, consisting of 32 threads that are processed simultaneously by the GPU.
Compute Kernel
A compute kernel is a function written to run on the parallel architecture of a GPU and typically executed across many parallel threads.
Memory Hierarchy
In GPU architecture, the memory hierarchy consists of registers, shared memory, constant memory, local memory, and global memory, arranged in a hierarchy based on speed and size.
OpenCL
Open Computing Language (OpenCL) is an open standard for cross-platform, parallel programming of diverse processors found in personal computers, servers, mobile devices, and embedded platforms.
Kernel Grid
A kernel grid is a multidimensional grid of thread blocks that defines the number of thread blocks and the number of threads within each block for a kernel launch in GPU programming.
Global Memory
Global memory refers to the main memory on a GPU which is accessible by all threads and has the highest capacity but lower speed compared to other types of GPU memory.
Shared Memory
Shared memory in GPU architecture is a small amount of memory accessible by threads within the same thread block that allows for fast, cooperative data sharing.
Data Parallelism
Data parallelism is a form of parallelization where each processor performs the same task on different pieces of distributed data in parallel.
Task Parallelism
Task parallelism is a form of parallelization where different processors perform different tasks on the same or different sets of data.
Synchronization
Synchronization in GPU programming is used to coordinate the execution of threads, ensuring that certain operations are performed in the correct order or that resources are not accessed concurrently by multiple threads in a way that causes conflicts.
Occupancy
Occupancy refers to the ratio of the number of active warps per multiprocessor to the maximum number of possible active warps, and is a measure of how well the GPU resources are being utilized.
SIMD
Single Instruction, Multiple Data (SIMD) is a parallel computing architecture where multiple processing elements perform the same operation on multiple data points simultaneously.
Graphics Pipeline
The graphics pipeline is a conceptual model in computer graphics that describes the sequence of steps used to render a 3D scene onto a 2D screen.
Thread Divergence
Thread divergence occurs when threads of the same warp follow different execution paths due to conditional branching, which can lead to serialization of the divergent branches and reduced parallel efficiency.
Texture Memory
Texture memory is a read-only cache on the GPU optimized for texture mapping operations, which can also be used to efficiently handle regular memory access patterns in general-purpose computing.
Constant Memory
Constant memory on a GPU is a small, fast cache that is read-only and ideal for variables that do not change over the course of a kernel execution.
Register
In GPU architecture, a register is a small storage location that is very fast and located on the processor itself, used to hold variables for individual threads.
API
An Application Programming Interface (API) is a set of protocols, routines, and tools for building software and applications which specify how software components should interact.
Multiprocessor
In the context of GPU architecture, a multiprocessor is a processing unit capable of executing multiple threads in parallel, and multiple such units make up a GPU.
© Hypatia.Tech. 2024 All rights reserved.