Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Categories

Computer Science

Parallel Computing

Amdahl's Law and Gustafson's Law

Basic Parallel Computing Terminology

Benchmarking Parallel Code

CUDA Basics for GPU Parallel Programming

Deadlocks and Livelocks

Data Locality in Parallel Algorithms

Fault Tolerance in Parallel Systems

GPU Programming Basics

Introduction to High-Performance Computing (HPC)

Locks, Mutexes, and Semaphores

Message Passing vs Shared Memory

Parallel Computing Patterns

Parallel Algorithms for Data Structures

Parallel File Systems

Parallel Computing Algorithms

Parallel Computing in Cloud Environments

Parallel Computer Architectures

Parallel Sorting Algorithms

OpenMP Directives and Clauses

Parallel Computing with Python and multiprocessing

Parallel Programming Libraries

Types of Parallelism in Computing

Speedup and Efficiency in Parallel Systems

Shared Memory Programming with Pthreads

Task and Data Parallelism

GPU Programming Basics

Flashcards

0/25

Still learning

GPU

A Graphics Processing Unit (GPU) is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display.

CUDA

Compute Unified Device Architecture (CUDA) is a parallel computing platform and application programming interface (API) model created by Nvidia that allows software developers to use a CUDA-enabled graphics processing unit (GPU) for general purpose processing.

GPGPU

General-Purpose computing on Graphics Processing Units (GPGPU) is the use of a GPU to handle computation typically handled by the Central Processing Unit (CPU).

Shader

A shader is a type of computer program that was originally used for shading (the production of appropriate levels of light, darkness, and color within an image) but which now performs a variety of specialized functions in various fields of computer graphics special effects.

Stream Processor

A stream processor is a type of computing unit found inside a GPU that is optimized for processing multiple data elements simultaneously (data parallelism).

Thread Block

In GPU programming, a thread block is a group of threads that can be executed independently and can share data through a shared memory space provided by the GPU.

Warp

A warp is the basic unit of parallel execution in Nvidia's GPUs, consisting of 32 threads that are processed simultaneously by the GPU.

Compute Kernel

A compute kernel is a function written to run on the parallel architecture of a GPU and typically executed across many parallel threads.

Memory Hierarchy

In GPU architecture, the memory hierarchy consists of registers, shared memory, constant memory, local memory, and global memory, arranged in a hierarchy based on speed and size.

OpenCL

Open Computing Language (OpenCL) is an open standard for cross-platform, parallel programming of diverse processors found in personal computers, servers, mobile devices, and embedded platforms.

Kernel Grid

A kernel grid is a multidimensional grid of thread blocks that defines the number of thread blocks and the number of threads within each block for a kernel launch in GPU programming.

Global Memory

Global memory refers to the main memory on a GPU which is accessible by all threads and has the highest capacity but lower speed compared to other types of GPU memory.

Shared Memory

Shared memory in GPU architecture is a small amount of memory accessible by threads within the same thread block that allows for fast, cooperative data sharing.

Data Parallelism

Data parallelism is a form of parallelization where each processor performs the same task on different pieces of distributed data in parallel.

Task Parallelism

Task parallelism is a form of parallelization where different processors perform different tasks on the same or different sets of data.

Synchronization

Synchronization in GPU programming is used to coordinate the execution of threads, ensuring that certain operations are performed in the correct order or that resources are not accessed concurrently by multiple threads in a way that causes conflicts.

Occupancy

Occupancy refers to the ratio of the number of active warps per multiprocessor to the maximum number of possible active warps, and is a measure of how well the GPU resources are being utilized.

SIMD

Single Instruction, Multiple Data (SIMD) is a parallel computing architecture where multiple processing elements perform the same operation on multiple data points simultaneously.

Graphics Pipeline

The graphics pipeline is a conceptual model in computer graphics that describes the sequence of steps used to render a 3D scene onto a 2D screen.

Thread Divergence

Thread divergence occurs when threads of the same warp follow different execution paths due to conditional branching, which can lead to serialization of the divergent branches and reduced parallel efficiency.

Texture Memory

Texture memory is a read-only cache on the GPU optimized for texture mapping operations, which can also be used to efficiently handle regular memory access patterns in general-purpose computing.

Constant Memory

Constant memory on a GPU is a small, fast cache that is read-only and ideal for variables that do not change over the course of a kernel execution.

In GPU architecture, a register is a small storage location that is very fast and located on the processor itself, used to hold variables for individual threads.

API

An Application Programming Interface (API) is a set of protocols, routines, and tools for building software and applications which specify how software components should interact.

Multiprocessor

In the context of GPU architecture, a multiprocessor is a processing unit capable of executing multiple threads in parallel, and multiple such units make up a GPU.

Know

Still learning

Click to flip

Know