Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Categories

Computer Science

Parallel Computing

Amdahl's Law and Gustafson's Law

Benchmarking Parallel Code

Basic Parallel Computing Terminology

CUDA Basics for GPU Parallel Programming

Data Locality in Parallel Algorithms

Deadlocks and Livelocks

Fault Tolerance in Parallel Systems

GPU Programming Basics

Introduction to High-Performance Computing (HPC)

Locks, Mutexes, and Semaphores

Message Passing vs Shared Memory

Parallel Computing Patterns

Parallel File Systems

Parallel Algorithms for Data Structures

Parallel Computing Algorithms

Parallel Computing in Cloud Environments

Parallel Sorting Algorithms

Parallel Computer Architectures

OpenMP Directives and Clauses

Parallel Programming Libraries

Parallel Computing with Python and multiprocessing

Types of Parallelism in Computing

Speedup and Efficiency in Parallel Systems

Shared Memory Programming with Pthreads

Task and Data Parallelism

Benchmarking Parallel Code

Flashcards

0/20

Still learning

Scalability

Scalability refers to a system's ability to maintain or improve performance as more computing resources are added. It is essential for evaluating the long-term utility of parallel computing architectures.

Communication Overhead

Communication overhead refers to the time spent on data exchange between processors rather than on actual computation. This often becomes the bottleneck in improving parallel code performance.

MPI (Message Passing Interface)

MPI is a standardized and portable message-passing system designed to function on a wide variety of parallel computing architectures. It is commonly used in benchmarking parallel code for communications between nodes.

Benchmark Suites

Benchmark suites like SPECmpi and NAS Parallel Benchmarks are collections of programs used to evaluate the performance and scalability of parallel computers. They provide standardized tests for various types of computations.

Amdahl's Law

Amdahl's Law predicts the potential speedup of a program's execution as more processors are added, considering the proportion of code that is serial and parallelizable. As a formula, $S(p) = \frac{1}{(1 - P) + \frac{P}{p}}$ , with $P$ being the parallelizable portion.

HPL (High Performance Linpack)

The HPL benchmark measures the floating-point computing power of a system by solving a dense system of linear equations. It is commonly used to rank supercomputers in the TOP500 list.

FLOPS (Floating Point Operations Per Second)

FLOPS is a metric that quantifies computing performance for numerical calculations. It's an essential performance metric in high-performance computing and helps compare different computer architectures.

Race Condition

A race condition occurs when multiple threads or processes access shared data and try to change it at the same time. It can lead to unpredictable results and is crucial to consider during parallel code benchmarking.

Lock Contention

Lock contention happens when multiple threads or processes wait for a lock on a resource leading to serialization of parallel code. Minimizing lock contention is essential for improving parallel code performance.

Parallel Efficiency

Parallel efficiency is a measure that combines speedup and efficiency to demonstrate how well a parallelization scales with the increase of resources. It essentially indicates how much additional benefit is gained from adding more processors.

Thread Safety

Thread safety refers to the property of an algorithm or data structure where it functions correctly during concurrent execution by multiple threads. This is vital for accurate benchmarking of parallel applications.

Load Balancing

Load balancing is the practice of distributing work evenly across processors to avoid any idling and ensure maximum utilization. It is a critical aspect of optimizing parallel code performance.

Gustafson's Law

Gustafson's Law provides a more optimistic view compared to Amdahl's Law, suggesting that scaled speedup is linear if the workload increases with the number of processors. The formula is $S'(p) = P + (1-P) * p$ .

Non-uniform Memory Access (NUMA)

NUMA is a computer memory design where the time taken to access memory depends on the memory's location relative to a processor. In parallel computing, optimally managing NUMA is key for maintaining high performance.

Efficiency

Efficiency evaluates how effectively parallel computing resources are used, giving insight into the overhead caused by parallelization. It is the speedup divided by the number of processors, expressed as $E(p) = \frac{S(p)}{p}$ .

Granularity

Granularity indicates the size of tasks into which a parallel computation is divided. Fine granularity means smaller tasks with more frequent communication, while coarse granularity means larger tasks with less frequent communication.

Weak Scaling

Weak scaling measures how the solution time changes with the number of processors while increasing the problem size proportionally. The ideal weak scaling keeps the time constant as processors and problem size grow.

Strong Scaling

Strong scaling measures the reduction in solution time with the addition of processors without changing the total problem size. Ideal strong scaling would linearly reduce the time with each additional processor.

Speedup

Speedup measures the performance improvement of a parallel algorithm compared to a serial algorithm. It's calculated as $S(p) = \frac{T(1)}{T(p)}$ , where $T(1)$ is the run time of the serial algorithm and $T(p)$ is the run time using $p$ processors.

Parallel Overhead

Parallel overhead is the additional computation associated with parallelization such as synchronization, communication, and non-uniform memory access (NUMA) effects, which can affect the net speedup achieved.

Know

Still learning

Click to flip

Know