Explore tens of thousands of sets crafted by our community.
Introduction to High-Performance Computing (HPC)
20
Flashcards
0/20
Cluster Computing
Cluster computing refers to the use of multiple computers, typically commodity hardware, working together closely as a single system. These clusters are commonly used for HPC tasks.
Message Passing Interface (MPI)
MPI is a standardized and portable message-passing system designed to function on a wide variety of parallel computing architectures. Widely used in HPC, it enables complex communications between distributed memory systems.
Parallel Computing
Parallel computing is a type of computation where many calculations or processes are carried out simultaneously. It's an essential aspect of HPC as it leverages multiple computing resources to solve computational problems efficiently.
Distributed Computing
Distributed computing involves a collection of separate, possibly heterogeneous systems that work together to solve a computational problem. It's important for HPC as it enables the collaborative use of geographically dispersed resources.
Supercomputers
Supercomputers are high-performance computing systems that are at the forefront of processing capacity, especially in terms of speed of calculation. They are essential for intensive tasks like climate simulations, nuclear simulations, and large-scale computations.
Performance Tuning in HPC
Performance tuning in HPC is the process of optimizing the configuration of HPC systems and applications for the purpose of achieving the best possible performance. This involves an understanding of both hardware and software aspects of the system.
Scalability
Scalability in HPC refers to a system's ability to handle a growing amount of work by adding resources to the system. It is crucial for HPC systems to efficiently utilize more resources as problem sizes and demands increase.
What is HPC?
High-Performance Computing (HPC) refers to the practice of aggregating computing power in a way that delivers much greater performance than one could get out of a typical desktop computer or workstation, in order to solve large problems in science, engineering, or business.
CPU vs. GPU in HPC
In HPC, CPUs (Central Processing Units) are used for tasks requiring complex decision-making and flexibility, while GPUs (Graphics Processing Units) are specialized for highly parallel processing tasks. GPUs often provide significant speedups in HPC applications due to their many cores.
Cloud Computing and HPC
Cloud Computing can provide HPC resources with a scalable, virtualized infrastructure that users can access on-demand. This is particularly significant for organizations that need occasional access to high-performance computing resources without the capital investment in hardware.
OpenMP
OpenMP is an open standard for parallel programming in C, C++, and Fortran, which enables the development of parallel applications that are scalable across multiple platforms. It is widely used in the HPC community, especially for shared memory architectures.
HPC Interconnects
Interconnects in HPC are the high-speed communication networks that link together various components of an HPC system. They allow for rapid data transfer rates between nodes which is essential for performance in parallel processing.
Job Scheduling in HPC
Job scheduling in HPC involves managing and dispatching compute jobs across the available resources. It's a critical component for ensuring optimal usage of HPC resources and maintaining system efficiency.
Energy Efficiency in HPC
Energy efficiency in HPC involves designing and operating HPC systems to consume the least amount of power for a given level of performance. It's important due to the high power requirements and environmental impacts of HPC systems.
Fault Tolerance in HPC
Fault tolerance in HPC refers to the ability of a system to continue operating properly in the event of the failure of some of its components. This is key for HPC systems as they often run critical applications where downtime is very costly.
Data Intensive Computing
Data Intensive Computing focuses on the ability to handle, process, and manage large data sets efficiently which is a critical component of high-performance computing for data-driven applications.
Checkpoint/Restart in HPC
Checkpoint/Restart is a technique used in HPC to save the current state of a running job so it can be resumed later. This is critical for long-running computations in case of system failures.
HPC Storage Systems
Storage systems in HPC are designed to handle massive amounts of data and provide quick data access. They often include parallel file systems to support high-speed data access for computing clusters.
Grid Computing
Grid Computing is a distributed system that combines computer resources from multiple administrative domains to reach a common goal. It's important in HPC for tackling large-scale tasks that cannot be handled by a single institution or domain.
Exascale Computing
Exascale computing refers to computing systems capable of at least one exaFLOP, or a billion billion () calculations per second. Reaching exascale is a major goal for the HPC community, as it will allow for the solving of complex and previously intractable problems.
© Hypatia.Tech. 2024 All rights reserved.