12 Mar 2018

OpenMP && MPI introduction

Introduction to parallel programming

[ OpenMP   ]


Therefore there can be three layers of parallelism in a single program:

Single thread processing multiple data;

multiple threads running simultaneously;

and multiple devices running same program simultaneously.

Team: A team is the group of threads executing the program, To create a new team of threads, you need to specify the parallel keyword

SIMD parallelism (Single-Instruction, Multiple-Data).

specify the thread used: num_threads(2)

Therefore, lastprivate cannot be used to e.g. fetch the value of a flag assigned randomly during a loop. Use reduction for that, instead.

The declare reduction directive generalizes the reductions to include user-defined reductions.


Offloading means that parts of the program can be executed not only on the CPU of the computer itself, but also in other hardware attached to it, such as on the graphics card.

You can acquire device numbers by using the library functions, such as omp_set_default_device, omp_get_default_device, omp_get_num_devices, and omp_is_initial_device.


The flush directive can be used to ensure that the value observed in one thread is also the value observed by other threads.

Thread synchronization

The barrier directive causes threads encountering the barrier to wait until all the other threads in the same team have encountered the barrier.

Coding guide

#ifdef _OPENMP






A Node is a worker machine in Kubernetes and may be either a virtual or a physical machine, depending on the cluster


A Pod is a Kubernetes abstraction that represents a group of one or more application containers (such as Docker or rkt), and some shared resources for those containers.

CaaS (Container as a Service)