Balanced Affinity Type

24 Jun

https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-8FCD3720-6F73-429C-AE65-7144ED0B991A.htm

performance:

https://www.pugetsystems.com/labs/articles/Haswell-Floating-Point-Performance-493/

https://www.microway.com/hpc-tech-tips/intel-xeon-e5-2600-v3-haswell-processor-review/

http://www.hpcwire.com/2014/09/08/intel-haswell-xeon-e5s-aimed-squarely-hpc/

This topic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

The affinity type balanced is particularly useful on the Intel® MIC Architecture. It is not supported for the CPU.

Under this setting, the OpenMP* runtime places threads on separate cores until all cores have at least one thread, similar to the scatter type. However, when the runtime must use multiple hardware thread contexts on the same core, the balanced type ensures that the OpenMP* thread numbers are close to each other, which scatter does not do.

The following diagrams illustrate the allocation of six OpenMP* threads across a 3-core system with the compact, scatter, and balanced affinity types.

Allocation with the compact affinity type

On the Intel® MIC Architecture, it is normally beneficial to use cores before threads, so the compact affinity type is unlikely to yield the best results, because it leaves cores unused.

Allocation with the scatter affinity type

The thread allocation under scatter is likely to be better than compact, because it uses cores before threads. However, scatter allocates thread IDs such that threads with IDs in close numerical proximity are on different cores, and therefore do not share caches. Because threads with neighboring IDs often operate on closely related data, placing them on different cores is unlikely to be the best way to allocate them.

Allocation with balanced affinity type

The thread allocation under balanced is balanced over the cores and the threads allocated to a core are neighbors of each other. Therefore, cache utilization should be efficient if the threads access data that is near in store.

Allocation with balanced affinity type for 9 OpenMP* threads

Tuning affinity is a complicated and machine specific process. Using the balanced affinity type is a reasonable starting point on the Intel® MIC Architecture.

The balanced affinity type is supported and recognized by the OpenMP* runtime on the Intel® MIC Architecture only. It is not supported and is ignored for the CPU.

Using the balanced type with the environment variable KMP_AFFINITY propagates to the coprocessor when MIC_ENV_PREFIX is not set. However, this generates the following runtime warning on the CPU:

coprocessor OMP: Warning #58: KMP_AFFINITY: parameter invalid, ignoring “balanced”

To set the balanced affinity type for only the Intel® MIC Architecture environment, assign a specific prefix using the MIC_ENV_PREFIX=prefix and then set prefix_KMP_AFFINITY with balanced.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: