I am using the PSTL implementation in Intel 18 Parallel Studio XE beta , which consists of lib ++ and libstd ++ and is based on TBB . I tested for_each and transform , but nothing else.
Update: Intel PSTL is open source ( https://github.com/intel/parallelstl ) and works with GCC and Clang.
Since PSTL support is limited, I make the code portable through the preprocessor:
#if defined(USE_PSTL) && defined(__INTEL_COMPILER) && (__INTEL_COMPILER >= 1800) std::for_each( pstl::execution::par, std::begin(range), std::end(range), [&] (int i) { std::for_each( pstl::execution::par_unseq, std::begin(range), std::end(range), [&] (int j) { #elif defined(USE_PSTL) && defined(__GNUC__) && defined(__GNUC_MINOR__) \ && ( (__GNUC__ == 8) || (__GNUC__ == 7) && (__GNUC_MINOR__ >= 2) ) __gnu_parallel::for_each( std::begin(range), std::end(range), [&] (int i) { __gnu_parallel::for_each( std::begin(range), std::end(range), [&] (int j) { #else #warning Parallel STL is NOT being used! std::for_each( std::begin(range), std::end(range), [&] (int i) { std::for_each( std::begin(range), std::end(range), [&] (int j) { #endif B[i*order+j] += A[j*order+i]; A[j*order+i] += 1.0; }); }); }
You can see that this code is based on libc++ on Mac, although part of the PSTL itself belongs to Intel headers, which in turn use TBB as a runtime.
$ otool -L transpose-vector-pstl transpose-vector-pstl: @rpath/libtbb.dylib (compatibility version 0.0.0, current version 0.0.0) /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 307.5.0) @rpath/libiomp5.dylib (compatibility version 5.0.0, current version 5.0.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.60.2)
Full disclosure: I work for Intel and discuss this implementation with developers, although I am not responsible for this for anything.