Intel Threading Building Blocks (Intel TBB)
Widely used C++ template library for task parallelism
- Rich set of components to efficiently implement higher-level, task-based parallelism
- Future-proof applications to tap multicore and many-core power
- Compatible with multiple compilers and portable to various operating systems
Simplify Parallelism with a Scalable Parallel Model
Intel Threading Building Blocks (Intel® TBB) 4.2 is a widely used, award-winning C and C++ library for creating high performance, scalable parallel applications.
- Enhance Productivity and Reliability – Rich set of components to efficiently implement higher-level, task-based parallelism
- Gain Performance Advantage Today and Tomorrow – Future-proof applications to tap multicore and many-core power
- Fits Within Your Environment – Advanced threading library, compatible with multiple compilers and portable to various operating systems
“Intel® TBB provided us with optimized code that we did not have to develop or maintain for critical system services. I could assign my developers to code what we bring to the software table—crowd simulation software.”
Michaël Rouillé, CTO, Golaem
The flow graph feature provides a flexible and convenient API for expressing static and dynamic dependencies between computations. It is customizable for a wide variety of problems. It also extends the applicability of Intel® TBB to event-driven/reactive programming models.
Intel® TBB delivers high performing and reliable code with less effort than hand-made threading. Pre-tested algorithms, concurrent containers, synchronization primitives, and a scalable memory allocator simplify parallel application development.
Dynamic Task Scheduler
Application performance can automatically improve as processor core count increases by using abstract tasks. The sophisticated Intel® TBB task scheduler dynamically maps tasks to threads to balance the load among available cores, preserve cache locality, and maximize parallel performance. The implementation supports C++ exceptions, task/task group priorities, and cancellation which are essential for large and interactive parallel C++ applications.
Dynamic task scheduler and parallel algorithms support nested and recursive parallelism as well as running parallel constructs side-by-side. This is useful for introducing parallelism gradually and helps independent implementation of parallelism in different components of an application.
Cross Platform Support and Composability
Organizations that require cross platform support today or anticipate needing it in the future should consider Intel® TBB. It is validated and commercially supported on Windows*, Linux*, and OS X* platforms, using multiple compilers. It is also available on FreeBSD*, IA-based Solaris*, and PowerPC*-based systems via the open source community. Intel® TBB is optimized for multicore architectures and Intel® Xeon Phi™ coprocessor.
Intel® TBB is designed to co-exist with other threading packages and technologies. Different components of Intel® TBB can be used independently and mixed with other threading technologies.
Organizations can expand their customer base by using a production-ready, open solution for parallelism that is available on a broad range of platforms. Intel® TBB is validated and commercially supported on Windows*, Linux*, and OS X* platforms, using multiple compilers. It is also available on FreeBSD*, IA-based Solaris*, and PowerPC*-based systems via the open source community.
Top Community Support
The broad support from an involved community provides developers access to additional platforms and OS’s. Intel® Premier Support services and Intel® Support Forums provide confidential support, technical notes, application notes, and the latest documentation.
A complete documentation package and code samples are readily available both as a part of Intel® TBB installation and online at http://threadingbuildingblocks.org. The User Guide provides an introduction into Intel® TBB. The Design Patterns chapter in the User Guide covers common parallel programming patterns and how to implement them using Intel® TBB. The Reference Manual contains formal descriptions of all classes and functions implemented in Intel® TBB.
|Support for Latest Intel Architectures||Take advantage of the newest features in Intel’s latest processors including Transactional Synchronization Extensions (TSX). Adds support for Intel® Xeon Phi™ coprocessor for Windows and Intel® Xeon™ Processor (Ivy Bridge-EP).Selecting the best models for your application today will set a path for you to take full advantage of multicore and many-core performance without re-writing your code. Start today by implementing parallelism for today’s architecture and be ready for future architectures.|
|Lower memory overhead||Improved heuristics in the memory allocator reduce memory overhead by intelligently releasing unused or stale memory.|
|Improved handling of large memory requests||Improved handling of large (>8K-128MB) memory requests results in better performance when using frequent large memory allocations. Use of big memory pages can now be explicitly enabled via a function call or environment variable.|
|Better Fork Support||Fork safety through a user enabled API that ensures Intel® TBB worker threads are completed before executing a fork.|
|PPL* Compatibility||Improved compatibility with Parallel Patterns Library (PPL) by adding concurrent_unordered_multimap and concurrent_unordered_multiset API’s.|
|Windows* Store||Customers that use Intel® TBB in their applications can now submit and sell their app through the Windows Store.|
|Android* OS support||The Android OS is now supported as a target operating system for improved application performance and power efficiency. See Beacon Mountain for more Android developer tool details.|
Intel® TBB 4.2 Pre-Tested Capabilities
Generic implementation of common parallel performance patterns
|Generic implementations of parallel patterns such as parallel loops, flow graphs, and pipelines can be an easy way to achieve a scalable parallel implementation without developing a custom solution from scratch.|
Generic implementation of common idioms for concurrent access
|Intel® TBB 4.2 concurrent containers are a concurrency-friendly alternative to serial data containers. Serial data structures (such as C++ STL containers) often require a global lock to protect them from concurrent access and modification; Intel® TBB concurrent containers allow multiple threads to concurrently access and update items in the container increasing allowed concurrency and improving an application’s scalability.|
Exception-safe locks, condition variables, and atomic operations
|Intel® TBB 4.2 provides a comprehensive set of synchronization primitives with different qualities that are applicable to common synchronization strategies. Exception-safe implementation of locks helps to avoid a dead-lock in programs which use C++ exceptions. Usage of Intel® TBB atomic variables instead of the C-style atomic API minimizes potential data races.|
|Scalable Memory Allocators
Scalable memory manager and false-sharing free memory allocator
|The scalable memory allocator avoids scalability bottlenecks by minimizing access to a shared memory heap via per-thread memory pool management. Special management of large (>=8KB) blocks allows more efficient resource usage, while still offering scalability and competitive performance. The cache-aligned memory allocator avoids false-sharing by not allowing allocated memory blocks to split a cache line.|
|Create arbitrary task trees||When an algorithm cannot be expressed with high-level Intel® TBB 4.2 constructs, the user can choose to create arbitrary task trees. Tasks can be spawned for better locality and performance or en-queued to maintain FIFO-like order and ensure starvation-resistant execution.|
|Conditional Numerical Reproducibility||Ensure deterministic associativity for floating-point arithmetic results with the new Intel® TBB template function ‘parallel_deterministic_reduce’.|
|C++11 Support||Intel® TBB can be used with C++11 compilers and supports lambda expressions. For developers using parallel algorithms, lambda expressions reduce the time and code needed by removing the requirement for separate objects or classes|
Scalability with Future-proofing
- Intel® TBB provides a simple and rapid way of developing robust parallel applications that abstracts platform details and threading mechanisms for performance that scales with increasing core counts
- Intel® Threading Building Blocks yields linear scaling in these example applications
Select the right Intel® TBB license
- Commercial Binary Distribution for customers who may require commercial support services. Attractive pricing available for academic, student and classroom usage.
- Open Source Distribution can be used under GPLv2 with the runtime exception allowing usage in proprietary applications. Allows support for additional OSs and hardware platforms. Both source and binary forms are available for download from http://threadingbuildingblocks.org.
- Custom license available if you require the ability to modify or distribute the commercial source code of Intel® TBB. Contact your Intel representative for more information.
Available Commercially and as open source
Videos to get you started
The Next Steps
What do our Customers say about us?
Thank you so much for all your help this morning. Endnote works just fine now and I’m enjoying working on this editorial. I am almost tempted to add your name as a co-author, but the journal won’t allow it! Thanks again.DR RD
Excellent – a natural teacher! Approachable and willing to help us individuallyPD, Manchester, UK
I can only say I wish all suppliers were as helpful as you.CP, Newport, UK
Good balance of basic information and practical tips for the more experiencedDV, Manchester, UK
For the time being we are unable to offer the following product ranges although we are currently working hard to increase the number of products we can offer in the future. Please contact us to talk about alternative products that we may be able to offer you.