Intel tools harness the true promise of multicore chips

analysis
Sep 16, 20106 mins

Intel Parallel Building Blocks make it easier for C and C++ developers to take advantage of multicore processors

The gigahertz wars are long over. Now and for the foreseeable future, CPU performance gains will come not from increasing clock speeds but from packing ever more processor cores onto chip dies. Today’s fast six-core processors are nothing compared to the megamulticore designs chipmakers have in the pipeline, and dual-core chips have even begun making their way into phones and other mobile devices.

But there’s a problem. Multicore chips achieve performance gains through parallelism, but parallelism in software doesn’t come for free. Before an application can take advantage of today’s multiprocessor architectures, developers must build support for parallel task and data management at the lowest levels. Unfortunately, many of today’s developers learned their craft at a time when this level of multiprocessing was limited to the rarified world of supercomputing. They simply lack the skills necessary to build reliable and effective parallel software.

[ Also on InfoWorld: Following recent upgrades, Nvidia’s toolkits allow users to massively leverage the parallel processing capabilities of GPUs. | Keep up with app dev issues and trends with InfoWorld’s Fatal Exception blog and Developer World newsletter. ]

Little wonder, then, that Intel has made tools and support for parallel software development such a priority in recent years. At the Intel Developer Forum 2010 conference, which took place this week in San Francisco, the chipmaker unveiled not one but three new technologies aimed at allowing developers to take better advantage of multiprocessing on Intel architecture. Collectively, the three tools are known as Intel Parallel Building Blocks, and they’re available today as part of Intel Parallel Studio 2011, which shipped earlier this month. But will Intel’s efforts really be a boon to developers, or will they serve mainly to widen the gap between Intel and its rivals, including AMD, Via, and ARM?

Three paths to parallelism Intel isn’t alone in tackling parallel software development, but different companies have approached the problem in different ways. Some, including Google and Sun Microsystems, have taken the route of building entire new programming languages around parallelism. This allows developers to enter this new world with a clean slate, but it also increases their learning curve. All of Intel’s tools, on the other hand, work with plain old C and C++.

Well, almost. The first of Intel’s three new tools, Cilk Plus (pronounced “silk”), makes it easier to add fundamental parallel features to programs by introducing new keywords to the C++ language itself. For example, the new “cilk.for” keyword generates loops that are automatically parallelized and managed by a task scheduler, while another new notation creates arrays that are better targets for SIMD instruction sets (the MMX and SSE technologies on Intel processors). Because the syntax of the new keywords is essentially the same as traditional C++ syntax, developers can easily bring parallelism to their programs and remove it again simply by swapping out a few keywords — making debugging much less painful. All of the low-level code required to enable parallelism is generated by the compiler.

The second tool, known as Threading Building Blocks (TBB), takes a more traditional approach. Its syntax will still feel familiar to C++ programmers, but rather than adding keywords to the language, it provides parallel capabilities in the form of a C++ template library. To enable parallelism, developers need only swap out Standard Template Library (STL) data types with the corresponding types from TBB. TBB also includes a task manager and a scalable memory allocator that was designed to support parallelism.

The third tool is called Array Building Blocks (ArBB), and it’s perhaps the most ambitious of the three. ArBB defines a new, domain-specific language syntax that can be embedded into C/C++ code. At runtime, ArBB instructions are compiled with a JIT (just-in-time) compiler that automatically generates appropriate machine instructions for the underlying hardware, including SIMD processing units. The result is parallel software that not only adapts to the capabilities of each processor, but is also “future proof” for new chip developments and instruction sets. ArBB is technically just entering beta as of this month, but it’s based on Intel Ct technology combined with technology the chipmaker acquired from RapidMind last August.

Are Parallel Building Blocks for everyone? So why three technologies instead of just one? Intel Parallel Studio actually includes more tools than just these, but the short answer is that while all three Intel Parallel Building Blocks can be used together in the same program, each might be more appropriate for different circumstances. TBB might be the easiest way for game developers to add parallelism to their existing code, for example, while ArBB’s domain-specific language might be more suited for programmers who are comfortable implementing algorithms in a more mathematical way — for example, in the finance industry.

Whichever your poison, the goal of all three Building Blocks is to make it easier to write reliable, scalable, bug-free parallel programs with as few lines of code as possible. Intel’s aim is to make writing parallel code as easy as writing traditional software. And that’s a good thing, because the trend toward parallelism is only going to increase, particularly as Intel moves toward its upcoming Many Integrated Core (MIC) architecture.

If there’s one troubling aspect to Intel’s approach, however, it’s the conspicuous absence of any talk of support for competing chips and architectures, including those from AMD, Via, and ARM holdings. Intel has been accused in the past of using its dominant position in the PC chip industry to stifle competitors. Not long ago, it was claimed that Intel’s C/C++ compilers produced inferior code for AMD and Via chips (and you’ll need to use Intel’s compilers to take advantage of Cilk Plus).

If Intel hopes to leverage its Parallel Studio tools to gain further advantage over its competitors, that will be a shame. Competition in the microprocessor industry is a good thing, and developers are likely to want to continue to target multiple architectures for the foreseeable future — particularly ARM, which remains the dominant architecture for mobile devices. Let’s hope Intel doesn’t squander the potential of these tools and techniques — and further limit the growth of parallel programming — by limiting them to a walled garden of its own products.

This article, “Intel tools harness the true promise of multicore chips,” originally appeared at InfoWorld.com. Read more of Neil McAllister’s Fatal Exception blog and follow the latest news in programming at InfoWorld.com.