AMD's dual-core server/workstation CPU passes our SPEC tests At Opteron’s two-year anniversary gathering in New York on April 21, 2005, AMD rolled out its first dual-core Opteron CPU. Not only do the new chips turn dual-processor workstations and servers into four-processor workhorses, but the upgrade path for customers with existing Opteron systems redefines “painless.” You merely pop the dual-core parts into your current Opteron server, and they will run the same software you’re running now. Scale-up — a cost-effective internal upgrade path previously reserved for larger systems — is now within reach of entry-level server buyers.“Dual core” refers to the placement of a second CPU on a single physical chip. The two cores are full-fledged Opterons sitting side by side on a chip that’s exactly the same size as the single-core Opteron. Dual-core systems will be most attractive to those who have an eight-cylinder appetite but only a four-cylinder budget. But dual core is no bargain if it shortchanges customers on performance. AMD told us that the second core delivers a 70 percent to 90 percent performance improvement to multiprocessor applications. If AMD’s claims prove accurate, then dual core will be a good investment even for those who can afford quad-processor machines.Never one to take a vendor’s word for anything, I built a reference system from the new Opteron components and began running it through a battery of standardized performance tests. AMD supplied me with a pair of production dual-core Opteron 875 2.2GHz CPUs and an Opteron workstation motherboard from Tyan, the Thunder K8WE model S2895. My testing continues as this article goes to press, but enough results are in to provide a snapshot of the performance of the dual-core Opteron relative to its single-core cousin. The nitty-grittyFor benchmark tests I chose an old favorite: the CPU2000 suite from Standard Performance Evaluation Corporation Corp. (SPEC). The test scripts were compiled using Intel’s compilers for EM64T (Extended Memory 64 Technology) and were run on Windows Server 2003 Enterprise x64 Edition. The SPEC software is mature and well-organized, and it creates consistent results across platforms. Past test results submitted by vendors are open to public scrutiny at spec.org, and to SPEC’s credit, full disclosure of the testing conditions is required.CPU2000’s two components, SPECint2000 and SPECfp2000, measure integer and floating-point performance, respectively. Integer tests exercise system calls, application performance, memory management, and OS scheduling efficiency more than they show off how fast your machine multiplies whole numbers. In contrast, the floating-point benchmarks are all about pushing your CPUs to the edge of their performance and environmental (power, cooling, and noise) limits. The tests referenced in the table are subsets of SPEC’s SPECint_rate2000 and SPECfp_rate2000 tests. Rate tests launch multiple simultaneous processes — ideally, and in this case, one benchmark process per core — to see how smoothly a system scales to handle a rising workload. If a system scaled perfectly, doubling the number of processes (or cores) would make it capable of handling twice as much work with no degradation in performance, producing CPU2000 rate test results roughly double that of the original configuration. Although incomplete, the numbers — 57.9 for SPECint_rate2000 and 62.8 for SPECfp_rate2000 — speak for themselves, showing an 85 percent and 64 percent improvement, respectively, when compared with the results for single core. In short, AMD is on the level with its projected performance.On the integer (general computing) side, dual core comes closer to ideal scale-up than I imagined possible. I’m not surprised by the lower floating-point boost, given that floating-point-intensive applications optimize comparatively poorly unless they’re optimized by hand. The floating-point results are nevertheless impressive, and both sets of results show a gain in computing capacity I expect is unique for the money. I don’t, however, settle for expectations. I’m already working on getting the rest of the CPU2000 suite built and tuned. How do these numbers compare with results for other processors? AMD’s best submitted SPEC results to date came from a 2.6GHz single-core dual-processor system. For SPECint_rate2000, the older Opteron system scored 40.5; for SPECfp_rate2000 its score was 38.6. For both of these tests the OSes were 32-bit versions of Windows XP Professional or Windows Server 2003. All used Intel’s C++ and Fortran compilers.Intel hasn’t submitted tests for any 64-bit Xeon server CPUs, but I tried to find the best showing among the 32-bit Xeon’s recently published tests. The closest match I found was a Dell dual-processor, single-core, 32-bit Xeon system running at 3.6GHz. The Xeon CPUs in this system had 2MB of Level 2 cache each, compared with Opteron’s 1MB, and the disparity in clock speed (3.6GHz vs. 2.2GHz) should also be noted. Both factors lead to higher system pricing relative to Opteron, and to higher power consumption, heat, and noise. Those factors aside, the 3.6GHz Xeon’s SPECint_rate2000 score was 40.5, and its SPECfp_rate2000 score was 31.7.Take that, Intel Complete test results are still to come, but it’s not too soon to know the impact dual-core Opteron should have on customers who have placed their bets on Intel’s Xeon. To be blunt, Intel’s claim of technological parity with AMD is an easily penetrated smoke screen.If all other technical differences were set aside, AMD’s on-chip memory controllers, dedicated memory banks for each processor, and independent I/O channels among CPUs would decide the battle with Intel’s EM64T. All these features are unique to AMD’s x64 system implementation, and we’ll see them exploited by major commercial OSes and development tools, including Windows Server 2003 x64 Editions and Visual Studio 2005 (currently in beta).Until Intel delivers its dual-core Xeon, its competitive technology remains Hyper-Threading. Hyper-Threading is an ingenious stopgap that Intel developed to boost the performance of a narrow range of applications, namely those that do their background processing via lightweight threads instead of the coarse processes that dominate software design. Neither the CPU nor the OS prevents threads from attempting to access the same resource, which makes the sloppy use of threads one of the leading enemies of software stability. But even with applications that use many threads, Hyper-Threading delivers, even by Intel estimates, a maximum of about 30 percent improvement to an application’s performance. And these benefits are limited only to heavily threaded applications; Hyper-Threading does not speed up the entire system. In fact, with most systems running a mix of threads and processes, Hyper-Threading can harm performance; a scan of Intel’s SPEC benchmarks reveals that Xeon system vendors often disable Hyper-Threading to improve Xeon’s results. In contrast to Xeon’s Hyper-Threading, dual-core Opteron is optimized to accelerate the performance of all applications on a system in a fairly uniform fashion. Dual core is an efficient alternative to the common practice of scale-up, which provides a total speed improvement by enabling the server to divide its workload across what amounts to several tightly connected computers. A dual-processor, dual-core Opteron system allows an operating system to distribute its total workload across the logical equivalent of four discrete physical CPUs. CPU futuresAs with all benchmarks, the results are meaningful only when applied in context, and the only context available here is 32-bit operation. These test results reflect both the advantages and the shortcomings of Microsoft’s first release of Windows XP Professional x64, as well as the effectiveness of dual core. But with all other things being equal — and they are, given that AMD has packaged dual-core Opteron as a chip upgrade and Microsoft is swapping 32-bit Windows licenses for 64-bit editions — dual-core Opteron doesn’t merely leave its own predecessor in the dust. It sets a mark for Intel that AMD is certain can’t be met before AMD moves on to better things such as faster HyperTransport buses, still more cores per chip, faster memory, and so on. Opteron is as close to future-proof as any entry-level 64-bit server or workstation architecture can be. Systems purchased today should still be operating five to seven years from now, with only incremental upgrades through CPU swaps and higher-density RAM (1GB per module instead of today’s 512MB). The dual-core Opteron systems hitting the market this week won’t merely be the best investment in the PC server and workstation spaces. They’re the best bet if you want systems that you won’t have to replace for a long long time. Technology IndustrySoftware DevelopmentSmall and Medium Business