Do the virtualization math: When four CPUs aren’t four CPUs

analysis

Oct 8, 20125 mins

Four virtual cores or four virtual sockets, what's the difference? It could be a lot

One of the major advantages of virtualization is the ability to dynamically add CPU and RAM to running virtual machines. Have a box that gets a sudden spike? Add more RAM on the fly and let it go. It’s a fantastic way to deal with certain compute issues, and it can make a tough decision disappear due to the fact that downtime and reboots aren’t required.

However, allocating CPU and RAM with the click of a mouse — dynamically or otherwise — can have deleterious effects on your servers in some circumstances. You really need to understand your workload and your OS.

[ Also on InfoWorld: First look: Driving VMware vSphere 5.1 | What’s key in VMware’s new vSphere, vCenter, and vCloud | The price of success: VMware’s big integration challenge ]

It all comes down to the type of workload you’re running, the OS scheduler, and the virtual CPU layout for the virtual machine. Virtual CPU allocations used to be simple. You specified how many virtual CPUs you wanted to assign and off you went. However, as the number of physical CPU cores increased and NUMA became the norm, that choice became trickier. Now, just about every major hypervisor presents a choice of virtual CPU types.

For instance, if you want to assign four virtual CPUs to your virtual machine, you can choose between four single-core CPUs, two dual-core CPUs, and one quad-core CPU. While all of these selections wind up presenting four virtual CPUs to the virtual machine, they do so in different ways, and the differences can impact the decisions made by the OS scheduler running on that virtual server.

Virtual machine alchemy There’s no hard-and-fast rule about these selections. The right choice is extremely dependent on the workload profile, the scheduler in use, and the OS version or kernel version. Older kernels less adept at dealing with multicore CPUs may have a better time with single-core CPU assignments. Newer kernels and OS versions might prefer multicore CPU presentations.

Beyond that, the nature of the workload itself can have a big impact. Single- and multithreaded workloads will handle each of these instances differently. There may be only slight differences in some workloads, but massive differences in others.

Picture a modern OS that’s well versed in NUMA. Taking advantage of NUMA permits faster memory access and can significantly speed up processor and RAM-intensive processes. If a CPU core interacts only with memory controlled by that CPU, it will perform faster, as it does not need to cross to another CPU to allocate and use memory.

This is fairly basic, sort of like how it’s quicker to go to the store across the street rather than one across town. However, when you insert a hypervisor underneath an OS, the relationships between CPU cores and RAM allocations can get a bit murky.

Depending on how the hypervisor presents CPUs to the virtual server, the OS may think that each CPU has its own memory controller or there’s a shared memory controller across four cores, for example. Underneath that, the hypervisor is constantly polling virtual server memory allocation status and assessing whether to move active memory closer to the CPU currently handling the load for that VM. There can be cases where performance dips due to all of these factors, and the fact that they’re occurring at once.

Straw into gold Fortunately, there’s a way to tell which method suits your workload best — test the hell out of it. Build up several virtual servers, each with a different virtual CPU layout, and run a sample workload. Look into deeper tweaks you can make regarding NUMA allocations at the hypervisor level, and test various scenarios by tweaking those parameters.

For instance, VMware vSphere has deep tuning parameters, such as numa.vcpu.maxPerMachineNode and numa.vcpu.maxPerClient, which allow you to adjust the maximum number of virtual CPUs that can reside on a single NUMA node and the maximum number of virtual CPUs that are rebalanced as a single unit by the hypervisor. There are several other parameters as well that may have a greater impact in your specific case, but the fact is that you may be able to boost your workload performance with a few tiny tweaks and some testing.

This isn’t a new concept. I noted this type of performance tweak 18 months ago in the last InfoWorld virtualization shoot-out, specifically with respect to Red Hat Enterprise Virtualization, but it’s a bit of knowledge that I find goes largely overlooked. So when you’re building and tweaking your virtual machines, remember that four doesn’t always equal four when you’re talking about virtual CPUs. Spending a little time testing can save a lot of time processing.

This story, “Do the virtualization math: When four CPUs aren’t four CPUs,” was originally published at InfoWorld.com. Read more of Paul Venezia’s The Deep End blog at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

Technology IndustrySoftware Development

by Paul Venezia

Senior Contributing Editor

Follow Paul Venezia on X

Paul Venezia is a veteran enterprise architect and senior contributing editor at InfoWorld, where he writes analyses and reviews.

Show me more

Topics

About

Policies

Our Network

More

Do the virtualization math: When four CPUs aren’t four CPUs

Four virtual cores or four virtual sockets, what's the difference? It could be a lot

More from this author

Congress has sold off your privacy—and U.S. security

Review: QNAP TVS-882T NAS piles on the features

Linux at 25: Linus Torvalds on the evolution and future of Linux

Linux at 25: How Linux changed the world

Why no one wins the tech holy wars

Sorry, dad, security isn’t what it used to be

Hey, Internet domain overlords, stop playing games

The end of Apple? The early signs may be in

Show me more

Oracle adds pre-built agents to Private Agent Factory in AI Database 26ai

Stop worrying: Instead, imagine software developers’ next great pivot

JetBrains launches AI coding agent management platform

How to build desktop apps in Typescript with Electrobun

Write and run assembly in Python with Copapy

Run AI Models Locally on Your PC — No Cloud Required (LM Studio Guide)