Quantcast
Channel: VMware Communities: Message List
Viewing all articles
Browse latest Browse all 293210

Performance degredation related to CPU Scheduling and NUMA nodes.

$
0
0

I have an interesting scenario (HP vs DELL hardware) with potentially degraded performance (specific to the DELL R815 hardware) and I would like to know if what I am seeing is being interpreted correctly, or whether am I simply being over cautious and don't actually have an issue.

 

Summary;

 

  1. Although a much higher hardware specification, the DELL R815 ESXi hosts are not scheduling the CPU cycles as efficiently as the HP DL585 G6 hardware. The impact we are seeing is an increased CPU ready time and performance degradation of the guest VM’s. This is evident with a very low number of guest VM’s on the host and increases as the consolidation ratio is ramped up or the CPU load is increased on any of the guest VM’s.
  2. There also appears to be an imbalance in the NUMA nodes where a particular node is favoured and the % NUMA local memory is not as efficient as it should be (ie. the HP hardware performs much better than the DELL hardware)

 

DELL Technical Details;

 

Hypervisor       : VMware ESXi 4.1.0, build 582267

 

Hardware specification;

 

Dell PowerEdge R815

- Model : AMD Opteron(tm) Processor 6174

- Processor Speed : 2.2 GHz

- Processor Sockets : 4

- Processor Cores per Socket : 12

- Logical Processors : 48

- Memory : 256 GB

 

esxtop performance statistics;

 

DELL Memory (incl NUMA statistics);

DELL_Mem_NUMA.png

Dell CPU;

DELL_CPU.png

 

 

Observations;

 

  • NUMA home node #7 is favoured, rather than balancing the load across all 8x nodes
  • % NUMA local memory is inefficiently allocated
  • Very low consolidation ratio of guest VM’s per host
  • Very low load on the host and already seeing ready time

 

Example of the affected Guest VM

Guest_VM.png

 

DELL Host is under no load whatsoever;

DELL_Resource.png

 

As a contrasting perspective from a heavily loaded HP DL585 G6 host, this is what I would “expect” to see;

 

HP Technical Details;

 

Hypervisor       : VMware ESXi 4.1.0, build 582267

 

HP Hardware specification;

 

HP ProLiant DL585 G6

- Model : Six-Core AMD Opteron(tm) Processor 8435

- Processor Speed : 2.6 GHz

- Processor Sockets : 4

- Processor Cores per Socket : 6

- Logical Processors : 24

- Memory : 128 GB

 

esxtop performance statistics;

 

HP Memory (incl NUMA statistics);

HP_Mem_NUMA.png

 

HP CPU;

HP_CPU.png

 

Observations;

 

  • HP host is of a lower hardware specification than the DELL host
  • HP host has almost 4x the number of guest VM’s hosted and does not suffer from the same performance issues
  • NUMA home node #0 is favoured, but there is a much better allocation of NUMA local memory (more efficient) – close to 100%
  • Much higher consolidation ratio of guest VM’s per host without performance issues
  • Much higher load on the host and almost ZERO ready time

 

HP host still has capacity, but is under much more load the than the affected DELL host;

HP_Resource.png

 

In both cases (HP and DELL) we do expect to see a certain level of ready time, but the levels seen on the DELL hardware are of concern, as well as the inefficient use of NUMA local memory. This issues is not seen on the HP hardware, including earlier and later generation hardware.

 

So the questions are;

 

  1. Have I interpreted this correctly?

  2. Has anyone else see this before? If yes, how was this resolved?

  3. What next steps can be taken to test and verify this information


Viewing all articles
Browse latest Browse all 293210

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>