Tuesday, February 21, 2017

virtualization - How to [politely?] tell software vendor they don't know what they're talking about



Not a technical question, but a valid one nonetheless. Scenario:



HP ProLiant DL380 Gen 8 with 2 x 8-core Xeon E5-2667 CPUs and 256GB RAM running ESXi 5.5. Eight VMs for a given vendor's system. Four VMs for test, four VMs for production. The four servers in each environment perform different functions, e.g.: web server, main app server, OLAP DB server and SQL DB server.



CPU shares configured to stop the test environment from impacting production. All storage on SAN.




We've had some queries regarding performance, and the vendor insists that we need to give the production system more memory and vCPUs. However, we can clearly see from vCenter that the existing allocations aren't being touched, e.g.: a monthly view of CPU utilization on the main application server hovers around 8%, with the odd spike up to 30%. The spikes tend to coincide with the backup software kicking in.



Similar story on RAM - the highest utilization figure across the servers is ~35%.



So, we've been doing some digging, using Process Monitor (Microsoft SysInternals) and Wireshark, and our recommendation to the vendor is that they do some TNS tuning in the first instance. However, this is besides the point.



My question is: how do we get them to acknowledge that the VMware statistics that we've sent them are evidence enough that more RAM/vCPU won't help?



--- UPDATE 12/07/2014 ---




Interesting week. Our IT management have said that we should make the change to the VM allocations, and we're now waiting for some downtime from the business users. Strangely, the business users are the ones saying that certain aspects of the app are running slowly (compared to what, I don't know), but they're going to "let us know" when we can take the system down (grumble, grumble!).



As an aside, the "slow" aspect of the system is apparently not the HTTP(S) element, i.e.: the "thin app" used by most of the users. It sounds like it's the "fat client" installs, used by the main finance bods, that is apparently "slow". This means that we're now considering the client and the client-server interaction in our investigations.



As the initial purpose of the question was to seek assistance as to whether to go down the "poke it" route, or just make the change, and we're now making the change, I'll close it using longneck's answer.



Thank you all for your input; as usual, serverfault has been more than just a forum - it's kind of like a psychologist's couch as well :-)


Answer



I suggest that you make the adjustments they have requested. Then benchmark the performance to show them that it made no difference. You could even go so far to benchmark it with LESS memory and vCPU to make your point.




Also, "We're paying you to support the software with actual solutions, not guesswork."


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...