Server performance issues: Detection and causes
Table of contents
When server performance issues occur, every minute matters. Use Syskit to detect issues and troubleshoot them as soon as possible. If you neglect performance issues, your server may run noticeably slower, leading to reduced work efficiency. Furthermore, if you have, for example, an overloaded server, some services might stop working, making it impossible for your colleagues to perform their daily tasks.
Prevent your losses
First of all, you should be able to determine quickly what each metric or event captured about your servers represents. If an important service is down, you don’t want to waste time figuring out what your monitoring data are trying to tell you, so it is good to be familiar with detection methods and the most common causes of server performance issues. You’ll find out more about these later in the text. It is also important to collect all your metrics in sufficient detail and over a wide range of data to make sure significant changes are visible when troubleshooting server performance issues. Syskit provides you with real-time performance data and the ability to dig deeper into every performance counter per server, application, and user.
To speed up the detection of a problem, set up and get to know your dashboards and thresholds before you actually have to use them. You could be losing precious time by trying to figure things out on the spot. The Syskit Performance Reports Dashboard enables you to track all the important performance counters from a single dashboard in real time. You may want to monitor server, application, or user performance and overlay relevant events on the graphs for a correlation analysis. For that, you can use the Syskit real-time Application Performance and User Performance reports.
How to detect the cause of a problem
What you need to have in mind when investigating a server performance issue is problem characterization. If the issue is not clearly described at the start, it’s easy to take a wrong turn while digging deeper into your data to diagnose the issue.
For a start, examine the occurring symptoms on the high-level system that are disrupting optimal functionality. They are mostly exhibited as slow performance, unresponsiveness, or even a server crash. These symptoms will most likely point you to the source of the problem, or at least set your investigation in the right direction. For instance, data loss often points to hard disk problems while a poor response time could suggest a network adapter issue. There is also another explanation for server performance issues, and that is simply because the system is overburdened, so don’t forget about that option before digging deeper into the monitoring metrics.
Often you will not be able to find the cause of a problem by just inspecting the symptoms. What you need to do is examine the resources that the system uses – hardware resources as well as software or services that act as resources to the system. By using the Syskit Performance reports and Application reports, you can get insights into the state of the critical resources. Are they unavailable? Are they over-utilized or saturated?
Most common causes of server performance issues
Physical components serve as a resource to your system environment. For instance, a server requires resources from such components as a CPU, memory, disk, and network adapters. Most notable of the above is the disk, which can cause important services to stop working when it runs out of free space. They are, what you might call, basic resources and are the most common cause of server performance issues.
CPU
Symptoms:
• Troubled or unresponsive applications or services
• High service response time or unavailability
Audit:
• Current CPU utilization
• Increase in CPU usage
• Number of processes and their CPU usage
Hard Disk
Symptoms:
• Sudden increase in the number of logs (transaction logs, error logs, etc.)
• Disk running out of space often
• Data loss
Audit:
• Disk space used
• Rate of increase in disk space utilization
• Disk speed per second (reads, writes and transfers)
• Average disk queue length
Memory
Symptoms:
• Troubled or unresponsive applications or services
• High service response time or unavailability
Audit:
• Current memory utilization
• Number of processes and their memory usage
• Spikes in memory usage
Network adapter
Symptoms:
• Unstable bandwidth
• Poor response time
• Intermittent network connectivity problems
• Poor application or service performance
Audit:
• Sudden peaks in sent/received traffic
• Large amounts of sent/received traffic