SharePoint is an information share and document management platform that is essential for the majority of businesses throughout the world today. But you already knew that, right? This article outlines SharePoint performance monitoring and its benefits. The focus is on highlighting the most common performance issues and advising on best practices to help you with solving them
If you’re a SharePoint admin – Yes, it’s absolutely crucial!
SharePoint performance monitoring is the key to providing optimal usage experience to end-users on your servers. Often, finding out why page load issues occur, or whether it is your database running out of free space, can carry the full weight of running your business.
It is imperative to monitor your SharePoint environment 24/7, as opposed to only starting when you’ve already experienced issues or slowness. When problems like massive database response time and page slowness occur, every minute matters. If neglected for a short period of time, it will affect your users and bring work disruption to your company.
Also, it is smart to have the SharePoint server performance logs of all farms at your disposal at all times. SharePoint performance monitoring in real-time, and historically, allows you to place performance metrics against a set of server baselines, giving you the insight into a potentially arising issue. Perhaps you’ll need to check the SQL Server logs from three days ago to find the problem cause, or need to prove that CPU usage on WFE servers is spiking only at a particular time of the day.
There are several benefits to be gained through SharePoint performance monitoring:
Like all technical platforms, SharePoint has a list of problems that you’re due to experience sooner or later working in most SharePoint environments. The focus here is on the performance issues that cause disruption to user workflow, and best practices for solving them. These are just some that tend to be a bit more common in the everyday SharePoint administration process.
Your wait status and latency numbers in performance metrics point to your I/O subsystem as the root cause of the issue. You need to change the underlying storage configuration so that it can handle more IOPS and provide the necessary MB/sec throughput for your SharePoint environmental needs.
Recycle bin files have two stages. The first is during the 30 days following deletion when they are user-recoverable. The second is when they are used by site collection admins to recover long-deleted content. So, if you don’t enforce a quota, it won’t stop growing. Believe it or not, it consumes 50% of SharePoint total quota size, so be sure to stop it from getting out of control.
If you have plans for the growth of your SharePoint environment, be sure to consider the upgrade of your SQL Servers accordingly. An unworthy SQL server is most likely to be a culprit in persistent system failure. What you need to look for are physical resources, optimally tuned SQL configuration, and SharePoint databases with recovery and backup set-up.
This is a frequent bottleneck in environments with lower network capacity. You need to assure enough bandwidth between the WFE and the SQL server: at least 1Gbit to be on the safe side in preventing connection issues.
Make sure that your front-end servers have enough RAM memory so that cache can be stored without swapping to the slow physical or virtual disks. Every SharePoint server needs to have an appropriate amount of memory since nowadays RAM is not expensive. It’s better to have some physical memory sitting and waiting than not having it at a critical moment, forcing the server to swap to the disks.
You can check out more detailed information about SharePoint performance issues and ways to solve them on Microsoft TechNet. Remember that SharePoint is a complex system and you need to monitor your environment constantly.
If George Orwell were a SharePoint admin, he’d probably say: “All performance metrics are equally important, but some are more important than others.” Guided by that thought, the following SharePoint performance counters are brought to your attention, as they are likely to point out the most common SharePoint performance issues.
|
Performance metric
|
Warning
|
Metric Description
|
|---|---|---|
|
SharePoint Publishing Cache: Number of cache Compactions
|
>2
|
Indicates that the SharePoint cache may lack in size.
|
|
SharePoint Publishing Cache: Object cache flushes/Sec
|
>0
|
Cache flushing during peak-use hours slows down performance.
|
|
SharePoint Publishing Cache: Object Cache Hit Ratio
|
<1
|
Small number indicates the searches for non-published items.
|
|
SharePoint Publishing Cache: BLOB Cache % full
|
> 80%
|
Indicates that the SharePoint cache may lack in size.
|
|
ASP.NET Applications: Cache API trims
|
>1
|
Indicates the lack of allocated memory for ASP.NET output cache.
|
|
ASP.NET: Requests Queued
|
>400
|
Indicates the need for more Web Servers since a high number of requests can cause slow page load.
|
|
ASP.NET: Requests Rejected
|
>2
|
Indicates the need for more Web Servers due to server busyness.
|
|
ASP.NET: Worker Processes Restarts
|
>1
|
Any restarted worker process can indicate a potential problem.
|
|
Memory: Pages/Sec
|
>20
|
Large value for page writes to memory indicates the lack of RAM memory. *Note: Tends to jump over 200 often.
|
|
Performance metric
|
Warning
|
Metric Description
|
|---|---|---|
|
SQL Server General Statistics: User Connections
|
N/A
|
Number of users connected to the SQL server at the moment. Warning value is server dependent.
|
|
SQL Server Buffer Manager: Page life expectancy
|
<400
|
Time the page is saved in the buffer cache, a low number indicates possible low RAM memory.
|
|
SQL Server Buffer Manager: Lazy Writes/Sec
|
>15
|
Number of bad pages being removed from buffer per second. Zero is ideal.
|
|
SQL Server Buffer Manager: Buffer cache hit ratio
|
<90%
|
Percentage of data successfully fetched from a cached page. Low RAM memory indicator.
|
|
SQL Server SQL Statistics: Batch Requests/Sec
|
>1000
|
Indicator of CPU usage by SQL Server affected by number of queries.
|
|
SQL Server SQL Statistics: SQL Compilations/Sec
|
>100
|
Large number of compilations per second indicates high resource usage.
|
|
SQL Server Access Methods: Page Splits/Sec
|
>200
|
Large number of page splits indicates high resource usage
|
|
SQL Server Access Methods: Full Scans/Sec
|
>100
|
Full scans value can be ignored unless it peaks proportionally to CPU.
|
|
SQL Server Locks: Lock Waits/Sec
|
>1
|
Indicates time passed for one server lock.
|
|
QL Server Locks: Number of Deadlocks/Sec
|
>1
|
Number of deadlocks on the SQL Server per second.
|
|
SQL Server Cache Manager: Cache Hit Ratio • > 85% •
|
<85%
|
Indicates the ratio between cache hits and lookups for plans.
|
|
SQL Server Databases: Transactions/Sec
|
N/A
|
Number of transactions per second on the SQL server, warning value is server dependent.
|
Learn more about SharePoint performance monitoring and metrics on Microsoft TechNet.
SharePoint performance monitoring is key to an effective SharePoint governance strategy, but it only works if you approach it proactively rather than reactively.
Start by setting up your dashboards and thresholds before problems occur, not after. Know which counters matter for each server role, keep historical logs at your disposal, and make checking the health of your farm a regular part of your routine rather than something you do only when users start complaining.
SharePoint is a complex system with many moving parts. The farms that run smoothly are almost always the ones where admins have taken the time to understand the metrics, set the right baselines, and built monitoring into their daily workflow. The ones that struggle are usually the ones where monitoring only starts after something has already gone wrong.