Scale-Up (OLTP)---a Technical Reference Guide for Designing Mission-Critical OLTP Solutions
Want more guides like this one? Go to Technical Reference Guides for Designing Mission-Critical Solutions. |
Most applications are designed for good scale-up performance because the design techniques for scale-up are reasonably well understood and are easier to develop and deploy than those for scale-out. (For example, review the bWin case study to understand the scale-up capability achieved by a customer - the link is provided below.)
However, more and more scale-out architectures are being adopted for large-scale deployments in which the expected number of concurrent users is huge and the total cost of ownership (TCO) must be kept low. Good design for scale-up is of utmost importance for both scale-up and scale-out because individual segments of a federated database or application have to perform well for the entire federation to be scalable at a reasonable cost.
The system scalability is dependent on several components, including: the database management system (DBMS) engine, the configuration settings, the database schema, and application design. Some would argue that the database schema and the application design are more important that the database engine component. Therefore, be sure that the application component gets sufficient emphasis and resources to make the system scalable.
Best Practices
The following resources provide examples of customer scenarios for scale-up solutions, in addition to general scale-up reference material. (Note that the full URLs for the hyperlinked text are provided in the Appendix at the end of this document.)
The SQL customer advisory team (SQL CAT) blog post Analyzing I/O Characteristics and Sizing Storage Systems for SQL Server Database Applications1 discusses input/output (I/O) in relation to Microsoft SQL Server database applications.
The SQL CAT blog post Data Compression: Strategy, Capacity Planning and Best Practices2 explains how the data compression feature in the SQL Server 2008 database software can help reduce the size of the database and improve the performance of I/O intensive workloads. Note that I/O is one of the most common bottleneck in current SQL Server systems.
The white paper The Data Performance Guide 20083 describes techniques for bulk loading large data sets into SQL Server.
The SQL CAT blog posts Tuning the Performance of Backup Compression in SQL Server 20084 and Tuning Backup Compression Part 25 provide guidance related to tuning options for backup performance, and discuss how backup compression interacts with other features of SQL Server 2008.
The white paper Troubleshooting Performance Problems in SQL Server 20086 provides step-by-step guidelines for diagnosing and troubleshooting common performance problems using publicly available tools such as SQL Server Profiler, Microsoft Performance Monitor, dynamic management views, and SQL Server Extended Events and the data collector.
The SQL CAT blog post Resolving PAGELATCH Contention on Highly Concurrent INSERT Workloads7 describes the process of diagnosing a problem and then using table partitioning to work around it.
Managing SQL Server Workloads with Resource Governor 8 is a good starting point for understanding how Resource Governor can help in scaling up. Additionally, review the Support Engineers blog Resource Governor and MAXDOP, Parallel plan generation9 for practical pointers.
The SQL CAT post Scripts and Tools for Performance Tuning and Troubleshooting SQL Server 200510 is a toolbox full of information to make performance tuning and troubleshooting easier.
The SQL CAT blog post Top SQL Server 2005 Performance Issues for OLTP Applications11 discusses some of the top performance bottlenecks for OLTP applications.
See DBCC Checks and Terabyte-Scale Databases12 for best practices on Database Console Command (DBCC) checks and terabyte-scale databases. Note that it is very important to regularly run DBCC in very large databases (VLDBs).
The white paper Service Broker: Performance and Scalability Techniques13 describes a large scale customer scenario and the techniques employed in scaling SQL Server Service Broker to process tens of thousands of messages per second on one server.
The white paper SQL Server 2005 Waits and Queues14 approaches SQL Server performance tuning through a methodology called "Waits and Queues."
The white paper Plan Caching in SQL Server 200815 explains how the SQL Server plan cache operates and how to avoid excessive recompilation.
The article Instant Initialization – What, Why and How?16 explains that instant file initialization (IFI) is very important for large databases. Note that implementing IFI requires consideration of the security implications.
Online indexing operations can help keep the database available while undergoing maintenance operations, as discussed in Online Indexing Operations in SQL Server 2005.17
Windows System Resource Manager 18 is a component that some customers use to manage the CPU resources for various instances of SQL Server executing under a Windows Server instance. Note that we recommend that you do not use Windows System Resource Manager to manage SQL Server memory usage. The SQL Server memory setting configuration should be used to manage the memory usage.
Case Studies and References
Examples of successful architectures are described in the following case studies and white papers:
bwin: Global Online Gaming Company Deploying SQL Server 2008 to support 100 Terabytes 19
The Progressive Group: Progressive Prepares for Future Growth, Gains Agility with SQL Server 2005 20
A Technical Case Study: Fast and Reliable Backup and Restore of Multi-Terabytes over the Network 21
Backup More Than 1GB per Second Using SQL2008 Backup Compression 22
Questions and Considerations
This section provides questions and issues to consider when working with your customers.
Experience shows that when new systems are tested and deployed, installations frequently encounter scalability issues in this sequence (though note that there are always exceptions):
Application scalability issues
Database schema and configuration issues
Internal database management system (DBMS) issues.
As you architect new systems, be watchful for factors that impact these areas.
Ask and look for data access patterns that adversely impact scalability, including the following:
A large number of concurrent users can lead to locking contention
Accessing unnecessary data can lead to wasted resources
Concurrently updating data can lead to contention
Appropriate indexing can avoid long-running queries and avoid blocking issues
Anticipate and design with these factors in mind (though note that this is not an exhaustive list). There are also issues internal to the DBMS that can prevent scalability (for example, latches, spin locks, and hash tables), and the application architect has little control over these. For information on how to anticipate and avoid the DBMS becoming a bottle neck for an application system, see the next consideration.
It is useful to perform an actual benchmark with some skeletal code to help ensure successful scalability; however, this may not be practical in many cases. Another good approach is to look for an existing equivalent benchmark; this is often a practical approach for an independent software vendor (ISV) application. If, for example, ISV package XYZ is used by installation A to serve 2 million users and installation B to serve 5 million users, you can conclude that with proper execution, installation C can also serve 4 million users and you can estimate the required hardware resources.
Do not forget the standard operations, like backup, restore, and CHECKDB, or features like database compression, backup compression, and resource governance. These can help with scalability and with meeting maintenance windows.
I/O subsystem performance has a major influence on scalability. For more information, see the SQL CAT blog post Analyzing I/O Characteristics and Sizing Storage Systems for SQL Server Database Applications.1
Appendix
Following are the full URLs for the hyperlinked text.
1 Analyzing I/O Characteristics and Sizing Storage Systems for SQL Server Database Applicationshttp://sqlcat.com/whitepapers/archive/2010/05/10/analyzing-i-o-characteristics-and-sizing-storage-systems-for-sql-server-database-applications.aspx
2 Data Compression: Strategy, Capacity Planning and Best Practiceshttp://sqlcat.com/whitepapers/archive/2009/05/29/data-compression-strategy-capacity-planning-and-best-practices.aspx
3 The Data Performance Guide 2008http://sqlcat.com/whitepapers/archive/2009/02/15/the-data-loading-performance-guide-2008.aspx
4 Tuning the Performance of Backup Compression in SQL Server 2008http://sqlcat.com/technicalnotes/archive/2008/04/21/tuning-the-performance-of-backup-compression-in-sql-server-2008.aspx
5 Tuning Backup Compression Part 2http://sqlcat.com/technicalnotes/archive/2009/02/16/tuning-backup-compression-part-2.aspx
6 Troubleshooting Performance Problems in SQL Server 2008http://sqlcat.com/whitepapers/archive/2009/04/14/troubleshooting-performance-problems-in-sql-server-2008.aspx
7 Resolving PAGELATCH Contention on Highly Concurrent INSERT Workloadshttp://sqlcat.com/technicalnotes/archive/2009/09/22/resolving-pagelatch-contention-on-highly-concurrent-insert-workloads-part-1.aspx
8 Managing SQL Server Workloads with Resource Governorhttps://msdn.microsoft.com/en-us/library/bb933866.aspx
9 Resource Governor and MAXDOP, Parallel plan generationhttps://blogs.msdn.com/b/psssql/archive/2011/02/03/resource-governor-and-maxdop-parallel-plan-generation.aspx
10 Scripts and Tools for Performance Tuning and Troubleshooting SQL Server 2005http://sqlcat.com/toolbox/archive/2008/02/21/scripts-and-tools-for-performance-tuning-and-troubleshooting-sql-server-2005.aspx
11 Top SQL Server 2005 Performance Issues for OLTP Applicationshttp://sqlcat.com/top10lists/archive/2007/11/21/top-sql-server-2005-performance-issues-for-oltp-applications.aspx
12 DBCC Checks and Terabyte-Scale Databaseshttp://sqlcat.com/technicalnotes/archive/2009/08/13/dbcc-checks-and-terabyte-scale-databases.aspx
13 Service Broker: Performance and Scalability Techniqueshttp://sqlcat.com/whitepapers/archive/2009/03/24/service-broker-performance-and-scalability-techniques.aspx
14 SQL Server 2005 Waits and Queueshttp://sqlcat.com/whitepapers/archive/2007/11/19/sql-server-2005-waits-and-queues.aspx
15 Plan Caching in SQL Server 2008https://msdn.microsoft.com/en-us/library/ee343986(SQL.100).aspx
16 Instant Initialization – What, Why and How?http://sqlskills.com/BLOGS/KIMBERLY/post/Instant-Initialization-What-Why-and-How.aspx
17 Online Indexing Operations in SQL Server 2005https://technet.microsoft.com/en-us/library/cc966402.aspx
18 Windows System Resource Managerhttps://technet.microsoft.com/en-us/library/cc755056.aspx
19 bwin: Global Online Gaming Company Deploying SQL Server 2008 to support 100 Terabyteshttps://www.microsoft.com/casestudies/Microsoft-SQL-Server-2008/bwin/Global-Online-Gaming-Company-Deploying-SQL-Server-2008-to-support-100-Terabytes/4000001470
20 The Progressive Group: Progressive Prepares for Future Growth, Gains Agility with SQL Server 2005https://www.microsoft.com/casestudies/Microsoft-SQL-Server-2005-Enterprise-X64-Edition/Progressive-Group-The/Progressive-Prepares-for-Future-Growth-Gains-Agility-with-SQL-Server-2005/4000002133
21 A Technical Case Study: Fast and Reliable Backup and Restore of Multi-Terabytes over the Networkhttp://sqlcat.com/whitepapers/archive/2009/08/13/a-technical-case-study-fast-and-reliable-backup-and-restore-of-a-vldb-over-the-network.aspx
22 Backup More Than 1GB per Second Using SQL2008 Backup Compressionhttp://sqlcat.com/msdnmirror/archive/2008/03/02/backup-more-than-1gb-per-second-using-sql2008-backup-compression.aspx