Relational (OLTP)---a Technical Reference Guide for Designing Mission-Critical OLTP Solutions
Want more guides like this one? Go to Technical Reference Guides for Designing Mission-Critical Solutions.
With the advent of relational databases in the 1970s, when Dr. E. F. Codd11 published his original work, the data types that were initially supported were well-structured types such as integer, decimal, float, and character. Most applications continue to store these data types for computation. However, non-structured data is now being captured and stored along with the traditional relational data. The schemas for the relational data predominantly remain the third normal form (3NF), and denormalization is occasionally adopted for better performance. The balanced B-tree indexes that are used for relational data are supplemented with sparse indexing, and row-level storage is enhanced with column-oriented storage. Volumetric statistics about the tables, rows, and columns are the foundation for the SQL Server Query Optimizer. However, volumetric statistics are being supplemented by "hints" when the complexity of the query is overwhelming and the optimizer needs manual help to provide consistent, good performance over a broad range of data that might not be distributed uniformly.
The following resources provide general reference material. (Note that the full URLs for the hyperlinked text are provided in the Appendix at the end of this document.)
General Index Design Guidelines
1 is a good article to review for information about relational systems.
Statistics are one of the more important features impacting the performance of Microsoft SQL Server. For more information, see Using Statistics to Improve Query Performance.2
The article Comparing Tables Organized with Clustered Indexes versus Heaps3 shows why the best practices for SQL Server deployments require the specification of a clustered index.
Using Sparse Columns
4 in SQL Server 2008 R2 Books Online provides useful insights on the uses of sparse columns and indexes.
The article Using Filtered Statistics with Partitioned Tables5 discusses the control of query resource consumption.
The article The Many Benefits of Money…Data Type!6discusses the money data type.
Best Practices for Semantic Data Modeling for Performance and Scalability
7 contains best practices, in addition to some theoretical information and some practical hints.
Case Studies and References
Examples of successful architectures are described in the following case studies and white papers:
A relational model of data for large shared data banks
8 by Dr. E. F. Codd
bwin: Global Online Gaming Company Deploying SQL Server 2008 to Support 100 Terabytes
ServiceU Corporation: Online Event Software Gets Disaster Recovery with SQL Server 2005 Database Mirroring
The Progressive Group: Progressive Prepares for Future Growth, Gains Agility with SQL Server 2005
Questions and Considerations
This section provides questions and issues to consider when working with your customers.
Customers have found the articles on SQL Server Performance.com12 useful in relation to performance tuning of SQL Server along, with the blogs from SQL Server MVPs such as Kimberly Tripp's blog13 and the blog aggregation on SQL Blog.com.14
To effectively architect and design solutions, you should understand the type and volume of the data to be stored. If there is a need to store non-relational data in addition to the relational data, review the "Beyond Relational" Technical Reference Guide.
Microsoft StreamInsight which was introduced in SQL Server 2008 R2 should be considered in addition to a relational data store if any of the following considerations apply:
Is the data coming from process, manufacturing, or other real time systems?
Is rapid computation is required on a stream of data?
Is it possible that a "burst" of data may arrive out of sequence but still need to be processed as though it was received in order?
Is there a need for computation of results over a time window such as the need for the calculation of running averages, maximum values, or minimum values for the last three minutes of input streaming data?
- Is the data coming from process, manufacturing, or other real time systems?
Understand the characteristics of the data to see if you have sparse data. You might be able to use filtered indexes that are small, concise, and that provide good performance.
In general, relational data compresses well; compression ratios of 70% to 80% have been observed. Generally, for read-oriented workloads, compressed objects appear to provide good performance. If appropriate, ask for a sample of the data to see if it is possible to actually load and examine the compression that can be achieved.
Pure relational systems have been around for long time, with documented best practices for design, performance, availability, and manageability readily available in text books and SQL Server documentation. The importance of physical database design, application code interaction, and operational best practices is well understood, but at times, it is skipped because of time pressure. Be on the lookout for these elements.
Ask about the usage pattern and volumetric information to understand if the system needs to be designed for scale-out to meet the potential user and usage growth. Will a traditional scale-up system suffice?
When designing high-throughput transactional systems, choose the data access library carefully. For high-volume systems consider using pure ADO.NET. The ADO.NET Entity Framework and similar libraries can save development time, but it can be difficult to control the exact queries issued by the system.
Following are the full URLs for the hyperlinked text.
1 General Index Design Guidelineshttp://msdn.microsoft.com/en-us/library/ms191195.aspx
2 Using Statistics to Improve Query Performancehttp://msdn.microsoft.com/en-us/library/ms190397.aspx
3 Comparing Tables Organized with Clustered Indexes versus Heapshttp://sqlcat.com/whitepapers/archive/2007/12/16/comparing-tables-organized-with-clustered-indexes-versus-heaps.aspx
4 Using Sparse Columnshttp://msdn.microsoft.com/en-us/library/cc280604.aspx
5 Using Filtered Statistics with Partitioned Tableshttp://sqlcat.com/msdnmirror/archive/2009/10/20/using-filtered-statistics-with-partitioned-tables.aspx
6 The Many Benefits of Money…Data Type!http://sqlcat.com/technicalnotes/archive/2008/09/25/the-many-benefits-of-money-data-type.aspx
7 Best Practices for Semantic Data Modeling for Performance and Scalabilityhttp://sqlcat.com/whitepapers/archive/2008/09/03/best-practices-for-semantic-data-modeling-for-performance-and-scalability.aspx
8 A relational model of data for large shared data banks by Dr. E. F. Coddhttp://portal.acm.org/citation.cfm?doid=362384.362685
9 bwin: Global Online Gaming Company Deploying SQL Server 2008 to Support 100 Terabyteshttp://www.microsoft.com/casestudies/Microsoft-SQL-Server-2008/bwin/Global-Online-Gaming-Company-Deploying-SQL-Server-2008-to-support-100-Terabytes/4000001470
10 ServiceU Corporation: Online Event Software Gets Disaster Recovery with SQL Server 2005 Database Mirroringhttp://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=49683
11 The Progressive Group: Progressive Prepares for Future Growth, Gains Agility with SQL Server 2005http://www.microsoft.com/casestudies/Microsoft-SQL-Server-2005-Enterprise-X64-Edition/Progressive-Group-The/Progressive-Prepares-for-Future-Growth-Gains-Agility-with-SQL-Server-2005/4000002133
12 SQL Server Performance.comhttp://www.sql-server-performance.com/
13 Kimberly Tripp's Bloghttp://www.sqlskills.com/BLOGS/KIMBERLY
14 SQL Blog.comhttp://sqlblog.com