Beyond Relational (OLTP)---a Technical Reference Guide for Designing Mission-Critical OLTP Solutions
Want more guides like this one? Go to Technical Reference Guides for Designing Mission-Critical Solutions.
Almost all tier-one deployments have relational characteristics. However, some also have a significant emphasis on data types such as binary large objects (BLOBs), spatial, and XML, with associated full-text search or other specialized processing. Such deployments require merging considerations for both relational and "beyond relational" data types, and benefit from the best practices from both disciplines. With beyond relational data types, it is not unusual to see tens, hundreds, and even thousands of terabytes of data. As the data size and the number of users increase, adoption of scale-out architectures, as expected, are often seen. For example, one customer (London Drugs) manages in excess of 30 TB of data. They started with scale-up architecture, but decided to adopt scale-out because of the size and growth of beyond relational content.
The following resources provide reference material about beyond relational data. (Note that the full URLs for the hyperlinked text are provided in the Appendix at the end of this document.)
The white paper Managing Unstructured Data with SQL Server 20081 provides a 200-level discussion of FILESTREAM and Remote BLOB Storage (RBS).
The white paper Delivering Location Intelligence with Spatial Data2 provides a 200-level discussion of the spatial data type.
Case Studies and References
Examples of successful architectures are described in the following case studies and white papers:
The case study NewsGator: RSS Aggregator NewsGator Manages 2.5 Billion Articles with SQL Server 20083 describes the use of integrated full-text search (iFTS).
The case study McLaren Electronics Systems: McLaren Electronics Fuels Analysis of Formula One Racing Data with SQL Server4 describes the use of FILESTREAM.
The case study FileControl Partners: Faster Searches of 1 Terabyte of Litigation Documents Gained with SQL Server 20055 describes the use of full text indexing.
Questions and Considerations
This section provides questions and issues to consider when working with your customers.
Is the nature of the data unstructured or semi-structured, and does the application require "relational-like" functionality?
For any given beyond relational feature, pay attention to the latency and volume requirements at different points in time. This is because beyond relational features (data types) are typically complex and/or large, demanding that attention be applied along the consumption path. This is different from a typical relational implementation, in which data tends to be atomic, small in size, and well aligned with common data types across the consumption path. For example, designing for the reading or writing of an XML or varchar(max) value is quite different than it is for an integer value.
When thinking about beyond relational formats and design considerations, or even about feature selection, consider the fact that data will be consumed differently in different scenarios:
In cache (in memory without atomicity, consistency, isolation, and durability [ACID] properties)
In memory with ACID properties
Persistent on a disk with ACID properties
- In stream
Fully understand the scalability and performance requirements.
Brainstorm with the customer to see if Windows Azure and/or SQL Azure can be used to help meet scalability challenges; Windows Azure and SQL Azure can help provide flexible scalability.
Investigate if scale-out architecture is possible for the database tier. Scale-out architectures are covered in other Technical Reference Guides.
Fully understand the user scenarios and the expected volumes. Experience with customers like MySpace indicates that when unexpected growth occurs, it is necessary to be flexible and adopt different architectures. The changes are likely to be minimized if scale-out architecture is adopted from the beginning.
Following are the full URLs for the hyperlinked text.
1 Managing Unstructured Data with SQL Server 2008http://www.microsoft.com/sqlserver/2008/en/us/wp-sql-2008-manage-unstructured.aspx
2 Delivering Location Intelligence with Spatial Datahttp://www.microsoft.com/sqlserver/2008/en/us/wp-sql-2008-spatial-data.aspx
3 NewsGator: RSS Aggregator NewsGator Manages 2.5 Billion Articles with SQL Server 2008http://download.microsoft.com/download/3/5/8/35802290-a7f6-4976-8855-74c8b3b7f035/NewsGator-SQLServer2008CaseStudy02-24-08b.docx
4 McLaren Electronics Systems: McLaren Electronics Fuels Analysis of Formula One Racing Data with SQL Serverhttp://www.microsoft.com/casestudies/Microsoft-SQL-Server-2008/McLaren-Electronic-Systems/McLaren-Electronics-Fuels-Analysis-of-Formula-One-Racing-Data-with-SQL-Server/4000001476
5 FileControl Partners: Faster Searches of 1 Terabyte of Litigation Documents Gained with SQL Server 2005http://www.microsoft.com/casestudies/Microsoft-Visual-Studio-6.0/FileControl-Partners/Faster-Searches-of-1-Terabyte-of-Litigation-Documents-Gained-with-SQL-Server-2005/200075