Custom Architecture (DW)---a Technical Reference Guide for Designing Mission-Critical DW Solutions
|
Want more guides like this one? Go to Technical Reference Guides for Designing Mission-Critical Solutions. |
Customer Microsoft SQL Server architectures for data warehousing (DW) cover a full range of configurations. Often, the configuration is predicated by the "deal" that can be made between the client and the HW Vendor, and not always on carefully analyzed, business-driven, balanced design based on DW workloads.
Best Practices
The following resources provide reference material and additional information.
-
Hardware 101 for SQL Server DBAs
1
-
Solid State Disk Drive Considerations for SQL Server
2
-
Benchmarks
3
-
TPC-H - All Results - Sorted by Database Vendor Version 2 Results
4
-
Storage Top 10 Best Practices
5
-
Top 10 SQL Server 2005 Performance Issues for Data Warehouse and Reporting Applications
6
-
Disk Partition Alignment Best Practices for SQL Server
7
-
SQLIO Disk Subsystem Benchmark Tool
8
-
Strategies for Partitioning Relational Data Warehouses in Microsoft SQL Server
9
-
Best Practices for Data Warehousing with SQL Server 2008
10
-
High Performance Data Warehouse with SQL Server 2005
11
Case Studies and References
This section provides questions and issues to consider when working with your customers.
-
Determine the customer’s hardware vendor of choice.
-
Understand the customer’s data center status:
-
Availability to accommodate special HW for Data Warehouse.
-
Will the DW have to share a SAN or can it use dedicated storage?
-
Availability to accommodate special HW for Data Warehouse.
-
Determine all connectivity requirements (e.g. connection managers, OLEDB, ODBC, JDBC, and so on).
-
Understand source database design (Star Schema, normalized).
-
Understand if the client will rely on Data Extract (e.g. BCP) versus ETL/ELT (e.g. SSIS).
-
For BCP Extracts, determine where data will land:
-
Storage platform and architecture.
-
Can extracts be multi-threaded?
-
Storage platform and architecture.
-
Determine ETL/ELT requirements with regards to:
-
Data volume.
-
Data “latency” or timing (how frequently does data need to be loaded.)
-
Transformation complexity (from simple table lookups to complex business-oriented /data enrichment transforms).
-
Data volume.
-
Determine geographical proximity of SQL Server source and target systems.
-
Understand characteristics of the target (is it another SQL Server instance, or a flat file, or another RDBMS or analytical platform (e.g. SSAS)?).
Questions and Considerations
This section provides questions and issues to consider when working with your customers.
-
Understand the DW workload in terms of data loading and query workloads.
-
Determine data latency requirements in conjunction with query/load concurrency.
-
Workload requirements may vary by user/time of day/ query type and must be governed to prevent exhausting the system for a single query.
-
Understand data security (row-level, column-level) requirements.
-
Understand High Availability/Disaster Recovery requirements.
-
Determine database backup/restore requirements in terms of frequency/storage.
-
Develop an ongoing maintenance strategy for performance monitoring and tuning.
-
Determine the technology landscape in terms of legacy RDBMS systems, ETL tools, BI tools.
-
Determine deployment strategy:
-
System Sizing Considerations
-
System Configuration
-
Software Installation/Configuration
-
Physical DW Database Design
-
Metadata Management
-
Designing the ETL system
-
Developing BI Applications
-
Developing the security model
-
Operations Management
-
Managing Growth
-
System Sizing Considerations
Appendix
Following are the full URLs for the hyperlinked text:
1 Hardware 101 for SQL Server DBAs http://www.mssqltips.com/tip.asp?tip=1331
2 Solid State Disk Drive Considerations for SQL Server http://www.mssqltips.com/tip.asp?tip=1389
3 Benchmarks www.microsoft.com/sqlserver/2008/en/us/benchmarks.aspx
4 TPC-H - All Results - Sorted by Database Vendor Version 2 Results http://www.tpc.org/tpch/results/tpch_results.asp?orderby=dbms
5 Storage Tope 10 Best Practices http://sqlcat.com/top10lists/archive/2007/11/21/storage-top-10-best-practices.aspx
6 Top 10 SQL Server 2005 Performance Issues for Data Warehouse and Reporting Applications http://technet.microsoft.com/en-us/library/cc917690.aspx
7 Disk Partition Alignment Best Practices for SQL Server http://technet.microsoft.com/en-us/library/dd758814%28SQL.100%29.aspx
8 SQLIO Disk Subsystem Benchmark Tool http://www.microsoft.com/downloads/en/details.aspx?familyid=9A8B005B-84E4-4F24-8D65-CB53442D9E19&displaylang=en
9 Strategies for Partitioning Relational Data Warehouses in Microsoft SQL Server http://technet.microsoft.com/en-us/library/cc966457.aspx
10 Best Practices for Data Warehousing with SQL Server 2008 http://msdn.microsoft.com/en-us/library/cc719165%28v=sql.100%29.aspx
11 High Performance Data Warehouse with SQL Server 2005 http://www.microsoft.com/downloads/en/details.aspx?FamilyID=8D708350-4867-4447-AF69-B9AF2C2769F9&displaylang=en
12 SQL Server 2008 hardware and software requirements http://searchsystemschannel.techtarget.com/generic/0,295582,sid99_gci1372853,00.html
13 EMC Celerra Unified Storage Platforms: Reference Architecture http://www.emc.com/collateral/hardware/technical-documentation/h2970-emc-celerra-ns20-iscsi-ref-arch.pdf
