SQL Server's BLOB Data Types
Return of the BLOB
Use the new ADO 2.6 Stream object to keep BLOBs contained
Editor's Note: This article was adapted from the SQL Server 2000 Developer's Guide (Osborne/McGraw-Hill) by Paul Conte and Michael Otey.
When you're designing and programming databases, images and audio or video clips probably aren't among the first things that you consider storing. However, today's databases often need to store employee photos, product pictures, and the like. Even the ability to play on-demand video clips and MP3 files isn't uncommon for the current crop of database applications. Fortunately, SQL Server can handle these binary files as easily as it does the more traditional integer and character data types. Let's look at how SQL Server treats binary objects, how to use SQL Server to store binary data, and how to use these data types in your applications. You can use the ADO Stream object to import binary data into SQL Server. And you can use an ADO Recordset object to retrieve that data and display it in both a data-bound image control and an unbound Microsoft Hierarchical FlexGrid.
From a database perspective, a binary data object is known as a binary large object, or BLOB. As the acronym suggests, this type of data can take up a lot of space. Using the image, text, or ntext data types, SQL Server can store binary data in tables. Table 1, page 32, lists the attributes of the data types that SQL Server can use for BLOB storage.
The image data type is the primary BLOB-storage data type; you can use it to hold virtually any type of binary data. Text and ntext are much like the Microsoft Access Memo data type and are particularly useful for storing variable amounts of text data. For example, you would use SQL Server's text data type if you needed to store the entire contents of a text (.txt) file. The text and ntext data types also let you search column data for text strings. And full-text indexing, a new SQL Server 2000 feature, enables searching of binary data types by associating the column with its target application.
Besides these natural BLOB data types, a few other SQL Server data typesvarchar, nvarchar, and varbinarymight seem to be viable candidates for storing BLOB data because they store variable amounts of data. For the most part, though, these types won't work for BLOB storage. The primary difference between these and the usable data types is the total amount of data that each type can store, as Table 2, page 32, shows. The true BLOB data types that Table 1 shows can hold as much as 2GB of data, whereas the data types in Table 2 can store a maximum of 8KB. (For more information about how SQL Server handles BLOB storage, see the sidebar "SQL Server BLOB Storage," page 33.) Under certain circumstances, you can use the varbinary data type to store BLOBs because, like the image data type, it can store binary data. Although varbinary isn't a true BLOB data type, you can use varbinary fields to hold small binary objects such as icons, wave files, or small bitmaps for which you know the maximum amount of data is less than 8000 bytes.
In many ways, you can treat SQL Server's binary data typesimage, text, and ntextexactly like standard char, varchar, and int data types. Although the binary data types have some limitations that might surprise you at first, when you consider the nature of binary data, they make sense. First, you can't use SQL Server's binary data types as local variables in stored procedures or T-SQL batches. Likewise, SQL Server can't use them as part of an index or in WHERE, ORDER BY, COMPUTE, or GROUP BY clauses. The exception to this restriction is that you can use the text and ntext data types with the LIKE keyword in a WHERE clause.
T-SQL provides a few rudimentary functions for dealing with BLOB data, but the primary mechanism for accessing BLOB data stored in SQL Server is ADO. (For more information about the T-SQL commands that can access BLOB data, see the Web sidebar "T-SQL Functions That Work with Text and Image Data" at http://www.sqlmag.com, InstantDoc ID 20842.) ADO 2.6's Stream object lets you read and write the contents of binary columns. The Stream object is a powerful replacement for the GetChunk and AppendChunk methods, which have existed in ADO since version 1.0. Although GetChunk and AppendChunk are workable, they're awkward to use. (For more information about using GetChunk and AppendChunk to work with BLOBs, see "Who's Afraid of the Big, Bad BLOB?" April 1999.) The Stream object vastly simplifies and streamlines the process of using ADO to access binary data. Figure 1, page 33, illustrates the ADO 2.6 object model that contains the new Stream Object.
Importing BLOB Data into SQL Server
Before you think about retrieving binary data from SQL Server, you need to get that data into the database in the first place. Binary files are typically produced by applications outside of SQL Server and are most commonly stored as standard OS files. Some of the most common binary file types contain image data and carry .bmp, .jpeg, .gif, and .tif extensions. Other binary data takes the form of sound files, which typically have the file extension .wav or .mp3. Another type of binary data that you might want to store in SQL Server is video data, which consists of both image and sound data. Video files typically end in the extension .avi, .mpeg, or .asx. Although these various BLOB files contain different types of data, SQL Server stores them all in the same way. The code in Listing 1 illustrates how you can use the ADO Stream object to import a .jpeg image into a SQL Server table. (Although this discussion centers on multimedia data, you can use SQL Server's image data type to store any type of binary object, including Microsoft Word .doc files, PowerPoint .ppt files, or even .exe executable programs.)
The subroutine in Listing 1 begins by passing in an instance of an active Connection object; it then creates a Recordset object named rs and an instance of the Stream object named stm. The Stream object reads the binary data from the OS file, then writes the data to the SQL Server database. Next, the code drops and recreates a three-column table named BinaryObject in the Pubs database. The first column is an Identity column that gives each row a unique identifier; the second, a varchar column, contains the path and name of the file that you're exporting; and the third is an image column that holds the binary object.
One thing to note about storing binary data is that storing the original filename or the data type extension can be useful when an image column contains more than one type of data. Storing the file type gives you an easy way to determine how to process each type of binary information in your application. For example, by examining the extension, you can decide whether to use an image control to display the data or use Microsoft Windows Media Player to play it.
After the subroutine creates the ADO objects, the next section of code opens the rs Recordset object, which will insert data into the BinaryObject table. The Where 1=2 clause ensures that the code returns no extraneous data to the application. Then, the code uses the Stream object to read the binary file in from disk. First, the code sets the stm Stream object's Type property to adTypeBinary, which specifies that the Stream object will contain binary data. If the Stream object contained text data, you'd use the adTypeText constant. The Stream object's Open method then opens the stream, essentially preparing it to be written to or read from by the object's methods. In this example, the code uses the Stream object's LoadFromFile method to load the Stream object from disk. The LoadFromFile method takes one parameter, which provides the path and name of the OS file that will be loaded into the Stream object. Here, the file named BLOBSample.jpg in the current application path will be loaded into the Stream object.
After loading the Stream object, the code assigns the object's contents to the image column in the rs Recordset object. First, the code executes the Recordset object's AddNew method to create a new record buffer, then assigns values to the blob_filename and blob_object columns. The code sets the blob_filename column to the name of the file, then assigns the contents of the Stream object to the blob_object column. The stm Stream object's Read method reads the data out of the Stream object and into the ADO Field object. Then, the code calls the Recordset object's Update method to write the data to the SQL Server BinaryObject table. After writing the binary data, the code closes the Recordset and Stream objects and releases their resources.
Using a Bound Image Control
After you store the binary data in SQL Server, you can retrieve that data and use it in your applications. Using a bound control to display binary data is the easiest way to incorporate multimedia information into your applications. The code in Listing 2 shows how you can use an ADO Recordset object to retrieve binary data and display that data in a bound image control.
As in the previous example, the subroutine passes an active instance of the Connection object to the subroutine, then creates a new instance of the Recordset object. From the BinaryObject table, the Open method then selects one row, in which the value of the blob_id column equals 1. (This row will be the first row in the BinaryObject table.)
After populating the recordset, the code assigns the DataSource property of the imgBinaryData image control to the rs Recordset object. Then, it sets the imgBinaryData object's DataField property to the recordset's blob_object column, which contains the binary data. Assigning the DataSource and DataField properties automatically loads the image into the image control, where the application can display it. Note that each data-bound interface object can display only the data types that it supports. In the case of the image control, the object supports displaying bitmap, icon, metafile, JPEG, and GIF files. Figure 2, page 36, shows a binary image of my favorite SQL Server author, retrieved as a data-bound image.
Using an Unbound Hierarchical Grid Control
Bound controls make displaying binary image data from a SQL Server table easy, but not every control supports data binding. Sometimes even if the control supports data binding, you can't use data binding with that control because you might want to display the data differently than data binding allows. For example, when binding an image to a grid control, you can't control the size that the grid automatically assigns to the cell. The DisplayBLOBGrid subroutine, which Listing 3, page 36, shows, demonstrates how to retrieve binary image information from the Pubs database's pub_info table and display it in an unbound Hierarchical FlexGrid control.
In this listing, note that using ADO to display binary data is much like working with standard character data, but the two have some differences. This sample subroutine begins by creating the required ADO objectsin this case, a Recordset object, a Field object, and a Stream object. In this example, the Recordset object contains the pub_id and logo columns from the pub_info table. So far, the code seems fairly standard. However, the logo column is an image data type containing a graphical image for each publisher's logo. The ADO Field object represents the current column data, and the Stream object reads the binary data from the image column. Next, the code creates the sFileName String variable that will hold the temporary filename and sets the variable's value to tempBLOB.bmp, the name of the temporary file that will contain the BLOB images before the LoadPicture function loads them to the Hierarchical FlexGrid.
After opening the Recordset object, the code initializes the grid that it will use to display the data. First, it turns off the Redraw property to improve performance and clears the grid of any existing data. Then, the code initially sizes the grid for two columns and uses the ColWidth property of the two grid columns to set the initial column size. Because the first column is a standard character data type, the Field object's ActualSize property specifies the column width. However, because the second column contains graphical data, the ActualSize property won't reflect the length required to display the data. To ensure that the column is large enough to display the binary image, the code sizes the second column for 200 characters. Next, the grid's RowHeightMin property increases the height for all the grid's rows; in this case, the original row height is tripled. After setting up the grid, the code assigns the column headings for the grid, using a For Each loop to iterate through the Fields collection. The column headings for binary data types are the same as the column headings for standard character columns.
Next, a Do...Until loop reads all the Recordset object's rows. For every row, a For...Each loop retrieves the column data and puts it into the grid. Because binary data requires different handling than text data, the code checks each Field object's Type property before moving the data to the grid. If the Field object contains binary data, the Field object's Type property will equal the constant adLongVarBinary, and the code will use the Stream object, stm, to read data from the current Field object. Inside the With block, the code first sets the Stream object's Type property to adTypeBinary, specifying that the Stream object will contain binary data; then the Open method executes, followed by the Write method. In this case, the Write method writes the fld Field object's value into the Stream object. Next, the Stream object's SaveToFile method writes the Stream object's contents into the temporary OS file that the code identified in the sFileName variable. Using the value of adSaveCreateOverWrite in the second parameter of the SaveToFile method ensures that running this code will overwrite any existing file with the same name. After the code writes the binary data to the temporary file, the LoadPicture function reads the binary data from the temporary file and loads it into the grid's CellPicture property. Figure 3 shows the results of the DisplayBLOBGrid subroutine.
ADO and SQL Server can handle BLOB objects just as well as they handle traditional text and numeric data types. Using ADO 2.6's new Stream object makes working with binary data easier than ever. Although dealing with binary objects is more difficult than handling standard data types, the ability to add graphics and sounds to your database applications makes the effort worthwhile.
Copyright © 2002 Penton Media, Inc. All rights reserved.