TechNet Magazine > Home > Issues > 2008 > September >  Use a disconnected recordset to sort large data...
Hey, Scripting Guy! Stay Connected with Your Toaster
The Microsoft Scripting Guys

Code download available at: HeyScriptingGuy2008_09.exe (150 KB)

If there's anything that sums up this modern world of ours it's these two words: stay connected. Thanks to cell phones, you no longer have to be home in order for people to give you a call; they can reach you wherever you are, at any time of the day or night. (Oh. How … nice.) Thanks to wireless computing, you don't have to be at the office to do your work; now you can work from home, from the beach, from just about anywhere you can think of.
True story: the Scripting Editor's parents recently went on a camping trip, but they were forced to rough it—just like the great explorers Lewis and Clark—when they ran into problems connecting to the campground's wireless network. Thank goodness the satellite TV still worked!
And that's not the half of it. GPS devices let you know exactly where you stand, within a matter of feet; depending on the device, they can also let other people know exactly where you stand. (The old phrase "you can run but you can't hide" was never more true than it is today.) If he wanted to, the Scripting Guy who writes this column could have his checking account phone him every time a check clears; likewise, he could have his car e-mail him monthly status reports. As if that wasn't enough, the toaster has offered to walk the dog and water the plants any time he goes on vacation.
Well, OK, so maybe that last part hasn't happened—yet. But if he wanted to, the Scripting Guy could buy an Internet-enabled toaster. Then he could phone his toaster on his way home and have piping hot toast waiting for him as he walked through the door. To be honest, we don't know why you'd want to have piping hot toast waiting for you when you walk through the door. But if you did …
Of course, if everyone's goal is to stay connected, it should come as no surprise that the Scripting Guys—who have never been slaves to trendiness—advocate that you become more disconnected. Does that mean that the Scripting Guys recommend you throw away your cell phone or laptop computer? No; even the Scripting Guys are smarter than that. However, what they are advocating is that you add disconnected recordsets to your scripting arsenal. But if you want to throw away your cell phone or laptop computer, well, we won't stop you.
Note According to a survey taken by Harris Interactive, 43 percent of Americans have used a laptop computer while on vacation in order to check and send work-related e-mail. And more than 50 percent of Americans use their cell phones while on vacation to check their e-mail and/or voicemail. And that doesn't include the 40 percent of Americans who take no vacation over the course of a year.
It goes without saying that a lot of people would happily add disconnected recordsets to their scripting arsenal, except for one thing: they have no idea what a disconnected recordset is. Well, in case you aren't familiar with the concept, a disconnected recordset is (more or less) a database table that isn't tied to an actual database; instead, it's created by a script, it exists only in memory, and it disappears the moment the script ends. In other words, a disconnected recordset is a made-up data structure that exists for only a few minutes and then disappears, taking your data with it. Gee, that sounds really useful, Scripting Guys. Thanks for your help!
OK, we admit it: disconnected recordsets don't sound too terribly exciting. And, the truth is, they aren't. But they can be extremely useful. As veteran VBScript writers know all too well, VBScript doesn't exactly have the best data-sorting capabilities in the world. (Well, not unless you consider no data-sorting capabilities to be the best data-sorting capabilities in the world.) Likewise, VBScript's ability to deal with large sets of data is limited, at best. Outside of the Dictionary object (which restricts you to working with items that have, at most, two properties) or the array (which is largely limited to single-property lists of data), well … that's about it.
The disconnected recordset lets you take care of both of those issues (and more). Need to sort your data, particularly multi-property data? No problem; like we said, a disconnected recordset is the virtual equivalent of a database table, and there's nothing easier in this world than sorting a database table. (OK, if you want to get picky about it, we suppose that not sorting a database table is easier than sorting a database table.) Or perhaps you've got a large set of items, items with multiple properties, that you need to keep track of? No problem; did we mention that a disconnected recordset is the virtual equivalent of a database table? Do you need to filter that information in some way, or maybe search that data for a specific value? Oh, if only there were some way to use the virtual equivalent of a database table …
Good point: maybe it's about that time to just show you what we're talking about. (Assuming we even know what we're talking about.) For starters, let's say we have the baseball statistics shown in Figure 1, statistics gleaned from the Web site MLB.com and stored in a tab-separated values file, C:\Scripts\Test.txt.
Player Home Runs RBI Average
D Pedroia 4 28 .276
K Kouzmanoff 8 25 .269
J Francouer 7 35 .254
C Guzman 5 20 .299
F Sanchez 2 25 .238
I Suzuki 3 15 .287
J Hamilton 17 67 .329
I Kinsler 7 35 .309
M Ramirez 12 39 .295
A Gonzalez 17 55 .299
That's all well and good, but suppose what we'd really like to do is show this list of players sorted by the number of home runs they've hit. Can a disconnected recordset help us with something like that? We're about to find out; take a look at Figure 2. Yes, there is a lot of code here, isn't there? But don't worry; as you'll soon see, its bark is far worse than its bite.
Const ForReading = 1
Const adVarChar = 200
Const MaxCharacters = 255
Const adDouble = 5

Set DataList = CreateObject("ADOR.Recordset")
DataList.Fields.Append "Player", _
  adVarChar, MaxCharacters
DataList.Fields.Append "HomeRuns", adDouble
DataList.Open

Set objFSO = _
  CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile _
  ("C:\Scripts\Test.txt", ForReading)

objFile.SkipLine

Do Until objFile.AtEndOfStream
    strStats = objFile.ReadLine
    arrStats = Split(strStats, vbTab)

    DataList.AddNew
    DataList("Player") = arrStats(0)
    DataList("HomeRuns") = arrStats(1)
    DataList.Update
Loop

objFile.Close

DataList.MoveFirst

Do Until DataList.EOF
    Wscript.Echo _
        DataList.Fields.Item("Player") & _
        vbTab & _
        DataList.Fields.Item("HomeRuns")
    DataList.MoveNext
Loop
To begin, we define four constants:
  • ForReading. We'll use this constant when we open, and read from, the text file.
  • adVarChar. This is a standard ADO constant for creating a field that uses the Variant data type.
  • MaxCharacters. This is a standard ADO constant used to indicate the maximum number of characters (in this case, 255) that a Variant field can hold.
  • adDouble. One final ADO constant, for creating a field that uses a double (numeric) data type.
After defining the constants, we encounter this block of code:
Set DataList = CreateObject _
    ("ADOR.Recordset")
DataList.Fields.Append "Player", _
    adVarChar, MaxCharacters
DataList.Fields.Append "HomeRuns", _
    adDouble
DataList.Open
This is the part of the script where we actually set up and configure our disconnected recordset. To accomplish this task, the first thing we do is create an instance of the ADOR.Recordset object; needless to say, that creates our virtual database table (that is, our disconnected recordset).
We then use this line of code (and the Append method) to add a new field to the recordset:
DataList.Fields.Append "Player", adVarChar, MaxCharacters
As you can see, there's nothing fancy going on here at all: we simply call the Append method followed by three parameters:
  • The name of the field (Players).
  • The data type for the field (adVarChar).
  • The maximum number of characters that can be stored in the field (MaxCharacters).
After adding the Players field, we can add a second field: HomeRuns, which has a numeric (adDouble) data type. When we've finished with that task, we then call the Open method to declare our recordset open and ready for business.
Next we create an instance of the Scripting.FileSystemObject and open the file C:\Scripts\Test.txt. This portion of the script actually has nothing to do with the disconnected recordset; it's there only because we need to retrieve data from a text file. The first line in the text file contains our header information:
Player     Home Runs     RBI        Average
We don't need this information for our recordset, so the first thing we do after opening the file is call the SkipLine method to skip this first line:
objFile.SkipLine
Now that we've moved to the first line with actual data in it, we set up a Do Until loop designed to let us read the rest of the file line by line. Each time we read in a line from the file, we store that value in a variable named strLine, then use the Split function to convert that line to an array of values (by splitting the line each time we encounter a tab):
arrStats = Split(strStats, vbTab)
Admittedly, that's kind of a quick overview, but we expect that, by now, most of you are pretty good at retrieving information from text files. To make a long story short, the first time through the loop the array, arrStats, will contain the items in Figure 3.
Item Number Item Name
0 D Pedroia
1 4
2 28
3 .276
Now we're ready to have some fun:
DataList.AddNew
DataList("Player") = arrStats(0)
DataList("HomeRuns") = arrStats(1)
DataList.Update
Here we're adding the information for player 1 (D Pedroia) to the disconnected recordset. To add a record to the recordset, we begin by calling the AddNew method; this creates a new, blank record for us to work with. We use the next two lines of code to assign values to two recordset fields (Player and HomeRuns), then we call the Update method to officially write that record to the recordset. And then it's back to the top of the loop once more, where we repeat the process with the next line—the next player—in the text file. See? There might be lots of code here, but it's all pretty simple and straightforward.
So what happens after all the players have been added to the recordset? Well, after we close the text file, we execute this block of code:
DataList.MoveFirst

Do Until DataList.EOF
  Wscript.Echo _
    DataList.Fields.Item("Player") & _
    vbTab & _
    DataList.Fields.Item("HomeRuns")
  DataList.MoveNext
Loop
In line 1 we use the MoveFirst method to position the cursor at the beginning of the recordset; if we don't do that, we run the risk of showing only some of the data in the recordset. We then set up a Do Until loop that will continue until we've run out of data (that is, until the recordset's EOF—end-of-file—property is True).
Inside that loop, all we do is echo back the values of the Player and HomeRuns fields (note the somewhat odd syntax used to indicate a particular field: DataList.Fields.Item("Player"). And then we simply call the Move­Next method to move on to the next record in the recordset.
Needless to say, that was really easy. When all is said and done we should get back the following:
D Pedroia       4
K Kouzmanoff    8
J Francouer     7
C Guzman        5
F Sanchez       2
I Suzuki        3
J Hamilton      17
I Kinsler       7
M Ramirez       12
A Gonzalez      17
As you can see, that's—well, come to think of it, that's really not all that good, is it? Granted, we did get back the players' names and home run totals, but we didn't get those home run totals in sorted order. Dang; why didn't the disconnected recordset sort our data for us?
There's actually a good reason: we didn't tell the recordset which field we wanted to sort on. But that's easy enough to correct: just modify the script to add sorting information right before the call to the MoveFirst method. In other words, make that part of the script look like this:
DataList.Sort = "HomeRuns"
DataList.MoveFirst
Obviously there's no trick involved here; we simply assign the field HomeRuns to the Sort property. Now take a look at the output we get when we run the script:
F Sanchez       2
I Suzuki        3
D Pedroia       4
C Guzman        5
J Francouer     7
I Kinsler       7
K Kouzmanoff    8
M Ramirez       12
J Hamilton      17
A Gonzalez      17
Much better. Well, except for one thing: usually home run totals are listed in descending order, with the player having the most home runs coming first. Is there any way to sort a disconnected recordset in descending order?
Of course there is; all we have to do is tack on the helpful DESC parameter, like so:
DataList.Sort = "HomeRuns DESC"
And what does the DESC parameter do for us? You got it:
A Gonzalez      17
J Hamilton      17
M Ramirez       12
K Kouzmanoff    8
I Kinsler       7
J Francouer     7
C Guzman        5
D Pedroia       4
I Suzuki        3
F Sanchez       2
Incidentally, it's perfectly legal to sort on multiple properties; all you have to do is assign each of those properties to the sort order. For example, suppose you'd like to sort first on home runs and then on RBIs. No problem; this command will do the trick:
DataList.Sort = "HomeRuns DESC, RBI DESC"
Give it a try and see for yourself. It's not as much fun as checking your e-mail while on vacation, but it's close.
Note Keep in mind that you can't sort on a field that hasn't been added to the recordset. What does that mean? It means that before you add a property such as RBI to the Sort statement, you need to add these lines to your script in the appropriate locations:
DataList.Fields.Append "RBI", adDouble

DataList("RBI") = arrStats(2)
And if you want to look at the output, you also need to modify your Wscript.Echo statement:
Wscript.Echo _
  DataList.Fields.Item("Player") & _
  vbTab & _
  DataList.Fields.Item("HomeRuns") & _
  vbTab & DataList.Fields.Item("RBI")
Let's see, what else can we do with disconnected recordsets? Oh, here's something. Suppose we retrieve all the information for all the players and then sort that data by batting average. (Among other things, that means we need to modify our original script to create fields named RBI and Batting­Average.) The output looks like this:
J Hamilton      17      67      0.329
I Kinsler       7       35      0.309
A Gonzalez      17      55      0.304
C Guzman        5       20      0.299
M Ramirez       12      39      0.295
I Suzuki        3       15      0.287
D Pedroia       4       28      0.276
K Kouzmanoff    8       25      0.269
J Francouer     7       35      0.254
F Sanchez       2       25      0.238
That's fine, but what if we just want a list of the players hitting .300 or better? How can we limit the displayed data just to the players who match some specified criteria? Well, one way is to assign a Filter to the recordset:
DataList.Filter = "BattingAverage >= .300"
A recordset filter serves the same general purpose as a database query: it provides a mechanism for limiting returned data to a subset of all the records in the recordset. In this case we're simply asking the Filter to weed out all the records except those where the Batting­Average field has a value greater than or equal to .300. And guess what? Filter will do exactly what we ask it to:
J Hamilton      17      67      0.329
I Kinsler       7       35      0.309
A Gonzalez      17      55      0.304
If only our kids would respond that same way, eh?
By the way, you can use multiple criteria in a single filter. For example, this command limits the returned data to records where the BattingAverage field is greater than or equal to .300 and the HomeRuns field is greater than 10:
DataList.Filter = _
  "BattingAverage >= .300 AND HomeRuns > 10"
By contrast, this filter limits data to records where the BattingAverage field is greater than or equal to .300 or the HomeRuns field is greater than 10:
DataList.Filter = "BattingAverage >= .300 OR HomeRuns > 10"
Give both of those a try and you'll see what the difference is. And what the heck: just for fun, here's another filter you can try:
DataList.Filter = "Player LIKE 'I*'"
As it turns out, you can also use wildcards in your filters. To do so, use the LIKE operator (as opposed to the equals sign) and then use the asterisk the same as you would when carrying out an MS-DOS® command like dir C:\Scripts\*.txt. In the preceding example, we should get back a list of players whose name starts with the letter I; that's because the syntax we employed says, "Show me a list of all the records where the value of the Player field begins with an I and then is followed by, well, anything." Give it a—well, OK; you know the routine by now.
By the way, you aren't stuck with batting averages like 0.309, either. (Typically batting averages are expressed without the leading 0, .309, for example.) But that's fine; you can simply use the FormatNumber function to format the batting average any old way you want:
FormatNumber (DataList.Fields.Item("BattingAverage"), 3, 0)
Simply include this function in your Wscript.Echo statement when you display the number (or alternatively, you could assign the output to a variable and put the variable in your Echo statement):
Wscript.Echo _
  DataList.Fields.Item("Player") & _
  vbTab & _
  DataList.Fields.Item("HomeRuns") & _
  vbTab & DataList.Fields.Item("RBI") & _
  vbTab & _
  FormatNumber _
  (DataList.Fields.Item("BattingAverage"), _
   3, 0)
Fun stuff, huh?
Unfortunately, though, it looks like we're out of time for this month. In summary, we'd just like to say that—sorry, the phone is ringing.
Anyway, we wanted to note that—oh, great, now the cell phone is ringing. And we just got e-mail from the toaster. Important stuff: our piping hot toast is ready and do we want butter or jam? Gotta go, but we'll see you all next month!
Dr. Scripto's Scripting Perplexer
The monthly challenge that tests not only your puzzle-solving ability but also your scripting skills.

September 2008: Scripting Search
Here's a simple (or maybe not so simple) word search. Find all the VBScript functions and statements from the list. But here's a twist: the remaining letters will spell out a hidden word, which just so happens to be—you guessed it—a Windows PowerShell™ cmdlet!
Word List: Abs, Array, Atn, CCur, CLng, CInt, DateValue, Day, Dim, Else, Exp, Fix, InStr, IsEmpty, IsObject, Join, Len, Log, Loop, LTrim, Mid, Month, MsgBox, Now, Oct, Replace, Set, Sin, Space, Split, Sqr, StrComp, String, Timer, TimeValue, WeekdayName.

Show Answer

The Scripting Guys work for—well, are employed by—Microsoft. When not playing/coaching/watching baseball (and various other activities), they run the TechNet Script Center. Check it out at www.scriptingguys.com.

Page view tracker