Hey, Scripting Guy!Chasing Cars... and XML

The Microsoft Scripting Guys

Download the code for this article: HeyScriptingGuy2007_02.exe (150KB)

"I DON’T KNOW why my dog chases cars," goes the old joke. "After all, what would he do with it if he ever actually caught one?"

Good question, and one that the Scripting Guys can’t answer. Well, unless it happens to be Lucy, the dog down the street, who catches the car. We have no doubt at all that that dog would go rob a liquor store and then use the car to make her getaway. And based on the condition of the front lawn, we’re willing to bet she wouldn’t visit any rest stops along the way, if you know what we mean.

If we had to update that old chestnut for today’s high-tech era, it might go something like this: "I don’t know why my system administrator keeps trying to learn XML. After all, what would she do with it if she ever actually learned it?"

Note In our defense, we only said we were going to update that tired old joke; we never said we were going to make it funny.

To be honest, XML might be a somewhat over-hyped technology, at least when it comes to system administration. The truth is, many system administrators have lived long and happy lives without ever writing a script that interacts with an XML file, and we don’t expect that to change any time in the near future.

That said, it’s also true that more and more applications are adopting XML as a standard for data storage. And why not? As nothing more than gussied-up text files, these XML data files are quick and easy to create, are portable across platforms, and require no complicated (not to mention expensive) database program. Got an operating system and a text editor? Then you can create an XML database.

Which means, of course, that there is at least one thing a system administrator can do with XML: write scripts that query an XML file as if that file were a full-fledged database. And, hey, who better than the Scripting Guys to give you a few pointers on doing that very thing?

Well, OK, good point. But if we had to venture a guess, we’d bet that Lucy is probably way too busy digging up someone’s flowerbed to write a column about XML.

To begin with, let’s take a look at a simple XML file (see Figure 1), a database file featuring four scripts found in the daily Hey, Scripting Guy! column.

Figure 1 Hey, Scripting Guy! Scripts in XML

<?xml version="1.0" encoding="ISO-8859-1"?>
<Repository>
  <Script>
    <Category>Microsoft Office</Category>
    <Subcategory>Microsoft Access</Subcategory>
    <Keyword>databases</Keyword>
    <Title>How Can I Print a Microsoft Access Report?</Title>
    <URL>https://www.microsoft.com/technet/scriptcenter/resources/qanda/oct06/hey1020.mspx</URL>
  </Script>
  <Script>
    <Category>Microsoft Office</Category>
    <Subcategory>Microsoft Access</Subcategory>
    <Keyword>databases</Keyword>
    <Title>How Can I Compact a Microsoft Access Database?</Title>
    <URL>https://www.microsoft.com/technet/scriptcenter/resources/qanda/oct06/hey1009.mspx</URL>
  </Script>
  <Script>
    <Category>Microsoft Office</Category>
    <Subcategory>Microsoft Word</Subcategory>
    <Keyword>hyperlinks</Keyword>
    <Title>How Can I Change an Existing Hyperlink in a Microsoft Word Document?</Title>
    <URL>https://www.microsoft.com/technet/scriptcenter/resources/qanda/oct06/hey1016.mspx</URL>
  </Script>
  <Script>
    <Category>Enterprise Servers</Category>
    <Subcategory>Microsoft SQL Server</Subcategory>
    <Keyword>databases</Keyword>
    <Title>How Can I Create a Table in a SQL Server Database?</Title>
    <URL>https://www.microsoft.com/technet/scriptcenter/resources/qanda/oct06/hey1016.mspx</URL>
  </Script>
</Repository>

As you can see, it’s a pretty simple little file. Each individual script is filed under a <Script> tag. From now on, we’ll refer to these scripts as records. Yes, you’re right—XML aficionados would never refer to those items as records. But that’s OK; our goal here is to show you how to apply something you already know—database querying techniques—to a new and different kind of data source. We thought we’d first focus on how to get things done, then maybe return to this topic in the future and supply you with the proper terminology.

OK, back to records. Each record contains the following fields: <Category>, <Subcategory>, <Keyword>, <Title>, and <URL>.

Hmm, you’re not quite as impressed as we thought you’d be. Maybe it would help if we showed you a script that could open up this XML file and echo back the contents:

Set xmlDoc = CreateObject("Microsoft.XMLDOM")
xmlDoc.Async = "False"
xmlDoc.Load("C:\Scripts\Scripts.xml")

Set colNodes = xmlDoc.selectNodes _
("/Repository/Script/*")

For Each objNode in colNodes
   Wscript.Echo objNode.Text
Next

Note Be careful: these tag names are case-sensitive. It’s "/Repository/Script," not "/repository/script" or "/REPOSITORY/SCRIPT".

Granted, that’s not particularly impressive either. But that’s the point: this isn’t anywhere near as hard as you might expect.

As you can see in the code, we begin by creating an instance of the Microsoft.XMLDOM object, the COM object used for working with XML files. After that, we set the Async property to False; this helps ensure that the entire file is read into memory before control is returned to the script. We then use the Load method to read the file C:\Scripts\Scripts.xml:

xmlDoc.Load("C:\Scripts\Scripts.xml")

As soon as the file has been loaded into memory, we can use the selectNodes method to select the specified records from the database:

Set colNodes = xmlDoc.selectNodes _
("/Repository/Script/*")

Take a close look at the parameter passed to selectNodes. As it turns out, this represents the path within the XML file. Our XML file has the following structure:

<Repository>
    <Script>
        <Category></Category>
        <Subcategory></Subcategory>
        <Keyword></Keyword>
        <Title></Title>
        <URL></URL>
    </Script>
</Repository>

Should you care about that? In a word, yes! When dealing with XML files, the structure is very important. For example, how do we get to the individual properties of a record? That’s easy: we access the <Repository> tag followed by the <Script> tag followed by the individual property tags. And guess what? That’s the same path we specify as the parameter to selectNodes. We simply follow that path with /*, which means "Select all the properties for these records." In other words, our returned collection will consist of all the properties for all the records in the database—akin to a "Select * From" query in Windows® Management Instrumentation (WMI).

Note Give the sample script a try and you’ll see exactly what we mean.

The rest of the script is easy; all we do is set up a For Each loop to walk through the collection and echo back the value of the Text property for each item and each property:

For Each objNode in colNodes
   Wscript.Echo objNode.Text
Next

Way easier than chasing cars.

But you’re right: if all we wanted to do was echo back the entire contents of the file, we could have bypassed all that XML stuff and just used the FileSystemObject. (It might have been a tad bit more complicated, but just a tad.) If our intention was to dazzle you and make you think "Wow, this XML stuff is pretty handy after all," well, we probably still have a ways to go.

So let’s see what we can do about that. In the first script, we got back all the property values for all the records. That’s fine, but suppose all we really wanted to get back was the script title? Can we do that? Um, just a second ...

OK, according to Lucy all we have to do is modify our selectNodes command, adding the property of interest to the end of the path. In other words:

    Set colNodes=xmlDoc.selectNodes _    ("/Repository/Script/Title")

As it turns out, the only reason we got back all the properties the first time is because we used the * wildcard character. Now we get back only the Title property. Why? Because that’s the only property value we asked to get back.

Note Hmmm, look at that: XML gives you exactly what you ask for. No offense, Lucy, but we think XML is man’s new best friend!

And, sure, you can return more than one property. For example, this modified command returns values for both the Title and URL properties:

    Set colNodes=xmlDoc.selectNodes _    ("/Repository/Script/(Title | URL)")

Admittedly, the syntax is a bit awkward-looking, but once you get used to it, it’s not too bad. To begin with, we specify the parent portion of the path, just like we did before:

    /Repository/Script/

We then add in a set of parentheses and, inside those parentheses, specify the properties to be returned. (Notice that we use the pipe separator—the | character—to split up the individual properties.) In other words:

    (Title | URL)

Like we said, it’s maybe a little goofy, but it works.

And no, you’re not limited to returning just two properties. As long as you keep adding in pipe separators you can return all the information your heart desires. For example, the command that follows returns three properties (Title, URL, and Keyword):

    Set colNodes=xmlDoc.selectNodes _    ("/Repository/Script/(Title | URL |          Keyword)")

Cool, huh?

Well, we thought it was cool. You guys are a tough crowd. But, again, you have a good point: all we’ve done so far is retrieve information about all the records in our database. However, there’s nothing wrong with that. Oftentimes that’s exactly what you want to do. On the other hand, there will also be times (lots of times) when we want to retrieve information about only a subset of those records. For example, maybe we want to get back information only about the scripts that have a Keyword equal to databases. You know, by using a query similar to this:

    Set colNodes=xmlDoc.selectNodes _
    ("/Repository/Script " & _
    "[Keyword = ‘databases’]")

Again, the syntax is a little weird. On the bright side, however, the syntax is also easy and straightforward. After specifying the parent path we place our "Where" clause (for example, where x equals y) inside square brackets, like so:

/Repository/Script [Keyword = ‘databases’]

As you probably already figured out, our Where clause equivalent tells the script "Return only those items where the Keyword attribute is equal to databases." Notice that databases is enclosed in single quotation marks. Not only is that perfectly fine, it’s pretty much required. If the filter term is enclosed in double quotation marks, you’ll get a syntax error when you try running the script.

Note Why? Because, remember, the entire query string—"/Repository/Script [Keyword = ‘databases’]"—is enclosed in double quotes.

Of course, you aren’t limited to using the equals sign (=) when filtering data. You can use greater than (>) or less than (<) signs, as well as everyone’s favorites, the less than or equal to (<=) and the greater than or equal to (>=) signs. You can also negate any of these signs by placing an exclamation mark in front of it. For example, this command requests all the scripts where the Keyword is not equal to databases (note the !=):

    Set colNodes=xmlDoc.selectNodes _
    ("/Repository/Script " & _
    "[Keyword != ‘databases’]")

Wait a minute: who said, "Sure, you can set a filter, but I bet you can’t set a filter and specify the properties you want returned"? Of course you can:

    Set colNodes=xmlDoc.selectNodes _
    ("/Repository/Script " & _
    "[Keyword = ‘databases’]/Title")

If all goes well, this command will return just the Title property for all the scripts that have the Keyword databases. Let’s see if it works:

    How Can I Print a Microsoft Access Report?
    How Can I Compact a Microsoft Access Database?
    How Can I Create a Table in a SQL Server     Database?

Let’s see if we can fit in one more command before we call it a day. Although we’re making progress, we might still be returning more records than we really want to return. After all, our preceding command returns a mixture of Microsoft® Access® scripts and Microsoft SQL Server™ scripts. (Why? Because all of these scripts use the Keyword databases.) What if we wanted to limit the returned data to scripts that have databases for a Keyword and that have Microsoft SQL Server as a Subcategory? Is it possible to filter on multiple criteria?

As if you had to ask:

    Set colNodes=xmlDoc.selectNodes _
    ("/Repository/Script " & _
    "[Keyword = ‘databases’ and " & _
    "Subcategory = ‘Microsoft SQL Server’]")

It’s the same basic syntax; we’ve just used two separate criteria (namely, Keyword = ‘databases’ and Subcategory = ‘Microsoft SQL Server’) and connected the two using the AND operator.

Note Yes, you’re right: with this simple little XML file we could have just requested all the scripts where the Subcategory was equal to Microsoft SQL Server. But that wouldn’t have been any fun, and nowhere near as educational.

You can also use OR clauses. For example, suppose we had a whole bunch of Microsoft Office scripts, including those with Subcategories equal to Microsoft Excel®, Microsoft PowerPoint®, Microsoft Outlook®, and so on. Is it possible to restrict the returned data to just those scripts from the subcategories Microsoft Word or Microsoft Access? You bet it is:

    Set colNodes=xmlDoc.selectNodes _
    ("/Repository/Script " & _
    "[Subcategory = ‘Microsoft Word’ " & _
    "or Subcategory = ‘Microsoft Access’]")

Is that more like it? Granted, this might not qualify as a column that changed your life forever; still, there’s a good chance that, sooner or later, being able to query an XML file in this manner will come in handy. But the decision is yours: you can learn some basic XML techniques or you can chase cars. It’s up to you. (If you choose the latter and you happen to met up with Lucy, the dog down the street, tell her the Scripting Guys say hi. And tell her to stay out of our yard!)

Hey, Scripting Guy! - Every Day

You’ve just perused this month’s Hey, Scripting Guy! column and thought it was the greatest piece of technical material you’ve ever read. You might even go so far as to say it’s the greatest piece of written material ever. What’s more, you’re eagerly standing by your mailbox in anticipation of your next issue of TechNet Magazine, just so you can read more articles like this.

What in the world are you waiting for? Want to know more about Lucy the neighborhood dog, or read about the latest exploits of the Scripting Son? Did you hear how the annual Turducken Bowl went last year? Stay abreast of all these important issues by reading the daily—yes, we said daily—Hey, Scripting Guy! columns. Every Monday through Friday (minus major holidays and Scripting Guy vacations), you can read all about high school baseball, college football and basketball, and occasionally get a local weather report. (Admittedly, it’s only local if "local" means Redmond to you, but who doesn’t want to know what’s going on with the weather in Redmond?)

Not only that, but you’ll actually learn something new about scripting each day. Yes, embedded amidst all this riveting content is actual scripting information. In each column, the Scripting Guys answer real questions submitted by what they assume to be real people. You can read the daily column online at microsoft.com/technet/scriptcenter/resources/qanda. Also, since they’ve already answered hundreds of questions, the archives are quite extensive and thus a great source of scripting information (see micros­oft.com/technet/scriptcenter/resources/qanda/hsgarch.mspx).

Do you have a question for the Scripting Guys? You stand a slim chance of getting your question answered by submitting it to scripter@microsoft.com. (In contrast, you stand no chance if you don’t submit it, so what have you got to lose? Unlike a lottery ticket, it’s free, and your odds are slightly better.)

The Microsoft Scripting Guys work for—well, are employed by—Microsoft. When not playing/coaching/watching baseball (and various other activities) they run the TechNet Script Center. Check it out at www.scriptingguys.com.

© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited.