Parsing Comma-Separated-Values Logs

Microsoft® Windows® 2000 Scripting Guide

A comma-separated-values (CSV) log consists of two primary components:

  • A header line, which indicates the data fields used in the log.

  • Record lines, containing the data for each record written to the log.

In both header and record lines, commas separate individual fields. By contrast, in a tab-separated-values (TSV) log, tab characters separate fields.

For example, in DHCPSrvLog.log, the header line and the first two record lines look something like the following:



ID Date,Time,Description,IP Address,Host Name,MAC Address
10,04/02/01,15:16:00,Assign,192.168.1.10,workstation1.fabrikam.com,0000F8083446,
10,04/02/01,15:16:00,Assign,192.168.1.11,workstation2.fabrikam.com,0000F8083447,

To parse a CSV log, the script needs to read in each line and then extract each field within the line. This can be done using the VBScript Split function, which splits a line into constituent parts based on the field delimiter (the character used to separate the individual fields). For example, this line:



Part 1,Part 2,Part 3,Part 4

Would be split into a four-item array, with the array consisting of the following elements:

  • Part 1

  • Part 2

  • Part 3

  • Part 4

Scripting Steps

Listing 12.17 contains a script that parses a CSV file. To carry out this task, the script must perform the following steps:

  1. Create a constant (ForReading) to be used with the FileSystemObject.

  2. Create an instance of the FileSystemObject.

  3. Open the DHCP server Log for Monday (C:\Windows\System32\DHCP\DhcpSrvLog-Mon.log).

  4. Use the SkipLine method to move through the first 25 lines in the file.

    The first 25 lines in a DHCP server log are always header information that is not required for this script. You must start at the beginning of the file and explicitly skip these lines, however, because the FileSystemObject always begins reading at the first line of the file. You cannot specify an alternative starting point, such as the 26th line of the file.

  5. Beginning with line 26, read each line by using the ReadLine method, and temporarily store the contents in the variable strNextLine.

  6. Use the VBScript Split function to split the line into 8 separate variables. (The comma is used as the split delimiter.)

    Each variable (representing a comma-delimited field in the log) is stored in the array arrDHCPRecord.

  7. Echo the contents of arrDHCPRecord to the screen.

    In a production script, you would probably either save selected variables to a database or take some action based on the value of one or more of those variables (for example, the Event ID).

Listing 12.17 Parsing a Comma-Separated-Values Log

  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20


Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile("C:\Windows\System32\DHCP\" _
 & "DhcpSrvLog-Mon.log", ForReading)
Do While objTextFile.AtEndOfStream <> True
 If inStr(objtextFile.Readline, ",") Then
 arrDHCPRecord = split(objTextFile.Readline, ",")
 Wscript.Echo "Event ID: " & arrDHCPRecord(0)
 Wscript.Echo "Date: " & arrDHCPRecord(1)
 Wscript.Echo "Time: " & arrDHCPRecord(2)
 Wscript.Echo "Description: " & arrDHCPRecord(3)
 Wscript.Echo "IP Address: " & arrDHCPRecord(4)
 Wscript.Echo "Host Name: " & arrDHCPRecord(5)
 Wscript.Echo "MAC Address: " & arrDHCPRecord(6)
 Else
 objTextFile.Skipline
 End If
 i = i + 1
Loop
objTextFile.Close

Show: