Windows PowerShell Tip of the Week

Here’s a quick tip on working with Windows PowerShell. These are published every week for as long as we can come up with new tips. If you have a tip you’d like us to share or a question about how to do something, let us know.

Find more tips in the Windows PowerShell Tip of the Week archive.

Working With Custom Objects

Scripting is always fun when the script does all the work for you. For example, suppose you want to get a list of all the files in the folder C:\Scripts, and then sort those files by size (Length). No problem; all you have to do is let the Get-ChildItem and the Sort-Object cmdlets take care of everything for you:

Get-ChildItem C:\Scripts | Sort-Object Length

Life is beautiful, eh?

Well, sometimes. Unfortunately, though, things aren’t always this much fun, or this easy. For example, suppose you have the following text file (C:\Scripts\Test.txt), a file containing baseball statistics:

Name,AtBats,Hits
Ken Myer,43,13
Pilar Ackerman,28,11
Jonathan Haas,37,17
Syed Abbas,41,20
Luisa Cazzaniga,22,6
Andrew Cencini,35,11
Baris Cetinok,19,4

What you’d like to do with this file is calculate the batting average for each player (something you can do by dividing each player’s hits by their number of at-bats), then sort those averages in descending order. That sounds easy enough; after all, can’t we just read in the data and calculate each player’s batting average:

$colStats = Import-CSV C:\Scripts\Test.txt
foreach ($objBatter in $colStats)
  {
    $objBatter.Name + " {0:N3}" -f ([int] $objBatter.Hits / $objBatter.AtBats)
  }

Well, we can, sort of. That approach will definitely give us each player’s batting average; what it won’t do, however, is sort those averages for us. Instead we end up with this:

Ken Myer 0.302
Pilar Ackerman 0.393
Jonathan Haas 0.459
Syed Abbas 0.488
Luisa Cazzaniga 0.273
Andrew Cencini 0.314
Baris Cetinok 0.211

Useful, but not what we had in mind.

The problem, of course, is obvious: we can’t sort the averages until we have all the averages. And that’s a problem because our script doesn’t work with the entire collection; it works with only a single batting average at a time. That’s not just a problem, it’s a big problem.

Of course, there are plenty of ways to solve this dilemma, all of them involving a sort of “secondary” data repository, be that an array, a hash table, or maybe a disconnected recordset. The idea is simple enough: what we need to do is calculate all the batting averages, store them in this secondary data repository, and then, as soon as we have all the averages, sort that data repository. Like we said, the idea is simple, but the execution isn’t always so straightforward; that’s because all these secondary data repositories have their own quirks and eccentricities, and it’s not always easy to get data into these things. (And it can sometimes be equally difficult to get data out of those things.)

On top of that, trying to shoehorn data into an array or a hash table seems a bit heretical, at least when it comes to Windows PowerShell. After all, PowerShell’s claim to fame is that it enables you to work with objects. Wouldn’t it better if we could store this information in a set of objects, and then work with those objects rather than some sort of secondary data repository?

You bet it would:

$colAverages = @()

$colStats = Import-CSV C:\Scripts\Test.txt

foreach ($objBatter in $colStats)
  {
    $objAverage = New-Object System.Object
    $objAverage | Add-Member -type NoteProperty -name Name -value $objBatter.Name
    $objAverage | Add-Member -type NoteProperty -name BattingAverage -value ("{0:N3}" -f ([int] $objBatter.Hits / $objBatter.AtBats))
    $colAverages += $objAverage
  }

$colAverages | Sort-Object BattingAverage -descending

Granted, at first glance this script might not seem all that impressive. But just wait; as you’re about to see, it’s actually kind of a neat little solution to this problem.

Our custom object script starts out by creating an empty array named $colAverages:

$colAverages = @()

And yes, we did say that using an array seemed a bit heretical. But don’t worry; we aren’t going to use this array to somehow store a batter’s name and batting average; that is, our array isn’t going to hold information like this, information we’d then have to somehow tease apart before we could sort the batting averages in descending order:

Ken Myer;0.302

Instead, we’re going to use this array to store some custom objects that we create ourselves.

See? We told you that this was a neat little solution to our problem.

After creating the empty array we use the Import-CSV cmdlet to read in the text file C:\Scripts\Test.txt and store the contents in a variable named $colStats. Incidentally, Import-CSV is a very underrated cmdlet. As long as your text file has a header line (which our text file does), Import-CSV will import each item in a comma-separated values file as a separate object, and an object with clearly-defined properties as well. Take a look at what we get if we pipe the data returned by Import-CSV to the Get-Member cmdlet:

Name        MemberType   Definition
----        ----------   ----------
Equals      Method       System.Boolean Equals(Object obj)
GetHashCode Method       System.Int32 GetHashCode()
GetType     Method       System.Type GetType()
ToString    Method       System.String ToString()
AtBats      NoteProperty System.String AtBats=43
Hits        NoteProperty System.String Hits=13
Name        NoteProperty System.String Name=Ken Myer

Look at the bottom of the list: the three fields referenced in our file header (AtBats, Hits, and Name) are all listed as object properties. Cool, huh?

As a matter of fact that is cool. Cool or not, however, it doesn’t solve our problem: we don’t even have each player’s batting average, let alone the ability to sort those averages in descending order. But that’s all right; we’re just about to address that issue.

To begin with, we set up a foreach loop that walks us through each item in $colStats:

foreach ($objBatter in $colStats)

Inside that loop, we use the New-Object cmdlet to create a brand-new object named $objAverage:

$objAverage = New-Object System.Object

So what kind of object is $objAverage? Well, that’s entirely up to us; at the moment $objAverage is essentially a blank object, one without any defined properties. That’s what the next line of code is for:

$objAverage | Add-Member -type NoteProperty -name Name -value $objBatter.Name

Here we’re piping $objAverage to the Add-Member cmdlet, a cmdlet that allows us to add properties to an object. In this case we’re adding a NoteProperty, giving our property the name Name and a value that represents the Name property for the first item in the collection $colStats. In other words, because the first object in $colStats has a Name equal to Ken Myer, that means that $objAverage will also have a Name equal to Ken Myer.

But here’s the really cool part; in the next line of code, we create a property named BattingAverage, and assign that property the player’s batting average:

$objAverage | Add-Member -type NoteProperty -name BattingAverage -value ("{0:N3}" -f ([int] $objBatter.Hits / $objBatter.AtBats))

Admittedly that’s kind of a clunky-looking line of code; that’s because we applied a little bit of formatting to our batting average. In particular, we did two things to that batting average. To begin with, we used the syntax [int] to make sure that PowerShell treated the player’s hits ($objBatter.Hits) and the player’s at bats ($objBatter.AtBats) as numbers. If we don’ explicitly cast these values as numbers PowerShell will treat them as strings, and we’ll get back an error message similar to this:

Method invocation failed because [System.String] doesn't contain a method named 'op_Division'.
At C:\scripts\test.ps1:11 char:88
+     $objAverage | Add-Member -type NoteProperty -name BattingAverage -value ($objBatter.Hits / <<<<  $objBatter.AtBats)

Not that this happened to us mind you. It happened to a … friend … of ours.

Second, we also use the .NET Framework formatting syntax "{0:N3}" –f to limit the resulting value to three decimal places; without that formatting we’ll get back batting averages similar to this:

0.210526315789474

Batting averages are always expressed using three decimal places; therefore, we used the construction {0:N3} to limit our batting averages to three decimal places as well.

Note. Needless to say, that’s the 3 in the formatting string is for: it limits the value to 3 decimal places. If we wanted to show 5 decimals places we would just replace that 3 with a 5: {0:N5}. We should probably note, too that our batting averages aren’t perfect; they all include a leading zero, which is definitely not the traditional way to display a batting average. However, getting rid of leading zeroes is another topic for another day.

So what does all this mean? This means that we now have an object named $objAverage, an object containing the following properties and property values:

Property Name

Property Value

Name

Ken Myer

BattingAverage

0.302

OK, so what do we do with this object now that it’s been created? Actually, we can’t do much of anything with it right now; remember, we need to sort the averages in descending order, and we can’t do that until we’ve calculated all the averages. Therefore we need to set $objAverage aside until later; one easy way to do that is to add the object to an array:

$colAverages += $objAverage

And then it’s back to the top of the loop, where we repeat this process with the next player in the text file.

OK, so then what do we do once we’ve calculated – and stored – all the bating averages? To be honest, we don’t have to do much of anything. In fact, all we have to do is take our collection of batting averages and pipe them to the Sort-Object cmdlet, asking Sort-Object to sort the data by the BattingAverage property, and in descending order:

$colAverages | Sort-Object BattingAverage -descending

Is that going to do us any good? Of course it is:

Name                                               BattingAverage
------                                                    -------
Syed Abbas                                                  0.488
Jonathan Haas                                               0.459
Pilar Ackerman                                              0.393
Andrew Cencini                                              0.314
Ken Myer                                                    0.302
Luisa Cazzaniga                                             0.273
Baris Cetinok                                               0.211

Again, this isn’t the only way we could have tackled this problem; however, it seemed easier than most other approaches. True, a hash table might be easier, as long as we need to keep track of only two items: Name and BattingAverage. But what if we wanted to keep track of Name, BattingAverage, At Bats, and Hits? To be honest, a hash table is of minimal value when dealing with more than one item. In that case, you’d probably need to use a disconnected recordset, something that requires code similar to this just to set the thing up:

$adVarChar = 200
$MaxCharacters = 255
$adFldIsNullable = 32
$adDouble = 5

$DataList = New-Object -com "ADOR.Recordset"
$DataList.Fields.Append("Name", $adVarChar, $MaxCharacters, $AdFldIsNullable)
$DataList.Fields.Append("BattingAverage", $adDouble, $Null, $AdFldIsNullable)
$DataList.Open()

See why we decided that using custom objects and the Add-Member cmdlet might be a bit easier?

And if that’s not enough, remember that, because these are objects, we can use them with other Windows PowerShell cmdlets as well. For example, we could pipe this information to the Measure-Object cmdlet:

$colAverages | Measure-Object BattingAverage -minimum -maximum -average |
Select-Object Minimum, Maximum, Average

In turn, Measure-Object will give us information similar to this

          Minimum                Maximum                        Average
          -------                -------                        -------
0.210526315789474       0.48780487804878              0.348569480651885

Again, not the prettiest formatting in the world, but you get the idea.

See you all next week.