September 2012

Volume 27 Number 09

Kinect - Working with Kinect Studio

By Leland Holmquest | September 2012

In June I covered the basics of creating a Windows Presentation Foundation (WPF) application that uses Kinect for Windows to track a user’s skeleton and draw it on the screen. I hope that those of you who had been hesitating to get a Kinect for Windows and start building applications were encouraged not to wait any longer.

If you just started developing with Kinect, I bet I can predict how you initially added a new feature or experimented with another visual: You eagerly hit F5, jumped out of your seat, moved in front of your Kinect, motioned and gyrated to express what you wanted in your application, ran into a bug or observed something that wasn’t quite what you expected or wanted, sat back down, stopped the application and went back to coding. Sound familiar? If you’re like me, this process was perfectly OK with you the first dozen or so times you went through it, excitement providing the force to overcome the object at rest (thank you, Sir Isaac Newton). But eventually you started to feel tired or lazy, not wanting to expend so much effort just to change a simple value in your code or add an element to the WPF that would result in about 30 seconds of down time between builds.

What’s a developer to do? The only way to test a Kinect-enabled application is to stand in front of the Kinect, right? It’s not like I can build a coded UI test to simulate the Kinect’s data feed—or can I? In this article I’ll show you how you can save time and energy when testing your Kinect-enabled applications by using Kinect Studio.

Introducing Kinect Studio

Kinect for Windows SDK 1.5 includes Kinect Studio, which is a very cool tool. I have the latest Developer Toolkit for Kinect for Windows (v1.5.1) installed on my development machine. To access Kinect Studio, I just go to All Programs/Kinect for Windows SDK v1.5 and select Kinect Studio v1.5.1. This tool can record all the data coming into an application from a Kinect unit. You can then view, review and store the data. Kinect Studio lets you inject the captured data streams back into a Kinect-enabled application, allowing you to test your code without getting out of your chair (although it might be healthier to make developers get up and move). If you’re working on a project in a distributed team, you can easily share data files with team members, enabling consistent testing across the team. Let’s look at a simple scenario that demonstrates how to use this tool.

I’m going to use the code I generated for the Kinect application in Figure 5 of my June 2012 article. Although simple, this code provides enough functionality to demonstrate how Kinect Studio works. To begin, open the solution for the Kinect application and start it in debug.

While the application is running, start Kinect Studio. As shown in Figure 1, Kinect Studio has four windows: a main window and Color Viewer, Depth Viewer and 3D Viewer windows. The main window shows a timeline as well as the controls used in Kinect Studio. In the Color Viewer window, you can see the couch in my living room. The Depth Viewer window uses color to represent the distance that an object or a person is from the Kinect unit. Red indicates a distance closer to the Kinect unit, and blue represents a distance farther from it. Using color to represent distance provides a visual indication of the distance. In the 3D Viewer window in Figure 1, notice that the view is slightly rotated with respect to the images in the Color Viewer and the Depth Viewer. The 3D Viewer has some really neat capabilities that I’ll explain in more detail later in this article.

Kinect Studio windows
Figure 1 Kinect Studio windows

Capturing Kinect Data

When Kinect Studio is first started, the Connect to a Kinect App & Sensor dialog box shown in Figure 2 opens.

Connecting to a Kinect-enabled application
Figure 2 Connecting to a Kinect-enabled application

In this dialog box, you specify which Kinect-enabled application you want to connect to. Connecting to an application enables Kinect Studio to capture the data coming into that application from the Kinect unit. Based on the windows available, Kinect Studio captures the data feeds from the color stream and the depth stream. With the application running and Kinect Studio connected, you can now capture the data. In the Kinect Studio main window, click the Record button (or press Ctrl+R) to begin collecting data. Next, have the test subject move through the scenario you want to test. When the scenario is complete, click the Stop button (or press Shift+F5). Kinect Studio then stores the data in memory. Once Kinect Studio is done processing the data, the timeline on the main window populates, as do the Color Viewer, Depth Viewer and 3D Viewer windows, as shown in Figure 3.

Captured data
Figure 3 Captured data

By moving the cursor along the timeline, you can see the content relative to the timeline selection. Typical of most video editing software, sections of the timeline can be selected and saved as a separate file so you can use just the parts you want. The file saved is an .xed file. All the collected data is now contained in the file, enabling you to replay it whenever you want or distribute it to the rest of your team.

Notice that there isn’t a Skeleton Viewer. Kinect Studio doesn’t capture skeleton data (collection of joints) because that data is evaluated at run time based on the depth and color views. Capturing a skeleton view would defeat the purpose of recording the data for testing. In other words, the intrinsic data comes from the depth and color sensors. The skeleton data is the product of the Kinect for Windows software analyzing this data. Therefore, the runtime is going to re-evaluate the skeleton data from the depth and color data being injected back into the Kinect-enabled application under development just as though a user were in front of the Kinect unit providing live interaction.

Now let’s see just how useful this tool is. While the Kinect-enabled application is running, open Kinect Studio, connect to the application and open the .xed file you previously saved. When you click the Play button (or press F5), Kinect Studio injects the data saved in the .xed file into the Kinect-enabled application, simulating the user (or users). The application in development reacts to the .xed feed as though the user were actually doing the actions in front of the Kinect in real time. In addition to relieving you from having to hop in front of the Kinect unit every time you want to test a change you make to your application to test it, this capability lets you test the application for different-sized users.

For example, when I’m the test user for an application I’m developing, I have to account for my size. I’m a big, relatively tall guy. So if I test the application only against myself, I run a risk of creating holes that other body types, say, those of small children, won’t fill. But it just isn’t practical to customize code for every possible body type. With Kinect Studio, I can capture people of various body types acting out different scenes for my application, save the files and then reuse them as appropriate. Using Kinect Studio to capture data streams of the supported user body types is a much more efficient, effective and thorough means of testing Kinect applications.

Capabilities of Kinect Studio

Let’s look at some of the other features Kinect Studio offers. While using the timeline to set the image where desired, the Color Viewer window shows a color image of the resolution that you enabled for the color stream from the Kinect. You can right-click on the image and select “Save image as” to save the still image as a bitmap.

The Depth Viewer also has a useful utility. To use it, pause the video at a specific point and then move the mouse pointer across the image. At the bottom of the Depth Viewer window, notice the data points that are displayed. First, the frame number indicates the frame number being displayed. Next, the x,y coordinates relative to the image being displayed are shown. Finally, and most interesting, the distance in millimeters from the Kinect to the object or person that the mouse pointer is “touching” is given. Let’s look at an example to understand this information a little better. I asked my daughters (Kenzy and Sherrie) to help me with this demonstration. I used Kinect Studio to capture them doing a routine that involved moving their arms and legs. Figure 4 shows how they were positioned: Kenzy is slightly behind but to the side of Sherrie.

Depth Viewer with data
Figure 4 Depth Viewer with data

In the image on the left, the mouse is on the point x=178, y=304. The depth is 2187 millimeters (mm). So this point (which corresponds to the “girl in red” [Sherrie]) is 2187 mm from the Kinect unit. The image on the right shows the mouse on the point x=326, y=304, with a depth of 2639 mm. So this point (which corresponds to the “girl in orange” [Kenzy]) is 2639 mm from the Kinect unit. Another way to interpret this data is to say that Kenzy is 452 mm behind Sherrie. The Kinect depth sensor provides this data and is one of the features that makes Kinect extremely powerful.

We have had the ability to capture data from webcams and similar hardware for many years. Kinect gives us the ability to capture the distance of objects in a relatively inexpensive package. By combining these data streams, we can create applications that “understand” not only the imagery being presented but also the three-dimensional aspects of the scene. Think about when relational databases first became commonplace. The ability to relate two seemingly disparate pieces of data via a defined relationship revolutionized the way applications work and the capabilities they can offer. Similarly, using Kinect, we can now draw far more value from the scene in front of the Kinect and can relate objects through analysis of imagery as well as actual distances.

Let’s take another look at Figure 4 from the perspective of developing an application that wants to know which user is closest to the screen in order to select the main player, for example. If all I had to work with was a color image of two children, it would be difficult to ascertain which child was standing closer to the screen. In fact, I would probably have to use some other means of determining which child was the main player, such as having the player highlight and select the shape of her body outline. Using the depth data, however, I no longer need to have these artificial mechanisms for determining what is obvious based on the physical layout. With Kinect, we’re a step closer to ubiquitous computing because the application is able to infer information by evaluating the user’s situation without requiring direct, artificial interaction (such as clicking the mouse or typing on a keyboard). The user simply behaves naturally. That’s the real power of programming Kinect-enabled applications. Developers can build applications for average businesses and households that allow users to be themselves. Or as Microsoft puts it, “You are the controller.” Without the depth data, this would not be feasible.

For me, the 3D Viewer window is the coolest feature of Kinect Studio. Again, using the timeline to set the point in time that you want to view, the image in the 3D Viewer represents a 3D model of a scene rendered by combining the depth and color data, as shown in Figure 5.

Viewer showing 3D model of depth and color data
Figure 5 3D Viewer showing 3D model of depth and color data

The wire frame (or camera frustrum) around the scene provides a means to understand the relative positioning of items in the image. You can also grab the scene and rotate it, enabling you to explore the 3D information from different angles. Figure 6 and Figure 7 show a couple different views of the frame in Figure 5.

3D Viewer showing Figure 5 image from a different angle
Figure 6 3D Viewer showing Figure 5 image from a different angle

3D Viewer showing a side view of the image in Figure 5
Figure 7 3D Viewer showing a side view of the image in Figure 5

For me, Figure 7 shows the most interesting view. From this side view, I can see the profiles of the two girls as well as the curvature of their bodies and their relative position to one another. 

A Real-Live Test with Kinect Studio

Now let’s walk through a typical scenario of testing a Kinect-enabled application. Using the application from last month’s article (Figure 5 there), I asked Kenzy to walk through some of the movements, captured that data and saved it as an .xed file.

The original application only drew ellipses on the screen for the head, the right hand and the left hand. I wanted to include ellipses for both feet as well, so in my XAML, I added definitions for the feet ellipses. Then in the code behind, I added some code to support drawing the feet. Because of how I designed the original application, adding feet ellipses was a trivial task. The SkeletonFrameReady event handler with code for adding the feet is shown in Figure 8. The new lines of code are lines 35-38.

void runtime_SkeletonFrameReady(object sender,
  SkeletonFrameReadyEventArgs e)
{
  bool receivedData = false;
 
  using (SkeletonFrame SFrame = e.OpenSkeletonFrame())
  {
    if (SFrame == null)
    {
      // The image processing took too long. More than 2 frames behind.
    }
    else
    {
      skeletons = new Skeleton[SFrame.SkeletonArrayLength];
      SFrame.CopySkeletonDataTo(skeletons);
      receivedData = true;
    }
  }
 
  if (receivedData)
  {
    Skeleton currentSkeleton = (from s in skeletons
                                where s.TrackingState ==
                                SkeletonTrackingState.Tracked
                                select s).FirstOrDefault();
 
    if (currentSkeleton != null)
    {
      SetEllipsePosition(head,
        currentSkeleton.Joints[JointType.Head]);
      SetEllipsePosition(leftHand,
        currentSkeleton.Joints[JointType.HandLeft]);
      SetEllipsePosition(rightHand,
        currentSkeleton.Joints[JointType.HandRight]);
     SetEllipsePosition(rightFoot,
       currentSkeleton.Joints[JointType.FootRight]);
     SetEllipsePosition(leftFoot,
       currentSkeleton.Joints[JointType.FootLeft]);
    }
  }
}

Figure 8 SkeletonFrameReady event handler with code to support drawing feet ellipses

Now that the code is in place, I want to run it to see whether it works. Rather than calling on Kenzy to re-enact the scene for me (greatly disturbing her day), I simply start the application, start Kinect Studio, connect my running application to Kinect Studio, open the .xed file with Kenzy’s motion and press Play. I see that the feet are indeed added and responding correctly, leaving me ready to tackle the next challenge.

Conclusion

Creating Kinect-enabled applications is a rewarding experience and also a lot of fun. Enabling applications that allow users to interact naturally and humanly rather than requiring artificial interaction mechanisms such as a mouse and keyboard can lead to amazing capabilities. Using Kinect Studio, developers can capture data feeds, save them as .xed files and reuse them to test the application. Testing with Kinect Studio by replaying .xed files is less tedious and more efficient than having to stand in front of the Kinect and act out scenarios every time the application is executed. In addition, Kinect Studio gives you another way to examine and explore the Kinect data. In short, if you’re developing Kinect-enabled applications, you definitely want to have Kinect Studio in your toolbox.

Next month, I’ll take you further into the capabilities of Kinect for Windows. Imagine what you will Kinect next!


Leland Holmquest is an Information Technology Architecture and Planning Advisor at Microsoft and a Ph.D. student at George Mason University. You can reach him atlelandholmquest.wordpress.com.