Title: LEGO NXT Meets Kinect

CMSC838F: Assignment 4 (11/21/12)

Cheng Fu, Ph.D Student, Department of Geography

Kotaro Hara, Ph.D Student, Department of Computer Science






We see an increasing interest in interaction modalities that go beyond the keyboard and mouse, so called Natural User Interaction (NUI), that do not require mediated interaction with specialized devices.

Fruits of modern research have provided a variety of input channels, for example speech based control and interactive desktop (Siri, Tabletop). Among the other interaction techniques, gestural computer vision based body tracking movement using computer vision is becoming increasingly popular.

One application of NUI that we are interested is controlling physical objects with gestures. This allows, for example, radio controlled helicopters that used to have specially designed controllers, could be controlled by NUI. This way, we believe users could get even more immersed and have fun controlling toys experience even more immersed which could allow people to experience.

To this end, we used Microsoft Kinect (Kinect For Windows) and took advantage of its built-in joint tracking feature to allow users to interact with computers using whole-body gestures to control a Lego Mindstorms NXT robot.


Fig. 1 Microsoft Kinect for Windows

Gesture based control has become a popular way to manipulate characters and avatars in video games such as Wii sports. In addition to traditional manipulation mechanisms that use specialized game controllers, computer vision based interactive techniques are becoming popular as computers can now, by using Kinect, detect game players��?whole-body based gestures. In addition to RGB based images a traditional web cam provides, Microsoft Kinect provides depth information.

Although originally designed as a game controller, Kinect has been extensively hacked by developers and they came up with a number of interesting applications. In 2012, Microsoft officially announced Kinect for Windows with its API which developers can use to collect sensor data using programming languages, for example C#, C++, and Visual Basic. The APIs allow programmers to track skeleton joints and then the body gestures through well-wrapped class functions without much knowledge on image and signal processing.

In this assignment, we develop a program with C# that let users to control a physical robot, Lego Mindstorms NXT (NXT), through a upper body gesture detected through user’s joint position data collected through Kinect API.

Lego Mindstroms NXT


fig2. Lego Mindstorms NXT robot

Lego Mindstorms NXT is a new robotics kit from LEGO that allows robot builders to combine differernt parts and actuators to build their personal robots . The NXT we had had two motors to control left and right wheels/caterpillar, and the core that allows NXT to communicate with a computer through a serial connection (Bluetooth or USB). We used Bluetooth communication because wireless robot is more flexible in their movements. We can control the left motor and the right motor attached to the NXT where We can set rotation speed and rotattion direction of the motors on the robot by seeding text message via Bluetooth connection. Such control capability allows us to make NXT to, for example, move forward, move backward, and turn.

Some open-source projects, like MindSqualls, wrap the commands as class methods which makes it more convenient for quick start.

Controlling NXT

NXT movements

We can control 2 motors attached to NXT that control right-side wheels and left-side wheels of the device. This allows us to programmatically control NXT to:
  • Move forward (Move-Forward)
  • Move backward (Move-Backward)
  • Rotate counter-clock-wise (Rotate-CCW)
  • Rotate clock-wise (Rotate-CW)
  • Turn right while going forward (Turn-Right-Forward)
  • Turn left while going forward (Turn-Left-Forward)
  • Turn right while going backward (Turn-Right-Backward)
  • Turn left while going backward (Turn-Left-Backward)


Fig.3 Body gestures for detection. (from left, corresponding to initialing/brake, forward, backward, stop, right-hand-forward, left-hand-forward, clockwise-rotation and counterclockwise-rotation
We explored several upper body gestures mapped to the NXT movements. For example, Right-Hand-Forward gesture is mapped to Move-Forward. The list of gestures and corresponding movement of NXT is shown in the figure above (Fig. 3). Exploring the optimal gesture to control NXT is beyond the scope of this assignment.

Serial communication

After the bluetooth connection being established, the robot will be recognized by the PC as a serial communication device. Therefore, once the gesture is recognized, corresponding command will be sent through the serial port. And the chip on the robot will manage the message and control the hardware to work.


The biggest challenge is to determine which kind of gestures should be detected as body commands. We have to consider the accuracy of the joint recognition especially in some fast movement and overlapping cases. And we shall also consider the people’s conventions on body gestures so that we have to abandon some gestures that are easy to recognize but not user-friendly.

The initializing process for skeleton tracking on Kinect takes time (around 2 minutes in most cases). And the bluetooth connection usually takes 20 seconds. But once the connection is established, the robot is very responsible to commands. However, for new users, their body movement range may not reach the distance threshold for gesture recognition, which then may make the robot fail to respond.


In this project, we apply gesture recognition based on Kinect API and convert the certain gestures to corresponding commands which then control a Lego robot to move via wireless communication.


Kinect for Windows http://www.microsoft.com/en-us/kinectforwindows/
Lego Mindstroms NXT http://mindstorms.lego.com/en-us/Default.aspx
Microsoft NUI http://research.microsoft.com/en-us/collaboration/focus/nui/default.aspx
Mindsqualls http://www.mindsqualls.net/