Game Development Reference
In-Depth Information
triangles are the same, we can use similar triangles to determine where the point
P will project to:
y
f
Y
Z ,
=
yY
Z
y =
.
Here we denote world coordinates in 3D space as capital letters, while the image
space coordinates are denoted with lowercase letters. The same relationship can
be used to show the mapping of P in the x -direction as well:
fX
Z
x =
.
With these two simple relationships, it is possible to project a point in the scene
to a point in the image. However, to go the other direction and take an image
point and unproject it back into the world, we can only determine a ray along
which the point must lay. This is because a traditional camera only produces
an intensity value at each pixelâ€”you no longer have any information about how
deep into the scene the point was when the image was taken. Fortunately for
us, the Kinect has two different camera systemsâ€”one of which produces a depth
value at each pixel. The next section discusses how to take advantage of this fact
and find a mapping between the 2D depth and color images we are given and the
3D objects that appear in them.
2.3.2 Kinect Coordinate Systems
When we are given a depth image from the Kinect, we are essentially given three
pieces of data for each pixel. The first two are the x and y image coordinates
where that particular pixel is located. In addition, the value stored at that pixel
provides the world-space distance from the depth camera. This is precisely the
Z distance that we used from Figure 2.6 when projecting a point onto the image.
Given this additional piece of data, we can easily determine the world-space point
that every pixel in the depth image represents by changing around our previous
equations:
xZ
f
X =
,
yZ
f
Y =
.
This allows us to utilize the Kinect to produce 3D representations of the scene
that it is viewing. That is already an interesting capability, but we also want
to be able to map the color-image stream to this 3D representation so that we