Jan 082013

[Warning: this is a slightly more technical article containing talk of matrices, transforms and spaces.  If it doesn’t make sense just carry on to the next article, you don’t need to understand the details!]

We saw in the previous article how to draw a point in 3D when the camera is at the origin and looking down the Z axis.  Most of the time though this won’t be the case as you’ll want to move the camera around.  It’s really hard to directly work out the projection of a point onto an arbitrarily positioned camera, but we can solve this by using transforms to get us back to the easy case.

Transform matrices

A transform in computer graphics is represented by a 4×4 matrix.  If you’re not familiar with matrix maths don’t worry – the important things to know are that you can multiply a 4×4 matrix with a position to get a different position, and you can multiply two 4×4 matrices together to get a new 4×4 matrix.  A transform matrix can perform translations, rotations, scales, projections, shears, or any combination of those.

Here is a simple example of why this is useful.  This is the scene we want to draw:

The camera is at position Z = 2, X = 1, and is rotated 40 degrees (approximately, again exact numbers don’t matter for the example) around the Y axis compared to where it was in the last example.  What we can do here is find a transform that will move the camera from its original position (at the origin, facing down the Z axis), to where it is now.

First we look at the rotation, so we need a matrix describing a 40° rotation around the Y axis.  After that we need to apply a translation of (2, 0, 1) in (x, y, z) to move the camera from the origin to its final location.  Then we can multiply the matrices together to get the camera transform.

If you’re interested, this is what a Y rotation matrix and and a translation matrix looks like:

This camera transform describes how the camera has been moved and rotated away from the origin, but on its own this doesn’t help us.  But think about if you were to move a camera ten metres to the right – this is exactly the same as if the camera had stayed where it was and the whole world was moved ten metres to the left.  The same goes for rotations, scales and any other transform.

We can take the inverse of the camera matrix – this gives us a transform that will do the opposite of the original matrix.  More precisely, if you transform a point by a matrix and then transform the result by the inverse transform the point will end up back where it started.  So while the camera transform can be used to move the camera from the origin to its current position, applying the inverse (called the view transform) to every position in the world will move those points to be relative to the camera.  The ‘view’ of the world through the camera will look the same, but the camera will still be at the origin, looking down the Z axis:

This diagram may look familiar, and in fact is the exact same setup that was used for rendering in the previous article.  So we can now use the same rendering projection method!

World, camera, object and screen spaces

Using transforms like this enables the concept of spaces.  Each ‘space’ is a frame of reference with a different thing at the origin.  We met world space last time, which is where the origin is some arbitrary position in the world, the Y axis is up and everything is positioned in the world relative to this.

We’ve just met camera space – this is the frame of reference where the origin is defined as the position of the camera, the Z axis is straight forwards from the camera and the X and Y axes are left/right and up/down from the camera’s point of view.  The view transform will convert a point from world space to camera space.

You may also come across object space.  Imagine you’re building a model of a tree in a modelling package – you’ll work in object space, probably with the origin on ground level and at the bottom.  When you’ve build your model you’ll want to position it in the world, and for this we use the basis matrix, which is a transform (exactly the same as how the camera was positioned above) that will convert the points you’ve modelled in object space into world space, thus defining where the tree is in the world.  To render the tree, the points will then be further converted to camera space.  This means that the same object can be rendered multiple times in different places just by using a different basis matrix.

Finally, as we saw in the previous article, we can convert to screen space by applying the projection matrix.

Using this in a renderer

All of these transforms are represented exactly the same, as 4×4 matrices.  Because matrices can be multiplied together to make new matrices, all of these transforms can be combined into one.  Indeed, a rendering pipeline will often just use one transform to take each point in a model and project it directly to the final position on the screen.  In general, each point in eac object is rendered using this sequence of transforms:

Basis Matrix * Camera Matrix * Projection Matrix * Position = Final screen position

You now pretty much know enough to render a basic wireframe world.  Next time I’ll talk about rendering solid 3D – colours, textures and the depth buffer.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>