OpenGL Coordinates
3D Basics Everyone Should Know Before Touching OpenGL In this part I will cover 3D graphics in general and most of the following topics don't have to be constrained to OpenGL alone. So what is exactly 3D and how can it be represented to the viewer on the computer screen? To describe the idea behind rendering 3D objects on the screen it's best for me to use a 3D object.
Lets examine the following image of a wire-framed
3D cube.
|
And yet it's hard to think of this image as being
"flat". 3D graphics on the visual level is (mostly) all about rendering
objects to the screen. The question is what are the main requirements to render an object so that you will be able to correctly recognize it as a 3D object and not just a collection of lines or perhaps polygons? Obviously, the idea is to render objects to the screen the way you would see them in real life. And how do you see objects in real life? This is where the meaning of perspective comes from. In the pre-computer ages artists had used the same techniques for painting their masterpieces that today's 3D software is using for creating 3D images. The point behind perspective is that all objects farther away from the viewer look smaller than objects closer to the viewer, and ultimately they disappear into the vanishing point. This is true for most 3D graphics applications. Now lets take a look at the OpenGL coordinate system we will be using. It is so-called 3D Cartesian coordinate system. As you can see, additionally to the x and y-axis known in 2D graphics we have the z-axis which extends into negative space from the center of the screen from the viewer and into positive space from the center of the screen towards the viewer. This image visually mimics what I've just said. |
|
Perspective and Orthographic Projections
As we take little steps towards the end of this tutorial, I think it's the
right time to explain projection right here. There are two types of
projections actually. Perspective Projection and Orthographic Projection
(described shortly). First I want to talk about Perspective projection
because I've already explained perspective. Objects that you're going to
render will be actually what we might call "projected" to the screen. What I
mean by projection is the actual conversion from the 3D coordinates (usually
vertices of objects) to the 2D flat surface of the screen. Since the
computer screen has only two dimensions, we, somehow, have to display the 3D
objects on the 2D screen. And that's precisely what projection does for us.
Perspective projection works as follows. I will take a single pixel as an
example. Imagine we have a pixel with coordinates of (5, -3, 2) on the x y
and z-axis respectively and we want to project it to the screen. We do it
with the following formula. Assume we have a structure POINT3D containing
the coordinates of the point initialized with the mentioned values for this
example.
// initialize point
int
x2d = HALFWIDTH + point.x * ViewingDistance / point.z;
int
y2d = HALFHEIGHT + point.y * ViewingDistance / point.z;
// project the
3D
point
to
the screen
Pixel(x2d, y2d);
Let's take the formula apart. As you already know, usually in 2D all
coordinates are based on the 4th quadrant in 2D Cartesian Coordinate system.
That means that (0, 0) is at the upper left corner of the screen. In 3D
graphics, we want our view, or the camera to be exact, (camera is explained
a little further into this tutorial) to be located as in the following
image, so that we're always looking straight down the negative space of the
z-axis.
As you can see, if we had a 3D point at (0, 0, -16) it would be exactly in
the center of the screen. A little modification is required here. Take a
look at the projection formula again. There we're adding halves of the
screen resolution first to center all results. We're in fact translating the
point from (0, 0) to (halfwidth, halfheight) on the screen. If we're in
640x480 resolution we would be translating the point to (320, 240). Take the
constant ViewingDistance out of the equation for a second. And you will
realize that the second part of the formula is just the relationship between
"X and Z" for x2d and "Y and Z" for y2d. This is the most important idea
behind perspective-projected objects. As you recall objects that appear
farther from the viewer are smaller, and this is the exact relationship
between the 2D points and the perspective, which is achieved by division of
the both horizontal and vertical coordinates by the amount of how far away
the object is. However there is a problem. By merely dividing the x and y
coordinates by depth (the z coordinate) we will only get the ratio between
the depth and vertical/horizontal position of the pixel. And what we need is
how they are actually related to the Viewing Distance and Viewing Volume.
These two terms are explained below.
The Viewing Volume is the space between the near clipping plane (or the
viewing plane) and the far clipping plane as seen on the second picture
below. So, back to our equation for a second, we simply multiply x and y by
ViewingDistance to get the right relationship between the Viewing Volume and
the X and Y coordinates. Simple as that. Viewing Distance is closely related
to the Viewing Volume. The longer the viewing distance, the narrower is the
line of sight and therefore the smaller the viewing volume. Well, the good
news is that we don't have to worry about all of this in OpenGL since
everything is done behind the scenes, however you still need to understand
these terms to understand why images appear the way they appear on the
screen, and I just wanted to explain the basics of perspective projection.
The above formula could be used in a software 3D rendered but we're not
interested in that at this moment.
In conclusion, here's how a whole object (as opposed to the pixel in
previous example) would be projected onto the screen in theory. At the upper
right corner of this image there is a real object (cube) in space. I tried
to make the projected version of the cube as it appears on the screen as
close as possible to what it would be like, but I'm sure this is wrong. Just
keep in mind that the whole object is projected on the flat screen pixel by
pixel (and polygon by polygon on a higher scale).
I talked about Viewing Volume and how it is related to the perspective
projection equation. But what is Viewing Volume? The Viewing Volume is also
known as the Clipping volume or the Frustum. Here's the visual
representation of the viewing volume.
There are two planes, the viewing plane and the far clipping plane. The
viewing plane is actually the screen and the far plan indicates how far you
can "see", whatever is behind the far clipping plane will not be visible.
The viewing volume is the space between those two planes. The viewing volume
is sometimes called clipping volume because you usually want to clip your
polygons against it.
Orthographic Projection
As I mentioned before there is another type of projection, which is the
Orthographic Projection. This type of projection cannot be used for games or
real-time applications with desirable results since it ignores the z-axis
coordinate. In other words, if you draw a bunch of trees close and far away
from the view, they will all appear the same size. Orthographic projection
is used with technical design software and OpenGL supports it as well. In
this series of OpenGL tutorials we will be always using the perspective
projection.
The 3D Camera
At this point I should explain what camera is. The camera is always located
at the origin of the virtual "view". Note however, that it is NOT NECESSARY
located at the origin of the COORDINATE SYSTEM since you can move the camera
around and transform it to anywhere in the world. The camera and the view
are basically the same things. Camera is only mentioned to represent a
virtual viewing point but there is actually no physical camera anywhere
around. I already talked about it but it is important to understand that
there is some space between the origin of the camera and the viewing plane.
As you saw in the previous image. That space is the VIEWING DISTANCE.
If you look straight ahead for example you are considered to be looking down
the camera's z-axis into the negative z space, in 3D terms. Camera rotation
is possible around all 3 axis as you would expect and is made even easier
for you by OpenGL. Camera rotation is responsible for moving the view, and
it's what happens when you move your virtual head around with the mouse or
arrow keys in a 3D-FPS shooter. Lets examine the camera a little closer.
Camera, as any other object in space has 2 coordinate systems. The two are
the Local Coordinate System and the World Coordinate System. The local
coordinates are the camera's rotation degrees on all of it's LOCAL xyz-axis
and actual displacement from the local coordinate system. The world
coordinates specify the camera's position in the world. For example, when
you walk around in a 3D FPS-shooter kind of game you are actually moving the
camera's world coordinates and when you look around you change the camera's
local coordinates. It is possible to use the local camera coordinates for
moving also, by translating them to the new location but only BEFORE
rotation is performed because rotation is also done in local coordinates
around (0,0,0) and if you move the camera before rotating to say (0, 5, 0)
it will not rotate correctly as its center will be displaced and taken into
account during rotation. Remember this rule: always rotate around the local
center (0,0,0). If this sounds confusing, don't worry. It will all settle
down the more you study and actually code in OpenGL, if you haven't already.
Here's how the camera's coordinates are transformed.
If you understand this so far, that's good. Now, let's move on to object
rotation basics. This is exactly the same as demonstrated on the camera
rotation part of the above image. The only difference is that we're not
viewing the world FROM that object, but are in fact OBSERVING that object
from the current camera position. This is the way an object is rotated
around all of the 3 possible axis. When we get down to actually doing it in
the following tutorials, I will make it more clear, so don't worry if you
don't get something at this moment.
Just the same way it is with the camera, the objects also have two
coordinate systems and as you might have guessed already, the objects are
positioned according to the LOCAL and WORLD coordinate systems. The local
coordinates are usually used for rotating the object and the world
coordinates are used for positioning the object in the world or, say, in a
3D level.
As you add objects and static polygons (e.g. walls, terrain, etc.) to your
3D world you want to clip all of the polygons that are not located in the
camera's viewing volume. You also want to clip off parts of the polygons
that are on the edge of the view volume against the bounding box of the
screen. The former is provided for us by OpenGL. Another issue associated
with drawing polygons is that you don't want to draw the back faces (or
sides) of the polygons when they are facing the camera. Imagine a textured
polygon which is rotated by 180 degrees so its "back" is facing us. Let's
also assume that that polygon is a part of a bigger structure, a wall for
example. Usually you will never want to see what's "behind" the wall. Have
you ever wanted to see what's behind your room's wallpaper? I surely hope
not. So the point is, if you rotate a textured polygon, its coordinates are
reversed judged against the camera view and you never want to see that
anyway and that space is usually covered with another side of the wall, so
why bother drawing it? That's right, there is no reason to and a technique
called Back-face Culling comes to our help. Back-face culling works this
way: it calculates the normal of the polygon (a normal is a perpendicular
pointing straight out of the polygon at a 90-deg angle, and is very common
in 3D graphics) and if it is pointing in the same direction as the camera,
the surface of that polygon is not rendered as illustrated in this image.
This technique was so common among the older 3D engines that developers of
OpenGL decided to take it into consideration and do all the dirty job for us
in hardware to speed up the pipeline which is in fact the next topic of this
tutorial.
3D Graphics Pipeline
In case you're all wondering what's up with all these pipelines everyone is
talking about, a pipeline is actually nothing more than an order of
relatively distinctive operations. At this stage it is early to talk about
what the operations are. Depending on what kind of program you're writing,
be it a 3D FPS engine or a flight simulator, the pipeline might actually
change into different forms that will work the best for a given task. And
therefore I'm not going to describe it here in detail, but I will as soon as
we get some tasks to do in further tutorials.
OpenGL Variable and Function Naming Conventions
In conclusion I want to say a few words on this topic. OpenGL was made for
use with various environments, not just Windows. You can always find more
information in the
numerous
OpenGL books
that are reasonably affordable, for a technical book, considering the amount
of knowledge you would have gained by the time you finished a book. In this
section I explain naming conventions for both OpenGL functions and
variables. Although you don't have to use OpenGL-defined types I still feel
obligated to describe them here so that anyone who wants their software to
be platform-independent understand what this all means. Well, lets see.
OpenGL has a number of predefined types. If you never plan being
platform-independent it might be the best way to use local C types such as
int, float and double. However if that's not the case, OpenGL has
definitions that will work on the current system whatever the system is. All
you have to do is add GL in front of the standard C types. For example, if
you want to use a floating number type use GLfloat instead of C's float and
if you want to use an int, use GLint. That works for the rest of the normal
C types as well. If you want to use an unsigned value, just add a "u"
between GL and the type like so: GLuint; is an unsigned integer. There is
also a GLboolean which is identical to bool in C. GLbitfield is used to
define binary fields. A little less obvious type in OpenGL is clamp; its
variations are clampf and clampi for floating and integer variables
respectively. It is short for ColorR AMPlitude and used for color
compositions. There are no types for pointers. Pointers are defined the
usual way. For instance this is an array of pointers to int: GLint *i[16];
Each OpenGL function has a neat naming convention and its format is:
To demonstrate this on a real name function I will use the glVertex3f
function.
glVertex3f(0.0f, 0.0f, 0.0f);
|
| ||
|
| ||
|
| |+- f means all
parameters are floats
|
| |
|
| +- 3 is the number of
parameters
|
|
|
+- Vertex is the name of the function that renders a 3D point (or a
vertex)
|
+- gl specifies the opengl library
The last two parameters are mostly encountered in the functions that are
responsible for drawing primitives. Many other functions are usually used in
this form: