FACTOID # 28: Austin, Texas has more people than Alaska.

 Home Encyclopedia Statistics States A-Z Flags Maps FAQ About

 WHAT'S NEW

SEARCH ALL

Search encyclopedia, statistics and forums:

(* = Graphable)

Encyclopedia > 3d projection

A 3D projection is a mathematical transformation used to project three dimensional points onto a two dimensional plane[1]. Often this is done to simulate the relationship of a camera to a subject, as 3D projection is often the first step in the process of representing three dimensional shapes two dimensionally in computer graphics, a process known as rendering. The result of this process can look similar to a picture taken with a camera[2], or a perspective drawing in graphic arts. The word projection can mean more than one thing. ... For the journal by ACM SIGGRAPH, see Computer Graphics (Publication). ... This article does not cite any references or sources. ... A cube in two-point perspective. ...

Data about the objects to render is usually stored as a collection of points, linked together in triangles. Each point is a set of three numbers, representing its X,Y,Z coordinates from an origin relative to the object they belong to. Each triangle is a set of three such points. In addition, the object has three coordinates X,Y,Z and some kind of rotation, for example, three angles alpha, beta and gamma, describing its position and orientation relative to a "world" reference frame. A frame of reference in physics is a set of axes which enable an observer to measure the aspect, position and motion of all points in a system relative to the reference frame. ...

Last comes the observer (or camera). The observer has a set of three X,Y,Z coordinates and three alpha, beta and gamma angles, describing the observer's position and the direction in which it is pointing.

All this data is usually stored using floating point values, although many programs convert them to integers at various points in the algorithm to speed up the calculations. A floating-point number is a digital representation for a number in a certain subset of the rational numbers, and is often used to approximate an arbitrary real number on a computer. ...

Points like described above given relative to an object are said to be in object space, and need to be transformed into world space as a first step for projection [3]. The complete projection can be done in three steps to transform a point from object space to screen space, which are detailed in the following sections:

• world transform Transform a point from object space to world space.
• camera transform Transform a point from world space to camera space. This includes a translation so the projection plane goes through the origin, and a rotation so the projection plane is perpendicular to the camera direction [4].
• perspective transform Project a point to the camera's view plane. This results in a 2D point in screen space[4].

In Euclidean geometry, translation is a transformation of Euclidean space which moves every point by a fixed distance in the same direction. ... A rotation matrix is a matrix which when multiplied by a vector has the effect of changing the direction of the vector but not its magnitude. ... The transformation P is the orthogonal projection onto the line m. ...

First step: world transform

The first step is to transform the point's coordinates, taking into account the position and orientation of the object they belong to. This is done using a set of four matrices: (The matrix used here is column major, i.e., v' = Matrix*v, the same in OpenGL but different in DirectX.) Image File history File links Broom_icon. ... This Manual of Style has the simple purpose of making things easy to read by following a consistent format &#8212; it is a style guide. ... In mathematics, a matrix (plural matrices) is a rectangular table of numbers or, more generally, a table consisting of abstract quantities that can be added and multiplied. ... To meet Wikipedias quality standards, this article or section may require cleanup. ... OpenGL (Open Graphics Library) is a standard specification defining a cross-language cross-platform API for writing applications that produce 2D and 3D computer graphics. ... Microsoft DirectX is a collection of application programming interfaces for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms. ...

$begin{bmatrix} 1 & 0 & 0 & x 0 & 1 & 0 & y 0 & 0 & 1 & z 0 & 0 & 0 & 1 end{bmatrix}$ — object translation
$begin{bmatrix} 1 & 0 & 0 & 0 0 & cos alpha & -sin alpha & 0 0 & sin alpha & cos alpha & 0 0 & 0 & 0 & 1 end{bmatrix}$ — rotation about the x-axis
$begin{bmatrix} cos beta & 0 & sin beta & 0 0 & 1 & 0 & 0 -sin beta & 0 & cos beta & 0 0 & 0 & 0 & 1 end{bmatrix}$ — rotation about the y-axis
$begin{bmatrix} cos gamma & -sin gamma & 0 & 0 sin gamma & cos gamma & 0 & 0 0 & 0 & 1 & 0 0 & 0 & 0 & 1 end{bmatrix}$ — rotation about the z-axis.

The four matrices are multiplied together, and the result is the world transform matrix: a matrix that, if a point's coordinates were multiplied by it, would result in the point's coordinates being expressed in the "world" reference frame. In Euclidean geometry, translation is a transformation of Euclidean space which moves every point by a fixed distance in the same direction. ...

Note that unlike multiplication between numbers, the order used to multiply the matrices is significant; changing the order will change the results too. When dealing with the three rotation matrices, a fixed order is good for the necessity of the moment that must be chosen. The object should be rotated before it is translated, since otherwise the position of the object in the world would get rotated around the centre of the world, wherever that happens to be.

World transform = Translation × Rotation

To complete the transform in the most general way possible, another matrix called the scaling matrix is used to scale the model along the axes. This matrix is multiplied to the four given above to yield the complete world transform. The form of this matrix is: In Euclidean geometry, uniform scaling is a linear transformation that enlarges or diminishes objects; the scale factor is the same in all directions; it is also called a homothety. ...

$begin{bmatrix} s_x & 0 & 0 & 0 0 & s_y & 0 & 0 0 & 0 & s_z & 0 0 & 0 & 0 & 1 end{bmatrix}$ — where sx, sy, and sz are the scaling factors along the three co-ordinate axes.

Since it is usually convenient to scale the model in its own model space or coordinate system, scaling should be the first transformation applied. The final transform thus becomes:

World transform = Translation × Rotation × Scaling

(as in some computer graphics books or programming APIs such as DirectX, it uses matrices with translation vectors in the bottom row, in this scheme, the order of matrices would be reversed.) Microsoft DirectX is a collection of application programming interfaces for handling tasks related to multimedia, especially game programming and video, on Microsoft platforms. ...

$begin{bmatrix} s_xcos gamma cos beta & -s_ysin gamma cos beta & s_zsin beta & x s_xcos gamma sin beta sin alpha + s_xsin gamma cos alpha & s_ycos gamma cos alpha - s_ysin gamma sin beta sin alpha & -s_zcos beta sin alpha & y s_xsin gamma sin alpha - s_xcos gamma sin beta cos alpha & s_ysin gamma sin beta cos alpha + s_ysin alpha cos gamma & s_zcos beta cos alpha & z 0 & 0 & 0 & 1 end{bmatrix}$ — final result of Translation × x × y × z × Scaling.

Second step: camera transform

The second step is virtually identical to the first one, except for the fact that it uses the six coordinates of the observer instead of the object, and the inverses of the matrices should be used, and they should be multiplied in the opposite order. (Note that (A×B)-1=B-1×A-1.) The resulting matrix can transform coordinates from the world reference frame to the observer's one.

The camera typically looks in its z direction, the x direction is typically left, and the y direction is typically up.

$begin{bmatrix} 1 & 0 & 0 & -x 0 & 1 & 0 & -y 0 & 0 & 1 & -z 0 & 0 & 0 & 1 end{bmatrix}$ — inverse object translation (the inverse of a translation is a translation in the opposite direction).
$begin{bmatrix} 1 & 0 & 0 & 0 0 & cos alpha & sin alpha & 0 0 & -sin alpha & cos alpha & 0 0 & 0 & 0 & 1 end{bmatrix}$ — inverse rotation about the x-axis (the inverse of a rotation is a rotation in the opposite direction. Note that sin(−x) = −sin(x), and cos(−x) = cos(x)).
$begin{bmatrix} cos beta & 0 & -sin beta & 0 0 & 1 & 0 & 0 sin beta & 0 & cos beta & 0 0 & 0 & 0 & 1 end{bmatrix}$ — inverse rotation about the y-axis.
$begin{bmatrix} cos gamma & sin gamma & 0 & 0 -sin gamma & cos gamma & 0 & 0 0 & 0 & 1 & 0 0 & 0 & 0 & 1 end{bmatrix}$ — inverse rotation about the z-axis.

The two matrices obtained from the first two steps can be multiplied together to get a matrix capable of transforming a point's coordinates from the object's reference frame to the observer's reference frame.

Camera transform = inverse rotation × inverse translation
Transform so far = camera transform × world transform.

Third step: perspective transform

The resulting coordinates would be good for an isometric projection or something similar, but realistic rendering requires an additional step to simulate perspective distortion. Indeed, this simulated perspective is the main aid for the viewer to judge distances in the simulated view. An isometric drawing of a cube. ... In imaging, there are two types of perspective distortion: Perspective projection distortion Perspective distortion (caused by camera to subject distance) This is a disambiguation page &#8212; a navigational aid which lists other pages that might otherwise share the same title. ...

A perspective distortion can be generated using the following 4×4 matrix:

$begin{bmatrix} cotmu & 0 & 0 & 0 0 & cotnu & 0 & 0 0 & 0 & frac{B+F}{B-F} & frac{-2BF}{B-F} 0 & 0 & 1 & 0 end{bmatrix}$

where μ is the angle between a line pointing out of the camera in z direction and the plane through the camera and the right-hand edge of the screen, and ν is the angle between the same line and the plane through the camera and the top edge of the screen. This projection should look correct, if you are looking with one eye; your actual physical eye is located on the line through the centre of the screen normal to the screen, and μ and ν are physically measured assuming your eye is the camera. On typical computer screens as of 2003, cot μ is probably about 11/3 times cot ν, and cot μ might be about 1 to 5, depending on how far from the screen you are. A normal vector is a vector which is perpendicular to a surface or manifold. ... Year 2003 (MMIII) was a common year starting on Wednesday of the Gregorian calendar. ...

F is a positive number representing the distance of the observer from the front clipping plane, which is the closest any object can be to the camera. B is a positive number representing the distance to the back clipping plane, the farthest away any object can be. If objects can be at an unlimited distance from the camera, B can be infinite, in which case (B + F)/(B − F) = 1 and −2BF/(B − F) = −2F. // In computer graphics, a clipping path is a closed vector path, or shape, used to cut out an image in current image manipulation software. ...

If you are not using a Z-buffer and all objects are in front of the camera, you can just use 0 instead of (B + F)/(B − F) and −2BF/(B − F). (Or anything you want.) Z-buffering is a term in computer graphics which refers to management of image depth coordinates in 3-d graphics, mainly used in hardware, more seldom in software. ...

All the calculated matrices can be multiplied together to get a final transformation matrix. One can multiply each of the points (represented as a vector of three coordinates) by this matrix, and directly obtain the screen coordinate at which the point must be drawn. The vector must be extended to four dimensions using homogeneous coordinates: In mathematics, homogeneous coordinates, introduced by August Ferdinand MÃ¶bius, allow affine transformations to be easily represented by a matrix. ...

$begin{bmatrix} x' y' z' omega' end{bmatrix}=begin{bmatrix}{rm Perspective transform}end{bmatrix} times begin{bmatrix}{rm Camera transform}end{bmatrix} times begin{bmatrix}{rm World transform}end{bmatrix} times begin{bmatrix} x y z 1 end{bmatrix}.$

Note that in computer graphics libraries, such as OpenGL, you should give the matrices in the opposite order than they should be applied, that is, first the perspective transform, then the camera transform, then the object transform, as the graphics library applies the transformations in the opposite order than you give the transformations in. This is useful, since the world transform typically changes more often than the camera transform, and the camera transform changes more often than the perspective transform. One can, for example, pop the world transform off a stack of transforms and multiply a new world transform on, without having to do anything with the camera transform and perspective transform. OpenGL (Open Graphics Library) is a standard specification defining a cross-language cross-platform API for writing applications that produce 2D and 3D computer graphics. ... Simple representation of a stack In computer science, a stack is a temporary abstract data type and data structure based on the principle of Last In First Out (LIFO). ...

Remember that {x'/ω', y'/ω'} are the final coordinates, where {−1, −1} is typically the bottom left corner of the screen, {1, 1} is the top right corner of the screen, {1, −1} is the bottom right corner of the screen and {−1, 1} is the top left corner of the screen.

If using a Z-buffer, a z'/ω' value of −1 corresponds to the front of the Z-buffer, and a value of 1 corresponds to the back of the Z-buffer. If the front clipping plane is too close, a finite precision Z-buffer will be less accurate. The same applies to the back clipping plane, but to a significantly lesser degree; a Z-buffer works correctly with the back clipping plane at an infinite distance, but not with the front clipping plane at 0 distance. Z-buffering is a term in computer graphics which refers to management of image depth coordinates in 3-d graphics, mainly used in hardware, more seldom in software. ...

Objects should only be drawn where −1 ≤ z'/ω' ≤ 1. If it is less than −1, the object is in front of the front clipping plane. If it is more than 1, the object is behind the back clipping plane. To draw a simple single-colour triangle, {x'/ω', y'/ω'} for the three corners contains sufficient information. To draw a textured triangle, where one of the corners of the triangle is behind the camera, all the coordinates {x', y', z', ω'} for all three points are needed, otherwise the texture would not have the correct perspective, and a point behind the camera would not appear in the correct location. In fact, the projection of a triangle where a point is behind the camera is not technically a triangle, since the area is infinite and two of the angles sum to more than 180°, the third angle being effectively negative. (Typical modern graphics libraries use all four coordinates, and can correctly draw "triangles" with some points behind the camera.) Also, if a point is on the plane through the camera normal to the camera direction, ω' is 0, and {x'/ω', y'/ω'} is meaningless.

Simple version

$X_{mathrm{2D}} = X_{mathrm{3D}} - frac{DX}{Z_{mathrm{3D}} + mathrm{Eye;distance}} times X_{mathrm{3D}}$
$Y_{mathrm{2D}} = Y_{mathrm{3D}} - frac{DY}{Z_{mathrm{3D}} + mathrm{Eye;distance}} times Y_{mathrm{3D}}$

where $frac{DX}{DY}$ is the distance between the eye and the 3D point in the X/Y axis, and a large positive Z is towards the horizon and 0 the screen. Using this you can also calculate 3D points in space.

All of the above-described 4-by-4 transformation matrices containing homogeneous coordinates are often classified, somewhat improperly, as "homogeneous transformation matrices". However, most of them (except for simple rotation and scaling matrices) represent definitely non-homogeneous and non-linear transformations (translations, roto-translations, perspective projections). And even the matrices themselves, as you can see, look rather heterogeneous, i.e. composed of different kinds of elements. Since they are multi-purpose transformation matrices, capable of representing both affine and projective transformations, they might be called "general transformation matrices", or, depending on the application, "affine transformation" or "perspective projection" matrices. Moreover, since the homogeneous coordinates describe a projective vector space, they might be also called "projective space transformation matrices". Using this formation you can also calculate points in 3D space. In linear algebra, linear transformations can be represented by matrices. ... In mathematics, homogeneous may refer to: a homogeneous polynomial, in algebra a homogeneous function a homogeneous differential equation a homogeneous system of linear equations, in linear algebra homogeneous coordinates a homogeneous number a homogeneous space for a Lie group G, or more general transformation group a homogeneous ideal in a... The word linear comes from the Latin word linearis, which means created by lines. ...

For the journal by ACM SIGGRAPH, see Computer Graphics (Publication). ... 3D computer graphics are different from 2D computer graphics in that a three-dimensional representation of geometric data is stored in the computer for the purposes of performing calculations and rendering 2D images. ... A graphics/video/display card/board/adapter is a computer component designed to convert the logical representation of visual information into a signal that can be used as input for a display medium. ... Transform and Lighting is a computing term used in computer graphics, generally used in the context of hardware acceleration (Hardware T&L). Transform refers to the task of converting coordinates in space, which in this case involves moving 3D objects in a virtual world and converting 3D coordinates to a... Spherical texture mapping Texture mapping is a method, pioneered by Edwin Catmull, of adding detail, surface texture, or colour to a computer-generated graphic or 3D model. ... A cube in two-point perspective. ...

References

1. ^ Kenneth C. Finney (2004). 3D Game Programming All in One. Thomson Course, 93. ISBN 159200136X.
2. ^ Hearn, Donald (1997). Computer Graphics, C Version. Prentice Hall, 432-433. ISBN 0135309247.
3. ^ article on GameDev
4. ^ a b Ingrid Carlbom, Joseph Paciorek, Planar Geometric Projections and Viewing Transformations, ACM Computing Surveys (CSUR), v.10 n.4, p.465-502, Dec. 1978

Results from FactBites:

 3D projection - Wikipedia, the free encyclopedia (1583 words) A 3D projection is a mathematical transformation used to project three dimensional points onto a two dimensional plane. 3D projection is often the first step in the process of representing three dimensional shapes two dimensionally in computer graphics, a process known as rendering. In fact, the projection of a triangle where a point is behind the camera is not technically a triangle, since the area is infinite and two of the angles sum to more than 180°, the third angle being effectively negative.
 3D computer graphics - Wikipedia, the free encyclopedia (1969 words) 3D computer graphics are different from 2D computer graphics in that a three-dimensional representation of geometric data is stored in the computer for the purposes of performing calculations and rendering 2D images. In general, the art of 3D modeling, which prepares geometric data for 3D computer graphics is akin to sculpting or photography, while the art of 2D graphics is analogous to painting. Orthogonal projection is used mainly in CAD or CAM applications where scientific modelling requires precise measurements and preservation of the third dimension.
More results at FactBites »

Share your thoughts, questions and commentary here