COMP3421 – Lec 2 – Transformations

August 23, 2009

Homogeneous Coordinates

Interestingly we can use the extra dimension in homogeneous coordinates to distinguish a point from a vector. A point will have a 1 in the last component, and a vector will have a 0. The difference between a point and a vector is a bit wish washy in my mind so I’m not sure why this distinction helps.

Transforming a Point

Say we have the 2D point (x, y). This point as a column vector in homogeneous coordinates is \begin{pmatrix} x \\ y \\ 1 \end{pmatrix}. For a multiplication between this vector and a transformation matrix (3 by 3) to work we need to do the matrix times the vector (in that order) to give the translated vector, Av = v'.

Combining Transformations

Say we want to do a translation then a rotation (A then B) on the point x. First we must do Ax = x', then B(x'). That is BAx. The order is important as matrix multiplication in not commutative, ie. AB \ne BA (just think a translation then a rotation is not necessarily the same as a rotate then a move (by the same amounts)). If we do lots of transformations we may get something like DCBAx, this is in effect doing transformation A then B then C then D. (Remember matrix multiplication is associative, i.e. (AB)C = A(BC)).

As a side note, if you express your point as a row vector (eg. \begin{pmatrix} x & y & z & 1\end{pmatrix}), then to do a transformation you must do xA (where x is the point/row vector). In this case xABC is equivalent to doing transformation A on point x, then transformation B then C (apparently this is how DirectX works).

Affine Transformations

Affine transformations are a special kind of transformation. They have a matrix form where the last row is [0 ... 0 1]. An affine transformation is equivalent to a linear transformation followed by a translation. That is, \begin{bmatrix} \vec{y} \\ 1 \end{bmatrix} = \begin{bmatrix} A & \vec{b} \ \\ 0, \ldots, 0 & 1 \end{bmatrix} \begin{bmatrix} \vec{x} \\ 1 \end{bmatrix} is the same as \vec{y} = A \vec{x} + \vec{b}.

Something interesting to note is, the inverse transformation of an affine transformation is another affine transformation, whose matrix is the inverse matrix of the original. Also an affine transformation in 2D is uniquely defined by its action on three points.

From page 209 of the text (Hill, 2006), affine transformations have some very useful properties.

1. Affine Transformations Preserve Affine Combinations of Points

For some affine transformation T, points P1 and P2, and real’s a1 and b1 where a1 + b1 = 1,

T(a_1P_1 + a_2P_2) = a_1T(P_1)+a_2T(P_2)

2. Affine Transformations Preserve Lines and Planes

That is under any affine transformation lines transformed are still lines (they don’t suddenly become curved), similarly planes that are transformed are still planes.

3. Parallelism of Lines and Planes is Preserved

“If two lines or planes are parallel, their images under an affine transformation are also parallel.” The explanation that Hill uses is rather good,

Take an arbitrary line A + bt having direction b. It transforms to the line given in homogeneous coordinates by M(A + bt) = MA + (Mb)t, this transformed line has direction vector Mb. This new direction does not depend on point A. Thus two different lines A_1 + \mathbf{b}t and A_2 + \mathbf{b}t that have the same direction will transform into two lines both having the direction M\mathbf{b}, so they are parallel. The same argument can be applied to planes and beyond.

4. The Columns of the Matrix Reveal the Transformed Coordinate Frame

Take a generic affine transformation matrix for 2D,

M = \begin{pmatrix}a & b & c\\ d & e & f\\ 0 & 0 & 1 \end{pmatrix}

The first two columns, \mathbf{m_1} = \begin{pmatrix}a \\ d\\ 0\end{pmatrix} and \mathbf{m_2} = \begin{pmatrix}b \\ e\\ 0\end{pmatrix}, are vectors (last component is 0). The last column \begin{pmatrix}c \\ f\\ 1\end{pmatrix} is a point (last component is a 1).

Using the standard basis vectors \mathbf{i} = \begin{pmatrix}1 \\ 0\\ 0\end{pmatrix}, \mathbf{j} = \begin{pmatrix}0 \\ 1\\ 0\end{pmatrix} with origin \phi = \begin{pmatrix}0 \\ 0\\ 1\end{pmatrix}, notice that i transforms to the vector \mathbf{m_1}. \mathbf{m_1} = M\mathbf{i}. Similarily for \mathbf{j} and \phi.

5. Relative Ratios are Preserved

6. Area’s Under an Affine Transformation

Given an affine transformation as a matrix M,

\frac{\mbox{area after transformation}}{\mbox{area before transformation}} = |\mbox{det} M|

7. Every Affine Transformation is Composed of Elementary Operations

Every affine transformation can be constructed by a composition of elementary operations (see below). That is,

M = (\mbox{shear})(\mbox{scale})(\mbox{rotation})(\mbox{translation})

For a 2D affine transformation M. In 3D,

M = (\mbox{scale})(\mbox{rotation})(\mbox{shear}_1)(\mbox{shear}_2)(\mbox{translation})

Rotations

Euler’s theorem: Any rotation (or sequence of rotations) about a point is equivalent to a single rotation about some coordinate axis through that point. Pages 221-223 of Hill give a detailed explanation of this, as well as the equations to go from one form to the other.

W2V (Window to Viewport Mapping)

A simplified OpenGL pipeline applies the modelview matrix, projection matrix, clipping, then the viewport matrix. The viewport matrix is the window to viewport map.

The window coordinate system is somewhere on the projection plane. These coordinates need to be mapped to the viewport (the area on the screen)

References

F.S. Hill, et al. (2006). Computer Graphics using OpenGL. Second Ed.


COMP3421 – Lec 1 – Colour

August 22, 2009

Colour

Pure spectral light, is where the light source has just one single wavelength. This forms monochromatic (or pure spectral) colours.

spectrum

However mostly light is made up of light of multiple wavelengths so you end up with a distribution of wavelengths. You could describe colour by this frequency distribution of wavelengths. For example brown is not in the spectrum, but we can get brown from this distribution of different light wavelengths,

brown_distribution

Spectral Distribution for a Brown Colour (http://www.cs.rit.edu/~ncs/color/a_spectr.html)

We could describe colour like this (as opposed to RGB) but human eyes perceive many different distributions (spectral density functions) as the same colour (that is they are indistinguishable when placed side by side). The total power of the light is known as its luminance which is given by the area under the entire spectrum.

The human eye has three cones (these detect light), the short, medium and long cones (we have two kinds of receptors cones and rods, rods are good for detecting in low light but they cannot detect colour or fine detail). The graph below shows how these three cones respond to different wavelengths.

300px-Cones_SMJ2_E.svg

(Source: http://en.wikipedia.org/wiki/File:Cones_SMJ2_E.svg)

So the colour we see is the result of our cones relative responses to RGB light. Because of this the human eye cannot distinguish some distributions that are different, to the eye they appear as the same color, hence you don’t need to recreate the exact spectrum to create the same sensation of colour. We can just describe the colour as a mixture of three colours.

There are three CIE standard primaries X, Y, Z. An XYZ colour has a one to one match to RGB colour. (See http://www.cs.rit.edu/~ncs/color/t_spectr.html for the formulae.)

Not all visible colours can be produced using the RGB system.

=====

Where S, P, N are spectral functions,

if S = P then N + S = N + P (ie. we can add a colour to both sides and if they were perceived the same before, they will be percieved the same after)

On one side you project S(\lambda) on the other you project combinations of A, B and C to give aA(\lambda) + bB(\lambda) + cC(\lambda)

By experimentation it was shown that to match any pure spectral colour \lambda you needed the amounts of RGB shown,

=====

To detirmine the XYZ of a colour from its spectral distribution  you need to use the following equations,

X= \int_0^\infty I(\lambda)\,\overline{x}(\lambda)\,d\lambda
Y= \int_0^\infty I(\lambda)\,\overline{y}(\lambda)\,d\lambda
Z= \int_0^\infty I(\lambda)\,\overline{z}(\lambda)\,d\lambda

Where the \overline{x}, \overline{y} and \overline{z} functions are defined as,

The CIE 1931 XYZ color matching functions.

The CIE 1931 XYZ colour matching functions. (Source: http://commons.wikimedia.org/wiki/File:CIE_1931_XYZ_Color_Matching_Functions.svg CC-BY-SA)

CIE Chromaticity Diagram

We can take a slice of the CIE space to get the CIE chromaticity diagram.

(Source: Hill)Source: http://en.wikipedia.org/wiki/File:CIE1931xy_blank.svg

RGB

fsc_stego_kessler_fig02(r, g, b) is the amount of red, green and blue primaries.

CMY

CMY is a subtractive colour model (inks and paint works this way). (c,m,y) = (1,1,1) – (r,g,b).

But inks don’t always subtract well so printers usually use a black ink as well using CMYK.

HSV

The HSV colour model is really good for allowing the user to select a colour as they choose the hue (colour), saturation (how rich the colour is) and value (how dark the colour is).

HSV Colour Cone

HSV Colour Cone

Gamut

Gamut is the range of colours available which is represented as a triangle in the CIE Chromacity diagram. Different devices have different gamuts (for instance the printer and LCD monitor).

  • Gamut Clipping – A shading in one image becomes just a solid colour in the other.
  • Gamut Scaling – Shading looks the same but the size of the gamut is minimal.


A problematic HSC ITG Question (2001 Q5a)

March 11, 2009

I discovered this back in 2007 when I was preparing for my HSC exams.

Here is the question (from the exam paper here),

2001 HSC ITG Q5a

2001 HSC ITG Q5a

Firstly I think this question is beyond the scope of the syllabus. The only relevant dot point says,

“Pictorial drawing

  • isometric
  • perspective (mechanical and measuring point)”

There is no reference to oblique drawing or oblique projection (this was the official answer).

Secondly, and more importantly the examiners say in their Notes from the Marking Centre, “This part was generally well answered; candidates had little trouble in identifying oblique and perspective projection.”

They claim that the first one is oblique projection, yet with just the information given its impossible to determine the projection used. For example the drawing given could be of a cube drawn in oblique projection or it could be of another object (shown below) in isometric projection, or some other object in some other projection. There are infinity different projections that it could have been drawn in.

An object (I call a Vube) shown in 3rd angle orthogonal which when drawn in isometric looks like a cube in oblique.

An object (I call a Vube) shown in 3rd angle orthogonal which when drawn in isometric looks like a cube in oblique.

Vube shown in perspective.

Vube shown in perspective.

The exam paper should have specified that the object in question is a cube.


The Mathematics Behind Graphical Drawing Projections in Technical Drawing

November 15, 2008

In the field of technical drawing, projection methods such as isometric, orthogonal, perspective are used to project three dimensional objects onto a two dimensional plane so that three dimensional objects can be viewed on paper or a computer screen. In this article I examine the different methods of projection and their mathematical roots (in an applied sense).

The approach that seems to be used by Technical Drawing syllabuses in NSW to draw simple 3D objects in 2D is almost entirely graphical. I don’t think you can say this is a bad thing because you don’t always want or need to know the mathematics behind the process, you just want to be able to draw without thinking about this. However to have an appreciation of what’s really happening the mathematical understanding is a great thing to learn.

Many 3D CAD/CAM packages available on the market today (such as AutoCAD, Inventor, Solidworks, CATIA, Rhinoceros) can generate isometric, three point perspective or orthogonal drawings from 3D geometry, however from what I’ve seen they can’t seem do other projections such as dimetric, trimetric, oblique, planometric, one and two point perspective. Admittedly I don’t think these projections are any use or even needed, but when your at high school and you have to show that you know how do to oblique, et al. it can be a problem when the software cannot do it for you from your 3D model. (So I actually wrote a small piece of software to help with this in this article). But to do so, I needed to understand the mathematics behind these graphical projections. So I will try to explain that here.

The key idea is to think of everything having coordinates in a coordinate system (I will use the Cartesian system for simplicity). We can then express all these projections as mathematical transformations or maps. Like a function, you feed in the 3D point, and then you get out the projected 2D point. Things get a bit arbitrary here because an isometric view is essentially exactly the same as a front view. So we keep to the convention that when we assign the axis of the coordinate system we try to keep the three planes of the axis parallel to the three main planes of the object.

The three "main" planes of the object are placed parallel to the three planes of the axis. This is how we will choose our axis in relation to the object.

The three "main" planes of the object are placed parallel to the three planes of the axis. This is how we will choose our axis in relation to the object.

We will not do this though,

We will not choose it like this...

We will not choose it like this...

...or this.

...or this.

In fact doing something like that shown just above with the object rotated is how we get projections like isometric.

Now what we do is take the coordinates of each point and “transform” them to get the projected coordinates, and join these points with lines where they originally were. However we can only do this for some kinds of projections, indeed for all the ones I have mentioned in this post this will do but only because these projections have a special property. They are linear maps (affine maps also hold this property and are a superset of the set of linear maps) which means that straight lines in 3D project to straight lines in 2D.

For curves we can just project a lot of points on the curve (subdivide it) and then join them up after they are projected. It all depends what our purpose is and if we are applying it practically. We can generate equations of the projected curves if we know the equation of the original curve but it won’t always be as simple. For example circles in 3D under isometric projection become ellipses on the projection plane.

Going back to the process of the projection, we can use matrices to represent these projections where

\begin{pmatrix}x'\\ y'\\ z'\end{pmatrix} = \begin{pmatrix}a&b&c\\ d&e&f\\ g&h&i\end{pmatrix}\begin{pmatrix}x\\ y\\ z\end{pmatrix}

is the same as,

x' = ax+by+cz\\ y' = dx+ey+fz\\ z' = gx+hy+iz.

We call the 3 by 3 matrix above as the matrix of the projection.

Knowing all this, we can easily define orthogonal projection as you just take two of the dimensions and cull the third. So for say an orthographic top view the projection matrix is simply,

 \begin{pmatrix}1&0&0\\ 0&1&0\\ 0&0&0\end{pmatrix}.

Now we want a projection matrix for isometric. One way would be to do the appropriate rotations on the object then do an orthographic projection, we can get the projection matrix by multiplying the matrices for the rotations and orthographic projection together. However I will not detail that here. Instead I will show you another method that I used to describe most of the projections that I learnt from high school (almost all except perspectives).

I can describe them as well as many “custom” projections in terms of what the three projected axis look like on the projection plane. I described them all in terms of a scale on each of the three axis, as well as the angle two of the axis make with the projection plane’s horizontal.

Projection attributes described in terms of the projected axis.

Projection attributes described in terms of the projected axis.

Using this approach we can think of the problem back in a graphical perspective of what the final projected drawing will look like rather than looking at the mathematics of how the object gets rotated prior to taking an orthographic projection or what angle do the projection lines need to be at in relation to the projection plane to get oblique, etc. Note also that the x, y, z in the above diagram are the scales of the x, y, z axis respectively. So we can see in the table below that we can now describe these projections in terms of a graphical approach that I was first taught.

Projection α (alpha) β (beta) Sx Sy Sz
Isometric 30° 30° 1 1 1
Cabinet Oblique 45° 0° 0.5 1 1
Cavalier Oblique 45° 0° 1 1 1
Planometric 45° 45° 1 1 1

Now all we need is a projection matrix that takes in alpha, beta and the three axes scale’s and does the correct transformation to give the projection. The matrix is,

\begin{bmatrix}x'\\y'\\z'\\1\end{bmatrix}=\begin{bmatrix}S_x\cos\alpha&-S_y\cos\beta&0&0\\ S_x\sin\alpha&S_y\sin\beta&S_z&0\\ 0&0&0&0\\ 0&0&0&1\end{bmatrix}\begin{bmatrix}x\\y\\z\\1\end{bmatrix}

Now for the derivation. First we pick a 3D Cartesian coordinate system to work with. I choose the Z-up Left Hand Coordinate System, shown below and we imagine a rectangular prism in the 3D coordinate system.

Block in 3D coordinate system.

Block in 3D coordinate system.

Now we imagine what it would look like in a 2D coordinate system using isometric projection.

Block in 2D coordinate system (isometric).

Block in 2D coordinate system (isometric).

As the alpha and beta angles (shown below) can change, and therefore not limited to a specific projection, we need to use alpha and beta in the derivation.

pp-description

Now using these simple trig equations below we can deduce the following.

polar

All the points on the xz plane have y = 0. Therefore the x’ and y’ values on the 2D plane will follow the trig property shown above, so:

x'=x\cos\alpha
y' = z + y\sin\alpha

However not all the points lie on the xz plane, y is not always equal to zero. By visualising a point with a fixed x and z value but growing larger in y value, its x’ will become lower, and y’ will become larger. The extent of the x’ and y’ growth can again be expressed with the trig property shown, and this value can be added in the respective sense to obtain the final combined x’ and y’ (separately).

x'=x\cos\alpha -y\cos\beta
y' = z + x \sin \alpha + y \sin \beta

If y is in the negative direction then the sign will automatically change accordingly. The next step is to incorporate the scaling of the axes. This was done by replacing the x, y & z with a the scale factor as a multiple of the x, y & z. Hence,

x'=S_x x\cos\alpha -S_y y\cos\beta
y' = S_z z + S_x x\sin\alpha + S_y y \sin \beta

This can now easily be transferred into matrix form as shown at the start of this derivation or left as is.

References:
Harvey, A. (2007). Industrial Technology – Graphics Industries 2007 HSC Major Project Management Folio. (Link)


The New Industrial Technology Syllabus (HSC 2010)

October 18, 2008

Only a couple of days ago the new Industrial Technology Syllabus to be implemented for the HSC in 2010 was released. It appears they finally weaved out a lot of the bugs making it much clearer and much less ambiguous. You wouldn’t think it would take them six years to do this, but turns out it did. The syllabus was not redone, rather just amended.

As for the changes… Well I guess the biggest change is the removal the Building and Construction, and Plastics Industries. I can understand the removal of Building and Construction as there is already a Construction VET course available, it’s always a shame to see a subject go so bad news for plastics enthusiasts, although it’s understandable when it gathers next to zero candidates each year.

The four sections of the course,

A. Industry study
B. Design and management
C. Workplace communication
D. Industry-specific content and production

have been changed to,

A. Industry Study
B. Design, Management and Communication
C. Production
D. Industry Related Manufacturing Technology.

They have also separated a lot of the preliminary content from the HSC content. This makes a lot of sense previously it appeared that you were supposed to learn the exact same content in both years. Also they have listed “Students learn about” and “Students learn to” dot points for the Major Work.

The most interesting (to me at least) changes were to that of the Graphics Industries specific content (note that they are now called technologies (collectively as focus areas) rather than industries e.g. you would now say the focus area Graphics Technologies). I support many, if not all of these changes although you get the feeling that this is what the original syllabus writers meant to be in the syllabus but simply forgot about and only now noticed that it was missing. I say this because much of the content from the previous HSC exams was based on material and content that was absent from the syllabus but has now been placed in the 2010 one. The order and categorising of this material has been redone and is much cleaner and nicer now.

For instance we now have oblique drawings (with references to cabinet and cavalier) mentioned in the syllabus along with,

  • A mention of architectural drawings including plans, elevations, sections, footing details, plumbing, electrical and roofing details, council requirements, site plans, set backs, shadow diagrams, landscape plans and colour palette and material selection. Previously they just said we need to know architectural styles and details without any elaboration.
  • axonometric projection
  • presentation techniques now include ‘fly-thoughs’ and prototypes
  • and equipment includes, both computer software packages AND mechanical drafting equipment rather than just either, scanners, electronic storage mediums such as external hard drives and flash drives (although they could have mentioned the common practice of storing files centrally on a file server in one place for many people to access, which is the much more common practice in the workplace), display folders, appropriate sized paper and stationary.

The Multimedia Technologies section also is much better now. It now contains the study of different types of fonts, formatting features, page layout elements for publications, features of graphics such as file formats and resolution, methods of obtaining images, image manipulation and editing, audio features such as sampling rate, file formats, analogue vs. digital, video features like frame rate compression, editing, compositing, animation techniques both 2D and 3D with references to motion capture, virtual reality, along with the world wide web, intellectual property, and the list goes on and on… Don’t just go by my description here go read the syllabus document, you will be very pleased with the changes or should I say additions.

If I were doing my HSC again, I know for sure I would have a very hard time choosing between multimedia and graphics technologies. They used to be together as one industry back pre 1999, although I must admit it is too much for someone who has done neither before to master both as one 2U subject. I wish you could do both, but they can’t allow that because the industry study, design and communication parts would be too common.

As for the common sections (Industry Study, Design and Management and Communication) the improvements here were good too with much more detail. But it’s not just the fact that the document is more detailed, but these details are what you would expect. They are in the right direction and are things that should be included. The Design section reinforces that the major project is not just about production of something, but the design aspects that go into it. The only problem I currently have is where is this design meant to be applied. It should be in the most obvious place, but the way the syllabus refers to production makes this slightly unclear. Timber Products and Furniture Technologies would look at the design aspects of the timber products or furniture product that they were producing. But if you were doing Graphics Technologies, your product is a series of drawings and perhaps related media such as flythoughs, etc. Do you look at the design of these drawings, I would say not, rather you should apply design techniques to the thing you are drawing, whether that be a product, building or a mechanical system. I don’t think this has been cleared up.

I haven’t been up to date with all things related here, so I may have missed some things. But one thing is for sure that I congratulate the Board for their work on this, and I’m sure many HSC students will benefit immensely from this revised syllabus. The syllabus is in much better shape now. As for the content, well I could argue that the material from the stage 5 graphics technology syllabus is more advanced than that of the stage 6 syllabus, and this should not happen. But as long as the stage 5 course is not a prerequisite, and as long as you have less time to cover industry specific content from the stage 6 course than that of the stage 5 course, there is little that can be done.

(PS. As a self advertisment, my 2007 HSC Industrial Technology Graphics Industries Major Work in its entirety can be downloaded from my site here, http://andrew.harvey4.googlepages.com/)


(x,y,z,w) in OpenGL/Direct3D (Homogeneous Coordinates)

September 29, 2008

I always wondered why 3D points in OpenGL, Direct3D and in general computer graphics were always represented as (x,y,z,w) (i.e. why do we use four dimensions to represent a 3D point, what’s the w for?). This representation of coordinates with the extra dimension is know as homogeneous coordinates. Now after finally getting formally taught linear algebra I know the answer, and its rather simple, but I’ll start from the basics.

Points can be represented as vectors, eg. (1,1,1). Now a common thing we want to do in computer graphics is to move this point (translation). So we can do this by simply adding two vectors together,

\begin{pmatrix}x'\\y'\\z'\end{pmatrix} = \begin{pmatrix}x\\y\\z\end{pmatrix} + \begin{pmatrix}a\\b\\c\end{pmatrix} = \begin{pmatrix}x + a\\y + b\\z + c\end{pmatrix}.

If we wanted do some kind of linear transformation such as rotate about the origin, scale about the origin, etc, then we could just multiply a certain matrix with the point vector to obtain the image of the vector under that transformation. For example,

\begin{pmatrix}x'\\ y'\\ z' \end{pmatrix} = \begin{pmatrix}\cos \theta &-\sin \theta &0\\ \sin \theta &\cos \theta &0\\ 0&0&1\end{pmatrix} \begin{pmatrix}x\\ y\\ z\end{pmatrix}

will rotate the vector (x,y,z) by angle theta about the z axis.

However as you may have seen you cannot do a 3D translation on a 3D point by just multiplying a 3 by 3 matrix by the vector. To fix this problem and allow all affine transformations (linear transformation followed by a translation) to be done by matrix multiplication we introduce an extra dimension to the point (denoted w in this blog). Now we can perform the translation,

\mathbb{R}^2 : (x,y) \to (x+a, y+b)

by a matrix multiplication,

\begin{pmatrix}1 & 0 & a\\ 0 & 1 & b\\ 0 & 0 & 1\end{pmatrix} \begin{pmatrix}x\\ y\\ 1\end{pmatrix} = \begin{pmatrix}x + a\\ y + b\\ 1 \end{pmatrix}.

We need this extra dimension for the multiplication to make sense, and it allows us to represent all affine transformations as matrix multiplication.

REFERENCES:
Homogeneous coordinates. (2008, September 29). In Wikipedia, The Free Encyclopedia. Retrieved 04:33, September 29, 2008, from http://en.wikipedia.org/w/index.php?title=Homogeneous_coordinates&oldid=241693659