Mini Project 1

Watchanan Chantapakul (wcgzm)


Part A: Original feature space and Euclidean distance

Given the dataset with four-classes (you can download from the link provided on Canvas as .mat) where each class follows a specific distribution:

1. Estimate the mean and covariance of each class distribution using a library function (i.e. Matlab toolbox, or Python statistics package, etc.). Report on their values.

Compute means μ

μc(x)=1NNci=1xi

Compute variances σ2

σ2c(x)=1NddofNci=1(xiμc)2

Note: numpy computes variance depending on delta degrees of freedom (ddof). So, in order to compute a sample variance, ddof must be set to 1.

Check that the computed eigenvectors are perpendicular to each other.

2. Plot the data in each of the four classes using different colors and display their eigen-vectors.

3. Consider the following four test samples in the table below 1:

Test Samples x-value y-value
s1 2.3 1.9
s2 7 -0.3
s3 10 0.5
s4 -1.2 0.6
Table 1: Test Samples to be classified

(a) On the same previous plot, display the four test samples.

Ellipse equation

x(α)=σ2c,xcos(α)cos(θ)σ2c,ysin(α)sin(θ)+μc,xy(α)=σ2c,xcos(α)sin(θ)+σ2c,ysin(α)cos(θ)+μc,y

Here, the angle α is computed from: α=arctan(ϕi,2ϕi,1)

The length of an eigenvector is scaled by the factor of the associated eigenvalue λi. Thus, an axis of a distribution is λiϕi.

(b) Compute the Euclidean distances d(μi,sj) between the center of each class i=1,2,3,4 and the test samples j=1,2,3,4.

d(μc,sj)=||μcsj||2

Classification based on Euclidean distance

ω=argminid(μc,sj)

In the plotted figures belolw, the purple line indicates the minimum distance popped out from the other three distances.

(c) Classify the test samples accordingly and report the results in the following table 2:

Test Samples d(μ1,sj) d(μ2,sj) d(μ3,sj) d(μ4,sj) Class Assignment
s1 8.2553 6.7662 5.8157 4.2925 class 4
s2 6.0986 2.7688 9.1359 9.4408 class 2
s3 4.5482 4.7291 12.2354 12.1118 class 1
s4 11.9873 8.8865 2.7610 2.4548 class 4
Table 2: Euclidean distances and classification results in the original feature space

Part B: Whitened space and Euclidean distance

1. Apply a whitening transformation to the data in each of the classes according to their own parameters (i.e. Mean and Covariance)

Whitened mean

μW,c(x)=Λ12cΦTcμx,c

Whitened covariance

ΣW,c=Λ12cΦTcΣcΦcΛ12c=I

The whitened covariance matrix Σw becomes an identity matrix as it is transformed by rotating and squishing. Therefore, ΦTcΣcΦc is equal to Λc, so that it cancels the other two Λ12c's out. ΦTcΣcΦc=Λc Λ12cΛcΛ12c=I

Whitened data sample

We can whiten a data sample just like when we apply whitening transformation to a mean vector μx,c. However, it must be transformed with the corresponding Λ and Φ. xW,c=Λ12cΦTcx

Notice the values of all whitened covariance matrices. They are identity matrices as stated above. Well, the off diagonal values are not completely zero but they approach zero.

Check that the computed eigenvectors are perpendicular to each other.

Whitened test sample

In order to whiten a test sample, as there are 4 classes, each test sample has to be whitened based on each distribution one at a time.

sj=Λ12iΦTisj

Classification based on Euclidean distance in the whitened spaces

ω=argminid(μc,sj)

2. Repeat questions A.1, A.2 and A.3. but this time using the whitened data and whitened testing samples and report the results in the following table 3:

Test Samples d(μ1,sj) d(μ2,sj) d(μ3,sj) d(μ4,sj) Class Assignment
s1 4.4632 8.2269 3.6661 3.0176 class 4
s2 2.7344 2.5084 3.4876 6.5090 class 2
s3 2.4367 2.6259 4.6832 8.6817 class 1
s4 6.5137 10.3474 2.5280 3.0785 class 3
Table 3: Euclidean distances and classification results in the whitened space

Part C: Original feature space and Mahalanobis distance

1. Using the original dataset from Part A (ie. before whitening), repeat question A.3 using the Mahalanobis distances instead of the Euclidean r(μi,sj) and report the results in the following table 4.

Mahalanobis distance

Mahalanobis distance is a way for computing the distance from a pattern to a distribution. It is preferable because we can perform the computation in the original feature space, i.e., we don't have to apply whitening transformation to the data. It is defined as follows: r2(μc,xj)=(xjμc)TΣ1c(xjμc)

Classification based on Mahalanobis distance in the original feature spaces

ω=argminir(μc,sj)
Test Samples d(μ1,sj) d(μ2,sj) d(μ3,sj) d(μ4,sj) Class Assignment
s1 4.4632 8.2269 3.6661 3.0176 class 4
s2 2.7344 2.5084 3.4876 6.5090 class 2
s3 2.4367 2.6259 4.6832 8.6817 class 1
s4 6.5137 10.3474 2.5280 3.0785 class 3
Table 4: Mahalanobis distances and classification results in the original feature space

2. Compare Tables 2, 3, 4 and comment on the classification results.


Report:

  1. Write and submit a Mini-Project Report 1 containing the answers to all the questions above, including a discussion on the results – i.e. the mean and covariance before and after the whitening; the class assignments in all three cases; etc.
  2. Submit your implementations.
Test Samples x-value y-value 2-norm 2-norm on whitened space Mahalanobis distance
s1 2.3 1.9 class 4 class 4 class 4
s2 7 -0.3 class 2 class 2 class 2
s3 10 0.5 class 1 class 1 class 1
s4 -1.2 0.6 class 4 class 3 class 3
Table 5: Classification resuts of four test samples based on different methods.

The classification results from using three approaches of the test samples s3,s2,s3 are the same. They are classified as class 4, 2, and 1, respectively. Interestingly, the test sample s4 is different. In the original feature space, it is classified by comparing euclidean distances to 4 classes, it turns out that s4 is classified as class 4. But when it comes to the other two methods (whitening transformation and Mahalanobis distance), it is rather classified as class 3. For the Mahalanobis distances between the test sample s4 and class distributions 3 and 4, they are really close. We obviously cannot use Euclidean distance here as the data of classes 3 and 4 are not aligned together. They spread in different directions.

Comparing mean vectors

Comparing covariance matrices

Whitened covariance matrices = identity matrices

This means that our covariance matrices become identity matrices. It is the result from applying whitening transformation to the origin space. The data will spread in every direction equally. We can also see from the unit standard deviation of each class distribution.

Comparing 3 distances

Why are they the same?

We can prove that a Euclidean distance in the whitened space equals a Mahalanobis distance.

Solve for Σc based on the equation arose from whitening transformation. ΦTcΣcΦc=Λc (ΦTc)1ΦTcΣcΦc=(ΦTc)1Λc ΣcΦc=(ΦTc)1Λc ΣcΦcΦ1c=(ΦTc)1ΛcΦ1c Σc=(ΦTc)1ΛcΦ1c Σc=(ΦTc)TΛcΦTc Σc=(ΦTc)ΛTcΦc Σc=ΦTcΛcΦc

Substitute Σc to the Mahalanobis distance equation. r2(μc,xj)=(xjμc)TΣ1c(xjμc) =(xjμc)T(ΦcΛcΦTc)1(xjμc) =(xjμc)T(ΦcΛ1cΦTc)(xjμc) =(xjμc)T(ΦcΛ12cΛ12cΦTc)(xjμc) =(xjμc)T(ΦcΛ12c)(Λ12cΦTc)(xjμc) =[(ΦcΛ12c)T((xjμc)T)T]T[(Λ12cΦTc)(xjμc)] =[(ΦcΛ12c)T(xjμc)]T[(Λ12cΦTc)(xjμc)] =[(xW,jμW,c)]T[(xW,jμW,c)] =||μW,cxW,j||22 =d2(μW,c,xW,j)

Take the square root out.

r2(μc)=d2(μW,c)r(μc)=d(μW,c)