[code]
#< -
morpho <- read.csv("http://www.chainsawriot.com/data_epb/pca_example.csv", header = TRUE, sep = ",", quote="\"", dec=".",fill = TRUE, comment.char="")[,2:7]
morpho <- as.matrix(morpho)
# dimension of morpho, 28 x 6
m <- dim(morpho)[1]
n <- dim(morpho)[2]
# mean normalization
column.mean <- apply(morpho, 2, mean)
# tedious, can use sweep function
mat.mean <- t(t(matrix(1, m, n)) * column.mean)
# the normalization is only x - mean
# covariance matrix, remember the averaging factor is 1/(m-1), instead of 1/m
# it is the same as cov(morpho)
# mat.cov can be covert to correlation matrix using cov2cor function.
# But i want to do it using linear algebra
# conversion of covariance to correlation = cov(x,y) / (sd(x)*sd(y))
column.sd <- apply(morpho, 2, sd)
# colculation the outer product of column.sd. Can be done using
# column.sd %o% column.sd
outer.sd <- column.sd %*% t(column.sd)
mat.cor <- mat.cov / outer.sd
# svd the mat.cor
svd.cor <- svd(mat.cor)
eigen.value <- svd.cor\$d
#screeplot
plot(eigen.value, type="b", xlab="Component", ylab="Variances")
abline(h=1, lty=3)

# we are going to take the first two components and compute the two component scores
u.reduce <- svd.cor\$u[,1:2]

# For compatibility with score from Princomp()
# sds is the idiosyncractic scaling factor used by Princomp
# not the same as column.sd

sds <- sqrt(diag(mat.cov*(1 - 1/m)))
mat.sds <- t(t(matrix(1, m, n)) * sds)
comp.score <- mat.zscore %*% u.reduce
[/code]

L2-3:

L5-6:

L7:

L 10-12:

L 15:

t(adj.morpho) 是 6 x 28 的 matrix ，乘 adj.morpho 是 28 x 6 Matrix ，結果是 6 x 6 matrix 。在 R ，如要計算與 cov() 算出來的 covariance matrix 一樣的結果，是要除 m-1 。我在 Stanford 的 Machine Learning 課是教除以 m 的。 ((為何是除 m-1 ，是因為除 m 的是 biased estimator 。如果 m 是超大的話，減不減那 1 分別不大。))

L 19-23:

L 25-32:

Singular Value Decomposition (svd) 方法計算 Eigen values 和 Loadings (Eigen Vectors) 。再用 Eigen Value 繪製 Scree Plot 。由 Scree plot 可見只有 Component 1 和 Component 2 的 Eigen Values 高於 1 。故此，只取頭兩個 component 。將頭兩個 component 的 loadings (u, Eigen Vector) 存為 u.reduce 。

L38-41: