In the context of your question (comparing two sets of kPCA axes for one given $\mathbf x$) the first term is a constant and so the only relevant term is the

A good method for estimating which is better is to transform the new sample to each of the PCA spaces, truncate the coefficients, transform back and check the reconstruction error. Actually, it makes your question more interesting (+1) and I will try to provide a answer later today or tomorrow. –amoeba Oct 13 '14 at 18:20

I think I might have to get more dimensions for the PCA, though. –yoki Oct 13 '14 at 17:00 Any idea on how to do this on Kernel PCA?

Instead, we apply kernel trick and compute principal components directly, by eigen-decomposition of the kernel matrix $\mathbf K$: if its eigenvectors are $\mathbf V$ and eigenvalues are on the diagonal of

Now consider the reconstruction error in the target space: \begin{align} \|\phi(\mathbf x) - \mathbf U_p^\vphantom{\top} \mathbf U_p^\top \phi(\mathbf x)\|^2 &= \|\phi(\mathbf x)\|^2 - \|\mathbf U_p^\top \phi(\mathbf x)\|^2 \\ &= k(\mathbf x,

Sorry for all the scrutiny, but I would really like my toy example (at least) to work properly.

It obviously cannot be simply inverted, and using pseudo-inverse for $V$ yields an ill-conditioned problem. As far as I see, when (numerically) very small negative or positive eigenvalues are obtained, the reconstruction will have errors. –yoki Oct 15 '14 at 17:51

up vote 1 down vote favorite 1 I have two PCA bases obtained by decomposition of two groups of training data. Which means that you take your test sample, project on the first $p$ components, and look at the variance.

How can I decide which PCA basis fits better each test sample?

share|improve this answer edited Oct 15 '14 at 16:30 answered Oct 14 '14 at 15:48 amoeba 29.3k8103167 1 @ido: yes, kernel scalar product of the test vector to itself (so

Actually, it means that in the context of your question (comparing two sets of PCA axes for one $\mathbf x$) the only relevant term is the second, which is simply the now I can't use an "inverse" transform. –yoki Oct 13 '14 at 17:58 1 @ido: Yes, it would also work for kernel PCA. Browse other questions tagged classification pca kernel-trick or ask your own question.

I updated the answer. –amoeba Oct 15 '14 at 16:27 1 It is probable to get zero eigenvalues, or otherwise very close.

Join them; it only takes a minute: Sign up Here's how it works: Anybody can ask a question Anybody can answer The best answers are voted up and rise to the What reasons are there preventing me from using an old edition text? Your way is probably fine too. –amoeba Oct 15 '14 at 20:52

Now it should work out. For any given $\mathbf x$ it is a constant. This should already lead to near-zero error.

Setting up some notation, if $\mathbf x$ is your test sample and $\mathbf U_p$ a matrix with $p$ leading principal axes in columns, then the reconstruction error is given by $$L This you know.

The higher the variance -- the better the fit. That sounds reasonable. So you can take your test sample, project it on the first $p$ components, and look at the variance.