scikit-learn/scikit-learn

PCA, LDA, unexpected explained_variance_ratio

Open

#9,400 opened on Jul 18, 2017

View on GitHub
 (6 comments) (0 reactions) (0 assignees)Python (66,084 stars) (27,020 forks)batch import
Bughelp wantedmodule:decomposition

Description

from sklearn import datasets
from sklearn.decomposition import PCA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

iris = datasets.load_iris()

X = iris.data
y = iris.target
target_names = iris.target_names

#### dimensionality reduction using PCA
pca = PCA(n_components=2)
X_r = pca.fit(X).transform(X)

#### Percentage of variance explained for each components
print('PCA: explained variance ratio (first two components): %s'
      % str(pca.explained_variance_ratio_))

#### dimensionality reduction using LDA
lda = LinearDiscriminantAnalysis(n_components=2)
X_r2 = lda.fit(X, y).transform(X)

print('LDA: explained variance ratio (first two components): %s'
      % str(lda.explained_variance_ratio_))

Expected Results

The first componet of the PCA has a larger variance ratio than that from the first componet from LDA.

Actual Results

PCA: explained variance ratio (first two components): [ 0.925 0.053] LDA: explained variance ratio (first two components): [ 0.991 0.009]

Versions

Darwin-14.5.0-x86_64-i386-64bit Python 3.5.3 |Anaconda custom (x86_64)| (default, Mar 6 2017, 12:15:08) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] NumPy 1.12.1 SciPy 0.19.1 Scikit-Learn 0.18.2

Contributor guide