scikit-learn/scikit-learn
View on GitHubPCA, LDA, unexpected explained_variance_ratio
Open
#9,400 opened on Jul 18, 2017
Bughelp wantedmodule:decomposition
Description
from sklearn import datasets
from sklearn.decomposition import PCA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
iris = datasets.load_iris()
X = iris.data
y = iris.target
target_names = iris.target_names
#### dimensionality reduction using PCA
pca = PCA(n_components=2)
X_r = pca.fit(X).transform(X)
#### Percentage of variance explained for each components
print('PCA: explained variance ratio (first two components): %s'
% str(pca.explained_variance_ratio_))
#### dimensionality reduction using LDA
lda = LinearDiscriminantAnalysis(n_components=2)
X_r2 = lda.fit(X, y).transform(X)
print('LDA: explained variance ratio (first two components): %s'
% str(lda.explained_variance_ratio_))
Expected Results
The first componet of the PCA has a larger variance ratio than that from the first componet from LDA.
Actual Results
PCA: explained variance ratio (first two components): [ 0.925 0.053] LDA: explained variance ratio (first two components): [ 0.991 0.009]
Versions
Darwin-14.5.0-x86_64-i386-64bit Python 3.5.3 |Anaconda custom (x86_64)| (default, Mar 6 2017, 12:15:08) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] NumPy 1.12.1 SciPy 0.19.1 Scikit-Learn 0.18.2