scikit-learn/scikit-learn
GitHub で見るDictionary learning is slower with n_jobs > 1
Open
#4,769 opened on 2015年5月26日
Performancehelp wantedmodule:decomposition
説明
Setting n_jobs > 1 in MiniBatchDictionaryLearning (and in function dictionary_learning_online) leads to worse performance.
Multi processing is handled in sklearn.decompositions, function dict_learning, l 249
code_views = Parallel(n_jobs=n_jobs)(
delayed(_sparse_encode)(
X[this_slice], dictionary, gram, cov[:, this_slice], algorithm,
regularization=regularization, copy_cov=copy_cov,
init=init[this_slice] if init is not None else None,
max_iter=max_iter)
for this_slice in slices)
Minimal example : https://gist.github.com/arthurmensch/091d16c135f4a3ba5580
Output n_jobs = 1
Distorting image...
Extracting reference patches...
done in 0.05s.
Learning the dictionary...
done in 5.12s.
Extracting noisy patches...
done in 0.02s.
Lasso LARS...
done in 10.24s.
Output n_jobs == 2
Distorting image...
Extracting reference patches...
done in 0.05s.
Learning the dictionary...
done in 78.98s.
Extracting noisy patches...
done in 0.02s.
Lasso LARS...
done in 6.15s.
Output n_jobs == 4
Distorting image...
Extracting reference patches...
done in 0.05s.
Learning the dictionary...
done in 83.24s.
Extracting noisy patches...
done in 0.02s.
Lasso LARS...
done in 3.82s.
We can see that transform function of MiniBatchDictionaryLearning (relying on sparse_encode function) benefits from multi-processing as expected.
Dictionary learning relies on successive calls of sparse_encode function : slowness may come from this.