PyTorch feature parity · FluxML/Flux.jl#1431

(94 评论) (33 反应) (0 负责人)Julia (4,725 star) (619 fork)batch import

help wanted

描述

A list of PyTorch 1.7 features. Items are checked if we have something more or less equivalent in Flux or in the julia ecosystem and supported by Flux. This list is not complete, it comes from a rough scan of pytorch's documentation. Please feel free to add anything I missed in the comments, and whoever has write access to modify the list. Related issue https://github.com/FluxML/ML-Coordination-Tracker/issues/16, and more generally anything in https://github.com/FluxML/ML-Coordination-Tracker/issues

Pytorch Features

Conv Layers

Conv1d, Conv2d, Conv3d.
ConvTranspose1d, ConvTranspose2d, ConvTranspose3d.
groups in convolution layers
Fold, Unfold. In progress: https://github.com/FluxML/NNlib.jl/pull/444

Pooling Layers

MaxPool1d, MaxPool2d, MaxPool3d
MaxUnPool1d, MaxUnPool2d, MaxUnPool3d
AvgPool1d, AvgPool2d, AvgPool3d
FractionalMaxPool2d
LPPool1d, LPPool2d
AdaptiveAvgPool1d, AdaptiveAvgPool2d, AdaptiveAvgPool3d
AdaptiveMaxPool1d, AdaptiveMaxPool2d, AdaptiveMaxPool3d

Padding Layers

ReflectionPad (1d,2d)
ReplicationPad (1d,2d,3d) ( NNlib.pad_repeat)
ZeroPad (2d)
ConstantPad (1d,2d,3d)
~~Add corresponding layers for all of the aboves wrapping the NNlin functions~~ keep as functions. Need to add them Flux's docs.

Activations

... . NNlib has an extensive collection of activation, plus we have any julia function.

Normalization Layers

BatchNorm1d, BatchNorm2d, BatchNorm3d
LayerNorm
GroupNorm
InstanceNorm1d,InstanceNorm2d,InstanceNorm3d
SyncBatchNorm
LocalResponseNorm. Very old unfinished PR #312. It is an outdated technique, probably we can live without it.
Move the functional implementations to NNlib.jl (https://github.com/FluxML/NNlib.jl/issues/19)

Recurrent Layers

RNN
GRU
LSTM

Attention Layers

Transformer. Well maintained implementations in Tansformers.jl.
MultiHeadAttention ~~Should be moved from Transformers.jl to Flux.jl~~ (ensure hitting cudnn kernels). PR #2146

Linear Layers

Identity
Linear
Bilinear

Dropout Layers

Dropout
Dropout2d, Dropout3d (#1490)
AlphaDropout

Sparse Layers

Embedding PR #1516
EmbeddingBag PR #2031

Distance Functions

CosineSimilarity. We have this in Distances.jl. Also easy to handcode. TODO check if AD and gpu friendly.
PairwiseDistance. We have this in Distances.jl TODO check if AD and gpu friendly (could use Tullio.jl to achieve both)

Loss Functions

.... . We should be well covered here.
CTCLoss. Being Implemented in #1287 (todo: remove separate GPU case, integrate with cudnn)

Vision Layers

PixelShuffle. #1468
Upsample (for 1d, 2d, and 3d). (partially done in #1468)
- 'nearest'
- 'linear' (cpu version merged in NNlib, CUDA PR still to come)
- 'bilinear'
- 'bicubic'
- 'trilinear' (cpu versino merged in NNlib, CUDA PR still open )

Initialization

xavier_uniform, xavier_normal. Called glorot here.
kaiming_normal kaiming_uniform
sparse
orthogonal (#1496)

Parallelism and Distributed

DataParallel
DistributedDataParallel(solved by https://github.com/DhairyaLGandhi/DaggerFlux.jl
set_num_threads, set_num_interop_threads. Not sure which operations are parallelized in pytorch. Here we have parallelization only in blas operations.

Distributions

diff rules for logpdf offered by DistributionsAD.jl
rsample. params's differentiability through sampling supported by many distr: gradient(mu -> rand(Normal(mu, 1)), 0) == (1,).

ONNX

Current best support in ONNXmutable. See this discussion
- ONNX.jl's old implementation has been replaced
- Overcome the limitations reported here

FFT

... . Zygote has the adjoints for AbstractFFTs.

Quantization

Pruning

WIP pruning package here

Optim

LinAlg

det
norm

Tensorboard

integration offered by TensorBoardLogger.jl

XLA

Some work in XLA.jl

Misc

Pytorch has both layers and their functional counterpart.
einsum. AD and CUDA compatible Einstein summation given by Tullio.jl and other packages
- add Documentation to Flux.jl
LazyModuleMixin (pytorch 1.8) PR #2078
weight_norm. Attempt in #1005 , PR #2053
modules iterator. #1444
spectral_norm. Old attempt in #115

Pytorch Extras

Torchvision

datasets. Some are implemented in DLDatasets.jl (unreleased), some in FastAI.jl, some in MLDatasets.jl, many are missing.
- Will consolidate in MLDatasets.jl (see https://github.com/lorenzoh/DLDatasets.jl/issues/1)
models. Some are implemented in Metalhead.jl, but it is a bit stale and not comprehensive.
- Metalhead's PR should add a bunch of model and generally revive the repo
- We should expose the possibility to load pretrained weights
io
transforms. Some ~~unreleased~~ work in DataAugmentation.jl

Torchaudio ...

Torchtext ...

贡献者指南

技术栈: 无
领域: machine learningbackend
议题类型: feature
难度: 5
预计时间: over 1 week
活动状态: stale
清晰度: unclear
前置要求: Julia programmingDeep learning basicsFamiliarity with Flux and PyTorch
新手友好度: 10
研究方向: This is a meta issue tracking many missing features. To contribute, pick a specific unchecked item (e.g., 'Fold', 'Unfold', or 'MaxUnPool1d') and look at its status and linked PRs. For 'Fold' and 'Unfold', there is an in progress PR in NNlib.jl; you could help finish it. Study the existing Flux codebase and the corresponding PyTorch documentation. The issue has many comments that may give further guidance.