Shape-propagating Chain · FluxML/Flux.jl#703

(8 留言) (8 反應) (0 負責人)Julia (4,725 star) (619 fork)batch import

discussionenhancementhelp wanted

描述

It'd be nice to be able to write something like

model = @Chain(
  Input(28^2),
  Dense(32, relu),
  Dense(10),
  softmax)

It's a relatively minor convenience but it does avoid some redundancy when specifying chains, which is tedious to correct and easy to get wrong when trying different layer sizes.

Here's roughly how I imagine this working. The @Chain would expand to something like

shape = nothing
layer1, shape = fromshape(Input, shape, 10)
layer2, shape = fromshape(Dense, shape, 32, relu)
...
Chain(layer1, layer2, ...)

fromshape can then forward to an appropriate constructor or error for non-supported layers. Hopefully this strikes the right balance of simplicity/generality and we don't end up having to turn it into a full shape inference system.

貢獻者指南

技術棧: 無
領域: machine learning
議題類型: feature
難度: 3
預計時間: 3-5 days
活動狀態: stale
清晰度: clear
前置要求: Basic knowledge of Flux.jlUnderstanding of Julia macrosFamiliarity with neural network layer construction
新手友善度: 30
研究方向: Examine Flux.jl's existing @Chain macro implementation (likely in src/chains.jl) to understand current syntax and expansion. Define a fromshape function that takes a layer type, previous shape, and parameters to infer the output shape. Implement fromshape for common layers like Dense and Input. Ensure backward compatibility. Write tests for various chain configurations. Consult issue comments for additional use cases.