Provide a way to access non-differentiable model state from `Optimizer.update`.

@dominikgrewe pointed out that K-FAC needs to access the model's non-differentiable internal state from `Optimizer.update(_:along:)`. My initial idea is to change the optimizer API to the following:

```swift
public protocol Optimizer {
    associatedtype Model: Layer
    associatedtype Scalar: FloatingPoint
    var learningRate: Scalar { get }
    mutating func update(_ variables: inout Model, along gradient: Model.CotangentVector)
}
```

But this comes at the cost of complicating concrete optimizers' implementation of `update(_:along:)`:
* Too long: `model.allDifferentiableVariables.recursivelyAllWritableKeyPaths(to: Tensor<Scalar>.self)`.
* Too too long: `model.allDifferentiableVariables[keyPath: kp] -= stepSize * firstMoments[keyPath: kp] / (sqrt(secondMoments[keyPath: kp]) + epsilon)`.

And we obviously can't assign `secondMoments.allDifferentiableVariables` to a local variable because we need setter access. Something like `inout var modelVariables = model.allDifferentiableVariables` is not possible in Swift yet until the ownership model and related things get fleshed out.

Another issue is that making model be `inout` is not semantically accurate: we don't want to mutate a model's non-differentiable states in an optimizer.

Maybe what we really need is optimizer-specific protocols that require model states, which models will implement. The specific optimizers will have such a generic constraint, and take these states as initializer parameters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Provide a way to access non-differentiable model state from `Optimizer.update`. #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Provide a way to access non-differentiable model state from Optimizer.update. #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Provide a way to access non-differentiable model state from `Optimizer.update`. #4