Skip to content

add differentiable versions of are and hinf #844

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
- uses: actions/checkout@v2
- uses: julia-actions/setup-julia@v1
with:
version: '1.8'
version: '1'
- name: Setup docs environment
run: |
julia --project=docs -e '
Expand Down
2 changes: 2 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
[deps]
ChainRules = "082447d4-558c-5d27-93f4-14fc19e9eca2"
ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
ComponentArrays = "b0b7db55-cfe3-40fc-9ded-d10e2dbeff66"
ControlSystems = "a6e380b2-a6ca-5380-bf3e-84a91bcd477e"
ControlSystemsBase = "aaaaaaaa-a6ca-5380-bf3e-84a91bcd477e"
DSP = "717857b8-e6f2-59f4-9121-6e50c889abd2"
Expand Down
99 changes: 54 additions & 45 deletions docs/src/examples/automatic_differentiation.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,30 @@
```@setup autodiff
using ChainRules, ForwardDiff, ChainRulesCore, LinearAlgebra
# @ForwardDiff_frule LinearAlgebra.exp!(x1::AbstractMatrix{<:ForwardDiff.Dual})

# Replace by ForwardDiffChainRules when https://github.com/ThummeTo/ForwardDiffChainRules.jl/pull/16 is merged
function LinearAlgebra.exp!(A::AbstractMatrix{<:ForwardDiff.Dual})
Av = ForwardDiff.value.(A)
J = reduce(vcat, transpose.(ForwardDiff.partials.(A)))
CS = length(ForwardDiff.partials(A[1,1]))
dself = NoTangent();
cAv = copy(Av)
eA, newJ1 = ChainRules.frule((dself, reshape(J[:,1], size(A))), LinearAlgebra.exp!, cAv)

newJt = ntuple(Val(CS - 1)) do i
xpartialsi = reshape(J[:, i+1], size(A))
cAv .= Av
_, ypartialsi = ChainRulesCore.frule((dself, xpartialsi), LinearAlgebra.exp!, cAv)
ypartialsi
end
newJ = hcat(vec(newJ1), vec.(newJt)...)
T = ForwardDiff.tagtype(eltype(A))
flaty = ForwardDiff.Dual{T}.(
eA, reshape(ForwardDiff.Partials.(NTuple{CS}.(eachrow(newJ))), size(A)),
)
end
```

# Automatic Differentiation
In Julia, it is often possible to automatically compute derivatives, gradients, Jacobians and Hessians of arbitrary Julia functions with precision matching the machine precision, that is, without the numerical inaccuracies incurred by finite-difference approximations.

Expand Down Expand Up @@ -41,9 +68,7 @@ linear_sys = ss(A, B, C, D)
```
The example above linearizes `f` in the point ``x_0, u_0`` to obtain the linear statespace matrices ``A`` and ``B``, and linearizes `g` to obtain the linear output matrices ``C`` and ``D``.

## Optimization-based tuning

### PID controller
## Optimization-based tuning--PID controller

This example will demonstrate simple usage of AD using ForwardDiff.jl for optimization-based auto tuning of a PID controller.

Expand Down Expand Up @@ -89,7 +114,7 @@ To make the automatic gradient computation through the matrix exponential used i
```@example autodiff
using Optimization, Statistics, LinearAlgebra
using OptimizationGCMAES, ChainRules, ForwardDiffChainRules
@ForwardDiff_frule LinearAlgebra.exp!(x1::AbstractMatrix{<:ForwardDiff.Dual})
# @ForwardDiff_frule LinearAlgebra.exp!(x1::AbstractMatrix{<:ForwardDiff.Dual}) # https://github.com/ThummeTo/ForwardDiffChainRules.jl/pull/16

function plot_optimized(P, params, res)
fig = plot(layout=(1,3), size=(1200,400), bottommargin=2Plots.mm)
Expand Down Expand Up @@ -148,47 +173,16 @@ plot_optimized(P, params, res.u)

The optimized controller achieves more or less the same low peak in the sensitivity function, but does this while *both* making the step responses significantly faster *and* using much less controller gain for large frequencies (the orange sensitivity function), an altogether better tuning. The only potentially negative effect of this tuning is that the overshoot in response to a reference step increased slightly, indicated also by the slightly higher peak in the complimentary sensitivity function (green). However, the response to reference steps can (and most often should) be additionally shaped by reference pre-filtering (sometimes referred to as "feedforward" or "reference shaping"), by introducing an additional filter appearing in the feedforward path only, thus allowing elimination of the overshoot without affecting the closed-loop properties.

### LQG controller
We could attempt a similar automatic tuning of an LQG controller. This time, we choose to optimize the weight matrices of the LQR problem and the state covariance matrix of the noise. The synthesis of an LQR controller involves the solution of a Ricatti equation, which in turn involves performing a Schur decomposition. These steps hard hard to differentiate through in a conventional way, but we can make use of implicit differentiation using the implicit function theorem. To do so, we load the package `ImplicitDifferentiation`, and defines the conditions that hold at the solution of the Ricatti equaiton:
## Optimization-based tuning--LQG controller
We could attempt a similar automatic tuning of an LQG controller. This time, we choose to optimize the weight matrices of the LQR problem and the state covariance matrix of the noise. The synthesis of an LQR controller involves the solution of a Ricatti equation, which in turn involves performing a Schur decomposition. These steps hard hard to differentiate through in a conventional way, but we can make use of implicit differentiation using the implicit function theorem. To do so, we load the package `ImplicitDifferentiation`, and define the conditions that hold at the solution of the Ricatti equaiton:
```math
A^TX + XA - XBR^{-1}B^T X + Q = 0
```

We then define differentiable versions of [`lqr`](@ref) and [`kalman`](@ref) that make use of the "implicit function".
When `ImplicitDifferentiation` is loaded, differentiable versions of [`lqr`](@ref) and [`kalman`](@ref) that make use of the "implicit function" are automatically loaded.

```@example autodiff
using ImplicitDifferentiation

function forward_are(Q0; A,B,R)
Q = reshape(Q0, size(A))
ControlSystemsBase.are(Continuous, A, B, Q, R), 0
end

function conditions_are(Q, X, noneed; A,B,R)
XB = X*B
T = promote_type(eltype(Q), eltype(X))
C = T.(Q)
mul!(C,A',X,1,1)
mul!(C,X,A,1,1)
mul!(C, XB, R\XB', -1, 1)
end

const implicit_are = ImplicitFunction(forward_are, conditions_are)

function lqrdiff(P::AbstractStateSpace{Continuous}, Q, R)
(; A, B) = P
X0, _ = implicit_are(Q; A, B, R)
X = X0 isa AbstractMatrix ? X0 : reshape(X0, size(A))
R\(B'X)
end

function kalmandiff(P::AbstractStateSpace{Continuous}, Q, R)
A = P.A'
B = P.C'
X0, _ = implicit_are(Q; A, B, R)
X = X0 isa AbstractMatrix ? X0 : reshape(X0, size(A))
(R\(B'X))'
end
using ImplicitDifferentiation, ComponentArrays # Both these packages are required to load the implicit differentiation rules
```

Since this is a SISO system, we do not need to tune the control-input matrix or the measurement covariance matrix, any non-unit weight assigned to those can be associated with the state matrices instead. Since these matrices are supposed to be positive semi-definite, we optimize Cholesky factors rather than the full matrices. We will also reformulate the problem slightly and use proper constraints to limit the sensitivity peak.
Expand All @@ -208,14 +202,14 @@ function triangular(x)
end
invtriangular(T) = [T[i,j] for i = 1:size(T,1) for j = i:size(T,1)]

function systemslqr(params, P)
function systemslqr(params::AbstractVector{T}, P) where T
n2 = length(params) ÷ 2
Qchol = triangular(params[1:n2])
Rchol = triangular(params[n2+1:2n2])
Q = Qchol'Qchol
R = Rchol'Rchol
L = lqrdiff(P, Q, I(1))
K = kalmandiff(P, R, I(1))
L = lqr(P, Q, one(T)*I(1)) # The last matrix must have the correct type
K = kalman(P, R, one(T)*I(1)) # The last matrix must have the correct type
C = observer_controller(P, L, K)
G = extended_gangoffour(P, C) # [S PS; CS T]
C, G
Expand All @@ -227,7 +221,7 @@ function systems(params, P)
end

function sim(G)
Gd = c2d(G, 0.1, :zoh) # Discretize the system
Gd = c2d(G, 0.1, :zoh) # Discretize the system
res1 = step(Gd[1, 1], 0:0.1:15) # Simulate S
res2 = step(Gd[1, 2], 0:0.1:15) # Simulate PS
res1, res2
Expand Down Expand Up @@ -277,7 +271,7 @@ plot_optimized(P, params, res.u)
This controller should perform better than the PID controller, which is known to be incapable of properly damping the resonance in a double-mass system. However, we did not include any integral action in the LQG controller, which has implication for the disturbance response, as indicated by the green step response in the simulation above.


#### Robustness analysis
### Robustness analysis
To check the robustness of the designed LQG controller w.r.t. parametric uncertainty in the plant, we load the package [`MonteCarloMeasurements`](https://github.com/baggepinnen/MonteCarloMeasurements.jl) and recreate the plant model with 20% uncertainty in the spring coefficient.
```@example autodiff
using MonteCarloMeasurements
Expand All @@ -298,4 +292,19 @@ r2 = step(Gd[1,2], 0:0.05:15) # Simulate PS
plot([r1, r2]; title="Time response",
lab = [" \$r → e\$" " \$d → e\$"], legend=:bottomright,
fillalpha=0.05, linealpha=0.8, seriestype=:path, c=[1 3], ri=false, N=32)
```
```

## Known limitations
The following issues are currently known to exist when using AD through ControlSystems.jl:

### ForwardDiff
[ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) works for a lot of workflows without any intervention required from the user. The following known limitations exist:
- The function `svdvals` does not have a forward rule defined. This means that the functions [`sigma`](@ref) and `opnorm` will not work for MIMO systems with ForwardDiff. SISO, MISO and SIMO systems will, however, work.
- [`hinfnorm`](@ref) requires ImplicitDifferentiation.jl and ComponentArrays.jl to be manually loaded by the user, after which there are implicit differentiation rules defined for [`hinfnorm`](@ref). The implicit rule calls `opnorm`, and is thus affected by the first limitation above for MIMO systems. [`hinfnorm`](@ref) has a reverse rule defined in RobustAndOptimalControl.jl, which is not affected by this limitation.
- [`are`](@ref), [`lqr`](@ref) and [`kalman`](@ref) all require ImplicitDifferentiation.jl and ComponentArrays.jl to be manually loaded by the user, after which there are implicit differentiation rules defined. To invoke the correct method of these functions, it is important that the second matrix (corresponding to input or measurement) has the `Dual` number type, i.e., the `R` matrix in `lqr(P, Q, R)` or `lqr(Continuous, A, B, Q, R)`
- The `schur` factorization is not amenable to differentiation using ForwardDiff. This is the fundamental reason for requireing ImplicitDifferentiation.jl to differentiate through the Ricatti equation solver. `schur` is called in several additional places, including [`balreal`](@ref) and all [`lyap`](@ref) solvers. To make `schur` differentiable, an implicit differentiation rule would be required.

### Reverse-mode AD
- Zygote does not work very well at all, due to
- Frequent use of mutation for performance
- Try catch blocks
13 changes: 12 additions & 1 deletion lib/ControlSystemsBase/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,18 @@ SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
UUIDs = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"

[weakdeps]
ComponentArrays = "b0b7db55-cfe3-40fc-9ded-d10e2dbeff66"
ImplicitDifferentiation = "57b37032-215b-411a-8a7c-41a003a55207"

[extensions]
ControlSystemsBaseImplicitDifferentiationExt = ["ImplicitDifferentiation", "ComponentArrays"]

[compat]
Aqua = "0.5"
DSP = "0.6.1, 0.7"
ForwardDiff = "0.10"
ImplicitDifferentiation = "0.4.2"
IterTools = "1.0"
LaTeXStrings = "1.0"
MacroTools = "0.5"
Expand All @@ -41,10 +49,13 @@ julia = "1.6"

[extras]
Aqua = "4c88cf16-eb10-579e-8560-4a9242c79595"
ComponentArrays = "b0b7db55-cfe3-40fc-9ded-d10e2dbeff66"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
FiniteDifferences = "26cc04aa-876d-5657-8c51-4c34ba976000"
GR = "28b8d3ca-fb5f-59d9-8090-bfdbd6d07a71"
ImplicitDifferentiation = "57b37032-215b-411a-8a7c-41a003a55207"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[targets]
test = ["Test", "Aqua", "Documenter", "GR", "Plots"]
test = ["Test", "Aqua", "ComponentArrays", "Documenter", "FiniteDifferences", "ImplicitDifferentiation", "GR", "Plots"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
module ControlSystemsBaseImplicitDifferentiationExt
using ControlSystemsBase
import ControlSystemsBase: kalman, lqr, are, ContinuousType, DiscreteType
using ForwardDiff
using ForwardDiff: Dual
using ImplicitDifferentiation
using ComponentArrays
using LinearAlgebra


function forward_arec(pars)
(; A,B,Q,R) = pars
# Q = reshape(Q0, size(A))
ControlSystemsBase.are(Continuous, A, B, Q, R), 0
end

function conditions_arec(pars, X, noneed)
(; A,B,Q,R) = pars
C = A'X
C .+= C'
C .+= Q
XB = X*B
mul!(C, XB, R\XB', -1, 1)
end

const implicit_arec = ImplicitFunction(forward_arec, conditions_arec)


"""
are(::Continuous, A, B, Q, R::AbstractMatrix{<:Dual}; kwargs)

To make the ARE solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
"""
function are(::ContinuousType, A::AbstractMatrix, B, Q, R::AbstractMatrix{<:Dual}; kwargs...)
pars = ComponentVector(; A,B,Q,R)
X0, _ = implicit_arec(pars)
X = X0 isa AbstractMatrix ? X0 : reshape(X0, size(A))
X
end

"""
lqr(::Continuous, A, B, Q, R::AbstractMatrix{<:Dual})

To make the LQR solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
"""
function lqr(::ContinuousType, A, B, Q, R::AbstractMatrix{<:Dual})
X = are(Continuous, A, B, Q, R)
R\(B'X)
end

"""
kalman(::Continuous, A, C, Q, R::AbstractMatrix{<:Dual})

To make the Kalman solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
"""
function kalman(::ContinuousType, A, C, Q, R::AbstractMatrix{<:Dual})
X = are(Continuous, A', C', Q, R)
(R\(C*X))'
end



## Discrete


function forward_ared(pars)
(; A,B,Q,R) = pars
# Q = reshape(Q0, size(A))
ControlSystemsBase.are(Discrete, A, B, Q, R), 0
end

function conditions_ared(pars, X, noneed)
# A'XA - X - (A'XB+S)(R+B'XB)^(-1)(B'XA+S') + Q = 0
# A'X*A - X - (A'X*B)*((R+B'X*B)\(B'X*A)) + Q
(; A,B,Q,R) = pars
AX = A'X
C = AX*A
C .+= Q .- X
AXB = AX*B
C .-= AXB*(((B'X*B) .+= R)\AXB')
C
end

const implicit_ared = ImplicitFunction(forward_ared, conditions_ared)


"""
are(::Discrete, A, B, Q, R::AbstractMatrix{<:Dual}; kwargs)

To make the ARE solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
"""
function are(::DiscreteType, A::AbstractMatrix, B, Q, R::AbstractMatrix{<:Dual}; kwargs...)
pars = ComponentVector(; A,B,Q,R)
X0, _ = implicit_ared(pars)
X = X0 isa AbstractMatrix ? X0 : reshape(X0, size(A))
X
end

"""
lqr(::Discrete, A, B, Q, R::AbstractMatrix{<:Dual})

To make the LQR solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
"""
function lqr(::DiscreteType, A, B, Q, R::AbstractMatrix{<:Dual})
X = are(Discrete, A, B, Q, R)
BX = B'X
(R+BX*B)\(BX*A)
end

"""
kalman(::Discrete, A, C, Q, R::AbstractMatrix{<:Dual})

To make the Kalman solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
"""
function kalman(::DiscreteType, A, C, Q, R::AbstractMatrix{<:Dual})
X = are(Discrete, A', C', Q, R)
CX = C*X
(R+CX*B)\(CX*A)
end


## Hinf norm
import ControlSystemsBase: hinfnorm
function forward_hinfnorm(pars; kwargs...)
(; A,B,C,D) = pars
sys = ss(A,B,C,D)
hinfnorm(sys; kwargs...)
end

function conditions_hinfnorm(pars, γ, w; tol=1e-10)
(; A,B,C,D) = pars
sys = ss(A,B,C,D)
[opnorm(freqresp(sys, w)) - γ]
end

const implicit_hinfnorm = ImplicitFunction(forward_hinfnorm, conditions_hinfnorm)

"""
hinfnorm(sys::StateSpace{Continuous, <:Dual}; kwargs)

The H∞ norm can be differentiated through using ForwardDiff.jl, but at the time of writing, is limited to systems with *either* a signel input *or* a signle output.

A reverse-differention rule is defined in RobustAndOptimalControl.jl, which means that hinfnorm is differentiable using, e.g., Zygote in reverse mode.
"""
function hinfnorm(sys::StateSpace{Continuous, <:Dual}; kwargs...)
A,B,C,D = ssdata(sys)
pars = ComponentVector(; A,B,C,D)
γ, w = implicit_hinfnorm(pars)
γ, w
end


# ## Schur, currently not working when the matrix A has complex eigenvalues.
# Sec 4.2 in "A PROCEDURE FOR DIFFERENTIATING PERFECT-FORESIGHT-MODEL REDUCED-FORM OEFFICIENTS", Gary ANDERSON has a formula for the derivative, but it looks rather expensive to compute, involving the factorization of a rather large Kronecker matrix. THis factorization only has to be done once, though, since it does not depend on the partials.
# function forward_schur(A)
# F = schur(A)
# ComponentVector(; F.Z, F.T), F
# end

# function conditions_schur(A, F, noneed)
# (; Z, T) = F
# [
# vec(Z' * A * Z - T);
# vec(Z' * Z - I + LowerTriangular(T) - Diagonal(T))
# ]
# end

# linear_solver = (A, b) -> (Matrix(A) \ b, (solved=true,))
# const implicit_schur = ImplicitFunction(forward_schur, conditions_schur, linear_solver)

# # vectors = Z
# # Schur = T
# # A = F.vectors * F.Schur * F.vectors'
# # A = Z * T * Z'
# function LinearAlgebra.schur(A::AbstractMatrix{<:Dual})
# ZT, F = implicit_schur(A)
# n = length(A)
# Z = reshape(ZT[1:n], size(A))
# T = reshape(ZT[n+1:end], size(A))
# Schur(T, Z, F.values)
# end


end # module

Loading