JuliaControl · baggepinnen · May 24, 2023 · May 24, 2023 · May 24, 2023 · May 24, 2023
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
@@ -16,7 +16,7 @@ jobs:
       - uses: actions/checkout@v2
       - uses: julia-actions/setup-julia@v1
         with:
-          version: '1.8'
+          version: '1'
       - name: Setup docs environment
         run: |
           julia --project=docs -e '

diff --git a/docs/Project.toml b/docs/Project.toml
@@ -1,5 +1,7 @@
 [deps]
 ChainRules = "082447d4-558c-5d27-93f4-14fc19e9eca2"
+ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
+ComponentArrays = "b0b7db55-cfe3-40fc-9ded-d10e2dbeff66"
 ControlSystems = "a6e380b2-a6ca-5380-bf3e-84a91bcd477e"
 ControlSystemsBase = "aaaaaaaa-a6ca-5380-bf3e-84a91bcd477e"
 DSP = "717857b8-e6f2-59f4-9121-6e50c889abd2"

diff --git a/docs/src/examples/automatic_differentiation.md b/docs/src/examples/automatic_differentiation.md
@@ -1,3 +1,30 @@
+```@setup autodiff
+using ChainRules, ForwardDiff, ChainRulesCore, LinearAlgebra
+# @ForwardDiff_frule LinearAlgebra.exp!(x1::AbstractMatrix{<:ForwardDiff.Dual})
+
+# Replace by ForwardDiffChainRules when https://github.com/ThummeTo/ForwardDiffChainRules.jl/pull/16 is merged
+function LinearAlgebra.exp!(A::AbstractMatrix{<:ForwardDiff.Dual})
+    Av = ForwardDiff.value.(A)
+    J = reduce(vcat, transpose.(ForwardDiff.partials.(A)))
+    CS = length(ForwardDiff.partials(A[1,1]))
+    dself = NoTangent();
+    cAv = copy(Av)
+    eA, newJ1 = ChainRules.frule((dself, reshape(J[:,1], size(A))), LinearAlgebra.exp!, cAv)
+
+    newJt = ntuple(Val(CS - 1)) do i
+        xpartialsi = reshape(J[:, i+1], size(A))
+        cAv .= Av
+        _, ypartialsi = ChainRulesCore.frule((dself, xpartialsi), LinearAlgebra.exp!, cAv)
+        ypartialsi
+    end
+    newJ = hcat(vec(newJ1), vec.(newJt)...)
+    T = ForwardDiff.tagtype(eltype(A))
+    flaty = ForwardDiff.Dual{T}.(
+        eA, reshape(ForwardDiff.Partials.(NTuple{CS}.(eachrow(newJ))), size(A)),
+    )
+end
+```
+
 # Automatic Differentiation
 In Julia, it is often possible to automatically compute derivatives, gradients, Jacobians and Hessians of arbitrary Julia functions with precision matching the machine precision, that is, without the numerical inaccuracies incurred by finite-difference approximations.
 
@@ -41,9 +68,7 @@ linear_sys = ss(A, B, C, D)
 ```
 The example above linearizes `f` in the point ``x_0, u_0`` to obtain the linear statespace matrices ``A`` and ``B``, and linearizes `g` to obtain the linear output matrices ``C`` and ``D``.
 
-## Optimization-based tuning
-
-### PID controller
+## Optimization-based tuning--PID controller
 
 This example will demonstrate simple usage of AD using ForwardDiff.jl for optimization-based auto tuning of a PID controller.
 
@@ -89,7 +114,7 @@ To make the automatic gradient computation through the matrix exponential used i
 ```@example autodiff
 using Optimization, Statistics, LinearAlgebra
 using OptimizationGCMAES, ChainRules, ForwardDiffChainRules
-@ForwardDiff_frule LinearAlgebra.exp!(x1::AbstractMatrix{<:ForwardDiff.Dual})
+# @ForwardDiff_frule LinearAlgebra.exp!(x1::AbstractMatrix{<:ForwardDiff.Dual}) # https://github.com/ThummeTo/ForwardDiffChainRules.jl/pull/16
 
 function plot_optimized(P, params, res)
     fig = plot(layout=(1,3), size=(1200,400), bottommargin=2Plots.mm)
@@ -148,47 +173,16 @@ plot_optimized(P, params, res.u)
 
 The optimized controller achieves more or less the same low peak in the sensitivity function, but does this while *both* making the step responses significantly faster *and* using much less controller gain for large frequencies (the orange sensitivity function), an altogether better tuning. The only potentially negative effect of this tuning is that the overshoot in response to a reference step increased slightly, indicated also by the slightly higher peak in the complimentary sensitivity function (green). However, the response to reference steps can (and most often should) be additionally shaped by reference pre-filtering (sometimes referred to as "feedforward" or "reference shaping"), by introducing an additional filter appearing in the feedforward path only, thus allowing elimination of the overshoot without affecting the closed-loop properties.
 
-### LQG controller
-We could attempt a similar automatic tuning of an LQG controller. This time, we choose to optimize the weight matrices of the LQR problem and the state covariance matrix of the noise. The synthesis of an LQR controller involves the solution of a Ricatti equation, which in turn involves performing a Schur decomposition. These steps hard hard to differentiate through in a conventional way, but we can make use of implicit differentiation using the implicit function theorem. To do so, we load the package `ImplicitDifferentiation`, and defines the conditions that hold at the solution of the Ricatti equaiton:
+## Optimization-based tuning--LQG controller
+We could attempt a similar automatic tuning of an LQG controller. This time, we choose to optimize the weight matrices of the LQR problem and the state covariance matrix of the noise. The synthesis of an LQR controller involves the solution of a Ricatti equation, which in turn involves performing a Schur decomposition. These steps hard hard to differentiate through in a conventional way, but we can make use of implicit differentiation using the implicit function theorem. To do so, we load the package `ImplicitDifferentiation`, and define the conditions that hold at the solution of the Ricatti equaiton:
 ```math
 A^TX + XA - XBR^{-1}B^T X + Q = 0
 ```
 
-We then define differentiable versions of [`lqr`](@ref) and [`kalman`](@ref) that make use of the "implicit function".
+When `ImplicitDifferentiation` is loaded, differentiable versions of [`lqr`](@ref) and [`kalman`](@ref) that make use of the "implicit function" are automatically loaded.
 
 ```@example autodiff
-using ImplicitDifferentiation
-
-function forward_are(Q0; A,B,R)
-    Q = reshape(Q0, size(A))
-    ControlSystemsBase.are(Continuous, A, B, Q, R), 0
-end
-
-function conditions_are(Q, X, noneed; A,B,R)
-    XB = X*B
-    T = promote_type(eltype(Q), eltype(X))
-    C = T.(Q)
-    mul!(C,A',X,1,1)
-    mul!(C,X,A,1,1)
-    mul!(C, XB, R\XB', -1, 1)
-end
-
-const implicit_are = ImplicitFunction(forward_are, conditions_are)
-
-function lqrdiff(P::AbstractStateSpace{Continuous}, Q, R)
-    (; A, B) = P
-    X0, _ = implicit_are(Q; A, B, R)
-    X = X0 isa AbstractMatrix ? X0 : reshape(X0, size(A))
-    R\(B'X)
-end
-
-function kalmandiff(P::AbstractStateSpace{Continuous}, Q, R)
-    A = P.A'
-    B = P.C'
-    X0, _ = implicit_are(Q; A, B, R)
-    X = X0 isa AbstractMatrix ? X0 : reshape(X0, size(A))
-    (R\(B'X))'
-end
+using ImplicitDifferentiation, ComponentArrays # Both these packages are required to load the implicit differentiation rules
 ```
 
 Since this is a SISO system, we do not need to tune the control-input matrix or the measurement covariance matrix, any non-unit weight assigned to those can be associated with the state matrices instead. Since these matrices are supposed to be positive semi-definite, we optimize Cholesky factors rather than the full matrices. We will also reformulate the problem slightly and use proper constraints to limit the sensitivity peak.
@@ -208,14 +202,14 @@ function triangular(x)
 end
 invtriangular(T) = [T[i,j] for i = 1:size(T,1) for j = i:size(T,1)]
 
-function systemslqr(params, P)
+function systemslqr(params::AbstractVector{T}, P) where T
     n2 = length(params) ÷ 2
     Qchol = triangular(params[1:n2])
     Rchol = triangular(params[n2+1:2n2])
     Q = Qchol'Qchol
     R = Rchol'Rchol
-    L = lqrdiff(P, Q, I(1))
-    K = kalmandiff(P, R, I(1))
+    L = lqr(P, Q, one(T)*I(1))        # The last matrix must have the correct type
+    K = kalman(P, R, one(T)*I(1))     # The last matrix must have the correct type
     C = observer_controller(P, L, K)
     G = extended_gangoffour(P, C) # [S PS; CS T]
     C, G
@@ -227,7 +221,7 @@ function systems(params, P)
 end
 
 function sim(G)
-    Gd = c2d(G, 0.1, :zoh)   # Discretize the system
+    Gd = c2d(G, 0.1, :zoh)          # Discretize the system
     res1 = step(Gd[1, 1], 0:0.1:15) # Simulate S
     res2 = step(Gd[1, 2], 0:0.1:15) # Simulate PS
     res1, res2
@@ -277,7 +271,7 @@ plot_optimized(P, params, res.u)
 This controller should perform better than the PID controller, which is known to be incapable of properly damping the resonance in a double-mass system. However, we did not include any integral action in the LQG controller, which has implication for the disturbance response, as indicated by the green step response in the simulation above.
 
 
-#### Robustness analysis
+### Robustness analysis
 To check the robustness of the designed LQG controller w.r.t. parametric uncertainty in the plant, we load the package [`MonteCarloMeasurements`](https://github.com/baggepinnen/MonteCarloMeasurements.jl) and recreate the plant model with 20% uncertainty in the spring coefficient.
 ```@example autodiff
 using MonteCarloMeasurements
@@ -298,4 +292,19 @@ r2 = step(Gd[1,2], 0:0.05:15) # Simulate PS
 plot([r1, r2]; title="Time response",
             lab = [" \$r → e\$" " \$d → e\$"], legend=:bottomright,
             fillalpha=0.05, linealpha=0.8, seriestype=:path, c=[1 3], ri=false, N=32)
-```
+```
+
+## Known limitations
+The following issues are currently known to exist when using AD through ControlSystems.jl:
+
+### ForwardDiff
+[ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) works for a lot of workflows without any intervention required from the user. The following known limitations exist:
+- The function `svdvals` does not have a forward rule defined. This means that the functions [`sigma`](@ref) and `opnorm` will not work for MIMO systems with ForwardDiff. SISO, MISO and SIMO systems will, however, work.
+- [`hinfnorm`](@ref) requires ImplicitDifferentiation.jl and ComponentArrays.jl to be manually loaded by the user, after which there are implicit differentiation rules defined for [`hinfnorm`](@ref). The implicit rule calls `opnorm`, and is thus affected by the first limitation above for MIMO systems. [`hinfnorm`](@ref) has a reverse rule defined in RobustAndOptimalControl.jl, which is not affected by this limitation.
+- [`are`](@ref), [`lqr`](@ref) and [`kalman`](@ref) all require ImplicitDifferentiation.jl and ComponentArrays.jl to be manually loaded by the user, after which there are implicit differentiation rules defined. To invoke the correct method of these functions, it is important that the second matrix (corresponding to input or measurement) has the `Dual` number type, i.e., the `R` matrix in `lqr(P, Q, R)` or `lqr(Continuous, A, B, Q, R)`
+- The `schur` factorization is not amenable to differentiation using ForwardDiff. This is the fundamental reason for requireing ImplicitDifferentiation.jl to differentiate through the Ricatti equation solver. `schur` is called in several additional places, including [`balreal`](@ref) and all [`lyap`](@ref) solvers. To make `schur` differentiable, an implicit differentiation rule would be required.
+
+### Reverse-mode AD
+- Zygote does not work very well at all, due to
+    - Frequent use of mutation for performance
+    - Try catch blocks
diff --git a/lib/ControlSystemsBase/Project.toml b/lib/ControlSystemsBase/Project.toml
@@ -23,10 +23,18 @@ SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
 StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
 UUIDs = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
 
+[weakdeps]
+ComponentArrays = "b0b7db55-cfe3-40fc-9ded-d10e2dbeff66"
+ImplicitDifferentiation = "57b37032-215b-411a-8a7c-41a003a55207"
+
+[extensions]
+ControlSystemsBaseImplicitDifferentiationExt = ["ImplicitDifferentiation", "ComponentArrays"]
+
 [compat]
 Aqua = "0.5"
 DSP = "0.6.1, 0.7"
 ForwardDiff = "0.10"
+ImplicitDifferentiation = "0.4.2"
 IterTools = "1.0"
 LaTeXStrings = "1.0"
 MacroTools = "0.5"
@@ -41,10 +49,13 @@ julia = "1.6"
 
 [extras]
 Aqua = "4c88cf16-eb10-579e-8560-4a9242c79595"
+ComponentArrays = "b0b7db55-cfe3-40fc-9ded-d10e2dbeff66"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
+FiniteDifferences = "26cc04aa-876d-5657-8c51-4c34ba976000"
 GR = "28b8d3ca-fb5f-59d9-8090-bfdbd6d07a71"
+ImplicitDifferentiation = "57b37032-215b-411a-8a7c-41a003a55207"
 Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
 Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
 
 [targets]
-test = ["Test", "Aqua", "Documenter", "GR", "Plots"]
+test = ["Test", "Aqua", "ComponentArrays", "Documenter", "FiniteDifferences", "ImplicitDifferentiation", "GR", "Plots"]
diff --git a/lib/ControlSystemsBase/ext/ControlSystemsBaseImplicitDifferentiationExt.jl b/lib/ControlSystemsBase/ext/ControlSystemsBaseImplicitDifferentiationExt.jl
@@ -0,0 +1,185 @@
+module ControlSystemsBaseImplicitDifferentiationExt
+using ControlSystemsBase
+import ControlSystemsBase: kalman, lqr, are, ContinuousType, DiscreteType
+using ForwardDiff
+using ForwardDiff: Dual
+using ImplicitDifferentiation
+using ComponentArrays
+using LinearAlgebra
+
+
+function forward_arec(pars)
+    (; A,B,Q,R) = pars
+    # Q = reshape(Q0, size(A))
+    ControlSystemsBase.are(Continuous, A, B, Q, R), 0
+end
+
+function conditions_arec(pars, X, noneed)
+    (; A,B,Q,R) = pars
+    C = A'X
+    C .+= C'
+    C .+= Q
+    XB = X*B
+    mul!(C, XB, R\XB', -1, 1)
+end
+
+const implicit_arec = ImplicitFunction(forward_arec, conditions_arec)
+
+
+"""
+    are(::Continuous, A, B, Q, R::AbstractMatrix{<:Dual}; kwargs)
+
+To make the ARE solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
+"""
+function are(::ContinuousType, A::AbstractMatrix, B, Q, R::AbstractMatrix{<:Dual}; kwargs...)
+    pars = ComponentVector(; A,B,Q,R)
+    X0, _ = implicit_arec(pars)
+    X = X0 isa AbstractMatrix ? X0 : reshape(X0, size(A))
+    X
+end
+
+"""
+    lqr(::Continuous, A, B, Q, R::AbstractMatrix{<:Dual})
+
+To make the LQR solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
+"""
+function lqr(::ContinuousType, A, B, Q, R::AbstractMatrix{<:Dual})
+    X = are(Continuous, A, B, Q, R)
+    R\(B'X)
+end
+
+"""
+    kalman(::Continuous, A, C, Q, R::AbstractMatrix{<:Dual})
+
+To make the Kalman solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
+"""
+function kalman(::ContinuousType, A, C, Q, R::AbstractMatrix{<:Dual})
+    X = are(Continuous, A', C', Q, R)
+    (R\(C*X))'
+end
+
+
+
+## Discrete
+
+
+function forward_ared(pars)
+    (; A,B,Q,R) = pars
+    # Q = reshape(Q0, size(A))
+    ControlSystemsBase.are(Discrete, A, B, Q, R), 0
+end
+
+function conditions_ared(pars, X, noneed)
+    # A'XA - X - (A'XB+S)(R+B'XB)^(-1)(B'XA+S') + Q = 0
+    # A'X*A - X - (A'X*B)*((R+B'X*B)\(B'X*A)) + Q
+    (; A,B,Q,R) = pars
+    AX = A'X
+    C = AX*A
+    C .+= Q .- X
+    AXB = AX*B
+    C .-= AXB*(((B'X*B) .+= R)\AXB')
+    C
+end
+
+const implicit_ared = ImplicitFunction(forward_ared, conditions_ared)
+
+
+"""
+    are(::Discrete, A, B, Q, R::AbstractMatrix{<:Dual}; kwargs)
+
+To make the ARE solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
+"""
+function are(::DiscreteType, A::AbstractMatrix, B, Q, R::AbstractMatrix{<:Dual}; kwargs...)
+    pars = ComponentVector(; A,B,Q,R)
+    X0, _ = implicit_ared(pars)
+    X = X0 isa AbstractMatrix ? X0 : reshape(X0, size(A))
+    X
+end
+
+"""
+    lqr(::Discrete, A, B, Q, R::AbstractMatrix{<:Dual})
+
+To make the LQR solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
+"""
+function lqr(::DiscreteType, A, B, Q, R::AbstractMatrix{<:Dual})
+    X = are(Discrete, A, B, Q, R)
+    BX = B'X
+    (R+BX*B)\(BX*A)
+end
+
+"""
+    kalman(::Discrete, A, C, Q, R::AbstractMatrix{<:Dual})
+
+To make the Kalman solver work with dual numbers, make sure that the `R` matrix has the dual number element type.
+"""
+function kalman(::DiscreteType, A, C, Q, R::AbstractMatrix{<:Dual})
+    X = are(Discrete, A', C', Q, R)
+    CX = C*X
+    (R+CX*B)\(CX*A)
+end
+
+
+## Hinf norm
+import ControlSystemsBase: hinfnorm
+function forward_hinfnorm(pars; kwargs...)
+    (; A,B,C,D) = pars
+    sys = ss(A,B,C,D)
+    hinfnorm(sys; kwargs...)
+end
+
+function conditions_hinfnorm(pars, γ, w; tol=1e-10)
+    (; A,B,C,D) = pars
+    sys = ss(A,B,C,D)
+    [opnorm(freqresp(sys, w)) - γ]
+end
+
+const implicit_hinfnorm = ImplicitFunction(forward_hinfnorm, conditions_hinfnorm)
+
+"""
+    hinfnorm(sys::StateSpace{Continuous, <:Dual}; kwargs)
+
+The H∞ norm can be differentiated through using ForwardDiff.jl, but at the time of writing, is limited to systems with *either* a signel input *or* a signle output. 
+
+A reverse-differention rule is defined in RobustAndOptimalControl.jl, which means that hinfnorm is differentiable using, e.g., Zygote in reverse mode.
+"""
+function hinfnorm(sys::StateSpace{Continuous, <:Dual}; kwargs...)
+    A,B,C,D = ssdata(sys)
+    pars = ComponentVector(; A,B,C,D)
+    γ, w = implicit_hinfnorm(pars)
+    γ, w
+end
+
+
+# ## Schur, currently not working when the matrix A has complex eigenvalues.
+# Sec 4.2 in "A PROCEDURE FOR DIFFERENTIATING PERFECT-FORESIGHT-MODEL REDUCED-FORM  OEFFICIENTS", Gary ANDERSON  has a formula for the derivative, but it looks rather expensive to compute, involving the factorization of a rather large Kronecker matrix. THis factorization only has to be done once, though, since it does not depend on the partials.
+# function forward_schur(A)
+#     F = schur(A)
+#     ComponentVector(; F.Z, F.T), F
+# end
+
+# function conditions_schur(A, F, noneed)
+#     (; Z, T) = F
+#     [
+#         vec(Z' * A * Z - T);
+#         vec(Z' * Z - I + LowerTriangular(T) - Diagonal(T))
+#     ]
+# end
+
+# linear_solver = (A, b) -> (Matrix(A) \ b, (solved=true,))
+# const implicit_schur = ImplicitFunction(forward_schur, conditions_schur, linear_solver)
+
+# # vectors = Z
+# # Schur = T
+# # A = F.vectors * F.Schur * F.vectors'
+# # A = Z * T * Z'
+# function LinearAlgebra.schur(A::AbstractMatrix{<:Dual})
+#     ZT, F = implicit_schur(A)
+#     n = length(A)
+#     Z = reshape(ZT[1:n], size(A))
+#     T = reshape(ZT[n+1:end], size(A))
+#     Schur(T, Z, F.values)
+# end
+
+
+end # module
+