lqcontrol¶
- class quantecon.lqcontrol.LQ(Q, R, A, B, C=None, N=None, beta=1, T=None, Rf=None)[source]¶
Bases:
object
This class is for analyzing linear quadratic optimal control problems of either the infinite horizon form
\[\min \mathbb{E} \Big[ \sum_{t=0}^{\infty} \beta^t r(x_t, u_t) \Big]\]with
\[r(x_t, u_t) := x_t' R x_t + u_t' Q u_t + 2 u_t' N x_t\]or the finite horizon form
\[\min \mathbb{E} \Big[ \sum_{t=0}^{T-1} \beta^t r(x_t, u_t) + \beta^T x_T' R_f x_T \Big]\]Both are minimized subject to the law of motion
\[x_{t+1} = A x_t + B u_t + C w_{t+1}\]Here \(x\) is n x 1, \(u\) is k x 1, \(w\) is j x 1 and the matrices are conformable for these dimensions. The sequence \({w_t}\) is assumed to be white noise, with zero mean and \(\mathbb{E} [ w_t' w_t ] = I\), the j x j identity.
If \(C\) is not supplied as a parameter, the model is assumed to be deterministic (and \(C\) is set to a zero matrix of appropriate dimension).
For this model, the time t value (i.e., cost-to-go) function \(V_t\) takes the form
\[x' P_T x + d_T\]and the optimal policy is of the form \(u_T = -F_T x_T\). In the infinite horizon case, \(V, P, d\) and \(F\) are all stationary.
- Parameters:
- Qarray_like(float)
Q is the payoff (or cost) matrix that corresponds with the control variable u and is k x k. Should be symmetric and non-negative definite
- Rarray_like(float)
R is the payoff (or cost) matrix that corresponds with the state variable x and is n x n. Should be symmetric and non-negative definite
- Aarray_like(float)
A is part of the state transition as described above. It should be n x n
- Barray_like(float)
B is part of the state transition as described above. It should be n x k
- Carray_like(float), optional(default=None)
C is part of the state transition as described above and corresponds to the random variable today. If the model is deterministic then C should take default value of None
- Narray_like(float), optional(default=None)
N is the cross product term in the payoff, as above. It should be k x n.
- betascalar(float), optional(default=1)
beta is the discount parameter
- Tscalar(int), optional(default=None)
T is the number of periods in a finite horizon problem.
- Rfarray_like(float), optional(default=None)
Rf is the final (in a finite horizon model) payoff(or cost) matrix that corresponds with the control variable u and is n x n. Should be symmetric and non-negative definite
- Attributes:
- Q, R, N, A, B, C, beta, T, Rfsee Parameters
- Parray_like(float)
P is part of the value function representation of \(V(x) = x'Px + d\)
- darray_like(float)
d is part of the value function representation of \(V(x) = x'Px + d\)
- Farray_like(float)
F is the policy rule that determines the choice of control in each period.
- k, n, jscalar(int)
The dimensions of the matrices as presented above
Methods
compute_sequence
(x0[, ts_length, method, ...])Compute and return the optimal state and control sequences \(x_0, ..., x_T\) and \(u_0,..., u_T\) under the assumption that \({w_t}\) is iid and \(N(0, 1)\).
stationary_values
([method])Computes the matrix \(P\) and scalar \(d\) that represent the value function
This method is for updating in the finite horizon case.
- compute_sequence(x0, ts_length=None, method='doubling', random_state=None)[source]¶
Compute and return the optimal state and control sequences \(x_0, ..., x_T\) and \(u_0,..., u_T\) under the assumption that \({w_t}\) is iid and \(N(0, 1)\).
- Parameters:
- x0array_like(float)
The initial state, a vector of length n
- ts_lengthscalar(int)
Length of the simulation – defaults to T in finite case
- methodstr, optional(default=’doubling’)
Solution method used in solving the associated Riccati equation, str in {‘doubling’, ‘qz’}. Only relevant when the T attribute is None (i.e., the horizon is infinite).
- random_stateint or np.random.RandomState/Generator, optional
Random seed (integer) or np.random.RandomState or Generator instance to set the initial state of the random number generator for reproducibility. If None, a randomly initialized RandomState is used.
- Returns:
- x_patharray_like(float)
An n x T+1 matrix, where the t-th column represents \(x_t\)
- u_patharray_like(float)
A k x T matrix, where the t-th column represents \(u_t\)
- w_patharray_like(float)
A j x T+1 matrix, where the t-th column represent \(w_t\)
- stationary_values(method='doubling')[source]¶
Computes the matrix \(P\) and scalar \(d\) that represent the value function
\[V(x) = x' P x + d\]in the infinite horizon case. Also computes the control matrix \(F\) from \(u = - Fx\). Computation is via the solution algorithm as specified by the method option (default to the doubling algorithm) (see the documentation in matrix_eqn.solve_discrete_riccati).
- Parameters:
- methodstr, optional(default=’doubling’)
Solution method used in solving the associated Riccati equation, str in {‘doubling’, ‘qz’}.
- Returns:
- Parray_like(float)
P is part of the value function representation of \(V(x) = x'Px + d\)
- Farray_like(float)
F is the policy rule that determines the choice of control in each period.
- darray_like(float)
d is part of the value function representation of \(V(x) = x'Px + d\)
- update_values()[source]¶
This method is for updating in the finite horizon case. It shifts the current value function
\[V_t(x) = x' P_t x + d_t\]and the optimal policy \(F_t\) one step back in time, replacing the pair \(P_t\) and \(d_t\) with \(P_{t-1}\) and \(d_{t-1}\), and \(F_t\) with \(F_{t-1}\)
- class quantecon.lqcontrol.LQMarkov(Π, Qs, Rs, As, Bs, Cs=None, Ns=None, beta=1)[source]¶
Bases:
object
This class is for analyzing Markov jump linear quadratic optimal control problems of the infinite horizon form
\[\min \mathbb{E} \Big[ \sum_{t=0}^{\infty} \beta^t r(x_t, s_t, u_t) \Big]\]with
\[r(x_t, s_t, u_t) := (x_t' R(s_t) x_t + u_t' Q(s_t) u_t + 2 u_t' N(s_t) x_t)\]subject to the law of motion
\[x_{t+1} = A(s_t) x_t + B(s_t) u_t + C(s_t) w_{t+1}\]Here \(x\) is n x 1, \(u\) is k x 1, \(w\) is j x 1 and the matrices are conformable for these dimensions. The sequence \({w_t}\) is assumed to be white noise, with zero mean and \(\mathbb{E} [ w_t' w_t ] = I\), the j x j identity.
If \(C\) is not supplied as a parameter, the model is assumed to be deterministic (and \(C\) is set to a zero matrix of appropriate dimension).
The optimal value function \(V(x_t, s_t)\) takes the form
\[x_t' P(s_t) x_t + d(s_t)\]and the optimal policy is of the form \(u_t = -F(s_t) x_t\).
- Parameters:
- Πarray_like(float, ndim=2)
The Markov chain transition matrix with dimension m x m.
- Qsarray_like(float)
Consists of m symmetric and non-negative definite payoff matrices Q(s) with dimension k x k that corresponds with the control variable u for each Markov state s
- Rsarray_like(float)
Consists of m symmetric and non-negative definite payoff matrices R(s) with dimension n x n that corresponds with the state variable x for each Markov state s
- Asarray_like(float)
Consists of m state transition matrices A(s) with dimension n x n for each Markov state s
- Bsarray_like(float)
Consists of m state transition matrices B(s) with dimension n x k for each Markov state s
- Csarray_like(float), optional(default=None)
Consists of m state transition matrices C(s) with dimension n x j for each Markov state s. If the model is deterministic then Cs should take default value of None
- Nsarray_like(float), optional(default=None)
Consists of m cross product term matrices N(s) with dimension k x n for each Markov state,
- betascalar(float), optional(default=1)
beta is the discount parameter
- Attributes:
- Π, Qs, Rs, Ns, As, Bs, Cs, betasee Parameters
- Psarray_like(float)
Ps is part of the value function representation of \(V(x, s) = x' P(s) x + d(s)\)
- dsarray_like(float)
ds is part of the value function representation of \(V(x, s) = x' P(s) x + d(s)\)
- Fsarray_like(float)
Fs is the policy rule that determines the choice of control in each period at each Markov state
- mscalar(int)
The number of Markov states
- k, n, jscalar(int)
The dimensions of the matrices as presented above
Methods
compute_sequence
(x0[, ts_length, random_state])Compute and return the optimal state and control sequences \(x_0, ..., x_T\) and \(u_0,..., u_T\) under the assumption that \({w_t}\) is iid and \(N(0, 1)\), with Markov states sequence \(s_0, ..., s_T\)
stationary_values
([max_iter])Computes the matrix \(P(s)\) and scalar \(d(s)\) that represent the value function
- compute_sequence(x0, ts_length=None, random_state=None)[source]¶
Compute and return the optimal state and control sequences \(x_0, ..., x_T\) and \(u_0,..., u_T\) under the assumption that \({w_t}\) is iid and \(N(0, 1)\), with Markov states sequence \(s_0, ..., s_T\)
- Parameters:
- x0array_like(float)
The initial state, a vector of length n
- ts_lengthscalar(int), optional(default=None)
Length of the simulation. If None, T is set to be 100
- random_stateint or np.random.RandomState/Generator, optional
Random seed (integer) or np.random.RandomState or Generator instance to set the initial state of the random number generator for reproducibility. If None, a randomly initialized RandomState is used.
- Returns:
- x_patharray_like(float)
An n x T+1 matrix, where the t-th column represents \(x_t\)
- u_patharray_like(float)
A k x T matrix, where the t-th column represents \(u_t\)
- w_patharray_like(float)
A j x T+1 matrix, where the t-th column represent \(w_t\)
- statearray_like(int)
Array containing the state values \(s_t\) of the sample path
- stationary_values(max_iter=1000)[source]¶
Computes the matrix \(P(s)\) and scalar \(d(s)\) that represent the value function
\[V(x, s) = x' P(s) x + d(s)\]in the infinite horizon case. Also computes the control matrix \(F\) from \(u = - F(s) x\).
- Parameters:
- max_iterscalar(int), optional(default=1000)
The maximum number of iterations allowed
- Returns:
- Psarray_like(float)
Ps is part of the value function representation of \(V(x, s) = x' P(s) x + d(s)\)
- dsarray_like(float)
ds is part of the value function representation of \(V(x, s) = x' P(s) x + d(s)\)
- Fsarray_like(float)
Fs is the policy rule that determines the choice of control in each period at each Markov state