# lqcontrol¶

Provides a class called LQ for solving linear quadratic control problems, and a class called LQMarkov for solving Markov jump linear quadratic control problems.

class quantecon.lqcontrol.LQ(Q, R, A, B, C=None, N=None, beta=1, T=None, Rf=None)[source]

Bases: object

This class is for analyzing linear quadratic optimal control problems of either the infinite horizon form

$\min \mathbb{E} \Big[ \sum_{t=0}^{\infty} \beta^t r(x_t, u_t) \Big]$

with

$r(x_t, u_t) := x_t' R x_t + u_t' Q u_t + 2 u_t' N x_t$

or the finite horizon form

$\min \mathbb{E} \Big[ \sum_{t=0}^{T-1} \beta^t r(x_t, u_t) + \beta^T x_T' R_f x_T \Big]$

Both are minimized subject to the law of motion

$x_{t+1} = A x_t + B u_t + C w_{t+1}$

Here $$x$$ is n x 1, $$u$$ is k x 1, $$w$$ is j x 1 and the matrices are conformable for these dimensions. The sequence $${w_t}$$ is assumed to be white noise, with zero mean and $$\mathbb{E} [ w_t' w_t ] = I$$, the j x j identity.

If $$C$$ is not supplied as a parameter, the model is assumed to be deterministic (and $$C$$ is set to a zero matrix of appropriate dimension).

For this model, the time t value (i.e., cost-to-go) function $$V_t$$ takes the form

$x' P_T x + d_T$

and the optimal policy is of the form $$u_T = -F_T x_T$$. In the infinite horizon case, $$V, P, d$$ and $$F$$ are all stationary.

Parameters: Q : array_like(float) Q is the payoff (or cost) matrix that corresponds with the control variable u and is k x k. Should be symmetric and non-negative definite R : array_like(float) R is the payoff (or cost) matrix that corresponds with the state variable x and is n x n. Should be symmetric and non-negative definite A : array_like(float) A is part of the state transition as described above. It should be n x n B : array_like(float) B is part of the state transition as described above. It should be n x k C : array_like(float), optional(default=None) C is part of the state transition as described above and corresponds to the random variable today. If the model is deterministic then C should take default value of None N : array_like(float), optional(default=None) N is the cross product term in the payoff, as above. It should be k x n. beta : scalar(float), optional(default=1) beta is the discount parameter T : scalar(int), optional(default=None) T is the number of periods in a finite horizon problem. Rf : array_like(float), optional(default=None) Rf is the final (in a finite horizon model) payoff(or cost) matrix that corresponds with the control variable u and is n x n. Should be symmetric and non-negative definite Q, R, N, A, B, C, beta, T, Rf : see Parameters P : array_like(float) P is part of the value function representation of $$V(x) = x'Px + d$$ d : array_like(float) d is part of the value function representation of $$V(x) = x'Px + d$$ F : array_like(float) F is the policy rule that determines the choice of control in each period. k, n, j : scalar(int) The dimensions of the matrices as presented above

Methods

 compute_sequence(x0[, ts_length, method, …]) Compute and return the optimal state and control sequences $$x_0, ..., x_T$$ and $$u_0,..., u_T$$ under the assumption that $${w_t}$$ is iid and $$N(0, 1)$$. stationary_values([method]) Computes the matrix $$P$$ and scalar $$d$$ that represent the value function update_values() This method is for updating in the finite horizon case.
compute_sequence(x0, ts_length=None, method='doubling', random_state=None)[source]

Compute and return the optimal state and control sequences $$x_0, ..., x_T$$ and $$u_0,..., u_T$$ under the assumption that $${w_t}$$ is iid and $$N(0, 1)$$.

Parameters: x0 : array_like(float) The initial state, a vector of length n ts_length : scalar(int) Length of the simulation – defaults to T in finite case method : str, optional(default=’doubling’) Solution method used in solving the associated Riccati equation, str in {‘doubling’, ‘qz’}. Only relevant when the T attribute is None (i.e., the horizon is infinite). random_state : int or np.random.RandomState, optional Random seed (integer) or np.random.RandomState instance to set the initial state of the random number generator for reproducibility. If None, a randomly initialized RandomState is used. x_path : array_like(float) An n x T+1 matrix, where the t-th column represents $$x_t$$ u_path : array_like(float) A k x T matrix, where the t-th column represents $$u_t$$ w_path : array_like(float) A j x T+1 matrix, where the t-th column represent $$w_t$$
stationary_values(method='doubling')[source]

Computes the matrix $$P$$ and scalar $$d$$ that represent the value function

$V(x) = x' P x + d$

in the infinite horizon case. Also computes the control matrix $$F$$ from $$u = - Fx$$. Computation is via the solution algorithm as specified by the method option (default to the doubling algorithm) (see the documentation in matrix_eqn.solve_discrete_riccati).

Parameters: method : str, optional(default=’doubling’) Solution method used in solving the associated Riccati equation, str in {‘doubling’, ‘qz’}. P : array_like(float) P is part of the value function representation of $$V(x) = x'Px + d$$ F : array_like(float) F is the policy rule that determines the choice of control in each period. d : array_like(float) d is part of the value function representation of $$V(x) = x'Px + d$$
update_values()[source]

This method is for updating in the finite horizon case. It shifts the current value function

$V_t(x) = x' P_t x + d_t$

and the optimal policy $$F_t$$ one step back in time, replacing the pair $$P_t$$ and $$d_t$$ with $$P_{t-1}$$ and $$d_{t-1}$$, and $$F_t$$ with $$F_{t-1}$$

class quantecon.lqcontrol.LQMarkov(Π, Qs, Rs, As, Bs, Cs=None, Ns=None, beta=1)[source]

Bases: object

This class is for analyzing Markov jump linear quadratic optimal control problems of the infinite horizon form

$\min \mathbb{E} \Big[ \sum_{t=0}^{\infty} \beta^t r(x_t, s_t, u_t) \Big]$

with

$r(x_t, s_t, u_t) := (x_t' R(s_t) x_t + u_t' Q(s_t) u_t + 2 u_t' N(s_t) x_t)$

subject to the law of motion

$x_{t+1} = A(s_t) x_t + B(s_t) u_t + C(s_t) w_{t+1}$

Here $$x$$ is n x 1, $$u$$ is k x 1, $$w$$ is j x 1 and the matrices are conformable for these dimensions. The sequence $${w_t}$$ is assumed to be white noise, with zero mean and $$\mathbb{E} [ w_t' w_t ] = I$$, the j x j identity.

If $$C$$ is not supplied as a parameter, the model is assumed to be deterministic (and $$C$$ is set to a zero matrix of appropriate dimension).

The optimal value function $$V(x_t, s_t)$$ takes the form

$x_t' P(s_t) x_t + d(s_t)$

and the optimal policy is of the form $$u_t = -F(s_t) x_t$$.

Parameters: Π : array_like(float, ndim=2) The Markov chain transition matrix with dimension m x m. Qs : array_like(float) Consists of m symmetric and non-negative definite payoff matrices Q(s) with dimension k x k that corresponds with the control variable u for each Markov state s Rs : array_like(float) Consists of m symmetric and non-negative definite payoff matrices R(s) with dimension n x n that corresponds with the state variable x for each Markov state s As : array_like(float) Consists of m state transition matrices A(s) with dimension n x n for each Markov state s Bs : array_like(float) Consists of m state transition matrices B(s) with dimension n x k for each Markov state s Cs : array_like(float), optional(default=None) Consists of m state transition matrices C(s) with dimension n x j for each Markov state s. If the model is deterministic then Cs should take default value of None Ns : array_like(float), optional(default=None) Consists of m cross product term matrices N(s) with dimension k x n for each Markov state, beta : scalar(float), optional(default=1) beta is the discount parameter Π, Qs, Rs, Ns, As, Bs, Cs, beta : see Parameters Ps : array_like(float) Ps is part of the value function representation of $$V(x, s) = x' P(s) x + d(s)$$ ds : array_like(float) ds is part of the value function representation of $$V(x, s) = x' P(s) x + d(s)$$ Fs : array_like(float) Fs is the policy rule that determines the choice of control in each period at each Markov state m : scalar(int) The number of Markov states k, n, j : scalar(int) The dimensions of the matrices as presented above

Methods

 compute_sequence(x0[, ts_length, random_state]) Compute and return the optimal state and control sequences $$x_0, ..., x_T$$ and $$u_0,..., u_T$$ under the assumption that $${w_t}$$ is iid and $$N(0, 1)$$, with Markov states sequence $$s_0, ..., s_T$$ stationary_values([max_iter]) Computes the matrix $$P(s)$$ and scalar $$d(s)$$ that represent the value function
compute_sequence(x0, ts_length=None, random_state=None)[source]

Compute and return the optimal state and control sequences $$x_0, ..., x_T$$ and $$u_0,..., u_T$$ under the assumption that $${w_t}$$ is iid and $$N(0, 1)$$, with Markov states sequence $$s_0, ..., s_T$$

Parameters: x0 : array_like(float) The initial state, a vector of length n ts_length : scalar(int), optional(default=None) Length of the simulation. If None, T is set to be 100 random_state : int or np.random.RandomState, optional Random seed (integer) or np.random.RandomState instance to set the initial state of the random number generator for reproducibility. If None, a randomly initialized RandomState is used. x_path : array_like(float) An n x T+1 matrix, where the t-th column represents $$x_t$$ u_path : array_like(float) A k x T matrix, where the t-th column represents $$u_t$$ w_path : array_like(float) A j x T+1 matrix, where the t-th column represent $$w_t$$ state : array_like(int) Array containing the state values $$s_t$$ of the sample path
stationary_values(max_iter=1000)[source]

Computes the matrix $$P(s)$$ and scalar $$d(s)$$ that represent the value function

$V(x, s) = x' P(s) x + d(s)$

in the infinite horizon case. Also computes the control matrix $$F$$ from $$u = - F(s) x$$.

Parameters: max_iter : scalar(int), optional(default=1000) The maximum number of iterations allowed Ps : array_like(float) Ps is part of the value function representation of $$V(x, s) = x' P(s) x + d(s)$$ ds : array_like(float) ds is part of the value function representation of $$V(x, s) = x' P(s) x + d(s)$$ Fs : array_like(float) Fs is the policy rule that determines the choice of control in each period at each Markov state