lqcontrol

Filename: lqcontrol.py

Authors: Thomas J. Sargent, John Stachurski

Provides a class called LQ for solving linear quadratic control problems.

class quantecon.lqcontrol.LQ(Q, R, A, B, C=None, N=None, beta=1, T=None, Rf=None)[source]

Bases: object

This class is for analyzing linear quadratic optimal control problems of either the infinite horizon form

. min E sum_{t=0}^{infty} beta^t r(x_t, u_t)

with

r(x_t, u_t) := x_t’ R x_t + u_t’ Q u_t + 2 u_t’ N x_t

or the finite horizon form

min E sum_{t=0}^{T-1} beta^t r(x_t, u_t) + beta^T x_T’ R_f x_T

Both are minimized subject to the law of motion

x_{t+1} = A x_t + B u_t + C w_{t+1}

Here x is n x 1, u is k x 1, w is j x 1 and the matrices are conformable for these dimensions. The sequence {w_t} is assumed to be white noise, with zero mean and E w_t w_t = I, the j x j identity.

If C is not supplied as a parameter, the model is assumed to be deterministic (and C is set to a zero matrix of appropriate dimension).

For this model, the time t value (i.e., cost-to-go) function V_t takes the form

x’ P_T x + d_T

and the optimal policy is of the form u_T = -F_T x_T. In the infinite horizon case, V, P, d and F are all stationary.

Parameters:

Q : array_like(float)

Q is the payoff(or cost) matrix that corresponds with the control variable u and is k x k. Should be symmetric and nonnegative definite

R : array_like(float)

R is the payoff(or cost) matrix that corresponds with the state variable x and is n x n. Should be symetric and non-negative definite

N : array_like(float)

N is the cross product term in the payoff, as above. It should be k x n.

A : array_like(float)

A is part of the state transition as described above. It should be n x n

B : array_like(float)

B is part of the state transition as described above. It should be n x k

C : array_like(float), optional(default=None)

C is part of the state transition as described above and corresponds to the random variable today. If the model is deterministic then C should take default value of None

beta : scalar(float), optional(default=1)

beta is the discount parameter

T : scalar(int), optional(default=None)

T is the number of periods in a finite horizon problem.

Rf : array_like(float), optional(default=None)

Rf is the final (in a finite horizon model) payoff(or cost) matrix that corresponds with the control variable u and is n x n. Should be symetric and non-negative definite

Attributes

Q, R, N, A, B, C, beta, T, Rf (see Parameters)
P (array_like(float)) P is part of the value function representation of V(x) = x’Px + d
d (array_like(float)) d is part of the value function representation of V(x) = x’Px + d
F (array_like(float)) F is the policy rule that determines the choice of control in each period.
k, n, j (scalar(int)) The dimensions of the matrices as presented above

Methods

compute_sequence(x0[, ts_length]) Compute and return the optimal state and control sequences x_0, ..., x_T and u_0,..., u_T under the assumption that {w_t} is iid and N(0, 1).
stationary_values() Computes the matrix P and scalar d that represent the value
update_values() This method is for updating in the finite horizon case.
compute_sequence(x0, ts_length=None)[source]

Compute and return the optimal state and control sequences x_0, ..., x_T and u_0,..., u_T under the assumption that {w_t} is iid and N(0, 1).

Parameters:

x0 : array_like(float)

The initial state, a vector of length n

ts_length : scalar(int)

Length of the simulation – defaults to T in finite case

Returns:

x_path : array_like(float)

An n x T matrix, where the t-th column represents x_t

u_path : array_like(float)

A k x T matrix, where the t-th column represents u_t

w_path : array_like(float)

A j x T matrix, where the t-th column represent w_t

stationary_values()[source]

Computes the matrix P and scalar d that represent the value function

V(x) = x’ P x + d

in the infinite horizon case. Also computes the control matrix F from u = - Fx

Returns:

P : array_like(float)

P is part of the value function representation of V(x) = xPx + d

F : array_like(float)

F is the policy rule that determines the choice of control in each period.

d : array_like(float)

d is part of the value function representation of V(x) = xPx + d

update_values()[source]

This method is for updating in the finite horizon case. It shifts the current value function

V_t(x) = x’ P_t x + d_t

and the optimal policy F_t one step back in time, replacing the pair P_t and d_t with P_{t-1} and d_{t-1}, and F_t with F_{t-1}