Analyzing the monotonicity of numerical schemes with examples

Tianhao Zhao

Analyzing the monotonicity of numerical schemes with examplesAbstractNotationA class of 2nd-order non-linear PDEDiscretization & gridStackingInterpolation & stencilMonotonicity of numerical schemesExplicit (forward Euler) method & CFL conditionNaive caseUpwind scheme example: flux termsUpwind scheme example: diffusion termsImplicit (backward Euler) methodNon-monotonicity of a class of approximation-based methodsFirst typeSecond typeBonus section: determine which scheme to use

Abstract

The important paper by Barles and Souganidis (1991) provides a powerful framework to analyze a class of 2nd-order non-linear PDEs which are commonly used in economics. Among all the required conditions, the monotonicity of the numerical scheme is critical and often violated in practice. This blog post explains how to analyze the monotonicity of a wide class of numerical schemes with multiple examples that share similar structures with some commonly used economics models. One can follow the examples to design their own numerical schemes.

Notation

A class of 2nd-order non-linear PDE

Before discussing the monotonicity of numerical schemes, let’s set up the problem and introduce the notations. Let’s consider a generic non-linear PDE that is abstracted from an infinite-horizon stochastic optimal control problem with discounting:

\begin{array}{r} (1) & ρ v (x) = f (x, D v (x), D^{2} v (x)) + \sum_{j = 1}^{k} μ_{j} (x, D v (x), D^{2} v (x)) \cdot D_{j} v (x) + \frac{1}{2} \sum_{m = 1}^{k} \sum_{j = 1}^{k} σ_{m j}^{2} (x, D v (x), D^{2} v (x)) \cdot D_{m j}^{2} v (x) \\ (2) & s.t. \forall j, d x^{j} = μ_{j} (x, D v (x), D^{2} v (x)) d t + \sum_{m = 1}^{k} σ_{m j} (x, D v (x), D^{2} v (x)) d W_{m} \\ (3) & B [x, v (\cdot), D v (\cdot), D^{2} v (\cdot)] = 0 \end{array}

where:

$x = (x^1,\dots,x^k)' \in \R^k$ $x^j$ $j$ -th dimension element of the vector
$\mu_j:\R^k\times \R^{1\times k}\times \R^{k\times k}\to \R$ is the drift function
$\sigma_{mj}\mapsto \R$ is the volatility function that is an element of the volatility matrix
$Dv,D^2 v$ $v$
$f\mapsto \R$ is the flow payoff/utility function
$\rho>0$ is the discounting rate
$\mathcal{W}$ is a 1-dimensional Brownian motion
$\mathscr{B}$ is a collection of boundary conditions as functionals

Remark:
$k\times m$ $k$ $m$ $k$ $x$ $m>k$ $m<k$ ) shape of the volatility matrix does not change the conclusion.
$v$ $x$ includes infinite-dimensional objects (e.g. distribution function), then the conclusion of this post still holds. The above formulation can be seen as a finite-moment approximation of such problems.

In the left part of this post, the long function arguments are ignored if they don’t affect the interpretation. Meanwhile, for convenience, let’s define the following clamping functions:

\begin{matrix} (4) & x^{+} := max {x, 0}, x^{-} := min {x, 0}, x \in R \end{matrix}

Discretization & grid

$\mathcal{X} := \otimes_{j=1}^k [\underline{x}^j,\bar{x}^j]$ $x$ . We don’t discuss the case of state constraints that cannot be rectanglized.

$\mathcal{N} := (N^1,\dots,N^k)$ $N:=\prod_{j=1}^k N^j$ $i := (i^1,\dots,i^k)$ $i^j = 1,\dots, N^j$ $j=1,\dots,k$ $e^j\in\Z^{k}$ $j$ $\Delta^j := \frac{\bar{v}^j-\underline{v}^j}{N^j-1}$ $j$ $j$ $x[i]$ $x[i+\Delta^j e^j]$ .

$N$ $\hat{\mathcal{X}}_N \subset \mathcal{X}$ $\hat{\mathcal{X}}_N$ .

Stacking

$\hat{\mathcal{X}}_N$ $(\mathcal{X},\mathcal{N})$ $\{i\}$ $\mathcal{I}_N\in\R^N$ $\mathcal{I}_N$ .

Remark: an alternative way of stacking representation is to use a vector of supporting nodes rather than “set+sorting”. The reason of the split here is to: 1. represent neighbor nodes in the space; 2. accommodate the fact that users have to manually define the sorting and keep it consistent in their computer programs.

$(\hat{\mathcal{X}}_N,\mathcal{I}_N)$ of supporting node set and the stacking order.

$v(x[i])$ $x[i]\in\hat{\mathcal{X}}_N$ $\mathcal{I}_N$ $(\mathcal{V}_N,\mathcal{I}_N)$ $\mathbf{V}\in\R^N$ .

Interpolation & stencil

$v(x):\R^k\to\R$ $\varphi^p(x)\mapsto \R$

\begin{matrix} (5) & v (x) \approx \hat{v} (x) := \sum_{p = 1}^{P} φ^{p} (x) θ^{p} \end{matrix}

$P$ $\hat{v}(x)$ $\vec\theta := (\theta^1,\dots,\theta^P)' \in \R^{P\times 1}$ $\theta^p$ $\hat{\mathcal{X}}_N$ $\vec\theta$ $v(x)$ at the supporting nodes (e.g. orthogonal polynomials).

$N+1$ degree interpolation:

\begin{matrix} (6) & \hat{v} (z) = ⟨ \vec{φ}, ({\hat{X}}_{N}, I_{N}) ⟩ (z) := \sum_{i \in I_{N}} φ^{i} (z) \cdot \hat{x} [i] + C (z, ({\hat{X}}_{N}, I_{N})) \end{matrix}

$C(z,(\hat{\mathcal{X}}_N,\mathcal{I}_N)) \in \R$ $1\cdot C(\cdot)$ $\hat{x}[i]\in\hat{\mathcal{X}}_N$ $\vec\varphi(z)$ basis stencil $z\in\mathcal{X}$ $(\hat{\mathcal{X}}_N,\mathcal{I}{_N})$ $C$ is to accommodate the strategy of hypothetical (ghost) outside nodes.

For interpolations where the coefficient is not the function value at supporting nodes, we can define a similar structure and also the basis stencil:

\begin{matrix} (7) & \hat{v} (z) = ⟨ \vec{φ}, {\vec{θ}}_{P} ⟩ (z) := \sum_{i = 1}^{P} φ^{i} (z) \cdot θ [i] + C (z, \vec{θ}) \end{matrix}

Remark: There are two common methods to handle boundary conditions:
Assume the boundary conditions hold exactly on the boundary nodes
Assume there exist hypothetical nodes outside but close to the boundary nodes. Assume boundary conditions hold on these hypothetical nodes.
The 2nd method is usually employed to generate solutions with smoother boundary behaviors.

The defined stencil representation allows for standard arithmetic such as addition, deduction, and scalars multiplication.

Monotonicity of numerical schemes

For second-order elliptic and parabolic PDEs, the equation's ellipticity determines both the system's tendency to reach a stationary state and its smoothness. In finite difference methods, a numerical scheme for a PDE consists of difference equations applied to discretized nodes and function values, approximating the original PDE. Within this framework, the monotonicity of a scheme is crucial for ensuring numerical stability and smoothness, and these properties are closely related.

In their seminal paper, Barles and Souganidis (1991) outline sufficient conditions for a numerical scheme to converge locally to the true solution, including monotonicity, consistency, and uniform stability. While consistency and uniform stability are typically easier to achieve in economic models, maintaining monotonicity requires special attention, as its absence is a common cause of failure in many proposed methods. In this post, we assume the other conditions are met and focus on the issue of monotonicity.

$\ref{eq:01}$ $\ref{eq:03}$ ):

\begin{matrix} (8) & S (t, x [i], v [i], V_{N}, D V_{N}, D^{2} V_{N}) \overset{finite difference approximation}{=} S (t, x [i], v [i], V_{N}) = 0 \end{matrix}

$t$ $D\mathcal{V}_N$ $D^2 \mathcal{V}_N$ $\mathcal{V}_N$ with finite difference.

$S$ $S$ $v[i+\Delta^t]$ $v[i]$ $\mathcal{V}_N$ $v[i]$ .

Example (Neo-classical growth model)

The neo-classical growth model:

\begin{matrix} (9) & ρ v (k) = max_{c} \frac{c^{1 - γ}}{1 - γ} + v_{k} (k) \cdot {Z k^{α} - δ k - c} \end{matrix}

$v_t=\partial v/\partial t=0$ $c=v_k^{-1/\gamma}$ , we get a non-linear elliptic PDE:

\begin{matrix} (10) & ρ v = \frac{1}{1 - γ} v_{k}^{\frac{γ - 1}{γ}} + v_{k} (Z k^{α} - δ k - v_{k}^{- 1 / γ}) \end{matrix}

By adding back the always-zero time derivative term, we make it a non-linear parabolic PDE:

\begin{matrix} (11) & ρ v = v_{t} + \frac{1}{1 - γ} v_{k}^{\frac{γ - 1}{γ}} + v_{k} (Z k^{α} - δ k - v_{k}^{- 1 / γ}) \end{matrix}

$k$ $v_k$ $t$ $v_t$ $(k[i],v[i])$ $(\hat{\mathcal{X}}_N,\mathcal{I}_N,\mathcal{V}_N)$ , there is:

\begin{aligned} \frac{v [i + Δ^{t} e^{t}] - v [i]}{Δ^{t}} + ρ v [i] = \frac{1}{1 - γ} {(\frac{v [i + Δ^{k} e^{k}] - v [i]}{Δ^{k}})}^{\frac{γ - 1}{γ}} \\ (12) & + (\frac{v [i + Δ^{k} e^{k}] - v [i]}{Δ^{k}}) {Z k [i]^{α} - δ k [i] - {(\frac{v [i + Δ^{k} e^{k}] - v [i]}{Δ^{k}})}^{- 1 / γ}} \end{aligned}

$v[i]$ $(t,k)$ $S(\cdot)=0$ $\ref{eq:11}$ ) is a non-linear scheme.

Remarks:
$\ref{eq:03}$ ), which is often ignored in practice.
$v[i+\Delta^t e^t]$ $\mathcal{V}_N\setminus v[i+\Delta^t e^t]$ to the other side. Then, both sides of the equation require monotonically increasing. We will use this style of organizing terms in the left part of this post.
$\ref{eq:11}$ one (time) layer $t$ $t$ $t+\Delta^t$ information. Some more complex scheme may use more moments information. For example, Crank-Nicolson (two layer implicit method), "leapfrog" method (three layer explicit method), and Alternating direction implicit (ADI) method (multi layer method implicit method, fraction layers).

Explicit (forward Euler) method & CFL condition

Naive case

The explicit (forward Euler) method is one of the most intuitive schemes, relying solely on the previous iteration's guess to update the solution. It is called "forward" because it applies a forward difference along the time dimension. In the context of HJB equation, this approach typically avoids the need to solve any linear or nonlinear systems, requiring only a loop over each grid node. So it is easy to do parallelization. However, this method usually demands a small and adaptively adjusted time step to maintain stability, resulting in the so-called Courant–Friedrichs–Lewy (CFL) condition.

$\ref{eq:10}$ $k[i]$ $k$ $v'[i]$ $t + \Delta^t$ $v[i]$ $t$ $v_t \approx (v'[i] - v[i])/\Delta^t$ .

$\ref{eq:11}$ $\ref{eq:11}$ $k$ dimension. Thus,

\begin{aligned} \frac{v^{'} [i] - v [i]}{Δ^{t}} + ρ v [i] = u (c [i]) + μ_{k} (k [i], v [i], V_{N}) \cdot \frac{1}{Δ^{k}} (v [i + Δ^{k}] - v [i]) \\ (13) & \overset{simplify}{⟹} & \frac{v^{'} [i] - v [i]}{Δ^{t}} + ρ v [i] = u [i] + μ_{k} [i] \cdot \frac{1}{Δ^{k}} (v [i + Δ^{k}] - v [i]) \end{aligned}

Rearranging the scheme:

\begin{aligned} (14) & \frac{1}{Δ^{t}} v^{'} [i] = {\frac{1}{Δ^{t}} - ρ - \frac{μ_{k} [i]}{Δ^{k}}} \cdot v [i] + \frac{μ_{k} [i]}{Δ^{k}} \cdot v [i + Δ^{k}] + u [i] \end{aligned}

The monotonicity requires:

\begin{aligned} (15) & \frac{1}{Δ^{t}} \geq 0 & (LHS) \\ (16) & \frac{1}{Δ^{t}} - ρ - \frac{μ_{k} [i]}{Δ^{k}} \geq 0 & (RHS) \\ (17) & \frac{μ_{k} [i]}{Δ^{k}} \geq 0 & (RHS) \end{aligned}

$u[i]$ $v[i]$ $\Delta^t>0$ $\forall i\in \mathcal{I}_N$ :

\begin{aligned} (18) & Δ^{t} \in R_{+} & , if μ_{k} [i] < - ρ Δ^{k} \\ (19) & Δ^{t} \leq \frac{Δ^{k}}{ρ Δ^{k} + μ_{k} [i]} < \frac{Δ^{k}}{μ_{k} [i]} & , otherwise \end{aligned}

$\ref{eq:18}$ $\Delta^t \to 0$ $\mu_k[i]$ increasing, which leads to slow convergence of the iteration.

$\ref{eq:13}$ $\mathcal{X}$ $\dots$ $\ref{eq:16}$ $\Delta^t$ $\ref{eq:13}$ $\ref{eq:13}$ ).

Upwind scheme example: flux terms

In the context of second-order elliptic, parabolic, or hyperbolic PDEs, there is a strategy to improve a scheme’s monotonicity called the upwind scheme. This terminology reflects the physical intuition behind the approach, particularly for heat or wave equations. Here, the direction of differencing is set against the direction of wave or heat propagation, ensuring that information from the "origin of the wave/heat" (considered more crucial) is accurately incorporated. Mathematically, this strategy involves clamping the flux (drift) terms, which act as coefficients for the partial derivative terms, to ensure they remain positive or negative as required.

$\ref{eq:10}$ $v_k \cdot \mu_k$ as:

\begin{matrix} (20) & v_{k} [i] \cdot μ_{k} [i] \approx \frac{μ_{k}^{+} [i]}{Δ^{k}} (v [i + Δ^{k}] - v [i]) + \frac{μ_{k}^{-} [i]}{Δ^{k}} (v [i] - v [i - Δ^{k}]) \end{matrix}

$\mu_k[i]$ $v_k[i] \cdot \mu_k[i] = 0$ $\ref{eq:19}$ $\ref{eq:19}$ ), there is:

\begin{matrix} (21) & \underset{\geq 0}{\underset{⏟}{\frac{μ_{k}^{+} [i]}{Δ^{k}}}} \cdot v [i + Δ^{k}] + \underset{\geq 0}{\underset{⏟}{\frac{- μ_{k}^{-} [i]}{Δ^{k}}}} \cdot v [i - Δ^{k}] - \underset{\geq 0}{\underset{⏟}{(\frac{μ_{k}^{+} [i]}{Δ^{k}} + \frac{- μ_{k}^{-} [i]}{Δ^{k}})}} \cdot v [i] \end{matrix}

$\ref{eq:20}$ ) appear on the RHS of the discretization. Specifically, let’s apply upwind scheme to the growth model example to get a full picture of this:

\begin{aligned} (22) & \frac{v^{'} [i] - v [i]}{Δ^{t}} + ρ v [i] = u [i] + \underset{\geq 0}{\underset{⏟}{\frac{μ_{k}^{+} [i]}{Δ^{k}}}} \cdot v [i + Δ^{k}] + \underset{\geq 0}{\underset{⏟}{\frac{- μ_{k}^{-} [i]}{Δ^{k}}}} \cdot v [i - Δ^{k}] - \underset{\geq 0}{\underset{⏟}{(\frac{μ_{k}^{+} [i]}{Δ^{k}} + \frac{- μ_{k}^{-} [i]}{Δ^{k}})}} \cdot v [i] \\ (23) & ⟹ & \frac{1}{Δ^{t}} \cdot v^{'} [i] = [\frac{1}{Δ^{t}} - ρ - (\frac{μ_{k}^{+} [i]}{Δ^{k}} + \frac{- μ_{k}^{-} [i]}{Δ^{k}})] \cdot v [i] + \underset{\geq 0}{\underset{⏟}{\frac{μ_{k}^{+} [i]}{Δ^{k}}}} \cdot v [i + Δ^{k}] + \underset{\geq 0}{\underset{⏟}{\frac{- μ_{k}^{-} [i]}{Δ^{k}}}} \cdot v [i - Δ^{k}] \end{aligned}

$v'[i]$ $v[i+\Delta^k]$ $v[i-\Delta^k]$ $v[i]$ $\Delta^t$ .

\begin{aligned} (24) & \frac{1}{Δ^{t}} - ρ - (\frac{μ_{k}^{+} [i]}{Δ^{k}} + \frac{- μ_{k}^{-} [i]}{Δ^{k}}) \geq 0 \\ (25) & ⟹ & \frac{1}{Δ^{t}} \geq \underset{> 0}{\underset{⏟}{ρ + (\frac{μ_{k}^{+} [i]}{Δ^{k}} + \frac{- μ_{k}^{-} [i]}{Δ^{k}})}} \\ (26) & ⟹ & Δ^{t} \leq \frac{Δ^{k}}{ρ Δ^{k} + (μ_{k}^{+} [i] - μ_{k}^{-} [i])} = \frac{Δ^{k}}{ρ Δ^{k} + | μ_{k} [i] |} < \frac{Δ^{k}}{| μ_{k} [i] |}, \forall i \in I_{N} \end{aligned}

$\ref{eq:13}$ $\Delta^t$ $\Delta^t$ now can handle both net saving and de-saving cases. Soon, we will demonstrate that the upwind scheme is still effective in the multi-dimensional case.

Remark $\Delta^t \approx 0.9 \cdot \frac{\Delta^k}{|\mu_k[i]|}$ $\rho$ $0.05$ $\Delta^t \approx 0.9 \cdot \frac{\Delta^k}{\rho \Delta^k + |\mu_k[i]|}$ .

Upwind scheme example: diffusion terms

$\ref{eq:01}$ ). We have shown how to apply upwind scheme onto the flux terms. Now, let’s discuss how to apply upwind scheme to the diffusion terms. WLOG, let’s consider a stochastic neo-classical growth model:

\begin{matrix} (27) & ρ v (k, z) = max_{c} \frac{c^{1 - γ}}{1 - γ} + v_{k} \cdot (z k^{α} - δ k - c) + v_{z} \cdot κ (\bar{z} - z) + v_{z z} \cdot \frac{1}{2} σ_{z}^{2} \end{matrix}

$v_{zz} \cdot \frac{1}{2} \sigma^2_z$ $\sigma^2_z$ $v_{zz}$ $v_{zz} \approx ( v[i+\Delta^z e^z] - 2 v[i] + v[i-\Delta^z e^z] )/(\Delta^z)^2$ $\sigma^2_z$ , we have:

\begin{matrix} (28) & v_{z z} [i] \cdot \frac{1}{2} σ_{z}^{2} [i] \approx \frac{(σ_{z}^{2} [i])^{+}}{2 Δ^{z}} (v [i + Δ^{z} e^{z}] - 2 v [i] + v [i - Δ^{z} e^{z}]) + \frac{- (σ_{z}^{2} [i])^{-}}{2 Δ^{z}} (v [i + Δ^{z} e^{z}] - 2 v [i] + v [i - Δ^{z} e^{z}]) \end{matrix}

$v[i+\Delta^z e^z]$ $v[i-\Delta^z e^z]$ $v[i]$ $v[i]$ $k$ $z$ , we have the following CFL condition:

\begin{aligned} (29) & \frac{1}{Δ^{t}} - ρ - (\frac{μ_{k}^{+} [i]}{Δ^{k}} + \frac{- μ_{k}^{-} [i]}{Δ^{k}} + \frac{μ_{z}^{+} [i]}{Δ^{z}} + \frac{- μ_{z}^{-} [i]}{Δ^{z}} + \frac{(σ_{z}^{2} [i])^{+}}{Δ^{z}} + \frac{- (σ_{z}^{2} [i])^{-}}{Δ^{z}}) \geq 0 \\ (30) & ⟹ & Δ^{t} \leq \frac{1}{ρ + | \frac{μ_{k}}{Δ^{k}} | + | \frac{μ_{z}}{Δ^{z}} | + | \frac{σ_{k}^{2}}{Δ^{k}} |} \end{aligned}

$\ref{eq:01}$ ):

\begin{matrix} (31) & \forall i \in I_{N}, Δ^{t} \leq {[ρ + \sum_{j = 1}^{k} | \frac{μ_{j} [i]}{Δ^{j}} | + \sum_{m = 1}^{k} \sum_{j = 1}^{k} | \frac{σ_{m j}^{2} [i]}{Δ^{m} Δ^{j}} |]}^{- 1} \end{matrix}

As we noted at the beginning of this section, the explicit method requires progressively smaller time steps as the dimensionality of the problem and the size of the computational space increase. For a medium-sized problem, this typically means millions of iterations are needed to achieve convergence.

Implicit (backward Euler) method

$\Delta^t$ when the upwind scheme is applied. However, the trade-off with the implicit method is that it requires solving a linear or nonlinear system at each iteration, which can be large but sparse in most cases.

$t$ $t$ $t+\Delta^t$ $t$ .

fully implicit $v_t$ $t$ $t+\Delta^t$ . This type of scheme typically results in nonlinear schemes (with respect to partial derivatives), which are challenging to analyze. Let’s use the deterministic growth model example again, but this time we have:

\begin{aligned} \frac{v^{'} [i] - v [i]}{Δ^{t}} + ρ v^{'} [i] = \underset{u^{'} [i]}{\underset{⏟}{\frac{1}{1 - γ} {(c^{'} [i])}^{\frac{γ - 1}{γ}}}} \\ (32) & + (\frac{v^{'} [i + Δ^{k} e^{k}] - v^{'} [i]}{Δ^{k}}) \underset{μ_{k}^{'} [i]}{\underset{⏟}{{Z k^{'} [i]^{α} - δ k^{'} [i] - {(\frac{v^{'} [i + Δ^{k} e^{k}] - v^{'} [i]}{Δ^{k}})}^{- 1 / γ}}}} \end{aligned}

$u'[i]$ $\mu'_k[i]$ $t+\Delta^t$ $Dv'$ $\mathcal{V}'$ $v'[i]$ and its neighboring values being extremely complex. In most economic models, analyzing these derivatives is impractical.

Remark $\ref{eq:31}$ $u$ $\mu_k$ .

quasi-implicit methods $t$ $t+\Delta^t$ . This scheme corresponds precisely to the “implicit method” used by Benjamin Moll and other economists. Under such a quasi-implicit scheme and applying upwind scheme, the growth model is discretized as:

\begin{aligned} (33) & \frac{v^{'} [i] - v [i]}{Δ^{t}} + ρ v^{'} [i] = u [i] + \underset{\geq 0}{\underset{⏟}{\frac{μ_{k}^{+} [i]}{Δ^{k}}}} \cdot v^{'} [i + Δ^{k}] + \underset{\geq 0}{\underset{⏟}{\frac{- μ_{k}^{-} [i]}{Δ^{k}}}} \cdot v^{'} [i - Δ^{k}] - \underset{\geq 0}{\underset{⏟}{(\frac{μ_{k}^{+} [i]}{Δ^{k}} + \frac{- μ_{k}^{-} [i]}{Δ^{k}})}} \cdot v^{'} [i] \\ (34) & ⟹ & \underset{> 0}{\underset{⏟}{{(\frac{1}{Δ^{t}} + ρ) + (\frac{μ_{k}^{+} [i]}{Δ^{k}} + \frac{- μ_{k}^{-} [i]}{Δ^{k}})}}} \cdot v^{'} [i] = u [i] + \underset{\geq 0}{\underset{⏟}{\frac{μ_{k}^{+} [i]}{Δ^{k}}}} \cdot v^{'} [i + Δ^{k}] + \underset{\geq 0}{\underset{⏟}{\frac{- μ_{k}^{-} [i]}{Δ^{k}}}} \cdot v^{'} [i - Δ^{k}] + \underset{> 0}{\underset{⏟}{\frac{1}{Δ^{t}}}} v [i] \end{aligned}

$v'[i]$ $v[i]$ unconditionally $\Delta^t$ $N+2k$ $2k$ $N$ $t+\Delta^t$ $\mathcal{V}$ $t+\Delta^t$ for finite-horizon HJB equations.

Remarktransition matrix $L$ $\Delta^t$ $L\cdot \mathbf{V} \approx \mathcal{L}\mathbf{V}$ ; it is basically a stacking of the flux & covariance terms in our previous examples.

Non-monotonicity of a class of approximation-based methods

$\hat{\mathcal{X}}_N$ $v$ $v$ are combined with finite difference methods, the monotonicity condition often fails to hold for a large class of these function approximation methods, even when the implicit method and upwind scheme are applied. This section is an intuitive illustration of the proof idea in Garcke and Ruttscheidt (2019).

We consider the two types of interpolation as outlined in the notation section.

First type

$\vec\theta$ $\hat{\mathcal{X}}_N$ $\ref{eq:06}$ ). Some examples: local methods such as piecewise polynomial, spline, etc. The multi-dimensional piecewise linear (aka multi-linear) interpolation exactly fits into the discussion of this sub-section.

$\hat{v}(z),z\in\mathcal{X}$ $v[i]\in\mathcal{V}$ $z_1, z_2, \dots \in \mathcal{X}$ can be represented by a basis stencil. For example:

\begin{aligned} for any z_{1}, z_{2} \in X; a_{1}, a_{2} \in R \\ \hat{v} (z_{1}) = \vec{φ} (z_{1}) \cdot V + C (z_{1}) \\ \hat{v} (z_{2}) = \vec{φ} (z_{2}) \cdot V + C (z_{2}) \\ a_{1} \cdot \hat{v} (z_{1}) + a_{2} \cdot \hat{v} (z_{2}) = {a_{1} \cdot \vec{φ} (z_{1}) + a_{2} \cdot \vec{φ} (z_{2})} \cdot V + {a_{1} \cdot C (z_{1}) + a_{2} \cdot C (z_{2})} \\ (35) & =: {\vec{φ}}_{1, 2} \cdot V + C_{1, 2} \end{aligned}

$\mathbf{V}$ , as previously arranged, is the stack of function values at the supporting nodes, formatted as a vector. Notably, all finite difference approximations of partial derivatives fall within this category.

$\ref{eq:26}$ $(\Delta^k,\Delta^z)$ $v_k[i] \approx [\hat{v}(k[i] + \Delta^k, z[i]) - v[i]] / \Delta^k$ .

Remark: The reason why not directly differencing two neighbor supporting nodes is that:
$\hat{\mathcal{X}}_N$
Even neighboring supporting nodes can be defined, there would be inconsistency issue which diverges the solution from the true solution. (Garcke & Ruttscheidt, 2019)

Remark $N$ $\hat{\mathcal{X}}_N$ such that the finite difference happens between a supporting node and an interpolated neighboring point. However, the conclusion of non-monotonicity trivially holds for a more generic case. This implies the failure of the following tries:
$\mathcal{X}$
Evaluate the interpolants at these points
Apply standard finite difference + quasi implicit scheme

$\ref{eq:26}$ ) and apply quasi implicit scheme and upwind scheme:

\begin{aligned} \frac{v^{'} [i] - v [i]}{Δ^{t}} + ρ v^{'} [i] = u [i] + \frac{μ_{k}^{+} [i]}{Δ^{k}} {{\hat{v}}^{'} [i + Δ^{k} e^{k}] - v^{'} [i]} + \frac{μ_{k}^{-} [i]}{Δ^{k}} {v^{'} [i] - {\hat{v}}^{'} [i - Δ^{k} e^{k}]} \\ + \frac{μ_{z}^{+} [i]}{Δ^{z}} {{\hat{v}}^{'} [i + Δ^{z} e^{z}] - v^{'} [i]} + \frac{μ_{z}^{-} [i]}{Δ^{z}} {v^{'} [i] - {\hat{v}}^{'} [i - Δ^{z} e^{z}]} \\ (36) & + \frac{σ_{z}^{2} [i]}{2 (Δ^{z})^{2}} {{\hat{v}}^{'} [i + Δ^{z} e^{z}] - 2 v^{'} [i] + {\hat{v}}^{'} [i - Δ^{z} e^{z}]} \end{aligned}

$\ref{eq:33}$ $\ref{eq:40}$ supporting nodes $\ref{eq:35}$ $e^i \in \R^N$ as basis stencil.

\begin{aligned} \frac{1}{Δ^{t}} [({\vec{φ}}^{'} [i] V^{'} - \vec{φ} [i] V) + (C^{'} [i] - C [i])] + ρ ({\vec{φ}}^{'} [i] V^{'} + C^{'} [i]) = u [i] \\ + \frac{μ_{k}^{+} [i]}{Δ^{k}} {[{\vec{φ}}^{'} [i + Δ^{k} e^{k}] - {\vec{φ}}^{'} [i]] V^{'} + (C^{'} [i + Δ^{k} e^{k}] - C^{'} [i])} \\ + \frac{μ_{k}^{-} [i]}{Δ^{k}} {[{\vec{φ}}^{'} [i] - {\vec{φ}}^{'} [i - Δ^{k} e^{k}]] V^{'} + (C^{'} [i] - C^{'} [i - Δ^{k} e^{k}])} \\ + \frac{μ_{z}^{+} [i]}{Δ^{z}} {[{\vec{φ}}^{'} [i + Δ^{z} e^{z}] - {\vec{φ}}^{'} [i]] V^{'} + (C^{'} [i + Δ^{z} e^{z}] - C^{'} [i])} \\ + \frac{μ_{z}^{-} [i]}{Δ^{z}} {[{\vec{φ}}^{'} [i] - {\vec{φ}}^{'} [i - Δ^{z} e^{z}]] V^{'} + (C^{'} [i] - C^{'} [i - Δ^{z} e^{z}])} \\ (37) & + \frac{σ_{z}^{2} [i]}{2 (Δ^{z})^{2}} {[{\vec{φ}}^{'} [i + Δ^{z} e^{z}] - 2 {\vec{φ}}^{'} [i] + {\vec{φ}}^{'} [i - Δ^{z} e^{z}]] V^{'} + (C^{'} [i + Δ^{z} e^{z}] - 2 C^{'} [i] + C^{'} [i - Δ^{z} e^{z}])} \end{aligned}

$\ref{eq:37}$ all $\hat{\mathcal{X}}_N$ :

\begin{matrix} (38) & A^{'} \cdot V^{'} = \frac{1}{Δ^{t}} \vec{φ} [i] \cdot V + C + u [i] \end{matrix}

$\mathscr{A}'\in\R^{1\times N}$ $\mathscr{C}\in\R$ $\ref{eq:38}$ ) to satisfy:

$\vec\varphi[i]$ $v[q],q\in\mathcal{I}_N$ $t$ is non-negative.
$\mathscr{A}'$ $i$ $t+\Delta^t$ , must be non-positive.
$i$ $t+\Delta^t$ must be non-negative.

$(2_{\text{moments}} \cdot 2_{\text{dim}} \cdot 2_{\text{neighbor}})N$ $\{\vec\varphi, \vec\varphi'\}$ $\vec\varphi \in \mathbb{R}^N$ $\mathscr{A}'$ $\vec\varphi[i]$ $\mathbf{V}$ $\mathcal{V}$ ) is violated, leaving no feasible solution for economists to restore it.

$k$ $m$ $\ref{eq:01}$ ). I leave the formal statement for those who are interested in further exploration.

$z \in \mathcal{X}$ convex $\hat{\mathcal{X}}_N$ $\ref{eq:38}$ $\ref{eq:38}$ ) preserves unconditional monotonicity. There are specific types of interpolation that meet this convexity requirement.

Remark: The non-negativity of basis stencil or function elements is NOT equivalent to the monotonicity of the interpolant. Some readers may recognize methods like monotonic interpolations (e.g., monotonic cubic splines). These interpolations, while designed to ensure monotonicity of the interpolant, do not necessarily guarantee monotonicity of numerical schemes—two forms of "monotonicity" that are fundamentally different.
More specifically, monotonic interpolation ensures that within a single iteration, if the guessed solution is monotonic along certain dimensions, then evaluating the interpolant will preserve this property. In contrast, the monotonicity of a numerical scheme ensures that, as the guessed solution is updated from one iteration to the next, the monotonicity of the previous iteration’s guess is maintained in the next iteration, thereby preserving the solution’s shape over successive updates. This is sometimes referred as “shape-preserving” property of the scheme.

Second type

$\vec\theta_P$ $P$ $N$ . Examples of this type of interpolation include global methods such as orthogonal polynomials (spectral), neural networks, and Gaussian processes, among others.

$\ref{eq:38}$ $\mathcal{V}$ $\vec\theta_P$ $v[i]$ is the evaluated value),

\begin{matrix} (39) & \hat{v} [i + Δ^{k} e^{k}] - \hat{v} [i] = (\vec{φ} [i + Δ^{k} e^{k}] - \vec{φ} [i]) \cdot {\vec{θ}}_{P} + (C [i + Δ^{k} e^{k}] - C [i]) \end{matrix}

$P+2k$ difference equations. If the stencil elements cannot be guaranteed to be non-negative, then the same issue of irreparable monotonicity arises once again.

$z \in \mathcal{X}$ . For example, one might consider using ReLU, Sigmoid, or other non-negative activation functions in neural network implementations.

Bonus section: determine which scheme to use

The combined strategy of the quasi-implicit method and the upwind scheme is sufficient for handling a broad class of economic models in research. However, certain models with unique structures may require customization beyond the standard approach. Learning how to design tailored linear or nonlinear schemes for specific problems can be highly beneficial (even though most customized schemes works less good than the standard approach when the latter is available).

$|v_k|$ appears in some applications (e.g. adjustment costs). The trick to monotonically discretize it is:

\begin{array}{r} (40) & | v_{k} | := max {v_{k}, - v_{k}} = max {\frac{v [i] - v [i - Δ^{k} e^{k}]}{Δ^{k}}, - \frac{v [i + Δ^{k} e^{k}] - v [i]}{Δ^{k}}} \end{array}

$v_k\geq 0$ $v_k<0$ .

The general steps to determine what numerical scheme to use is:

Define the economic model as a boundary problem and solve policy functions in the interior and on the boundary.
Plugging the policy functions back and observe:
1. How tractable the problem is? If maximum principle can gives the analytical solution, then solve it. If not, then consider numerically solve it.
2. Is finite difference the best choice? If yes, move on. If not, try generalized residual methods (I’ll discuss this in another post)
3. How non-linear the problem is? If linear enough or easy to linearize, then consider linearize the PDE and solve it analytically or numerically. e.g. some portfolio or risk-neutral problems can be approximated as a Linear Quadratic Regulator (LQR) problem which is tractable
4. Dimensionality of the problem? If the dimensionality is low, then primarily use even-spaced grid + standard strategy. If the dimensionality is high, then choose proper approximation methods that satisfy the above non-negativity conditions, and potentially combine it with other techniques such as sparse grids.
If finite difference is your final choice, then
1. Design your grid: even-spaced dense grid? uneven-spaced dense grid (e.g. Chebyshev)? sparse grid (e.g. Smolyak)? adaptive sparse grid (e.g. multi-linear ASG)?
2. $\to$ If yes, then things go more complex and you may need an advanced course of numerical PDE.
3. Based on your grid, decide if to use approximation methods
  1. Even-spaced dense grid: No need to use approximation, everything is standard
  2. Uneven-spaced dense grid: No need to use approximation, but needs to be careful about the mesh step size for difference
  3. (Regular or adaptive) sparse grid: Must use approximation. Carefully choose approximation methods that satisfy the above conditions
4. Design the discretization strategy, i.e. the scheme. This is the heart part.
  1. $\to$ This is needed in some special cases such as doing dimension reduction using methods like ADI, or you are handling some special time-dependent (parabolic) problems.
  2. $\to$ If yes, then use it. This is faster convergent than the quasi implicit scheme. If not, one may need to compare it (after applying CFL condition) with a quasi implicit scheme.
  3. $\to$ If no, then great and just fit your PDE into the standard strategy. If yes, you need more tricks to handle these non-linearities before moving forward.
  4. $\to$ If yes, then go ahead. If no, stay with the standard strategy.
5. Carefully double check if all Barles-Souganidis sufficient conditions are satisfied.
6. Carefully verify the chosen scheme on the discretized boundary conditions. The PDE, esp. elliptic/parabolic/hyperbolic PDEs are pinned down greatly by boundary conditions. For multi-dimensional problems, pay extra attention to “corners” i.e. where multiple boundary conditions need to simultaneously hold.
7. Write your code, debug, test and run.
Reference
1. Garcke, J., & Ruttscheidt, S. (2019). Finite differences on sparse grids for continuous time heterogeneous agent models.
2. Barles, G., & Souganidis, P. E. (1991). Convergence of approximation schemes for fully nonlinear second order equations. Asymptotic analysis, 4(3), 271-283.