Dept. of Math./CMA Univ. of Oslo Pure Mathematics No 4 ISSN 86 2439 April 27 A MAXIMUM PINCIPLE FO STOCHASTIC DIFFEENTIAL GAMES WITH PATIAL INFOMATION TA THI KIEU AN AND BENT ØKSENDAL Abstract. In this paper we first deal with the problem of optimal control for zero-sum stochastic differential games. We give a necessary sufficient maximum principle for that problem with partial information. Then we use the result to solve a problem in finance. Finally, we extend our approach to general stochastic games (nonzero-sum), obtain an equilibrium point of such game. 1. Introduction Game theory had been an active area of research a useful tool in many applications, particularly in biology economic. In the recent paper by Mataramvura Øksendal 6, the stochastic differential game was solved with restriction to consider only Markov controls. Then the equilibrium point or other type of solution are constructed using Hamilton-Jacobi- Bellman (HJB) equations. In this paper, we require that the control process is adapted to a given sub-filtration of the filtration generated by the underlying Lévy processes. So we cannot use dynamic programming HJB equations to solve the problems. Here we establish a maximum principle for such stochastic control problems. There is already a lot of literature on the maximum principle. See e.g. 1, 2, 5, 9 the references therein. Our paper is organized as follows: In section 2 we give a sufficient maximum principle for zero-sum stochastic differential games (Theorem 1). And a necessary type of this problem is given in the section 3. In section 4 we put a problem in finance into the framework of a stochastic differential game with partial information use Theorem 1 to solve it. With complete information this problem is solved in 8 by using HJB equations. In section 4 we generalize our approach to the general case, not necessarily of zero-sum type, also give an equilibrium point for nonzero-sum games. 2. The sufficient maximum principle for zero-sum games Suppose the dynamics of a stochastic system is described by a stochastic differential equation on a complete filtered probability space (Ω, F, {F t } t, P) of the form Date: evised in October 27. Key words phrases. Jump diffusion, stochastic control, stochastic differential game, sufficient maximum principle, necessary maximum principle. 1
2 T.T.K.AN AND B.ØKSENDAL dx(t) = b(t, X(t), u (t))dt σ(t, X(t), u (t))db(t) (2.1) γ(t, X(t ), u n 1 (t, z), z)ñ( dt, dz), t, T X() = x n Here b :, T n K n ; σ :, T n K n n γ :, T n K n n are given continuous functions, B(t) is n- dimensional Brownian motion, Ñ(.,.) are n independent compensated Poisson rom measures K is a given closed subset of n. The processes u (t) = u (t, ω) u 1 (t) = u 1 (t, z, ω), ω Ω are our control processes. We assume that u (t), u 1 (t, z) have values in a given set K for a.a. t, z that u (t), u 1 (t, z) are càdlàg adapted to a given filtration {E t } t, where For example, we could have E t F t, t E t = F (t δ) ; t where (t δ) = max(, t δ). This models a situation where the controller only has delayed information available about the state of the system. Let f :, T n K be a continuous function, namely the profit rate, let g : n be a concave function, namely the bequest function. We call u = (u, u 1 ) an admissible control if (2.1) has a unique strong solution (2.2) E x T f(t, X(t), u (t)) dt g(x(t )) < If u is an admissible control we define the performance criterion J(u) by (2.3) J(u) = E x T f(t, X(t), u (t)) dt g(x(t )) (2.4) (2.5) Now suppose that the controls u (t) u 1 (t, z) have the form u (t) = (θ (t), π (t)); t u 1 (t, z) = (θ 1 (t, z), π 1 (t, z)); (t, z), ) n We let Θ Π be given families of admissible controls θ = (θ, θ 1 ) π = (π, π 1 ), respectively. The partial information zero-sum stochastic differential game problem is to find (θ, π ) Θ Π such that ( ) (2.6) Φ E (x) = J(θ, π ) = sup inf J(θ, π) π Π θ Θ Such a control (θ, π ) is called an optimal control (if it exists). The intuitive idea is that there are two players, I II. Player I controls θ := (θ, θ 1 ) player II controls π := (π, π 1 ). The actions of the players are antagonistic, which means that between I II there is a payoff J(θ, π) which is a cost for I a reward for II.
MAXIMUM PINCIPLE FO STOCHASTIC DIFFEENTIAL GAMES 3 Let K 1, K 2 be two sets such that θ(t, z) K 1 π(t, z) K 2 for a.a t, z. As in 5 we now define the Hamiltonian H :, T n K 1 K 2 n n n by H(t, x, θ, π, p, q, r) =f(t, x, θ, π) b T (t, x, θ, π)p tr(σ T ((t, x, θ, π)q) (2.7) γ ij (t, x, θ, π, z)r ij (t, z)ν j (dz j ) i,j=1 where is the set of functions r :, T n n such that the integral in (2.7) converges. From now on we assume that H is continuously differentiable with respect to x. The adjoint equation in the unknown adapted processes p(t) n, q(t) n n r(t, s) n n is the backward stochastic differential equation (BSDE) dp(t) = x H(t, X(t), θ(t), π(t), p(t), q(t), r(t,.)) dt (2.8) q(t) db(t) r(t, z)ñ( dt, dz), t < T n p(t ) = g(x(t )) ( ) T where y ϕ(.) = ϕ y 1,..., ϕ y n is the gradient of ϕ : n with respect to y = (y 1,..., y n ). We can now state the following verification theorem for optimality: Theorem 1. Let (ˆθ, ˆπ) Θ Π with corresponding state process ˆX(t) = X (ˆθ,ˆπ) (t). Denote by X π (t) X θ (t) to be X (ˆθ,π) (t) X (θ,ˆπ) (t), respectively. Suppose there exists a solution ( ˆp(t), ˆq(t), ˆr(t, z)) of the corresponding adjoint equation (2.8) such that for all θ Θ π Π, we have (2.9) T E ( ˆX(t) X (π) (t)) T {ˆqˆq T (t) ˆrˆr T (t, z)ν(dz) } ( ˆX(t) X (π) (t))dt < n (2.1) T E ( ˆX(t) X (θ) (t)) T {ˆqˆq T (t) ˆrˆr T (t, z)ν(dz) } ( ˆX(t) X (θ) (t))dt <, n (2.11) T E ˆp(t) T { σσ T (t, X(t), θ(t), ˆπ(t)) γγ T (t, X (θ) (t), θ(t), ˆπ(t), z)ν(dz) } p(t)dt <, (2.12) T E ˆp(t) T { σσ T (t, X(t), ˆθ(t), π(t)) γγ T (t, X (π) (t), θ(t), ˆπ(t), z)ν(dz) } p(t)dt <,
4 T.T.K.AN AND B.ØKSENDAL ensuring that the integrals with respect to B the compensated small jumps part indeed have zero mean. Moreover, suppose that for all t, T, the following partial information maximum principle holds: (2.13) inf EH(t, X(t), θ, ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) E t θ K 1 = EH(t, X(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) E t = sup π K 2 EH(t, X(t), ˆθ(t), π, ˆp(t), ˆq(t), ˆr(t,.)) E t i) Suppose that, for all t, T, g(x) is concave is concave. Then (x, π) H(t, x, ˆθ(t), π, ˆp(t), ˆq(t), ˆr(t,.)) J(ˆθ, ˆπ) J(ˆθ, π) for all π Π J(ˆθ, ˆπ) = sup J(ˆθ, π) π Π ii) Suppose that, for all t, T, g(x) is convex is convex. Then (x, θ) H(t, x, θ, ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) J(ˆθ, ˆπ) J(θ, ˆπ) J(ˆθ, ˆπ) = inf J(θ, ˆπ) θ Θ for all θ Θ iii) If both cases (i) (ii) hold (which implies, in particular, that g is an affine function), then (θ, π ) := (ˆθ, ˆπ) is an optimal control ( ) ( ) (2.14) Φ E (x) = sup inf J(θ, π) = inf sup J(θ, π) π Π θ Θ θ Θ π Π Proof. i) Suppose (i) holds. Choose (θ, π) Θ Π. Let us consider J(ˆθ, ˆπ) J(ˆθ, π) = I 1 I 2 where T { (2.15) I 1 = E f(t, ˆX(t), ˆθ(t), ˆπ(t)) f(t, X (π) (t), ˆθ(t), π(t)) } dt (2.16) I 2 = E g( ˆX(T )) g(x (π) (T )) Since g is concave in x from integration by parts formula for jump processes we get the following, where the L 2 conditions (2.9) (2.12)
MAXIMUM PINCIPLE FO STOCHASTIC DIFFEENTIAL GAMES 5 ensure that the stochastic integrals with respect to the local martingales have zero expectation: I 2 = E g( ˆX(T )) g(x (π) (T )) E ( ˆX(T ) X (π) (T )) T g( ˆX(T )) = E (X (ˆθ,ˆπ) (T ) X (ˆθ,π) (T )) T ˆp(T ) T = E ( ˆX(t ) X (π) (t )) T dˆp(t) T T T ˆp T (t)(d ˆX(t) dx (π) (t)) tr {σ(t, ˆX(t), ˆθ(t), ˆπ(t)) σ(t, X (π) (t), ˆθ(t), π(t))} T ˆq(t) {γ ij (t, ˆX(t), ˆθ(t), ˆπ(t), z) i,j=1 γ ij (t, X (π) (t), ˆθ(t), π(t), z)}ˆr ij (t, z)ν(dz j )dt dt T = E ( ˆX(t) X (π) (t)) T ( x H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)))dt (2.17) T T T ˆp T (t) { b(t, ˆX(t), ˆθ(t), ˆπ(t)) b(t, X (π) (t), ˆθ(t), π(t)) } dt tr {σ(t, ˆX(t), ˆθ(t), ˆπ(t)) σ(t, X (π) (t), ˆθ(t), π(t))} T ˆq(t) dt i,j=1 {γ ij (t, ˆX(t), ˆθ(t), ˆπ(t), z) γ ij (t, X (π) (t), ˆθ(t), π(t), z)}ˆr ij (t, z)ν(dz j )dt By the definition (2.7) of H we have T I 1 = E {f(t, ˆX(t), ˆθ(t), ˆπ(t)) f(t, X (π) (t), ˆθ(t), π(t))}dt T = E {H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) H(t, X (π) (t), ˆθ(t), π(t), ˆp(t), ˆq(t), ˆr(t,.))}dt T E T E ˆp T (t) { b(t, ˆX(t), ˆθ(t), ˆπ(t)) b(t, X (π) (t), ˆθ(t), π(t)) } dt tr {σ(t, ˆX(t), ˆθ(t), ˆπ(t)) σ(t, X (π) (t), ˆθ(t), π(t))} T ˆq(t) dt
6 T.T.K.AN AND B.ØKSENDAL (2.18) T E i,j=1 {γ ij (t, ˆX(t), ˆθ(t), ˆπ(t), z) γ ij (t, X (π) (t), ˆθ(t), π(t), z)}ˆr ij (t, z)ν(dz j )dt By concavity of H in x π we have H(t, ˆX(t),ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) H(t, X (π) (t), ˆθ(t), π(t), ˆp(t), ˆq(t), ˆr(t,.)) (2.19) x H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) T ( ˆX(t) X (π) (t)) π H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) T (ˆπ(t) π(t)) Since π EH(t, X π (t), ˆθ(t), π, ˆp(t), ˆq(t), ˆr(t,.)) E t is maximum for π = ˆπ(t) π(t), ˆπ(t) are E t -measurable, we get E π H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) T (ˆπ(t) π(t)) E t = (ˆπ(t) π(t)) π E H(t, X π (t), ˆθ(t), π(t), ˆp(t), ˆq(t), ˆr(t,.)) E t T π=ˆπ(t) Combining this with (2.19) we obtain T E {H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) H(t, X (π) (t), ˆθ(t), π(t), ˆp(t), ˆq(t), ˆr(t,.))}dt (2.2) T E Hence T I 1 E (2.21) T E x H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) T ( ˆX(t) X (π) (t)) x H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) T ( ˆX(t) X (π) (t)) T E T E ˆp T (t) { b(t, ˆX(t), ˆθ(t), ˆπ(t)) b(t, X (π) (t), ˆθ(t), π(t)) } dt tr {σ(t, ˆX(t), ˆθ(t), ˆπ(t)) σ(t, X (π) (t), ˆθ(t), π(t))} T ˆq(t) dt i,j=1 {γ ij (t, ˆX(t), ˆθ(t), ˆπ(t), z) γ ij (t, X (π) (t), ˆθ(t), π(t), z)}ˆr ij (t, z)ν(dz j )dt Adding (2.17), (2.21) above, we get (2.22) J(ˆθ, ˆπ) J(ˆθ, π) = I 1 I 2
MAXIMUM PINCIPLE FO STOCHASTIC DIFFEENTIAL GAMES 7 We therefore conclude that J(ˆθ, ˆπ) J(ˆθ, π) for all π Π. ii) Proceeding in the same way as in (i) we can show that J(ˆθ, ˆπ) J(θ, ˆπ) for all θ Θ if (ii) holds. iii) If both (i) (ii) hold then for any (θ, π) Θ Π. Thereby J(ˆθ, ˆπ) inf θ Θ J(ˆθ, π) J(ˆθ, ˆπ) J(θ, ˆπ) J(θ, ˆπ) sup π Π ( inf J(θ, π)) θ Θ On the other h ( J(ˆθ, ˆπ) sup J(ˆθ, π) inf sup J(θ, π) ) π Π θ Θ π Π Now due to the inequality we have inf θ Θ ( sup π Π J(θ, π) ) ( sup inf J(θ, π)) π Π θ Θ ( Φ E (x) = sup inf J(θ, π)) ( = inf sup J(θ, π) ) π Π θ Θ θ Θ π Π If the control process (θ, π) is admissible adapted to the filtration F t we have the following Corollary. Corollary 2. Suppose E t = F t for all t that (2.9)-(2.12) hold. Moreover, suppose that for all t the following maximum principle holds (2.23) inf θ K 1 H(t, X(t), θ, ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) Then for all t, T we have = H(t, X(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) = sup π K 2 H(t, X(t), ˆθ(t), π, ˆp(t), ˆq(t), ˆr(t,.)) i) If g(x) is concave (x, π) H(t, x, ˆθ(t), π, ˆp(t), ˆq(t), ˆr(t,.)) is concave, then J(ˆθ, ˆπ) J(ˆθ, π) for all π Π J(ˆθ, ˆπ) = sup J(ˆθ, π) π Π ii) If g(x) is convex (x, θ) H(t, x, θ, ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) is convex, then J(ˆθ, ˆπ) J(θ, ˆπ) J(ˆθ, ˆπ) = inf J(θ, ˆπ) θ Θ for all θ Θ
8 T.T.K.AN AND B.ØKSENDAL iii) If both cases (i) (ii) hold, then (θ, π ) := (ˆθ, ˆπ) is an optimal control based on the information flow F t ( ) ( ) (2.24) Φ E (x) = sup inf J(θ, π) = inf sup J(θ, π) π Π θ Θ θ Θ π Π 3. A necessary maximum principle for Zero-sum games In addition to the assumptions in Section 2 we now assume the following: (A1) For all t, h such that t < t h T, all bounded E t -measurable α, ρ, for s, T the controls β(s) := (,..., β i (s),..., ) η(s) := (,..., η i (s),..., ), i = 1,..., n with β i (s) := α i χ t,th (s), η i (s) := ρ i χ t,th (s) belong to Θ Π, respectively. (A2) For given θ, β Θ π, η Π with β, η are bounded, there exists δ > such that where y, v ( δ, δ). θ yβ Θ, π vη Π Denote X θyβ (t) = X (θyβ,π) (t) X πvη (t) = X (θ,πvη) (t). For a given θ, β Θ π, η Π with β, η bounded, we define the processes Y θ (t) Y π (t) by (3.1) (3.2) We have that Y θ (t) = d dy Xθyβ (t) y= = (Y θ 1 (t),..., Y θ n (t)) T Y π (t) = d dv Xπvη (t) v= = (Y π 1 (t),..., Y π n (t)) T (3.3) dy θ i (t) = λ θ i (t)dt (3.4) dy π i (t) = λ π i (t)dt ξij(t)db θ j (t) j=1 ξij(t)db π j (t) j=1 j=1 j=1 ζ θ ij(t, z)ñj(dt, dz), ζ π ij(t, z)ñj(dt, dz), where i = 1,..., n λ θ i (t) = xb i (t, X(t), θ(t), π(t)) T Y θ (t) θ b i (t, X(t), θ(t), π(t)) T β(t) ξij θ (3.5) (t) = xσ ij (t, X(t), θ(t), π(t)) T Y θ (t) θ σij(t, X(t), θ(t), π(t)) T β(t) ζ ij θ (t) = xγ ij (t, X(t), θ(t), π(t)) T Y θ (t) θ γ ij (t, X(t), θ(t), π(t)) T β(t)
MAXIMUM PINCIPLE FO STOCHASTIC DIFFEENTIAL GAMES 9 (3.6) λ π i (t) ξ π ij (t) ζ π ij (t) = xb i (t, X(t), θ(t), π(t)) T Y π (t) π b i (t, X(t), θ(t), π(t)) T η(t) = xσ ij (t, X(t), θ(t), π(t)) T Y π (t) π σij(t, X(t), θ(t), π(t)) T η(t) = xγ ij (t, X(t), θ(t), π(t)) T Y π (t) π γ ij (t, X(t), θ(t), π(t)) T η(t) Theorem 3. Suppose (ˆθ, ˆπ) Θ Π is a directional critical point for J(θ, π), in the sense that for all bounded β Θ η Π there exist δ > such that ˆθ yβ Θ ˆπ vη Π for all y ( δ, δ) v ( δ, δ) h(y, v) := J(ˆθ yβ, ˆπ vη) y, v ( δ, δ) has a critical point at (, ), i.e. (3.7) h h (, ) = (, ) =. y v Suppose there exists a solution ˆp(t), ˆq(t), ˆr(t,.) of the associated adjoint equation dˆp(t) = x H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) dt (3.8) ˆq(t) db(t) ˆr(t, z)ñ( dt, dz), t < T n ˆp(T ) = g( ˆX(T )) Moreover, suppose that if Y ˆθ, Y ˆπ, (λˆθ i, ξ ˆθ ij, ζ ˆθ ij ) (λˆπ i, ξˆπ ij, ζ ij ˆπ ) are the corresponding coefficients (see (3.1)-(3.6)) then T { } E Y ˆθ T (t) ˆqˆq T ˆrˆr T (t, z)ν(dz) Y ˆθ(t)dt (3.9) <, (3.1) (3.11) (3.12) { } E Y ˆπT (t) ˆqˆq T ˆrˆr T (t, z)ν(dz) Y ˆπ (t)dt <, T E T E { ˆp T (t) { ˆp T (t) Then for a.a. t, T, we have (3.13) ξ ˆθξ ˆθ T (t, ˆX(t), ˆθ(t), ˆπ(t)) γγ T (t, ˆX(t), ˆθ(t), } ˆπ(t))ν(dz) ˆp(t)dt <, ξˆπ ξˆπt (t, ˆX(t), ˆθ(t), ˆπ(t)) γγ T (t, ˆX(t), ˆθ(t), } ˆπ(t))ν(dz) ˆp(t)dt <, E θ H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) E t = E π H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) E t =
1 T.T.K.AN AND B.ØKSENDAL Proof. Since h has a minimum at y = we have = T y h(y, ) y== E T x f(t, ˆX(t), ˆθ(t), ˆπ(t)) T d dy X ˆθyβ (t) y= dt θ f(t, ˆX(t), ˆθ(t), ˆπ(t)) T β(t)dt g( ˆX(T ) T d dy X ˆθyβ (T ) y= T = E x f(t, ˆX(t), ˆθ(t), ˆπ(t)) T Y ˆθ(t)dt (3.14) T θ f(t, ˆX(t), ˆθ(t), ˆπ(t)) T β(t)dt g( ˆX(T ) T Y ˆθ(T ) By the Itô formula, (3.15) E g( ˆX(T ) T Y ˆθ(T ) = Eˆp T (T )Y ˆθ(T ) T { = E ˆp i (t)( x b i (t, ˆX(t), ˆθ(t), ˆπ(t)) T Y ˆθ(t) i=1 θ b i (t, ˆX(t), ˆθ(t), ˆπ(t)) T β(t)) Y ˆθ i (t)( x H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) i ˆq ij (t)( x σ ij (t, ˆX(t), ˆθ(t), ˆπ(t)) T Y ˆθ(t) j=1 j=1 θ σ ij (t, ˆX(t), ˆθ(t), ˆπ(t)) T β(t)) ˆr ij (t, z)( x γ ij (t, ˆX(t), ˆθ(t), ˆπ(t)) T Y ˆθ(t) n θ γ ij (t, ˆX(t), ˆθ(t), } ˆπ(t)) T β(t)) dt On the other h, (3.16) x H(t, x, θ, π, p, q, r) = x f(t, x, θ, π) x σ ji (t, x, θ, π)q ji j,i=1 j,i=1 x b i (t, x, θ, π)p i i=1 x γ ji (t, x, θ, π)r ji (t, z)ν j (dz) Substituting this into (3.15) combining with (3.14) we get T { f = E (t, θ ˆX(t), ˆθ(t), ˆπ(t)) i=1 i (ˆpj (t) b j (t, θ ˆX(t), ˆθ(t), ˆπ(t)) ˆq kj (t) σ kj (t, i θ ˆX(t), ˆθ(t), ˆπ(t)) i j=1 k=1
MAXIMUM PINCIPLE FO STOCHASTIC DIFFEENTIAL GAMES 11 (3.17) T = E ˆr kj (t, z) γ kj (t, θ ˆX(t), ˆθ(t), ˆπ(t))ν j (dz) ) } β i (t) dt i θ H(t, ˆX(t), ˆθ(t), ˆπ(t), ˆp(t), ˆq(t), ˆr(t,.)) T β(t)dt By assumption (A1), the equation (3.17) leads to th E H(s, θ ˆX(s), ˆθ(s), ˆπ(s), ˆp(s), ˆq(s), ˆr(s,.))α i (s)ds = i t Differentiating with respect to h at h = gives E H(s, θ ˆX(s), ˆθ(s), ˆπ(s), ˆp(s), ˆq(s), ˆr(s,.))α i (s) = i Since this holds for all bounded E t -measurable α i, we conclude that E H(s, θ ˆX(s), ˆθ(s), ˆπ(s), ˆp(s), ˆq(s), ˆr(s,.)) E t =, i as claimed. Proceeding in the same way by differentiating the function h(, v) with respect to v we get E H(s, π ˆX(s), ˆθ(s), ˆπ(s), ˆp(s), ˆq(s), ˆr(s,.)) E t = i This completes the proof. 4. Applications to Finance In this section, we will use our result to solve a partial information version of the problem studied in 8. Consider the following jump diffusion market: (4.1) (4.2) (risky free asset) ds (t) = ρ(t)s (t)dt; S () = 1 (risky asset) ds 1 (t) = S 1 (t ) α(t)dt β(t)db(t) γ(t, ; z)ñ(dt, dz) S 1 () > where ρ(t) is a deterministic function, α(t), β(t) γ(t, z) are given F t - predictable functions satisfying the following integrability condition: T { (4.3) E ρ(s) α(s) 1 2 β(s)2 } log(1 γ(s, z)) γ(s, z) ν(dz) ds < where T is fixed. We assume that (4.4) γ(t, z) 1 for a.a. t, z, T
12 T.T.K.AN AND B.ØKSENDAL where = \{}. This model represents a natural generalization of the classical Black-Scholes market model to the case where the coefficients are not necessarily constants, but allowed to be (predictable) stochastic processes. Moreover, we have added a jump component. See e.g. 3 or 7 for discussions of such markets. Let E t F t be a given sub-filtration. Let π(t) be a portfolio, which is E t - measurable rom variable represented by the fraction of the wealth invested in the risky asset at time t. The the dynamics of the corresponding wealth process V (π) (t) is (4.5) dv (π) (t) = V (π) (t ) V (π) () = v >. {ρ(t) (α(t) ρ(t))π(t)}dt π(t)β(t)db(t) π(t ) γ(t, z)ñ(dt, dz) A portfolio π is called admissible if it is a measurable càdlàg stochastic process adapted to filtration E t satisfies (4.6) T π(t )γ(t, z) > 1 a.s. { ρ(t) (α(t) ρ(t))π(t) π 2 (t)β 2 (t) } π 2 (t) γ 2 (t, z)ν(dz) dt < a.s. The requirement that π should be adapted to the filtration E t is a mathematical way of requiring that the choice of the portfolio value π(t) at time t is allowed to depend on the information (σ-algebra) E t only. The wealth process corresponding to an admissible portfolio π is the solution of (4.5): (4.7) V (π) t (t) = v exp {ρ(t) (α(t) ρ(t))π(t) 1 2 π2 (t)β 2 (t) (ln(1 π(s)γ(s, z)) π(z)γ(s, z))ν(dz)}ds t π(s)β(s)db(s) t ln(1 π(s)γ(s, z))ñ(ds, dz) The family of admissible portfolios is denoted by Π. Now we introduce a family Q of measures Q θ parameterized by processes θ = (θ (t), θ 1 (t, z)) such that (4.8) dq θ (ω) = Z θ (T )dp (ω) on F T, where (4.9) { dzθ (t) = Z θ (t ) θ (t)db(t) θ 1(t, z)ñ(dt, dz) Z θ () = 1
MAXIMUM PINCIPLE FO STOCHASTIC DIFFEENTIAL GAMES 13 We assume that θ 1 (t, z) 1 for a.a. t,z T { } (4.1) θ(s) 2 θ1(s, 2 z) ds < a.s. Then by the Itô formula the solution of (4.9) is given by t Z θ (t) = exp θ (s)db(s) 1 t θ 2 (s)ds 2 t ln(1 θ 1 (s, z))ñ(dt, dz) (4.11) t If θ = (θ (t), θ 1 (t, z)) satisfy (4.12) EZ θ (T ) = 1 {ln(1 θ 1 (s, z)) θ 1 (s, z)}ν(dz)ds then Q θ is a probability measure. If, in addition (4.13) β(t)θ (t) γ(t, z)θ 1 (t, z)ν(dz) = α(t) r(t); t, T then dq θ (ω) = Z θ (T )dp (ω) is an equivalent local martingale measure. See e.g. 7, Ch.1. But here we do not assume (4.13) holds. For all θ = (θ, θ 1 ) adapted to sub-filtration E t satisfies (4.1)-(4.12) is called admissible controls of the market. The families of admissible controls θ is denoted by Θ. The problem is to find (θ, π) Θ Π such that (4.14) inf θ Θ ( sup E Qθ U(X π (T )) ) = E Qθ U(X π (T )) π Π where U :, ), ) is a given utility function, which is increasing, concave twice continuously differentiable on (, ). We can consider this problem as a stochastic differential game between the agent the market. The agent wants to maximize her expected discounted utility over all portfolios π the market wants to minimize the maximal expected utility of the representative agent over all scenarios, represented by all probability measures Q θ ; θ Θ. To put this problem into the Markovian context discussed in the previous sections we combine the adon-nikodym process Z θ (t) the wealth process V (π) (t) into a 2-dimensional state process X(t), as follows: Put dx1 (t) dzθ (t) dx(t) = = dx 2 (t) dv (π) (t) = V (π) (t ){ρ(t) (α(t) ρ(t))π} dt
14 T.T.K.AN AND B.ØKSENDAL (4.15) Z θ (t )θ (t) V (π) (t )β(t)π(t) db(t) Z θ (t ) θ 1(t, z) V (π) (t )π(t) γ(t, z) Ñ(dt, dz) To solve this problem we first write down the Hamiltonian function (4.16) H(t, x 1, x 2, θ, π, p, q, r) = x 2 {ρ(t) (α(t) ρ(t))π(t)}p 2 x 1 θ q 1 x 2 β(t)π(t)q 2 { x 1 θ 1 (t, z)r 1 (t, z) x 2 π(t)γ(t, z)r 2 (t, z)}ν(dz) The adjoint equations are (4.17) dp 1 (t) = ( θ (t)q 1 (t) θ 1(t, z)r 1 (t, z)ν(dz) ) dt q 1 (t)db(t) r 1(t, z)ñ(dt, dz) p 1 (T ) = x1 U(X 2 (T )) (4.18) dp 2 (t) = {ρ(t) (α(t) ρ(t))π(t)}p 2 (t) β(t)π(t)q 2 (t) π(t)γ(t, z)r 2(t, z)ν(dz) dt q 2 (t)db(t) r 2(t, z)ñ(dt, dz) p 2 (T ) = x2 U(X 2 (T )) Let (ˆθ, ˆπ) be cidate for an optimal control let ˆX(t) = ( ˆX 1 (t), ˆX 2 (t)) be the corresponding optimal processes with corresponding solution ˆp(t) = (ˆp 1 (t), ˆp 2 (t)), ˆq(t) = (ˆq 1 (t), ˆq 2 (t)), ˆr(t,.) = (ˆr 1 (t,.), ˆr 2 (t,.)) of the adjoint equations. We first maximize the Hamiltonian EH(t, x 1, x 2, θ, π, p, q, r) E t over all π K 2. This gives the following condition for a maximum point ˆπ: (4.19) E(α(t) ρ(t))ˆp 2 (t) E t Eβ(t)ˆq 2 (t) E t γ(t, z)eγ(t, z)ˆr 2 (t, z) E t ν(dz) = And then we minimize EH(t, x 1, x 2, θ, π, p, q, r) E t over all θ K 1 get the following conditions for a minimum point ˆθ = (θ, θ 1 ) (4.2) E ˆX 1 (t)ˆq 1 (t) E t = (4.21) We try a process ˆp 1 (t) of the form E ˆX 1 (t)ˆr 1 (t, z) E t ν(dz) = (4.22) ˆp 1 (t) = U(f(t) ˆX 2 (t))
MAXIMUM PINCIPLE FO STOCHASTIC DIFFEENTIAL GAMES 15 with f is a deterministic differentiable function. Differentiating (4.22) using (4.5) we get dˆp 1 (t) = f (t) ˆX 2 (t)u (f(t) ˆX 2 (t))dt ˆX 2 (t)(ρ(t) (α(t) ρ(t))ˆπ(t))dt β(t)ˆπ(t)db(t)f(t)u (f(t) ˆX 2 (t)) 1 2 f 2 (t) ˆX 2(t)β 2 2 (t)π 2 (t)u (f(t) ˆX 2 (t))dt {U( ˆX 2 (t)(f(t) ˆπ(t)γ(t, z)) U(f(t) ˆX 2 (t))) = ˆX 2 (t)ˆπ(t)γ(t, z)f(t)u (f(t) ˆX 2 (t))}ν(dz)dt {U( ˆX 2 (t)(f(t) ˆπ(t)γ(t, z)) U(f(t) ˆX 2 (t)))}ñ(dt, dz) { f (t) ˆX 2 (t)u (f(t) ˆX 2 (t)) 1 2 f 2 (t) ˆX 2 2(t)β 2 (t)π 2 (t)u (f(t) ˆX 2 (t)) ˆX 2 (t)(ρ(t) (α(t) ρ(t))ˆπ(t))f(t)u (f(t) ˆX 2 (t)) {U( ˆX 2 (t)(f(t) ˆπ(t)γ(t, z)) U(f(t) ˆX 2 (t))) ˆX 2 (t)ˆπ(t)γ(t, z)f(t)u (f(t) ˆX } 2 (t))}ν(dz) dt ˆX 2 (t)β(t)ˆπ(t)f(t)u (f(t) ˆX 2 (t))db(t) {U( ˆX 2 (t)(f(t) ˆπ(t)γ(t, z)) U(f(t) ˆX 2 (t)))}ñ(dt, dz) Ñ(dt, dz) coeffi- Comparing this with (4.17) by equating the dt, db(t) cients respectively, we get (4.23) (4.24) ˆq 1 (t) = ˆX 2 (t)β(t)ˆπ(t)u (f(t) ˆX 2 (t)) ˆr 1 (t, z) = U( ˆX 2 (t)(f(t) ˆπ(t)γ(t, z))) U(f(t) ˆX 2 (t)) (4.25) f (t) ˆX 2 (t)u (f(t) ˆX 2 (t)) 1 2 ˆX 2 2(t)β 2 (t)π 2 (t)u (f(t) ˆX 2 (t)) ˆX 2 (t)(ρ(t) (α(t) ρ(t))ˆπ(t))f(t)u (f(t) ˆX 2 (t)) {U( ˆX 2 (t)(f(t) ˆπ(t)γ(t, z)) U(f(t) ˆX 2 (t))) ˆX 2 (t)ˆπ(t)γ(t, z)f(t)u (f(t) ˆX 2 (t))}ν(dz) = ˆθ (t)ˆq 1 (t) ˆθ 1 (t, z)ˆr 1 (t, z)ν(dz) Substituting (4.23) into (4.2) we get (4.26) ˆπ(t)E ˆX 1 (t) ˆX 2 (t)β(t)u (f(t) ˆX 2 (t)) E t =
16 T.T.K.AN AND B.ØKSENDAL or (4.27) ˆπ(t) = Now we try the process ˆp 2 (t) of the form (4.28) ˆp 2 (t) = ˆX 1 (t)f(t)u (f(t) ˆX 2 (t)) Differentiating (4.28) using (4.27) we get (4.29) dˆp 2 (t) = f (t) ˆX 1 (t)u (f(t) ˆX 2 (t))dt f(t)u (f(t) ˆX 2 (t))d ˆX 1 (t) f(t) ˆX 1 (t)du (f(t) ˆX 2 (t)) = ˆX ( 1 (t) f (t)u (f(t) ˆX 2 (t)) f(t)f (t) ˆX 2 (t)u (f(t) ˆX 2 (t)) f 2 (t) ˆX 2 (t)ρ(t)u (f(t) ˆX ) 2 (t)) dt f(t) ˆX 1 (t)θ (t)u (f(t) ˆX 2 (t))db(t) f(t) ˆX 1 (t)θ 1 (t, z)u (f(t) ˆX 2 (t))ñ(dt, dz) Compare this with (4.18) we get (4.3) (4.31) (4.32) ˆq 2 (t) = f(t) ˆX 1 (t)θ (t)u (f(t) ˆX 2 (t)) ˆr 2 (t, z) = f(t) ˆX 1 (t)θ 1 (t, z)u (f(t) ˆX 2 (t)) f (t)u (f(t) ˆX 2 (t)) f(t) ˆX 2 (t)u (f(t) ˆX 2 (t))(f (t) f(t)ρ(t)) = ρ(t)f(t)u (f(t) ˆX 2 (t)) Substituting (4.3), (4.31) into (4.19) we get (4.33) E(α(t) ρ(t))f(t) ˆX 1 (t)u (f(t) ˆX 2 (t)) E t θ (t)eβ(t)f(t) ˆX 1 (t)u (f(t) ˆX 2 (t)) E t θ 1 (t, z)eγ(t, z)f(t) ˆX 1 (t)u (f(t) ˆX 2 (t)) E t = This can be written as (4.34) ˆθ (t)eβ(t) E t ˆθ 1 (t, z)eγ(t, z) E t ν(dz) = Eα(t) E t ρ(t) From (4.32) we get ( (4.35) U (f(t) ˆX 2 (t)) ˆX 2 (t)f(t)u (f(t) ˆX 2 (t)) ) (f (t) r(t)f(t)) = or (4.36) f (t) r(t)f(t) =
MAXIMUM PINCIPLE FO STOCHASTIC DIFFEENTIAL GAMES 17 i.e. ( T (4.37) f(t) = exp We have proved: t ) r(s)ds Theorem 4. The optimal portfolio π Π for the agent is (4.38) π(t) = ˆπ(t) = the optimal measure Qˆθ for the market is to choose ˆθ = (ˆθ, ˆθ 1 ) such that (4.39) ˆθ (t)eβ(t) E t ˆθ 1 (t, z)eγ(t, z) E t ν(dz) = Eα(t) E t ρ(t) emark. In the case when E t = F t for all t, this was proved in 8. In this case the interpretation of this result is the following: The market minimizes the maximal expected utility of the agent by choosing a scenario (represented by a probability law dq θ = Z θ (T )dp ) which is an equivalent martingale measure for the market (see (4.13)). In this case the optimal strategy for the agent is to place all the money in the risk free asset, i.e. to choose π(t) = for all t. Theorem 4 states that an analogue result holds also in the case when both players have only partial information E t F t to their disposal, but now the coefficients β(t), γ(t, z) α(t) must be replaced their conditional expectations Eβ(t) E t, Eγ(t, z) E t Eα(t) E t. 5. The sufficient maximum principle for nonzero-sum game Let X(t) be a stochastic process describing the state of the system. We now consider the case when two controllers I II intervene on the dynamics of the system their advantages are not necessarily antagonistic but each one acts such as to save her own interest. This situation is a nonzero-sum game. Let Et 1, Et 2 be filtrations satisfying Et i F t, t, i = 1, 2 Let u = (θ, π), where θ = (θ, θ 1 ) π = (π, π 1 ) are controls for player I II, respectively. We assume that θ = (θ, θ 1 ) are adapted to Et 1 π = (π, π 1 ) are adapted to Et 2. Denote by Θ Π the sets of admissible controls θ π, respectively. Suppose the players act on the system with strategy (θ, π) Θ Π, then the costs associated with I II are, respectively, J (θ,π) 1 (x) J (θ,π) 2 (x) of the form (5.1) J i (θ, π) = E x T f i (t, X(t), u(t)) dt g i (X(T )), i = 1, 2. The problem is to find a control (θ, π ) Θ Π such that (5.2) J 1 (θ, π ) J 1 (θ, π ) for all θ Θ
18 T.T.K.AN AND B.ØKSENDAL (5.3) J 2 (θ, π) J 2 (θ, π ) for all π Π The pair of controls (θ, π ) is called a Nash equilibrium point for the game because when player I (resp. II) acts with the strategy θ (resp. π ), the best that II (resp. I) can do is to act with π (resp. θ ). Let us introduce the Hamiltonian functions associated with this game, namely H 1 H 2, from, T n K 1 K 2 n n m to which are defined by H i (t, x, θ, π, p i, q i, r i ) = f i (t, x, θ, π) b (t, x, θ, π)p i tr(σ ((t, x, θ, π)q i ) (5.4) γ kj (t, x, θ, π, z)rkj i (t, z)ν j(dz j ), i = 1, 2. k,j=1 And we also have the adjoint equations for the game, as follows dp i (t) = x H i (t, X(t), θ(t), π(t), p i (t), q i (t), r i (t,.))dt q (5.5) i (t)db(t) r i (t, z)ñ(dt, dz), t < T n p i (T ) = g i (X(T )), i = 1, 2. The following result is a generalization of Theorem 1: Theorem 5. Let (ˆθ, ˆπ) Θ Π with corresponding state process ˆX(t) = X (ˆθ,ˆπ) (t). Suppose there exists a solution (ˆp i (t), ˆq i (t), ˆr i (t, z)), i = 1, 2 of the corresponding adjoint equation (5.5) such that for all θ Θ π Π, we have (5.6) (5.7) EH 1 (t, ˆX(t), ˆθ(t), ˆπ(t), ˆp 1 (t), ˆq 1 (t), ˆr 1 (t,.)) E 1 t EH 1 (t, ˆX(t), θ(t), ˆπ(t), ˆp 1 (t), ˆq 1 (t), ˆr 1 (t,.)) E 1 t EH 2 (t, ˆX(t), ˆθ(t), ˆπ(t), ˆp 2 (t), ˆq 2 (t), ˆr 2 (t,.)) E 2 t EH 2 (t, ˆX(t), ˆθ(t), π(t), ˆp 2 (t), ˆq 2 (t), ˆr 2 (t,.)) E 2 t Moreover, suppose that for all t, T, H i (t, x, θ, π, ˆp i (t), ˆq i (t), ˆr i (t,.)), i = 1, 2, is concave in x, θ, π g i (x), i = 1, 2, is concave in x. Then (ˆθ(t), ˆπ(t)) is a Nash equilibrium point for the game (5.8) (5.9) J 1 (ˆθ, ˆπ) = sup θ Θ J 1 (θ, ˆπ) J 2 (ˆθ, ˆπ) = sup π Π J 2 (ˆθ, π) Proof. As in the proof of Theorem 1 we have J 1 (ˆθ, ˆπ) J 1 (θ, ˆπ) T { = E f1 (t, ˆX(t), ˆθ(t), ˆπ(t)) f 1 (t, X (π) (t), ˆθ(t), π(t)) } dt g 1 ( ˆX(T )) g 1 (X (π) (T ))
MAXIMUM PINCIPLE FO STOCHASTIC DIFFEENTIAL GAMES 19 T { = E H 1 (t, ˆX(t), ˆθ(t), ˆπ(t), ˆp 1 (t), ˆq 1 (t), ˆr 1 (t,.)) H 1 (t, X (π) (t), ˆθ(t), π(t), ˆp 1 (t), ˆq 1 (t), ˆr 1 (t,.)) (5.1) x H 1 (t, ˆX(t), ˆθ(t), ˆπ(t), ˆp 1 (t), ˆq 1 (t), ˆr 1 (t,.)))( ˆX(t) } X (π) (t)) T dt From (5.6) concavity of H 1 in x π we have T E {H 1 (t, ˆX(t), ˆθ(t), ˆπ(t), ˆp 1 (t), ˆq 1 (t), ˆr 1 (t,.)) H 1 (t, X (π) (t), ˆθ(t), π(t), ˆp 1 (t), ˆq 1 (t), ˆr 1 (t,.))}dt (5.11) T E Hence x H 1 (t, ˆX(t), ˆθ(t), ˆπ(t), ˆp 1 (t), ˆq 1 (t), ˆr 1 (t,.))( ˆX(t) X (π) (t)) T dt (5.12) J 1 (ˆθ, ˆπ) J 1 (θ, ˆπ) Since this holds for all θ Θ we have (5.13) J 1 (ˆθ, ˆπ) = sup J 1 (θ, ˆπ) θ Θ In the same way we show that J 2 (θ, π) J 2 (θ, π ) J 2 (ˆθ, ˆπ) = sup J 2 (ˆθ, π), π π whence the desired result. eferences 1 A. Bensoussan. Maximum principle dynamic programming approaches of the optimal control of partially observed diffusions. Stochastics 9, (1983). 2 F. Baghery B. Øksendal. A maximum principle for stochastic control with partial information. Stochastic Analysis Applications 25, 75-717 (27). 3. Cont P. Tankov. Financial Modelling with Jump Processes. Chapman Hall 24. 4 W. H. Fleming P. E. Souganidis. On the existence of value function of two-player, zero-sum stochastic differential games. Indiana Univ. Math. J. 38, 293-314 (1989). 5 N. Framstad B. Øksendal A. Sulem. Stochastic maximum principle for optimal control of jump diffusions applications to finance. J. Optimization Theory Appl. 121 (1), 77-98 (24). Errata J. Optimization Theory Appl. 124 (2), 511-512 (25). 6 S. Mataramvura B. Øksendal. isk minimizing portfolios HJB equations for stochastic differential games. E-print, University of Oslo 4, 25. To appear in Stochastics. 7 B. Øksendal A. Sulem Applied Stochastic Control of Jump Diffusions. Second Edition. Springer (27).
2 T.T.K.AN AND B.ØKSENDAL 8 B. Øksendal A. Sulem A game theoretic approach to martingale measures in incomplete markets. E-print, University of Oslo 24, 26. 9 S. Tang. The maximum principle for partially observed optimal control of stochastic differential equations. SIAM Journal on Control Optimization, 36 (5), 1596-1617 (1998). (Ta Thi Kieu An) Centre of Mathematics for Applications (CMA) Department of Mathematics University of Oslo P.O. Box 153, Blindern N 316 Oslo, Norway E-mail address: atkieu@math.uio.no (Bernt Øksendal) Centre of Mathematics for Applications (CMA) Department of Mathematics University of Oslo P.O. Box 153, Blindern N 316 Oslo, Norway Norwegian School of Economics Business Admistration Helleveien 3 N 545 Bergen, Norway E-mail address: oksendal@math.uio.no