Riemannian metrics on positive definite matrices related to means (joint work with Dénes Petz) Fumio Hiai (Tohoku University) 010, August (Leipzig) 1
Plan 0. Motivation and introduction 1. Geodesic shortest curve and geodesic distance. Characterizing isometric transformation 3. Two kinds of isometric families of Riemannian metrics 4. Comparison property 5. Problems Reference F. Hiai and D. Petz, Riemannian metrics on positive definite matrices related to means, Linear Algebra Appl. 430 (009), 3105 3130.
0. Motivation and introduction Question? For n n positive definite matrices A, B and 0 < t < 1, the well-known convergences of Lie-Trotter type are: lim α 0 ( (1 t)a α + tb α) 1/α = exp((1 t) log A + t log B), lim α 0 ( A α # α B α) 1/α = exp((1 t) log A + t log B). What is the Riemannian geometry behind? Can we explain these convergences in terms of Riemannian geometry? 3
Notation M n (the n n complex matrices) is a Hilbert space with respect to the Hilbert-Schmidt inner product X, Y HS := Tr X Y. H n (the n n Hermitian matrices) is a real subspace of M n, = the Euclidean space R n. P n (the n n positive definite matrices) is an open subset of H n, a smooth differentiable manifold with T D P n = H n. D n (the n n positive definite matrices of trace 1) is a smooth differentiable submanifold of P n with T D D n = {H H n : Tr H = 0}. A Riemannian metric K D (H, K) is a family of inner products on H n depending smoothly on the foot point D P n. 4
For D P n, write L D X := DX and R D X := XD, X M n. L D and R D are commuting positive operators on (M n,, HS ). Statistical Riemannian metric [Mostow, Skovgaard, Ohara-Suda-Amari, Lawson-Lim, Moakher, Bhatia-Holbrook] g D (H, K) := Tr D 1 HD 1 K = H, (L D R D ) 1 K HS This is considered as a geometry on the Gaussian distributions p D with zero mean and covariance matrix D. The Boltzmann entropy is S(p D ) = 1 log(det D) + const. and g D (H, K) = s t S(p D+sH+tK) s=t=0 (Hessian). 5
Congruence-invariant For any invertible X M n, Geodesic curve g XDX (XHX, XKX ) = g D (H, K) γ(t) = A # t B := A 1/ (A 1/ BA 1/ ) t A 1/ (0 t 1) The geodesic midpoint γ(1/) is the geometric mean A # B [Pusz-Woronowicz]. Geodesic distance δ(a, B) = log(a 1/ BA 1/ ) HS This is the so-called Thompson metric. 6
Monotone metrics [Petz] K β(d) (β(x), β(y )) K D (X, Y ) if β : M n M m is completely positive and trace-preserving. Theorem (Petz, 1996) There is a one-to-one correspondence: { monotone metrics with KD (I, I) = Tr D 1} { operator monotone functions f : (0, ) (0, ) with f(1) = 1 } by K f D (X, Y ) = X, (Jf D ) 1 Y HS, J f D := f(l DR 1 D )R D. K f D is symmetric (i.e., Kf D (X, Y ) = Kf D (Y, X )) if and only if f is symmetric, (i.e., xf(x 1 ) = f(x)). A symmetric monotone metric is also called a quantum Fisher information. 7
Theorem (Kubo-Ando, 1980) There is a one-to-one correspondence: { } operator means σ { operator monotone functions f : (0, ) (0, ) with f(1) = 1 } by A σ f B = A 1/ f(a 1/ BA 1/ )A 1/, A, B P n. σ f is symmetric if and only if f is symmetric. 8
Quantum skew information When 0 < p < 1, the Wigner-Yanase-Dyson skew information is I WYD D (p, K) := 1 Tr [Dp, K][D 1 p, K] = p(1 p) K f p D (i[d, K], i[d, K]) for D D n, K H n, where f p is an operator monotone function: f p (x) := p(1 p) (x 1) (x p 1)(x 1 p 1). For each operator monotone function f that is regular (i.e., f(0) := lim x 0 f(x) > 0), Hansen introduced the metric adjusted skew information (or quantum skew information): I f f(0) D (K) := Kf D (i[d, K], i[d, K]), D D n, K H n. 9
Generalized covariance For an operator monotone function f, ϕ D [H, K] := H, J f D K HS, D D n, H, K H n, Tr H = Tr K = 0, ϕ D [H, K] = Tr DHK if D and K are commuting. Motivation The above quantities are Riemannian metrics in the form K φ D (H, K) := H, φ(l D, L R ) 1 K HS = k i,j=1 φ(λ i, λ j ) 1 Tr P i HP j K, where D = k i=1 λ ip i is the spectral decomposition, and the kernel function φ : (0, ) (0, ) (0, ) is in the form φ(x, y) = M(x, y) θ, a degree θ R power of a certain mean M(x, y). A systematic study is desirable, from the viewpoints of geodesic curves, scalar curvature, information geometry, etc. 10
1. Geodesic shortest curve and geodesic distance Let M 0 denote the set of smooth symmetric homogeneous means M : (0, ) (0, ) (0, ) satisfying M(x, y) = M(y, x), M(αx, αy) = αm(x, y), α > 0, M(x, y) is non-decreasing and smooth in x, y, min{x, y} M(x, y) max{x, y}. For M M 0 and θ R, define φ(x, y) := M(x, y) θ and consider a Riemannian metric on P n given by K φ D (H, K) := H, φ(l D, R D ) 1 K HS, D P n, H, K H n. 11
When D = UDiag(λ 1,..., λ n )U is the diagonalization, ([ ] ) φ(l D, R D ) 1/ 1 H = U (U HU) U, φ(λi, λ j ) [ ] K φ D (H, H) = φ(l D, R D ) 1/ H 1 HS = φ(λi, λ j ) where denotes the Schur product. ij ij (U HU) HS, For a C 1 curve γ : [0, 1] P n, the length of γ is L φ (γ) := 1 0 K φ γ(t) (γ (t), γ (t)) dt = The geodesic distance between A, B is 1 0 φ(l γ(t), R γ(t) ) 1/ γ (t) HS dt. δ φ (A, B) := inf { L φ (γ) : γ is a C 1 curve joining A, B }. A geodesic shortest curve is a γ joining A, B s.t. L φ (γ) = δ φ (A, B) if exists. 1
When φ(x, y) := M(x, y) θ for M M 0 and θ R as above, Theorem Assume A, B P n are commuting (i.e., AB = BA). Then, independently of the choice of M M 0, the following hold: The geodesic distance between A, B is θ A θ B θ δ φ (A, B) = HS if θ, log A log B HS if θ =, A geodesic shortest curve joining A, B is ( θ ) (1 t)a + tb θ θ, 0 t 1 if θ, γ(t) := exp((1 t) log A + t log B), 0 t 1 if θ =, If M(x, y) is an operator monotone mean and θ = 1, then γ(t) = ( (1 t)a 1/ + tb 1/), 0 t 1, is a unique geodesic shortest curve joining A, B. 13
Theorem (P n, K φ ) is complete (i.e., the geodesic distance δ φ (A, B) is complete) if and only if θ =. Proposition For every M M 0 and A, B P n there exists a smooth geodesic shortest curve for K φ joining A, B whenever θ is sufficiently near depending on M and A, B. 14
. Characterizing isometric transformation For N, M M 0 and κ, θ R, define ψ, φ : (0, ) (0, ) (0, ) by ψ(x, y) := N(x, y) κ, φ(x, y) := M(x, y) θ, and Riemannian metrics K ψ, K φ by K ψ D (H, K) := H, ψ(l D, R D ) 1 K HS, K φ D (H, K) := H, φ(l D, R D ) 1 K HS. F : (0, ) (0, ) is an onto smooth function such that F (x) 0 for all x > 0. Theorem When α > 0, the transformation D P n F (D) P n is isometric from (P n, α K φ ) onto (P n, K ψ ) if and only if one of the following (1 ) (5 ) holds: 15
(1 ) κ = θ = 0 and F (x) = αx, x > 0. (N, M are irrelevant; K ψ and K φ are the Euclidean metric.) ( ) κ = 0, θ 0, and F (x) = α θ x θ, x > 0, ( ) /θ θ x y M(x, y) =, x, y > 0. x θ y θ (N is irrelevant; K φ is a pull-back of the Euclidean metric.) (3 ) κ 0,, θ = 0 and F (x) = N(x, y) = ( α ( κ κ ) x κ, x > 0, x y x κ y κ ) /κ, x, y > 0. (M is irrelevant.) 16
(4 ) κ, θ 0, and F (x) = M(x, y) = (5 ) κ = θ = and ( α ( κ θ θ κ ) κ θ x x θ κ x y κ, x > 0, y θ κ ) /θ ) N (x θ θ κ/θ, κ, y κ x, y > 0. or F (x) = cx α, x > 0 (c > 0 is a constant), ( ) x y M(x, y) = α x α y α N ( x α, y α), x, y > 0, F (x) = cx α, x > 0 (c > 0 is a constant), ( ) x y M(x, y) = α y α x α N ( x α, y α), x, y > 0. 17
3. Two kinds of isometric families of Riemannian metrics For N M 0, κ R \ {}, θ R \ {0, }, and α R \ {0}, define N κ,θ (x, y) := ( θ κ ( x y N α (x, y) := α x α y α x θ κ x y y θ κ ) /θ ) N (x θ θ κ/θ, κ, y κ ) N ( x α, y α), x, y > 0. In particular, N 0,θ s are Stolarsky means S θ (x, y) := ( θ x y x θ y θ ) /θ, interpolating the following typical means: 18
S (x, y) = M A (x, y) := x + y (arithmetic mean), ( ) x + y S 1 (x, y) = M (x, y) := (root mean), x y S (x, y) := lim S θ (x, y) = M L (x, y) := (logarithmic mean), θ log x log y S 4 (x, y) = M G (x, y) := xy (geometric mean). The metric corresponding to the root mean (called the Wigner-Yanase metric) is a unique monotone metric that is a pull-back of the Euclidean metric [Gibilisco-Isola]. 19
Proposition (a) For any N M 0, κ R \ {}, and θ R \ {0, }, N κ,θ (x, y) = S (θ κ) κ (x, y) (θ κ) θ( κ) N ( x θ κ, y θ κ ) κ/θ, lim N κ,θ(x, y) = M L (x, y). θ If 0 κ θ < or < θ κ, then N κ,θ M 0. (b) For any N M 0 and α R \ {0}, If 0 < α 1, then N α M 0. N α (x, y) = S α (x, y) 1 α N(x α, y α ), lim N α(x, y) = M L (x, y). α 0 0
For any N M 0, the above theorem and proposition show: When κ 0 and κ, K N θ κ,θ (κ θ < or κ θ > ) is a one-parameter isometric family of Riemannian metrics starting from K N κ and converging to K M L as θ. When κ =, K N α (1 α > 0) is a one-parameter isometric family of Riemannian metrics starting from K N and converging to K M L as α 0. Claim The metric K M L is an attractor among the Riemannian metrics K M θ (M M 0, θ 0). The geodesic shortest curve for K M L joining A, B Pn is γ A,B (t) := exp((1 t) log A + t log B) (0 t 1). The geodesic distance between A, B with respect to K M L is δ M L (A, B) := log A log B HS. 1
Theorem Let N M 0 and A, B P n be arbitrary. (a) For the one-parameter family K N θ κ,θ (0 κ θ < or κ θ > ), where δ N θ κ,θ (A, B) = δ N κ(a k,θ, B κ,θ ) log A log B HS (θ ), A κ,θ := ( ) κ κ θ A κ, Bκ,θ := θ ( ) κ κ θ B κ. θ (b) For the one-parameter family K N α (1 α > 0), δ N α (A, B) = 1 α δ N (Aα, B α ) log A log B HS (α 0).
Theorem Let N M 0 and A, B P n be arbitrary. In the following, assume that geodesic shortest curves are always parametrized under constant speed. (a) If γ Aκ,θ,B κ,θ (t) is the geodesic shortest curve for K N κ A κ,θ, B κ,θ, then the geodesic shortest curve for K N θ κ,θ ( ) ( ) κ θ θ given by γ Aκ,θ,B κ,θ (t) and θ κ joining joining A, B is ( ) θ θ ( ) κ θ lim γ Aκ,θ,B θ κ κ,θ (t) = exp((1 t) log A + t log B) (0 t 1). (b) If γ A α,b α (t) is the geodesic shortest curve for K N joining A α, B α, then the geodesic shortest curve for K N α ( 1/α γ Aα,B (t)) α and joining A, B is given by ( 1/α lim γ Aα,B (t)) α = exp((1 t) log A + t log B) (0 t 1). α 0 3
The above convergences for the geodesic shortest curves may be considered as variations of the Lie-Trotter formula. Examples When κ = 0, N 0,θ = S θ is the family of Stolarsky means. The geodesic distance and the geodesic shortest curve for K Sθ θ δ S θ θ (A, B) = θ A θ B θ HS, are γ A,B (t) = ) ((1 t)a θ + tb θ θ. We have lim θ θ θ A B θ HS = log A log B HS, ) lim ((1 t)a θ + tb θ θ θ = exp((1 t) log A + t log B). 4
When N = M G (geometric mean), K M G ( ) is the statistical Riemannian x y metric and N α (x, y) = α x α y α (xy) α/, x, y > 0. The geodesic distance and the geodesic shortest curve for K N α are δ N α (A, B) = 1 α δ M G (Aα, B α ) = log(a α/ B α A α/ ) 1/α HS, γ A,B (t) = (A α # t B α ) 1/α. We have lim α 0 log(a α/ B α A α/ ) 1/α HS = log A log B HS (decreasing), lim α 0 (Aα # t B α ) 1/α = exp((1 t) log A + t log B). Remark When σ is an operator mean corresponding to an operator monotone function f and s := f (1), lim α 0 (Aα σ B α ) 1/α = exp((1 s) log A + s log B). 5
4. Comparison property Theorem Let φ (1), φ () : (0, ) (0, ) (0, ) be smooth symmetric kernel functions. The following conditions are equivalent: (i) φ (1) (x, y) φ () (x, y) for all x, y > 0; (ii) K φ(1) D (H, H) Kφ() D (H, H) for all D P n and H H n ; (iii) L φ (1)(γ) L φ ()(γ) for all C 1 curve γ P n ; (iv) δ φ (1)(A, B) δ φ ()(A, B) for all A, B P n. For example, for θ R, let φ θ (x, y) := S θ (x, y) θ and φ(x, y) := M(x, y) θ with M M 0. If θ > 0 and M(x, y) S θ (x, y) for all x, y > 0, then θ θ A B θ HS if θ, δ φ (A, B) δ φθ (A, B) = log A log B HS if θ =. 6
Theorem If AB BA and φ(x, y) φ θ (x, y) for all x, y > 0 with x y, then, δ φ (A, B) δ φθ (A, B). In the case θ = and φ(x, y) = M G (x, y), log(a 1/ BA 1/ ) HS log A log B HS (exponential metric increasing [Mostow, Bhatia, Bhatia-Holbrook]) In the case θ = and φ(x, y) = M A (x, y), δ M A (A, B) log A log B HS (exponential metric decreasing) In the case θ = 1, δ MG (A, B) δ ML (A, B) A 1/ B 1/ HS δ MA (A, B) (square metric increasing/decreasing) Bogoliubov Wigner-Yanase Bures-Uhlmann 7
Unitarily invariant norms For a unitarily invariant norm, L φ, (γ) := 1 0 φ(l γ(t), R γ(t) ) 1/ γ (t) dt, δ φ, (A, B) := inf{l φ, (γ) : γ is a C 1 curve joining A, B}. (P n, δ φ, ) is no longer a Riemannian manifold but a differential manifold of Finsler type. Many results above hold true even when HS is replaced by. Let φ (k) (x, y) := M (k) (x, y) θ, k = 1,. To compare L φ (1), (γ) and L φ (), (γ), the infinite divisibility of M (1) (x, y)/m () (x, y) is crucial: ( M (1) (e t, 1) M () (e t, 1) is positive definite on R for any r > 0 [Bhatia-Kosaki, Kosaki]. ) r 8
5. Problems Want to prove the unique existence of geodesic shortest curve between A, B P n with respect to K φ. Need to study (D n, K φ ) rather than (P n, K φ ) for applicatioins to quantum information. 9
Thank you for your attention. 30