Derivation for Input of Factor Graph Representation

Σχετικά έγγραφα
Πανεπιστήµιο Κρήτης - Τµήµα Επιστήµης Υπολογιστών. ΗΥ-570: Στατιστική Επεξεργασία Σήµατος. ιδάσκων : Α. Μουχτάρης. εύτερη Σειρά Ασκήσεων.

Multi-dimensional Central Limit Theorem

Multi-dimensional Central Limit Theorem

α & β spatial orbitals in

One and two particle density matrices for single determinant HF wavefunctions. (1) = φ 2. )β(1) ( ) ) + β(1)β * β. (1)ρ RHF

8.324 Relativistic Quantum Field Theory II

C.S. 430 Assignment 6, Sample Solutions

1 Complete Set of Grassmann States

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

Fractional Colorings and Zykov Products of graphs

Neutralino contributions to Dark Matter, LHC and future Linear Collider searches

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

Statistical Inference I Locally most powerful tests

8.1 The Nature of Heteroskedasticity 8.2 Using the Least Squares Estimator 8.3 The Generalized Least Squares Estimator 8.

Variance of Trait in an Inbred Population. Variance of Trait in an Inbred Population

Section 8.3 Trigonometric Equations

Duals of the QCQP and SDP Sparse SVM. Antoni B. Chan, Nuno Vasconcelos, and Gert R. G. Lanckriet

The Simply Typed Lambda Calculus

ST5224: Advanced Statistical Theory II

Math221: HW# 1 solutions

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

2. THEORY OF EQUATIONS. PREVIOUS EAMCET Bits.

EE512: Error Control Coding

Finite Field Problems: Solutions

Overview. Transition Semantics. Configurations and the transition relation. Executions and computation

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

Problem Set 3: Solutions

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

Srednicki Chapter 55

2 Composition. Invertible Mappings

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

8.323 Relativistic Quantum Field Theory I

Lecture 2. Soundness and completeness of propositional logic

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

PARTIAL NOTES for 6.1 Trigonometric Identities

derivation of the Laplacian from rectangular to spherical coordinates

Matrices and Determinants

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Areas and Lengths in Polar Coordinates

Symplecticity of the Störmer-Verlet algorithm for coupling between the shallow water equations and horizontal vehicle motion

Areas and Lengths in Polar Coordinates

Other Test Constructions: Likelihood Ratio & Bayes Tests

4.6 Autoregressive Moving Average Model ARMA(1,1)

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013

A Class of Orthohomological Triangles

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

Μηχανική Μάθηση Hypothesis Testing

Constant Elasticity of Substitution in Applied General Equilibrium

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Reminders: linear functions

A Note on Intuitionistic Fuzzy. Equivalence Relation

Supporting information for: Functional Mixed Effects Model for Small Area Estimation

Lecture 34 Bootstrap confidence intervals

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

Partial Differential Equations in Biology The boundary element method. March 26, 2013

LECTURE 4 : ARMA PROCESSES

Generalized Fibonacci-Like Polynomial and its. Determinantal Identities

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!

Dynamic types, Lambda calculus machines Section and Practice Problems Apr 21 22, 2016

CRASH COURSE IN PRECALCULUS

Example Sheet 3 Solutions

Solutions to Exercise Sheet 5

Every set of first-order formulas is equivalent to an independent set

8. ΕΠΕΞΕΡΓΑΣΊΑ ΣΗΜΆΤΩΝ. ICA: συναρτήσεις κόστους & εφαρμογές

Section 7.6 Double and Half Angle Formulas

2 Lagrangian and Green functions in d dimensions

Solutions for Mathematical Physics 1 (Dated: April 19, 2015)

Αλγόριθμοι και πολυπλοκότητα NP-Completeness (2)

Inverse trigonometric functions & General Solution of Trigonometric Equations

ΗΥ537: Έλεγχος Πόρων και Επίδοση σε Ευρυζωνικά Δίκτυα,

Local Approximation with Kernels

Bounding Nonsplitting Enumeration Degrees

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

Supplementary materials for Statistical Estimation and Testing via the Sorted l 1 Norm

Notes on the Open Economy

DERIVATION OF MILES EQUATION FOR AN APPLIED FORCE Revision C

Uniform Convergence of Fourier Series Michael Taylor

Estimators when the Correlation Coefficient. is Negative

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Sequent Calculi for the Modal µ-calculus over S5. Luca Alberucci, University of Berne. Logic Colloquium Berne, July 4th 2008

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

Section 9.2 Polar Equations and Graphs

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

The challenges of non-stable predicates

Tridiagonal matrices. Gérard MEURANT. October, 2008

New bounds for spherical two-distance sets and equiangular lines

Math 446 Homework 3 Solutions. (1). (i): Reverse triangle inequality for metrics: Let (X, d) be a metric space and let x, y, z X.

Pg The perimeter is P = 3x The area of a triangle is. where b is the base, h is the height. In our case b = x, then the area is

Phasor Diagram of an RC Circuit V R

The one-dimensional periodic Schrödinger equation

TMA4115 Matematikk 3

Example of the Baum-Welch Algorithm

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

Partial Trace and Partial Transpose

P AND P. P : actual probability. P : risk neutral probability. Realtionship: mutual absolute continuity P P. For example:

On the Galois Group of Linear Difference-Differential Equations

ΤΜΗΜΑ ΗΛΕΚΤΡΟΛΟΓΩΝ ΜΗΧΑΝΙΚΩΝ ΚΑΙ ΜΗΧΑΝΙΚΩΝ ΥΠΟΛΟΓΙΣΤΩΝ

Transcript:

Dervaton for Input of actor Graph Representaton Sum-Product Prmal Based on the orgnal LP formulaton b x θ x + b θ,x, s.t., b, b,, N, x \ b x = b we defne V as the node set allocated to the th core. { V }\{{} =, 2, 3..., n} s defned as the ndex of a free varable set on the -th core. Note the set of does not nclude sngle varable. actors can share the same free varable set on the -th core, by requrng the factors wth the same free varable set to have same margnal dstrbuton on ther common free varable set, we can compress these factors and dramatcally reduce dual varables to ncrease effcency. In other words, we replace the orgnal local margnal polytope wth a herarchcal consstency polytope, where margnal consstency s jontly replaced by factor to free varable set and free varable set to node consstency. b s.t. b θ V + b x ˆθ x + b x ˆθ x N x V N x +ɛ c H b + ɛ c Hb V V N +ɛ c H b + ɛ c Hb N b x = b x,, N, x x \ b x = b, V,, N x \ = b,, N, x \ b 2 b x = b x, : V, x More formally, herarchcal consstency polytope s the feasble set of the above LP wth entropy barrer functons. f the ndex of free varable sets on the -th core. N means f and only f V =. N f and only f s a neghbor of on the -th core. And N f and only f V =. The new formulaton s a approxmaton to the LP relaxaton of the orgnal MAP problem. Note we dstngush factors ntersectng the free

varables wth sngle varable and those wth mult-varate set, so that we can have a clearer factor graph message passng representaton after compressng factors. As x \ b x = x \ b x = b x = = b, every x \ x \ x \ b confguraton of b, b n the herarchcal consstency polytope s n the local margnal polytope.or arbtrary vald dstrbuton denoted by b, settng b x = bx, b = b and b = b, we can see the vald dstrbuton corresponds to a pont n the herarchcal consstency polytope.in other word, t s guaranteed that the herarchcal consstency polytope the new feasble set s a subset of the local margnal polytope. On the other hand, every vald probablty dstrbuton n the margnal polytope s guaranteed to be n the herarchcal consstency polytope. And we have a smaller searchng space than local margnal polytope but all vald dstrbuton n margnal polytope wll be ncluded n the new searchng space. 2 Sum-Product Dual By assocatng dual varables δ, δ, λ and ν x to the three type of constrants n Equ. 0, we have the followng clam. Clam. Let N f and only f V =. The dual of the program n Equ. 0 s θ λ x ln N δ N V + ˆθ, x \ + νx, x \ + δx mn λ,δ,µ V + + N ln ln x x \ δx N + ˆθ ln N x \ x N λ, x \ + ν, x \ + δ s.t : V ν x = 0 3 When consstency varables ν are fxed, the sub-program for the -th core and the correspondng λ, δ update rule are shown n Clam 2 2

Clam 2. When ν s fxng, the bloc coordnate descent can be acheve by solvng the program ˆθ λ x ɛĉ ln N ɛĉ V mn λ ˆθ + ɛĉ ln + λ x N ɛĉ x where γx = ln ˆθ x + ν x 4 x \ γx = ln ˆθ x + ν x x \ ˆθ = γx ˆθ = θ + γx N N and ĉ = c + c ĉ = c + c N N 3 Sum-Product Update Rule rom the program n Equ 8 and 20, the update rule s summarzed n Clam 8..3. Note that we use λ to represent λ n the followng for smplcty. Clam 3. The message passng rule for λ s exactly the same as convex BP rule:, N, λ = ĉ ˆθ + c µ β µ β N ˆθ where µ = ɛĉ ln + λ j j N \ x j ɛĉ x \x and c = ĉ + N ĉ The bloc coordnates rule for varables ν s ν x = N P j: V j 5 δ j x V j δ x V 6 where N P s the number of sub-program n whch s nvolved. or arbtrary confguraton of λ, varables δ can be decoded as δx = c ˆθ ĉ + λ j x j γx 7 j N and δ = c ĉ ˆθ N λ γx 8 3

whch s dentcal to δ = c c ˆθ + N µ γx 9 by substtutng λ. 4 Max-Product Prmal The LP prmal formulaton, whch corresponds to Max-Product message passng, s b s.t. b θ + b x ˆθ x V N x + b x ˆθ x V N x b x = b x,, N, x x \ b x = b, V,, N x \ = b,, N, x \ b b x = b x =, b = x b, : V, x b x =, : V x b, b, b, b 0 0 5 Max-Product Dual By assocatng the constrants wth δ, δ, λ, ν x, η, η, η and η, the dual problem s shown n the followng Clam. Clam 4. The dual of the Max-Product problem s 4

mn s.t. + : V V V + + θ N ˆθ x + ν x x + δx N x N δ + λ x δx N N λ N x ˆθ x + ν x + δ ν x = 0, x Clam 5. When ν s fxng, the objectve functon of Equ 42 s lower bounded by mn ˆθ λ λ x + ˆθ V x N + where γx = ˆθ x + νx γx = ˆθ x + νx x \ x \ ˆθ = γx ˆθ = θ + γx N N N λ 2 6 Max-Product Update Rule The Max-Product Update Rule s summarzed n the followng. represent λ n the followng for smplcty. Note that we use λ to Clam 6. The message passng rule for λ s exactly the same as convex BP rule:, N, λ = + N ˆθ + µ β µ β N where µ = ˆθ x \x + λ j x j j N \ The bloc coordnates rule for varables ν s νx = N P j: V j 3 δ j x V j δ x V 4 where N P s the number of sub-program n whch s nvolved. or arbtrary confguraton of λ, varables δ can be decoded as δx = + N ˆθ + λ j x j γx 5 j N and δ = + N ˆθ N λ γx 6 5

whch s dentcal to by substtutng λ. δ = + N ˆθ + N µ γx 7 7 Algorthm 7. Sum-Product The nference algorthm can be ressed n the followng sum-product fashon. Algorthm Inference : Input: ψ = θ, ˆψ x = θx N P 2: whle Untl convergency do 3: for all do 4:, N, x, σ = ˆψ x n s x x \ 5: V, N,, σ = x \ ˆψ x n s x 6:, x, ˆψ = σx N 7: V,, ˆψ = ψ σx N 8: end for 9: for all do 0: η = Sub-Inference ˆψ, ˆψ, σ : end for 2: for all do / N P 3: n s x = n j s x V j n j: V j 4: end for 5: end whle s x V 6

Algorthm 2 Sub-Inference : Input: σx, σx, ˆψ, ˆψ, n = λ, m = µ 2: for all t num of nner ter do 3: for all V do 4: N,, n = ˆψ m β 5: N,, m = x \ 6: end for 7: for all : V do 8: f V s a sngle node then 9: ηx = ˆψ m N β N ĉ c / m ˆψ n j j N \ x j c c / σ 0: else c / ĉ : ηx = ˆψ n j j N x j σx 2: end f 3: end for 4: end for 5: Return η ɛĉ ɛĉ 7.2 Max-Product The nference algorthm can be ressed n the followng -product fashon. Algorthm 3 Inference : Input: θ, ˆθ x = θx N P 2: whle Untl convergency do 3: for all do 4:, N, x, γ = ˆθ x + νx x \ 5: V, N,, γx = ˆθ x + νx x \ 6:, x, ˆθ = γx N 7: V,, ˆθ = θ + γx N 8: end for 9: for all do 0: δ = Sub-Inferenceˆθ, ˆθ, γ : end for 2: for all do 3: νx = N P δx j V j δx V 4: end for 5: end whle j: V j 7

Algorthm 4 Sub-Inference : Input: γ, γ, ˆθ, ˆθ 2: for all t num of nner ter do 3: for all V do 4: N,, λ = + N 5: N,, µ = x \ 6: end for 7: for all : V do 8: f V s a sngle node then 9: δ = + N 0: else : δ = + N 2: end f 3: end for 4: end for 5: Return δ ˆθ + N ˆθ + β N µ β ˆθ + λ j j N \ x j µ γ ˆθ + λ j j N x j γx µ 8 Appendx 8. Proof of the Clams 8.. Proof of Clam Clam. Let N f and only f V =. The dual of the program n Equ. 0 s θ λ x ln N δ N V + ˆθ, x \ + νx, x \ + δx mn λ,δ,µ V + + N ln ln x x \ δx N + ˆθ ln N x \ x N λ, x \ + ν, x \ + δ s.t : V ν x = 0 8 8

Proof. The lagrangan dual of the prmal problem s b θ + b x ˆθ x V N x + b x ˆθ x V N x +ɛ c H b + ɛ c Hb + ɛ c Hb + ɛ V N V N + δ b x b x L = N x x \ + δx b x b V N x \ + λ x b V x b x \ + νx b x b x N x + ν x b x b x V N x b θ λ x δx + ɛ c H b V N N V + b x ˆθ, x \ + νx, x \ + δx V N x +ɛ c Hb V N = + b x δ x + λ x x N N +ɛ + c H b N x b x +ɛ c Hb N b x νx x : V ˆθ x, x \ + ν, x \ + δx c H b 9 We mze L over b analytcally wth the fact that log-sum- functon s the conjugate functon of entropy under the smplex constrants. In addton, to prevent b x νx sup b x : V from gong to negatve nfnty, the last term n the lagrangan dual gves addtonal constrants of νx = 0, whch results n the dual program n Clam : V 9

8..2 Proof of Clam 2 Clam. When ν s fxng, the bloc coordnate descent can be acheve by solvng the program ˆθ λ x ɛĉ ln N ɛĉ V mn λ ˆθ + ɛĉ ln + λ x N ɛĉ x where γx = ln ˆθ x + ν x x \ γx = ln ˆθ x + ν x x \ ˆθ = γx ˆθ = θ + γx N N and ĉ = c + c ĉ = c + N N c 20 Proof. Dervaton for free var set terms: By extractng all the terms nvolvng x from Equ 8, we have L = ln δx N + λ x N x + ˆθ x ln, x \ + ν, x \ + δx N x x \ = ln δx N + λ x N x + γ x ln + δ N x Settng the dervatve wth respect to δ to 0, we have γ +δ x x γ +δ x x = δ N x + λ N δ N x + λ N 2 22 Introducng a degree of freedom n normalzaton, we have γx + δx = c δ c x + N N λ 23 0

Summng over N, t gves N δ = N c ĉ N λ c ĉ N γ 24 Substtutng t bac to Equ 9, we have γ + δ x δ + λ x N = N = c c wth whch we compress L n Equ 2 nto a sngle term ˆθ + L = ɛĉ ln ɛĉ N λ N λ + γx N ĉ 25 26 Dervaton for node terms: The dervaton goes smlarly to the above process for terms whch are compressed nto node. We also put t here for clearer future reference. L = ln θ λ x N δ N + ˆθ, x \ + ν ln x, x \ + δx N x x \ = ln θ λ x N δ 27 N + γ ln + δ N x Settng the dervatve wth respect to δx to 0, we have γ x+δ x = γ +δ x x θ δ N x λ N θ δ N x λ N 28 Introducng a degree of freedom n normalzaton, we have γx + δx = c θ δ c x N Summng over N, t gves N δ = N c ĉ θ N N λ x c λ x 29 ĉ N γ 30

Substtutng t bac to Equ 29, we have θ δ γx + δx x λ x N = N θ + γ N = c c ĉ 3 wth whch we compress L n Equ 2 nto a sngle term ˆθ L = ɛĉ ln ɛĉ N λ N λ 32 Combnng Equ 26 and Equ 32 wll gve us the resson n Equ 20. 8..3 Proof of Clam 3 Clam. The message passng rule for λ s exactly the same as convex BP rule:, N, λ = ĉ ˆθ + c µ β µ β N ˆθ where µ = ɛĉ ln + λ j j N \ x j ɛĉ x \x and c = ĉ + N ĉ The bloc coordnates rule for varables ν s ν x = N P j: V j 33 δ j x V j δ x V 34 where N P s the number of sub-program n whch s nvolved. or arbtrary confguraton of λ, varables δ can be decoded as δx = c ˆθ ĉ + λ j x j γx 35 j N and whch s dentcal to δ = c ĉ δ = c c ˆθ ˆθ + N N λ γx 36 µ γx 37 by substtutng λ. Proof. or λ varables, the rule s exactly the same as convex BP. 2

In terms of ν, from the program n Clam, we tae the lagrangan dual of the objectve functon ˆL = ln ˆθ x + ν x + δx V + η x ν x : V x x : V 38 settng the dervatve wth respect to νx as zero, whch gves the followng equaton ˆθx +ν x+δ x V = η x 39 ˆθx +ν x+δ x V x As any set of νx satsfyng the above equaton s guaranteed to be optmal and every optmal s guaranteed to satsfy νx = 0, we can add arbtrary addtonal constant to νx : V as long as Equ 39 s satsfed. The effect of addtonal constant s accumulated to 0 over. Thus the resultng optmal value for ˆL s unchanged..e. we can smply tae ˆθ x + ν x + δ x V = β x 40 summng over : V, νx are elmnated whch gves θ x + δx V = N P β x 4 : V We can substtutng t bac to Equ 40 and get the rule n Clam 8..3. Note the lct resson of δ n Equ 35 and Equ 36 can be drectly derved from Equ 25 and 3. 8..4 Proof of Clam 4 Clam. The dual of the Max-Product problem s mn s.t. + : V V V + + θ N ˆθ x + ν x x + δx N x N δ + λ x δx N N λ N x ˆθ x + ν x + δ ν x = 0, x 42 Proof. Wth standard LP prmal-dual tranformaton c T x = mn b T y s.t. Ax = b s.t. A T y c x 0 43 3

we have the followng dual formulaton { η mn + η + η + } η V V N N s.t. η + δx + λ x ˆθ, N N η + δx λ x 0,, x N N η δ ν x ˆθ x η δx νx ˆθ x νx 0 : V, V, N, x,, N, x, x 44 Note we reverse the sgn of δ, λ and ν. By substtutng η bac nto the dual objectve functon, we have the resson n Equ 42. Note that when νx > 0, we can always : V construct ν satsfyng x = 0 wth whch the objectve value s not ncreased. Thus : V ν we can replace the last nequalty wth equalty. 8..5 Proof of Clam 5 Clam. When ν s fxng, the objectve functon of Equ 42 s lower bounded by mn ˆθ λ λ x + ˆθ V x N + where γx = ˆθ x + νx γx = ˆθ x + νx x \ x \ ˆθ = γx ˆθ = θ + γx N N N λ 45 Proof. Dervaton for node terms L = θ λ x δ x N N = = θ θ ˆθ N N N λ x δx N + + λ x + γx N λ x ˆθ x + ν x + δx N x \ N γ + δ 46 4

Dervaton for free var set terms L = δx x + N = δx x + N γx x + N = ˆθ x + N λ N λ N λ N λ + + x N N x ˆθ x + νx x \ γx + δx 47 + δ 8..6 Proof of Clam 6 Clam. The message passng rule for λ s exactly the same as convex BP rule:, N, λ = + N ˆθ + µ β µ β N where µ = ˆθ x \x + λ j x j j N \ The bloc coordnates rule for varables ν s νx = N P j: V j 48 δ j x V j δ x V 49 where N P s the number of sub-program n whch s nvolved. or arbtrary confguraton of λ, varables δ can be decoded as δx = + N ˆθ + λ j x j γx 50 j N and δ = + N ˆθ N λ γx 5 Proof. rom Equ 45, the sum of terms nvolvng node s ˆθ λ x + ˆθ x N N + = ˆθ ˆθ + N β N λ x + µ β N x \ N λ ˆθ + j N \ λ j x j + λ x 52 5

λ. It s a lower bound achevable wth Equ 48, whch gves the bloc coordnate descent rule over rom Equ 42, the sum of terms nvolvng ν x s : V ˆθ x + ν x x + δx The lower bound can be achved by settng ˆθ x + νx + δx = N P θ x + rule n Equ 49. x : V θ x + δ : V δx 53, whch s equvalent to the By evaluate δ wth Equ 50 and 5, the lower bound n Equ 45 can be achved, whch gves the coordnate ascent rule for δ 8.2 Another Expresson of Sum-Product The algorthm can be dentcally ressed as the followng. In the mplementaton, we use the computaton procedure shown n ths resson. Algorthm 5 Inference : Input: θ, ˆθ x = θx N P 2: whle Untl convergency do 3: for all do 4:, N, x, γ = ln ˆθx +ν x x \ 5: V, N,, γ = ln x \ 6:, x, ˆθ = γx N 7: V,, ˆθ = θ + γx N 8: end for 9: for all do 0: δ = Sub-Inferenceˆθ, ˆθ, γ : end for 2: for all do 3: νx = N P δx j V j δx V 4: end for 5: end whle j: V j ˆθx +ν x 6

Algorthm 6 Sub-Inference : Input: γx, γx, ˆθ, ˆθ 2: for all t num of nner ter do 3: for all V do 4: N,, λ = ĉ c ˆθ + 5: N,, µ = ɛĉ ln x \ 6: end for 7: for all : V do 8: f V s a sngle node then 9: δ = c c 0: else : δ = c ĉ 2: end f 3: end for 4: end for 5: Return δ ˆθ + N µ µ β µ β N ˆθ + λ j x j j N \ γ ˆθ + λ j j N x j γx ɛĉ 7