Reading Order Detection for Text Layout Excluded by Image

Σχετικά έγγραφα
ER-Tree (Extended R*-Tree)

Vol. 31,No JOURNAL OF CHINA UNIVERSITY OF SCIENCE AND TECHNOLOGY Feb

Quick algorithm f or computing core attribute

Research of Han Character Internal Codes Recognition Algorithm in the Multi2lingual Environment

Adaptive grouping difference variation wolf pack algorithm

Approximation Expressions for the Temperature Integral

GPU. CUDA GPU GeForce GTX 580 GPU 2.67GHz Intel Core 2 Duo CPU E7300 CUDA. Parallelizing the Number Partitioning Problem for GPUs

Nov Journal of Zhengzhou University Engineering Science Vol. 36 No FCM. A doi /j. issn

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

AΡΙΣΤΟΤΕΛΕΙΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΕΣΣΑΛΟΝΙΚΗΣ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΠΟΛΙΤΙΚΩΝ ΜΗΧΑΝΙΚΩΝ

* ** *** *** Jun S HIMADA*, Kyoko O HSUMI**, Kazuhiko O HBA*** and Atsushi M ARUYAMA***

Δυσκολίες που συναντούν οι μαθητές της Στ Δημοτικού στην κατανόηση της λειτουργίας του Συγκεντρωτικού Φακού

Retrieval of Seismic Data Recorded on Open-reel-type Magnetic Tapes (MT) by Using Existing Devices

ΠΑΡΑΜΕΤΡΟΙ ΕΠΗΡΕΑΣΜΟΥ ΤΗΣ ΑΝΑΓΝΩΣΗΣ- ΑΠΟΚΩΔΙΚΟΠΟΙΗΣΗΣ ΤΗΣ BRAILLE ΑΠΟ ΑΤΟΜΑ ΜΕ ΤΥΦΛΩΣΗ

HIV HIV HIV HIV AIDS 3 :.1 /-,**1 +332

n 1 n 3 choice node (shelf) choice node (rough group) choice node (representative candidate)

Ποιος φοβάται το ψηφιακό περιεχόμενο στη Νεοελληνική Φιλολογία;

Ανάπτυξη Οντολογικής Γνώσης για Τεκμηρίωση Οπτικοακουστικού Περιεχομένου ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

IPSJ SIG Technical Report Vol.2014-CE-127 No /12/6 CS Activity 1,a) CS Computer Science Activity Activity Actvity Activity Dining Eight-He

Congruence Classes of Invertible Matrices of Order 3 over F 2

Buried Markov Model Pairwise

EPL 603 TOPICS IN SOFTWARE ENGINEERING. Lab 5: Component Adaptation Environment (COPE)

Newman Modularity Newman [4], [5] Newman Q Q Q greedy algorithm[6] Newman Newman Q 1 Tabu Search[7] Newman Newman Newman Q Newman 1 2 Newman 3

A research on the influence of dummy activity on float in an AOA network and its amendments

[4] 1.2 [5] Bayesian Approach min-max min-max [6] UCB(Upper Confidence Bound ) UCT [7] [1] ( ) Amazons[8] Lines of Action(LOA)[4] Winands [4] 1

Correction of chromatic aberration for human eyes with diffractive-refractive hybrid elements

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΕΙΡΑΙΩΣ ΤΜΗΜΑ ΠΛΗΡΟΦΟΡΙΚΗΣ ΠΜΣ «ΠΡΟΗΓΜΕΝΑ ΣΥΣΤΗΜΑΤΑ ΠΛΗΡΟΦΟΡΙΚΗΣ» ΚΑΤΕΥΘΥΝΣΗ «ΕΥΦΥΕΙΣ ΤΕΧΝΟΛΟΓΙΕΣ ΕΠΙΚΟΙΝΩΝΙΑΣ ΑΝΘΡΩΠΟΥ - ΥΠΟΛΟΓΙΣΤΗ»

Test Data Management in Practice

Παλεπηζηήκην Πεηξαηώο Τκήκα Πιεξνθνξηθήο Πξόγξακκα Μεηαπηπρηαθώλ Σπνπδώλ «Πξνεγκέλα Σπζηήκαηα Πιεξνθνξηθήο»

Area Location and Recognition of Video Text Based on Depth Learning Method

A summation formula ramified with hypergeometric function and involving recurrence relation

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

MIDI [8] MIDI. [9] Hsu [1], [2] [10] Salamon [11] [5] Song [6] Sony, Minato, Tokyo , Japan a) b)

c Key words: cultivation of blood, two-sets blood culture, detection rate of germ Vol. 18 No

A Bonus-Malus System as a Markov Set-Chain. Małgorzata Niemiec Warsaw School of Economics Institute of Econometrics

Πανεπιστήµιο Πειραιώς Τµήµα Πληροφορικής

Πώς μπορεί κανείς να έχει έναν διερμηνέα κατά την επίσκεψή του στον Οικογενειακό του Γιατρό στο Ίσλινγκτον Getting an interpreter when you visit your

Conjoint. The Problems of Price Attribute by Conjoint Analysis. Akihiko SHIMAZAKI * Nobuyuki OTAKE

ΠΕΡΙΕΧΟΜΕΝΑ. Μάρκετινγκ Αθλητικών Τουριστικών Προορισμών 1

Μιχαήλ Νικητάκης 1, Ανέστης Σίτας 2, Γιώργος Παπαδουράκης Ph.D 1, Θοδωρής Πιτηκάρης 3

Schedulability Analysis Algorithm for Timing Constraint Workflow Models

«ΑΝΑΠΣΤΞΖ ΓΠ ΚΑΗ ΥΩΡΗΚΖ ΑΝΑΛΤΖ ΜΔΣΔΩΡΟΛΟΓΗΚΩΝ ΓΔΓΟΜΔΝΩΝ ΣΟΝ ΔΛΛΑΓΗΚΟ ΥΩΡΟ»

ACTA MATHEMATICAE APPLICATAE SINICA Nov., ( µ ) ( (

Reminders: linear functions

High order interpolation function for surface contact problem

( ) , ) , ; kg 1) 80 % kg. Vol. 28,No. 1 Jan.,2006 RESOURCES SCIENCE : (2006) ,2 ,,,, ; ;

, Evaluation of a library against injection attacks

Arbitrage Analysis of Futures Market with Frictions

Development of a Seismic Data Analysis System for a Short-term Training for Researchers from Developing Countries

Study on the Strengthen Method of Masonry Structure by Steel Truss for Collapse Prevention

2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

Study of In-vehicle Sound Field Creation by Simultaneous Equation Method

Χρειάζεται να φέρω μαζί μου τα πρωτότυπα έγγραφα ή τα αντίγραφα; Asking if you need to provide the original documents or copies Ποια είναι τα κριτήρια

(C) 2010 Pearson Education, Inc. All rights reserved.

Η ΨΥΧΙΑΤΡΙΚΗ - ΨΥΧΟΛΟΓΙΚΗ ΠΡΑΓΜΑΤΟΓΝΩΜΟΣΥΝΗ ΣΤΗΝ ΠΟΙΝΙΚΗ ΔΙΚΗ

CorV CVAC. CorV TU317. 1

Second Order RLC Filters

Gro wth Properties of Typical Water Bloom Algae in Reclaimed Water

1 (forward modeling) 2 (data-driven modeling) e- Quest EnergyPlus DeST 1.1. {X t } ARMA. S.Sp. Pappas [4]

ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ ΕΠΑΝΑΣΧΕΔΙΑΣΜΟΣ ΓΡΑΜΜΗΣ ΣΥΝΑΡΜΟΛΟΓΗΣΗΣ ΜΕ ΧΡΗΣΗ ΕΡΓΑΛΕΙΩΝ ΛΙΤΗΣ ΠΑΡΑΓΩΓΗΣ REDESIGNING AN ASSEMBLY LINE WITH LEAN PRODUCTION TOOLS

ΠΩΣ ΕΠΗΡΕΑΖΕΙ Η ΜΕΡΑ ΤΗΣ ΕΒΔΟΜΑΔΑΣ ΤΙΣ ΑΠΟΔΟΣΕΙΣ ΤΩΝ ΜΕΤΟΧΩΝ ΠΡΙΝ ΚΑΙ ΜΕΤΑ ΤΗΝ ΟΙΚΟΝΟΜΙΚΗ ΚΡΙΣΗ

, Litrrow. Maxwell. Helmholtz Fredholm, . 40 Maystre [4 ], Goray [5 ], Kleemann [6 ] PACC: 4210, 4110H

Database programming in VC + + :applying ODBC API

Διπλωματική Εργασία του φοιτητή του Τμήματος Ηλεκτρολόγων Μηχανικών και Τεχνολογίας Υπολογιστών της Πολυτεχνικής Σχολής του Πανεπιστημίου Πατρών

Fractional Colorings and Zykov Products of graphs

ΠΑΝΔΠΙΣΗΜΙΟ ΜΑΚΔΓΟΝΙΑ ΠΡΟΓΡΑΜΜΑ ΜΔΣΑΠΣΤΥΙΑΚΧΝ ΠΟΤΓΧΝ ΣΜΗΜΑΣΟ ΔΦΑΡΜΟΜΔΝΗ ΠΛΗΡΟΦΟΡΙΚΗ

Τ.Ε.Ι. ΔΥΤΙΚΗΣ ΜΑΚΕΔΟΝΙΑΣ ΠΑΡΑΡΤΗΜΑ ΚΑΣΤΟΡΙΑΣ ΤΜΗΜΑ ΔΗΜΟΣΙΩΝ ΣΧΕΣΕΩΝ & ΕΠΙΚΟΙΝΩΝΙΑΣ

Research on Economics and Management

ΔΙΕΡΕΥΝΗΣΗ ΤΗΣ ΕΠΟΧΙΑΚΗΣ ΠΑΡΑΚΤΙΑΣ ΑΝΑΒΛΥΣΗΣ ΣΤΟ Β.Α. ΑΙΓΑΙΟ. Τμήμα Επιστημών της Θάλασσας, Πανεπιστήμιο Αιγαίου 2

Web-based supplementary materials for Bayesian Quantile Regression for Ordinal Longitudinal Data

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks

Εκτεταμένη περίληψη Περίληψη

Resurvey of Possible Seismic Fissures in the Old-Edo River in Tokyo

UDZ Swirl diffuser. Product facts. Quick-selection. Swirl diffuser UDZ. Product code example:

Development of the Nursing Program for Rehabilitation of Woman Diagnosed with Breast Cancer

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ

Partial Trace and Partial Transpose

Η ψηφιακή βιβλιοθήκη του Πανεπιστημίου Κρήτης

ΓΕΩΠΟΝΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΘΗΝΩΝ ΤΜΗΜΑ ΕΠΙΣΤΗΜΗΣ ΤΡΟΦΙΜΩΝ ΚΑΙ ΔΙΑΤΡΟΦΗΣ ΤΟΥ ΑΝΘΡΩΠΟΥ

ΚΑΘΟΡΙΣΜΟΣ ΠΑΡΑΓΟΝΤΩΝ ΠΟΥ ΕΠΗΡΕΑΖΟΥΝ ΤΗΝ ΠΑΡΑΓΟΜΕΝΗ ΙΣΧΥ ΣΕ Φ/Β ΠΑΡΚΟ 80KWp

ΓΗΠΛΧΜΑΣΗΚΖ ΔΡΓΑΗΑ ΑΡΥΗΣΔΚΣΟΝΗΚΖ ΣΧΝ ΓΔΦΤΡΧΝ ΑΠΟ ΑΠΟΦΖ ΜΟΡΦΟΛΟΓΗΑ ΚΑΗ ΑΗΘΖΣΗΚΖ

* * E mail : matsuto eng.hokudai.ac.jp. Zeiss

Commutative Monoids in Intuitionistic Fuzzy Sets

ON NEGATIVE MOMENTS OF CERTAIN DISCRETE DISTRIBUTIONS

Error ana lysis of P2wave non2hyperbolic m oveout veloc ity in layered media

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ

Approximation of distance between locations on earth given by latitude and longitude

ΟΡΓΑΝΙΣΜΟΣ ΒΙΟΜΗΧΑΝΙΚΗΣ ΙΔΙΟΚΤΗΣΙΑΣ

MSM Men who have Sex with Men HIV -

ΑΥΤΟΜΑΤΟΠΟΙΗΣΗ ΜΟΝΑΔΑΣ ΘΡΑΥΣΤΗΡΑ ΜΕ ΧΡΗΣΗ P.L.C. AUTOMATION OF A CRUSHER MODULE USING P.L.C.

ΕΥΘΑΛΙΑ ΚΑΜΠΟΥΡΟΠΟΥΛΟΥ

ΤΕΧΝΟΛΟΓΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΤΜΗΜΑ ΝΟΣΗΛΕΥΤΙΚΗΣ

An Automatic Modulation Classifier using a Frequency Discriminator for Intelligent Software Defined Radio

Study on Re-adhesion control by monitoring excessive angular momentum in electric railway traction

LS series ALUMINUM ELECTROLYTIC CAPACITORS CAT.8100D. Specifications. Drawing. Type numbering system ( Example : 200V 390µF)

C.S. 430 Assignment 6, Sample Solutions

Zigbee. Zigbee. Zigbee Zigbee ZigBee. ZigBee. ZigBee

ΑΝΙΧΝΕΥΣΗ ΓΕΓΟΝΟΤΩΝ ΒΗΜΑΤΙΣΜΟΥ ΜΕ ΧΡΗΣΗ ΕΠΙΤΑΧΥΝΣΙΟΜΕΤΡΩΝ ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ

Transcript:

19 5 JOURNAL OF CHINESE INFORMATION PROCESSING Vol119 No15 :1003-0077 - (2005) 05-0067 - 09 1, 1, 2 (11, 100871 ; 21IBM, 100027) :,,, PMRegion,, : ; ; ; ; :TP391112 :A Reading Order Detection for Text Layout Excluded by Image J IA Juan 1,CHEN Kun2qiu 1,ZHOU Dong2hao 2 (11National Key Laboratory for Text Processing, Institute of Computer Science and Technology Peking University, Beijing 100871,China ; 21IBM China Co. Ltd,Beijing 100027,China) Abstract : Detecting reading order for text layout excluded by image is a key problem in document image understanding (DIU) and text typesetting. Especially in Chinese and other orient languages, text region in which words are reflected to next line when they meet a graph boundary makes reading order various. A new layout model, which uses a new page object called PMRegion, is defined. Based on ordered tree, an algorithm for reading order detection after page top2down decompo2 sition for constructing layout objects is presented. They are proven be effective by a special typesetting system and also help2 ful to go deep into DIU. Key words : computer application ; Chinese information processing ; reading order ; text layout excluded by image ;ordered tree 1, [1 ],, 3 : ;,, :2004-09 - 20 :2005-02 - 02 : (1978 ),,, XML. 67

,,,,, [2 ] [3 ], 13 Allen [4 ],,,, PMRe2 gion,, 2 PL L S : PL = < L, S > L = { l 1, l 2,, l n } l i = < O i, Rl i > O, Rl, < O i, O j > Rl O i O j S = { s 1,, s m } s i = < O i, Rs i > Rs, < O i, O j > Rs O i O j,,, 211 4,, PageR ArticleR Column Tile LineSeg Box PageR, ArticleR, Column ArticleR Box, LineSeg Box LineSeg Tile, Column, Tile,, Column PMRegion,, x, y, P ( x, y) x y, P, : Column PtConnect : P 1 ( x 1, y 1 ) Column, P 2 ( x 2, y 2 ) Column, x 1 < x 2, y 2 = y 1 Π x 1 < x < x 2 ( Pm ( x, y 1 ) Column) < P 1, P 2 > PtConnect 68

Thread ( y 0, x 0, x 1 ) = { P ( x, y 0 ) P Column x 0 x x 1 P ( x 0-1, y 0 ) Column P ( x 1 + 1, y 0 ) Column < P, P > PtConnect} TCluster( y 0 ) = { Thread ( y 0, x 0, x 1 ) P( x 0, y 0 ), P( x 1, y 0 ) Column} ThreadSet = TCluster( y 0 ) TCluster ( y 1 ) TCluster ( y n ), y min y 0 < y 1 < < y n y max, y min, y max Column ThreadSet VConnect : T 1 = Thread ( y 0, x 11, x 12 ), T 2 = Thread ( y 0 + 1, x 21, x 22 ), < T 1, T 2 > VConnect Ζ ϖ P 1 ( x, y 0 ) T 1 ( ϖ P 2 ( x, y 0 + 1) T 2 ( x = x ) ) ThreadSet SoleNeighbor : T 1, T 2, < T 1, T 2 > SoleNeighbor Ζ < T 1, T 2 > VConnect ( Π T TCluster( y 0 ) ( T T 1 < T, T 2 > VConnect) ) ( Π T TCluster( y 0 + 1) ( T T 2 < T 1, T > VConnect) ) ThreadSet SoleConnect : T 1 = Thread ( y 1, x 11, x 12 ), T 2 = Thread ( y 2, x 21, x 22 ), y 1 y 2 ; < T 1, T 2 > SoleConnect Ζ T 1 = T 2 ϖ T 1 ( y 1 ) T 2 ( y 1 + 1) T n - 1 ( y 2-1) T n ( y 2 ) Thread2 Set ( Π1 i < n ( < T i, T i + 1 > SoleNeighbor) ) Column SameR = { < P 1, P 2 > P 1, P 2 Column P 1 Thread1 ( y 1 ) P 2 Thread2 ( y 2 ) < Thread1, Thread2 > SoleConnect}, SameR, Column PMRegion : Column SameR PMRegion, PMRegion ;, PMRegion = Column = ArticleR 212, Rl, < ArticleR, PageR > Rl < Column, ArticleR > Rl < PMRegion, Column > Rl < Tile, PMRegion > Rl < LineSeg, Tile > Rl < Box,LineSeg > Rl, PageR, ArticleR Col2 umn PMRegion Tile LineSeg Box HLMTree 213 HLMTree HLMTree PageR ArticleR ExcludeR ; Column Tile Box, ; LineSeg, PMRegion Web [5 ],, Column PMRegion SameR, O ( N 2 ) ( N = Column ), PMRegion SameR ThreadSet SoleNeighbor Column SameR ThreadSet SoleNeighbor, ThreadSet, TCluster ( y) TCluster ( y + 1) Thread, Thread PMRegion : PG 1, PG 2,, PG n Column PolyColumn, TCluster ( PG i. y) TCluster( PG i. y + 1) (1 i n) PG i CN P, ExcludeR PolyExclude 1) PolyColumn rcframe 2) rcframe PolyColumn PolyExclude, 69

y 1, y 2,, y n 3) y = y i y = y i + 1 (1 i < n), rcframe rcs2 can rcframe PolyColumn PolyExclude, rcscan PMRegion, 4 :,, rcframe PolyColumn PolyExclude PMRegion, layer = i, layer, PolyColumn PolyColumn, PMRegion 1 PMRegion,,, rcframe,,, PMR i (1 i 39) 39 PMRegion, 9 PMR 2 PMR 5 PMR 7 PMR 10 PMR 13 PMR 15 PMR 18 PMR 20 PMR 22 PMR 25 PMR 27 PMR 30 PMR 33 PMR 35 PMR 38, 1 PMRegion 3 311 HLMTree,,,, 1) Column IsLeftColumn : Column C 1, C 2 < Column Π P 1 ( x 1, y 1 ) C 1 Π P 2 ( x 2, y 2 ) C 2 ( x 1 < x 2 ) < C 1, C 2 > IsLeftColumn 2) Tile IsUpTile : Tile T 1, T 2 < PMRegion Π P 1 ( x 1, y 1 ) T 1 ( Π P 2 ( x 2, y 2 ) T 2 ( y 1 < y 2 ) ) < T 1, T 2 > IsUpTile 3) LineSeg IsLeftSeg : LineSeg S 1, S 2 < Tile Π P 1 ( x 1, y 1 ) S 1 ( Π P 2 ( x 2, y 2 ) S 2 ( x 1 < x 2 ) ) < S 1, S 2 > IsLeftSeg 4) Box IsLeftBox : Box B 1, B 2 < LineSeg Π P 1 ( x 1, y 1 ) B 1 ( Π P 2 ( x 2, y 2 ) B 2 ( x 1 < x 2 ) ) < B 1, B 2 > IsLeftBox 5) PMRegion IsUpPMRegion : PMRegion PMR 1, PMR 2 < Column PMR 1. layer = PMR 2. layer - 1 < PMR 1, PMR 2 > IsUpPMRegion 6) PMRegion IsLeftPMRegion : PMRegion PMR 1, PMR 2 < Column, 70

PMR 1. layer = PMR 2. layer PMR 1 PMR 2 < PMR 1, PMR 2 > IsLeftPMRegion 7) PMR 1, PMR 2, PMR a1, PMR a2,, PMR an, < PMR 1, PMR a1 > IsLeft2 PMRegion, < PMR ai, PMR ai + 1 > IsLeftPMRegion (1 ai < n), < PMR an, PMR 2 > IsLeftPMRe2 gion, PMR 1 PMR 2, 1 PMR 32 PMR 34 8) PMRegion IsLeftTextPMRegion : PMR 1 PMR 2, PMR k < PMR 1, PMR k > IsUpPMRegion < PMR 2, PMR k > IsUpPMRe2 gion, < PMR k, PMR 1 > IsUpPMRegion < PMR k, PMR 2 > IsUpPMRegion, < PMR 1, PMR 2 > IsLeftTextPMRegion 312, ReadFirst ( O 1, O 2 ) O 1 O 2 HLMTree, Box, Box, Box, : < Column 1, Column 2 > IsLeftColumn ReadFirst ( Column 1, Column 2 ) < PMR 1, PMR 2 > IsUpPMRegion ReadFirst ( PMR 1, PMR 2 ) < PMR 1, PMR 2 > IsLeftTextPMRegion ReadFirst ( PMR 1, PMR 2 ) < T 1, T 2 > IsUpTile ReadFirst ( T 1, T 2 ) < B 1, B 2 > IsLeftBox ReadFirst ( B 1, B 2 ) PMR 1 Column 1 PMR 2 Column 2 ReadFirst ( Column 1, Column 2 ) ReadFirst ( PMR 1, PMR 2 ) Tile 1 PMR 1 Tile 2 PMR 2 ReadFirst ( PMR 1, PMR 2 ) ReadFirst ( Tile 1, Tile 2 ) Box 1 Tile 1 Box 2 Tile 2 ReadFirst ( Tile 1, Tile 2 ) ReadFirst ( Box 1, Box 2 ) Column Tile Box ReadFirst Column PMRegion :, 1 PMR 1 PMR 2 PMR 38 PMR 39 PMRegion, PMR 33 PMR 38,, PMRegion : Column SameR, PMRegion 313 PMRegion, PMRegionSet rcframe PMRe2 gion ( PMRegionAtomSet < PMRegionSet, PMRegionAtomSet PMRegion, IsUpPMRegion,, PMRegionAtomSet PMRegion, PMRegionAtomSet PMRegionSet ( PMRegionSet 1, PMRegionSet 2 < PMRegionSet, layer min 1 71

min 2 min 1 max 2 : 1) max 1 < min 2, PMRegionSet 1 PMRegionSet 2 { PMR 1, PMR 2, PMR 3 } { PMR 4, PMR 5, PMR 6, PMR 7, PMR 8 } 2) min 1 = min 2 max 1 = max 2, PMR 1 PMRegionSet 1 PMR 2 PMRegionSet 2, PMR 1 PMR 2, PMR 1 PMR 2, PMRegionSet 1 PMRegion2 Set 2 { PMR 1, PMR 4, PMR 5, PMR 6 } { PMR 2, PMR 3, PMR 7, PMR 8 } ( PMR 1 PMR 2, < PMR 1, PMR 2 > IsUpPMRegion, PMR 1 PMR 2 ; < PMR 2, PMR 1 > IsUpPMRegion, PMR 2 PMR 1, PMR 1 PMR 2, PMR 2 PMR 7 ( PMRegionSet = PMRegionSet 1 PMRegionSet 2 PMRegionSet n, PMRegionSet i PM2 RegionSet j = <( i j), PMRegionSet PMR 1 PMR 2, PMR 1 PMRegionSet i, PMR 2 PMRegionSet i (1 i n), PMRegionSet 1 PMRegionSet 2 PMRegion2 Set n PMRegionSet, : 1) 1 i < n, PMRegionSet i PMRegionSet i + 1, PMRegionSet 1 PM2 RegionSet 2 PMRegionSet n PMRegionSet 2) 1 i < n, PMRegionSet i PMRegionSet i + 1, PMRegionSet 1 PM2 RegionSet 2 PMRegionSet n PMRegionSet 314, PMRegionSet,,,, PMRegion : 1) PMRegionSet 2), ; PMRegionSet, PMRegionSet 1, PMRegionSet 2,, PMRegionSet k, n 1, n 2,, n k 3) n i PMRegionSet i (1 i k), PMRegionSet i1 PMRegionSet i2 PMRegionSet it, n i n i1, n i2,, n it 4) n ij (1 i k,1 j t), 2), 5) PMRegion,, PMRegion 6), PMRegion, PMRegion 1 39 PMRegion 2,,,, PMRegion, PMRegion : PMR 5 PMR 2 PMR 7 PMR 13 PMR 18 PMR 10 PMR 15 PMR 20 PMR 25 PMR 22 PMR 27 PMR 30 PMR 33 72

PMR 38 PMR 35 315 2 PMRegion : rc2 Frame PMRegionSet, : 1) PMRegion 2) PMRegion PMRegion, 3), PMRegion ;, PMRegionSet : 1) PMRegionSet PMRegion layer PMRegionSet 1, PMRegion2 Set 2,, PMRegionSet n 2) PMRegionSet i PMRegionSet j (1 i, j n, i j),, PMRegionSet PMRegionSet : 1) PMRegionSet, 2) PMRegionSet PMRegion, IsUpPMRegion, PMRegion, 3) PMRegionSet PMRegion min max, max2min + 1 G, PMRegionGSet, PMRegionGSet PMRegionSet 2PMRe2 giongset PMRegionSet, 1) PMRegionGSet PMRegionSet 2PMRegionG2 Set 4), PMRegionSet 2PMRegionGSet PMR, PMR G PMRegionGSet, PMR PMR G, PMR PMRegionGLeftSet PMRegionGLeftSet PMRegionSet 2PMRegionGLeftSet PMRegionSet, 2) PMRegionGLeftSet PMRegionSet 2PMRegionGLeftSet 73

5), PMRegionSet 2PMRegionGSet PMR, PMR G PMRegionGSet, PMR G PMR, PMR PMRegionGRightSet PMRegionGRightSet PMRegionSet 2PMRegionGRightSet PMRegionSet, 2) PMRegion2 GRightSet PMRegionSet 2PMRegionGRightSet, PMRegionSet 1 { PMR 1, PMR 2,, PMR 8 } 2, { PMR 1, PMR 4, PMR 5, PMR 6 } { PMR 2, PMR 3, PMR 7, PMR 8 },, ( ) 316 PMRegion PMRegionAtomSet, PMRegion IsUpPMRegion IsLeftTextPMRegion, < PMR 18, PMR 20 > IsLeftTextPMRegion ReadFirst ( PMR 18,, PMR 20 ) ; < PMR 15, PMR 20 > IsUpPMRegion ReadFirst ( PMR 15, PMR 20 ), ReadFirst ( PMR 15, PMR 18 ) ReadFirst ( PMR 18, PMR 15 ), PMRegion : 1) PMRegion IsUpPMRegion G 1 = < V 1, E 1 >, V 1 = { PMR PMR PMRegionAtomSet}, E 1 = { e PMR i PMR j e < PMR i, PMR j > IsUpPMRegion} 3 1 PMRegion G 2) PMRegion IsLeftText PMRegion G 2 = < V 2, E 2 >, V 2 = V 1, E 2 = { e PMR i PMR j e < PMR i, PMR j > IsLeftTextPMRegion} 3) G = < V, E >,V = V 1, : E( G) ; ) < PMR i, PMR j > E( G 2 ) < PMR i, PMR j > E( G) ; ) < PMR i, PMR j > E( G 1 ) < PMR i, PMR j > ) if ( PMR i, PMR j PMR ( G 1, ), < PMR i, PMR j > E( G 2 ), PMR PMR k ( k i, j), < PMR i, PMR k > E ( G 2 ) ) { if ( PMR i PMR j, PMR j PMR anc ( : ) ) { < PMR i, PMR anc > E( G) ;} else if ( PMR i PMR j PMR same PMR G 1 PMR same PMR j PMR same ) { < PMR i, PMR > E( G) } ) if ( PMR i, PMR j PMR, < PMR i, PMR j > E( G 2 ), PMR PMR k ( k i, j), < PMR i, PMR k > E( G 2 ) ) { if ( PMR i PMR j, PMR i PMR des ( : ) ) { < PMR des, PMR j > E( G) ;} else if ( PMR i PMR j PMR same PMR PMR i PMR same PMR same ) { < PMR, PMR j > E( G) } 74

4) G PMRegion, 1 { PMR 9, PMR 10,, PMR 28 } G 3, PMR 13 PMR 18 PMR 10 PMR 15 PMR 20 PMR 25 PMR 22 PMR 27 PMRegionAtomSet 4 4 5 4 PMRegion, PMRegion, PMRegion,,, V Nikkan, 4, PMRegion, 3 2,, 4, : [1 ] 1 [ ][D]. :,2000. [2 ] Gatos B, Mantzaris S, Perantonis S, et al. Automatic page analysis of a digital library from newspaper archives[j ], International Journal of Digital Libraries, 2000,3 (1) :77-84. [3 ] Aiello M, Monz C, Todoran L, et al. Document understanding for a broad class of documents[j ]. International Jour2 nal on Document Analysis and Recognition, 2002,5 (1) :1-16. [4 ] Allen J. Maintaining knowledge about temporal intervals[j ]. Communications of the ACM, 1983,26 (11) :832-843. [5 ],,,. Web [J ].,2004,18 (1) :6-13. 75