Physical DB Design. B-Trees Index files can become quite large for large main files Indices on index files are possible.

Σχετικά έγγραφα
Αρχεία και Βάσεις Δεδομένων

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

Πανεπιστήμιο Κρήτης, Τμήμα Επιστήμης Υπολογιστών. Δημήτρης Πλεξουσάκης. Physical DB Design

The challenges of non-stable predicates

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

Ευρετήρια. Βάσεις Δεδομένων. Διδάσκων: Μαρία Χαλκίδη

C.S. 430 Assignment 6, Sample Solutions

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

2 Composition. Invertible Mappings

EE512: Error Control Coding

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

Συστήματα Διαχείρισης Βάσεων Δεδομένων

ΚΥΠΡΙΑΚΟΣ ΣΥΝΔΕΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY 21 ος ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ Δεύτερος Γύρος - 30 Μαρτίου 2011

Finite Field Problems: Solutions

Section 8.3 Trigonometric Equations

Homework 3 Solutions

ST5224: Advanced Statistical Theory II

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 24/3/2007

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

Matrices and Determinants

Block Ciphers Modes. Ramki Thurimella

Approximation of distance between locations on earth given by latitude and longitude

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

Fractional Colorings and Zykov Products of graphs

ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΤΜΗΜΑ ΠΛΗΡΟΦΟΡΙΚΗΣ. ΕΠΛ342: Βάσεις Δεδομένων. Χειμερινό Εξάμηνο Φροντιστήριο 10 ΛΥΣΕΙΣ. Επερωτήσεις SQL

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

department listing department name αχχουντσ ϕανε βαλικτ δδσϕηασδδη σδηφγ ασκϕηλκ τεχηνιχαλ αλαν ϕουν διξ τεχηνιχαλ ϕοην µαριανι

Example Sheet 3 Solutions

Math 6 SL Probability Distributions Practice Test Mark Scheme

Other Test Constructions: Likelihood Ratio & Bayes Tests

The Simply Typed Lambda Calculus

2. THEORY OF EQUATIONS. PREVIOUS EAMCET Bits.

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Assalamu `alaikum wr. wb.

Areas and Lengths in Polar Coordinates

Section 7.6 Double and Half Angle Formulas

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

Every set of first-order formulas is equivalent to an independent set

Areas and Lengths in Polar Coordinates

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 11/3/2006

Πρόβλημα 1: Αναζήτηση Ελάχιστης/Μέγιστης Τιμής

derivation of the Laplacian from rectangular to spherical coordinates

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

Right Rear Door. Let's now finish the door hinge saga with the right rear door

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

Instruction Execution Times

Capacitors - Capacitance, Charge and Potential Difference

5.4 The Poisson Distribution.

9.09. # 1. Area inside the oval limaçon r = cos θ. To graph, start with θ = 0 so r = 6. Compute dr

Solution Series 9. i=1 x i and i=1 x i.

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

PARTIAL NOTES for 6.1 Trigonometric Identities

ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 7η: Consumer Behavior Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

Inverse trigonometric functions & General Solution of Trigonometric Equations

Συστήματα Διαχείρισης Βάσεων Δεδομένων

Jordan Form of a Square Matrix

4.6 Autoregressive Moving Average Model ARMA(1,1)

Homework 8 Model Solution Section

Statistical Inference I Locally most powerful tests

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Reminders: linear functions

Section 9.2 Polar Equations and Graphs

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

[1] P Q. Fig. 3.1

Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1

EPL 603 TOPICS IN SOFTWARE ENGINEERING. Lab 5: Component Adaptation Environment (COPE)

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Code Breaker. TEACHER s NOTES

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

ΑΡΙΣΤΟΤΕΛΕΙΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΕΣΣΑΛΟΝΙΚΗΣ ΤΜΗΜΑ ΟΔΟΝΤΙΑΤΡΙΚΗΣ ΕΡΓΑΣΤΗΡΙΟ ΟΔΟΝΤΙΚΗΣ ΚΑΙ ΑΝΩΤΕΡΑΣ ΠΡΟΣΘΕΤΙΚΗΣ

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

ΤΕΧΝΟΛΟΓΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΤΜΗΜΑ ΝΟΣΗΛΕΥΤΙΚΗΣ

Galatia SIL Keyboard Information

AKAΔΗΜΙΑ ΕΜΠΟΡΙΚΟΥ ΝΑΥΤΙΚΟΥ ΜΑΚΕΔΟΝΙΑΣ ΣΧΟΛΗ ΜΗΧΑΝΙΚΩΝ ΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ ΘΕΜΑ: Η ΧΡΗΣΗ ΒΙΟΚΑΥΣΙΜΩΝ ΣΤΗΝ ΝΑΥΤΙΛΙΑ ΠΛΕΟΝΕΚΤΗΜΑΤΑ-ΜΕΙΟΝΕΚΤΗΜΑΤΑ ΠΡΟΟΠΤΙΚΕΣ

ΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ. ΘΕΜΑ: «ιερεύνηση της σχέσης µεταξύ φωνηµικής επίγνωσης και ορθογραφικής δεξιότητας σε παιδιά προσχολικής ηλικίας»

(1) Describe the process by which mercury atoms become excited in a fluorescent tube (3)

Lecture 2. Soundness and completeness of propositional logic

the total number of electrons passing through the lamp.

ΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ ΒΑΛΕΝΤΙΝΑ ΠΑΠΑΔΟΠΟΥΛΟΥ Α.Μ.: 09/061. Υπεύθυνος Καθηγητής: Σάββας Μακρίδης

Srednicki Chapter 55

On a four-dimensional hyperbolic manifold with finite volume

Math221: HW# 1 solutions

Γιπλυμαηική Δπγαζία. «Ανθπυποκενηπικόρ ζσεδιαζμόρ γέθςπαρ πλοίος» Φοςζιάνηρ Αθανάζιορ. Δπιβλέπυν Καθηγηηήρ: Νηθφιανο Π. Βεληίθνο

Pg The perimeter is P = 3x The area of a triangle is. where b is the base, h is the height. In our case b = x, then the area is

Οδηγίες Αγοράς Ηλεκτρονικού Βιβλίου Instructions for Buying an ebook

( y) Partial Differential Equations

ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 9η: Basics of Game Theory Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

Potential Dividers. 46 minutes. 46 marks. Page 1 of 11

CRASH COURSE IN PRECALCULUS

Πανεπιστήμιο Πειραιώς Τμήμα Πληροφορικής Πρόγραμμα Μεταπτυχιακών Σπουδών «Πληροφορική»

Concrete Mathematics Exercises from 30 September 2016

1) Formulation of the Problem as a Linear Programming Model

ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 10η: Basics of Game Theory part 2 Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

Εγκατάσταση λογισμικού και αναβάθμιση συσκευής Device software installation and software upgrade

Numerical Analysis FMN011

Transcript:

B-Trees Index files can become quite large for large main files Indices on index files are possible 3 rd -level index 2 nd -level index 1 st -level index Main file 1

The 1 st -level index consists of pairs (v,b), where b is a pointer to a block B of the main file and v is the key of the first record in the block. The index is also sorted by key values. The 2 nd -level index consists of pairs (v,b) where b points to a block of the 1 st -level index whose first key is v, and so on. Multilevel indexing can be considerably more efficient than a single level of indexing. A multilevel index structure can have many forms. These are collectively referred to as balanced trees (B-trees). The main file is part of the B-tree and it is assumed that the file contains unpinned records. 2

Although the insertion / deletion procedures of single-level index structures can be used, they do not result in the optimum organization of the B-tree: nodes can have between one and the maximum possible number of records. B-trees use a particular insertion / deletion strategy that ensures that no node, except possibly the root, is less than half full. A B-tree is characterized by two parameters d,e: d and e are integers such that, the number of index records a block can hold is 2*d-1 and the number of records of the main file a block can hold is 2*e-1. Convention: the key value in the first record is omitted. It is assumed that all values that are less than the key value of the second record are covered by the missing value. 3

Operations on B-trees: 1. Lookup: to search for a record with key value v, find a path from the root of the B-tree to some leaf node where the desired record will be found if it exists Paths in B-trees: every search path begins at the root. When a block B is reached during the search, if B is a leaf node, then B has to be examined for a record with key value v. If B is not a leaf, then it is an index record. The key value that covers v has to be found and the associated pointer must be followed, leading to another index block or main file block. Property: the key value in record i of block B is the lowest key value of any leaf descending from the i-th child of B. 4

This property and the fact that the main file is sorted, guarantee that, if a record exists in a leaf node, then it can only be found by following the pointers as described above. 2. Modification: If a key field is to be modified, then modification is performed by a deletion followed by an insertion. Otherwise, modification is a lookup followed by the rewriting of the record involved. 5

Example: d=e=2, i.e., 3 records in blocks of the index and main files. B1 25 144 B2 B3 B4 9 64 100 196 B5 B6 B7 B8 B9 B10 B11 1 4 9 16 25 36 49 64 81 100121 144 169 196 225 256 6

3. Insertion: to insert a record with key value v Apply the lookup procedure to find the block B in which this record belongs If there are fewer than 2e-1 records in B, insert the new record in sorted order If there are already 2e-1 records in B, create a new block B1 and divide the records of B and the inserted record in two groups of e records each. The first e go to block B and the remaining e to block B1. A record for B1 needs to be inserted in the parent record of B. The insertion procedure is applied recursively (with d in place of e) for inserting a record for B1 to the right of the record for B. 7

3. Insertion: to insert a record with key value v If many ancestors of B have the maximum 2d-1 records, the effects of inserting a record may ripple up the tree. It is only ancestors of B that are affected. If the insertion ripples up to the root, then the root node is split and a new node with two children is created. Example: assume we want to insert the record with key value 32 in the B-tree of the previous example Record 32 belongs to B7, but B7 is full. A new block (B12) is created and a record for B12 must be inserted in B3. B3 is also full, so a new block (B13) is created. 8

Example: d=e=2, i.e., 3 records in blocks of the index and main files. B1 25 144 B2 B3 B4 9 64 100 196 B5 B6 B7 B8 B9 B10 B11 1 4 9 16 25 36 49 64 81 100121 144 169 196 225 256 Record 32 belongs to B7, but B7 is full; a new block (B12) must be created and a record must be inserted in B3. But B3 is also full. 9

B15 64 B1 25 144 B14 144 B13 B2 B3 100 B4 9 36 196 B5 B6 B7 B8 B9 B10 B11 1 4 9 16 25 32 64 81 100121 144 169 196 225 256 36 49 B12 10

4. Deletion: to delete a record with key value v Apply the lookup procedure to find the block B in which this record belongs If after the deletion, B still has e or more records, we re done. If the deleted record was the first in B, the value of the parent record must be changed to contain the value of the new first key of B. If B is the first child of its parent node, the parent has no key value for B, so the parent s parent must be changed. The process continues until an ancestor A1 of B is found that is not the first child of its parent A2. Then, the new lowest value of B goes in the record of A2 that points to A1. 11

4. Deletion: to delete a record with key value v If after deletion block B has e-1 records, we examine the block B1 that has the same parent as B and that is immediately next to B, If B1 has more than e records, we distribute the records of B and B1 as evenly as possible, keeping them sorted. The key values in the parents of B and B1 may need to be modified. If B1 has only e records, then combine the records of B and B1. This requires that a record be deleted from the parent node. Example: delete the record with key value 64 in the B-tree of the previous example. 12

B15 64 B1 25 144 B14 144 B13 B2 B3 100 B4 9 36 196 B5 B6 B7 B8 B9 B10 B11 1 4 9 16 25 32 64 81 100121 144 169 196 225 256 36 49 B12 13

B-tree after the deletion of record 64 B14 25 81 B13 B2 B3 9 36 144 196 B5 B6 B7 B8 B10 B11 1 4 9 16 25 32 81 100121 144 169 196 225 256 36 49 B12 14

Cost of Operations on B-trees: Assume that a file of n records is organized as a B-tree with parameters d and e. Then the tree will have at most n/e leaf nodes and n/de parent nodes of leaf nodes, n/d 2 e parents of parents of leaf nodes, etc. Lookup: if there exist i nodes on a path from the root to a leaf node where a particular record is located, then i block accesses are needed. For insertion, deletion and modification, 2 + log d (n/e) accesses are required on average We will assume that all operations take 2 + log d (n/e) block accesses on average. 15

Example: A file contains 1,000,000 records of 200 bytes each. Blocks are 2 12 =4096 bytes long. The length of the key fields is 20 bytes. Pointers take 4 bytes. e=10 (up to 20 records can fit in a block) 171 index records can fit in a block (171*20 + 171*4 = 4084). Thus, d=86. The expected number of block accessed per operation is 2 + log d (n/e) = 2 + log 86 (1000000 / 10) < 5 16