Principles of Database Systems

Principles of Database Systems V. Megalooikonomou Συναρτησιακές Εξαρτήσεις (Functional Dependencies) (based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos)

General Overview Formal query languages rel algebra and calculi Commercial query languages SQL QBE, (QUEL) Integrity constraints Συναρτησιακές Εξαρτήσεις Normalization - good DB design

Overview Domain; Ref. Integrity constraints Assertions and Triggers Security Συναρτησιακές Εξαρτήσεις why definition Armstrong s axioms closure and cover

Συναρτησιακές Εξαρτήσεις motivation: good tables takes1 (ssn, c-id, grade, name, address) good or bad?

Συναρτησιακές Εξαρτήσεις takes1 (ssn, c-id, grade, name, address) Ssn c -id G rade Name A ddress 123 413 A s mith Main 123 415 B s mith Main 123 211 A s mith Main

Συναρτησιακές Εξαρτήσεις Bad - why? Ssn c -id G rade Name A ddress 123 413 A s mith Main 123 415 B s mith Main 123 211 A s mith Main

Συναρτησιακές Εξαρτήσεις Redundancy space inconsistencies insertion/deletion anomalies (later ) What caused the problem?

Συναρτησιακές Εξαρτήσεις name depends on ssn define depends Ssn c -id G rade Name A ddress 123 413 A s mith Main 123 415 B s mith Main 123 211 A s mith Main

Συναρτησιακές Εξαρτήσεις a b Definition: a functionally determines b ( a συναρτησιακά καθορίζει το b ) Ssn c -id G rade Name A ddress 123 413 A s mith Main 123 415 B s mith Main 123 211 A s mith Main

Συναρτησιακές Εξαρτήσεις Informally: if you know a, there is only one b to match Ssn c -id G rade Name A ddress 123 413 A s mith Main 123 415 B s mith Main 123 211 A s mith Main

Συναρτησιακές Εξαρτήσεις formally: Y ( t1[ x] = t2[ x] t1[ y] = t2[ y]) if two tuples agree on the attribute, they *must* agree on the Y attribute, too (e.g., if ssn is the same, so should address) a functional dependency is a generalization of the notion of a key -Why?

Συναρτησιακές Εξαρτήσεις, Y can be sets of attributes other examples?? Ssn c -id G rade Name A ddress 123 413 A s mith Main 123 415 B s mith Main 123 211 A s mith Main

Συναρτησιακές Εξαρτήσεις ssn -> name, address ssn, c-id -> grade Ssn c -id G rade Name A ddress 123 413 A s mith Main 123 415 B s mith Main 123 211 A s mith Main

Συναρτησιακές Εξαρτήσεις K is a superkey for relation R iff K -> R K is a candidate key for relation R iff: K -> R for no a K, a -> R

Συναρτησιακές Εξαρτήσεις Closure (κλειστότητα) of a set of FD: all implied FDs e.g.: ssn -> name, address ssn, c-id -> grade imply (συνάγουν) ssn, c-id -> grade, name, address ssn, c-id -> ssn

FDs - Armstrong s axioms Closure of a set of FD: all implied FDs e.g.: ssn -> name, address ssn, c-id -> grade how to find all the implied ones, systematically?

FDs - Armstrong s axioms Armstrong s axioms guarantee soundness and completeness: Reflexivity Y Y (ανακλαστικότητα): e.g., ssn, name -> ssn Augmentation Y W YW (επαυξητικότητα): e.g., ssn->name then ssn,grade-> ssn,grade

FDs - Armstrong s axioms Transitivity (μεταβατικότητα) Y Y Z Z ssn->address address-> county-tax-rate THEN: ssn-> county-tax-rate

FDs - Armstrong s axioms Reflexivity: Y Y Augmentation: Y W YW Transitivity: Y Y Z Z sound and complete

FDs finding the closure F+ F + = F repeat for each functional dependency f in F + apply reflexivity and augmentation rules on f add the resulting Συναρτησιακές Εξαρτήσεις to F + for each pair of Συναρτησιακές Εξαρτήσεις f 1 and f 2 in F + if f 1 and f 2 can be combined using transitivity then add the resulting functional dependency to F + until F + does not change any further We can further simplify manual computation of F + by using the following additional rules

FDs - Armstrong s axioms Additional rules: Union (ένωση) Decomposition (διασπαστικότητα) Pseudo-transitivity (ψευδομεταβατικότητα) Y Z YZ YW Y YZ Z Y W Z Z

FDs - Armstrong s axioms Prove Union from the three axioms: Y? Z YZ

FDs - Armstrong s axioms Prove Union from the three axioms: Y (1) Z (2) (1) + augm. w/ Z (2) + augm. w/ but (3) + (4) is ; thus and Z transitivity YZ Z (3) (4) YZ

FDs - Armstrong s axioms Prove Pseudo-transitivity: Y Y W Y YW YW? Y Z W Z Y Y Z Z

FDs - Armstrong s axioms Prove Decomposition Y Y W Y YW YZ? Y Z Y Y Z Z

FDs - Closure F+ Given a set F of FD (on a schema) F+ is the set of all implied FD. E.g., takes(ssn, c-id, grade, name, address) ssn, c-id -> grade ssn-> name, address }F

FDs - Closure F+ ssn, c-id -> grade ssn-> name, address ssn-> ssn ssn, c-id-> address c-id, address-> c-id... F+

FDs - Closure F+ R=(A,B,C,G,H,I) F= { A->B A->C CG->H CG->I B->H} Some members of F+: A->H AG->I CG->HI

FDs - Closure A+ Given a set F of FD (on a schema) A+ is the set of all attributes determined by A: takes(ssn, c-id, grade, name, address) ssn, c-id -> grade ssn-> name, address }F {ssn}+ =??

FDs - Closure A+ takes(ssn, c-id, grade, name, address) ssn, c-id -> grade ssn-> name, address }F {ssn}+ ={ssn, name, address }

FDs - Closure A+ takes(ssn, c-id, grade, name, address) ssn, c-id -> grade ssn-> name, address }F {c-id}+ =??

FDs - Closure A+ takes(ssn, c-id, grade, name, address) ssn, c-id -> grade ssn-> name, address }F {c-id, ssn}+ =??

FDs - Closure A+ if A+ = {all attributes of table} then A is a candidate key

FDs - Closure A+ Algorithm to compute α +, the closure of α under F result := α; while (changes to result) do for each β γin F do begin if β resul then result := result γ end

FDs - Closure A+ (example) R = (A, B, C, G, H, I) F = {A B, A C, CG H, CG I, B H} (AG) + 1. result = AG 2. result = ABCG (A C and A B) 3. result = ABCGH(CG H and CG AGBC) 4. result = ABCGHI (CG I and CG AGBCH) Is AG a candidate key? 1. Is AG a super key? 1. Does AG R? 2. Is any subset of AG a superkey? 1. Does A + R? 2. Does G + R?

FDs - A+ closure Diagrams AB->C (1) A->BC (2) B->C (3) A->B (4) A B C

FDs - canonical cover Fc Given a set F of FD (on a schema) Fc (ελάχιστο κάλλυμα) is a minimal set of equivalent FD. E.g., takes(ssn, c-id, grade, name, address) ssn, c-id -> grade ssn-> name, address ssn,name-> name, address ssn, c-id-> grade, name F

FDs - canonical cover Fc Fc ssn, c-id -> grade ssn-> name, address ssn,name-> name, address ssn, c-id-> grade, name F

FDs - canonical cover Fc why do we need it? define it properly compute it efficiently

FDs - canonical cover Fc why do we need it? easier to compute candidate keys define it properly compute it efficiently

FDs - canonical cover Fc define it properly - three properties every FD a->b has no extraneous attributes on the RHS same for the LHS all LHS parts are unique

FDs - canonical cover Fc extraneous attribute: if the closure is the same, before and after its elimination or if F-before implies F-after and vice-versa

FDs - canonical cover Fc ssn, c-id -> grade ssn-> name, address ssn,name-> name, address ssn, c-id-> grade, name F

FDs - canonical cover Fc Algorithm: examine each FD; drop extraneous LHS or RHS attributes merge FDs with same LHS repeat until no change

FDs - canonical cover Fc Trace algo for AB->C (1) A->BC (2) B->C (3) A->B (4)

FDs - canonical cover Fc Trace algo for AB->C (1) A->BC (2) B->C (3) A->B (4) (4) and (2) merge: AB->C (1) A->BC (2) B->C (3)

FDs - canonical cover Fc AB->C (1) A->BC (2) B->C (3) AB->C (1) A->B (2 ) B->C (3) in (2): C is extr.

FDs - canonical cover Fc AB->C (1) A->B (2 ) B->C (3) B->C (1 ) A->B (2 ) B->C (3) in (1): A is extr.

FDs - canonical cover Fc B->C (1 ) A->B (2 ) B->C (3) (1 ) and (3) merge A->B (2 ) B->C (3) nothing is extraneous: canonical cover

FDs - canonical cover Fc BEFORE AB->C (1) A->BC (2) B->C (3) A->B (4) AFTER A->B (2 ) B->C (3)

Overview - conclusions Domain; Ref. Integrity constraints Assertions and Triggers Συναρτησιακές Εξαρτήσεις why definition Armstrong s axioms closure and cover