Προχωρημένα Θέματα σε Κατανεμημένα Συστήματα GFS / HDFS

Σχετικά έγγραφα
Google File System, HDFS, BigTable, Hbase

Google File System, HDFS, BigTable και HBase. Διδάσκων: Νεκτάριος Κοζύρης, Kαθηγητής

Google File System, HDFS, BigTable, Hbase

Instruction Execution Times

EPL 660: Lab 4 Introduction to Hadoop

Physical DB Design. B-Trees Index files can become quite large for large main files Indices on index files are possible.

Google File System, HDFS, BigTable, Hbase

EE512: Error Control Coding

Block Ciphers Modes. Ramki Thurimella

Συστήματα Διαχείρισης Βάσεων Δεδομένων

the total number of electrons passing through the lamp.

Στο εστιατόριο «ToDokimasesPrinToBgaleisStonKosmo?» έξω από τους δακτυλίους του Κρόνου, οι παραγγελίες γίνονται ηλεκτρονικά.

The challenges of non-stable predicates

Homework 3 Solutions

(C) 2010 Pearson Education, Inc. All rights reserved.

The Simply Typed Lambda Calculus

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΕΣΣΑΛΙΑΣ ΤΜΗΜΑ ΠΟΛΙΤΙΚΩΝ ΜΗΧΑΝΙΚΩΝ ΤΟΜΕΑΣ ΥΔΡΑΥΛΙΚΗΣ ΚΑΙ ΠΕΡΙΒΑΛΛΟΝΤΙΚΗΣ ΤΕΧΝΙΚΗΣ. Ειδική διάλεξη 2: Εισαγωγή στον κώδικα της εργασίας

Approximation of distance between locations on earth given by latitude and longitude

Potential Dividers. 46 minutes. 46 marks. Page 1 of 11

2 Composition. Invertible Mappings

ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ

ΑΝΙΧΝΕΥΣΗ ΓΕΓΟΝΟΤΩΝ ΒΗΜΑΤΙΣΜΟΥ ΜΕ ΧΡΗΣΗ ΕΠΙΤΑΧΥΝΣΙΟΜΕΤΡΩΝ ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

Εγκατάσταση λογισμικού και αναβάθμιση συσκευής Device software installation and software upgrade

Εικονική Μνήμη (virtual memory)

Πανεπιστήµιο Πειραιώς Τµήµα Πληροφορικής

Other Test Constructions: Likelihood Ratio & Bayes Tests

Section 8.3 Trigonometric Equations

Κάθε γνήσιο αντίγραφο φέρει υπογραφή του συγγραφέα. / Each genuine copy is signed by the author.

Case 1: Original version of a bill available in only one language.

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

C.S. 430 Assignment 6, Sample Solutions

derivation of the Laplacian from rectangular to spherical coordinates

Abstract Storage Devices

Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1

Study of In-vehicle Sound Field Creation by Simultaneous Equation Method

ST5224: Advanced Statistical Theory II

ΠΑΝΔΠΙΣΗΜΙΟ ΜΑΚΔΓΟΝΙΑ ΠΡΟΓΡΑΜΜΑ ΜΔΣΑΠΣΤΥΙΑΚΧΝ ΠΟΤΓΧΝ ΣΜΗΜΑΣΟ ΔΦΑΡΜΟΜΔΝΗ ΠΛΗΡΟΦΟΡΙΚΗ

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

ΚΑΘΟΡΙΣΜΟΣ ΠΑΡΑΓΟΝΤΩΝ ΠΟΥ ΕΠΗΡΕΑΖΟΥΝ ΤΗΝ ΠΑΡΑΓΟΜΕΝΗ ΙΣΧΥ ΣΕ Φ/Β ΠΑΡΚΟ 80KWp

Example of the Baum-Welch Algorithm

A browser-based digital signing solution over the web

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

Modbus basic setup notes for IO-Link AL1xxx Master Block

[1] P Q. Fig. 3.1

Δίκτυα Δακτυλίου. Token Ring - Polling

ΠΕΡΙΕΧΟΜΕΝΑ. Κεφάλαιο 1: Κεφάλαιο 2: Κεφάλαιο 3:

CMOS Technology for Computer Architects

Προχωρηµένα Θέµατα Αρχιτεκτονικής Η/Υ. Storage Systems.. Λιούπης

Εθνικό Μετσόβιο Πολυτεχνείο Σχολή Ηλεκτρολόγων Μηχανικών - Μηχανικών Υπολογιστών. Αρχιτεκτονική Υπολογιστών Νεκτάριος Κοζύρης.

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 24/3/2007

CYTA Cloud Server Set Up Instructions

Elements of Information Theory

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΤΜΗΜΑ ΗΛΕΚΤΡΟΛΟΓΩΝ ΜΗΧΑΝΙΚΩΝ ΚΑΙ ΤΕΧΝΟΛΟΓΙΑΣ ΥΠΟΛΟΓΙΣΤΩΝ ΤΟΜΕΑΣ ΣΥΣΤΗΜΑΤΩΝ ΗΛΕΚΤΡΙΚΗΣ ΕΝΕΡΓΕΙΑΣ

Example Sheet 3 Solutions

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

Lecture 34 Bootstrap confidence intervals

Practice Exam 2. Conceptual Questions. 1. State a Basic identity and then verify it. (a) Identity: Solution: One identity is csc(θ) = 1

VBA ΣΤΟ WORD. 1. Συχνά, όταν ήθελα να δώσω ένα φυλλάδιο εργασίας με ασκήσεις στους μαθητές έκανα το εξής: Version ΗΜΙΤΕΛΗΣ!!!!

ΚΥΠΡΙΑΚΟΣ ΣΥΝΔΕΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY 21 ος ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ Δεύτερος Γύρος - 30 Μαρτίου 2011

Assalamu `alaikum wr. wb.

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

7 Present PERFECT Simple. 8 Present PERFECT Continuous. 9 Past PERFECT Simple. 10 Past PERFECT Continuous. 11 Future PERFECT Simple

Partial Trace and Partial Transpose

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 11/3/2006

Γιπλυμαηική Δπγαζία. «Ανθπυποκενηπικόρ ζσεδιαζμόρ γέθςπαρ πλοίος» Φοςζιάνηρ Αθανάζιορ. Δπιβλέπυν Καθηγηηήρ: Νηθφιανο Π. Βεληίθνο

Strain gauge and rosettes

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

Bayesian modeling of inseparable space-time variation in disease risk

(1) Describe the process by which mercury atoms become excited in a fluorescent tube (3)

6.003: Signals and Systems. Modulation

ΙΑΤΜΗΜΑΤΙΚΟ ΠΡΟΓΡΑΜΜΑ ΜΕΤΑΠΤΥΧΙΑΚΩΝ ΣΠΟΥ ΩΝ ΣΤΑ ΠΛΗΡΟΦΟΡΙΑΚΑ ΣΥΣΤΗΜΑΤΑ "VIDEO ΚΑΤΟΠΙΝ ΖΗΤΗΣΗΣ" ΑΝΝΑ ΜΟΣΧΑ Μ 11 / 99

Δίκτυα Επικοινωνιών ΙΙ: OSPF Configuration

Capacitors - Capacitance, Charge and Potential Difference

ΠΑΝΔΠΗΣΖΜΗΟ ΠΑΣΡΩΝ ΣΜΖΜΑ ΖΛΔΚΣΡΟΛΟΓΩΝ ΜΖΥΑΝΗΚΩΝ ΚΑΗ ΣΔΥΝΟΛΟΓΗΑ ΤΠΟΛΟΓΗΣΩΝ ΣΟΜΔΑ ΤΣΖΜΑΣΩΝ ΖΛΔΚΣΡΗΚΖ ΔΝΔΡΓΔΗΑ

Statistical Inference I Locally most powerful tests

Προσομοίωση BP με το Bizagi Modeler

Use Cases: μια σύντομη εισαγωγή. Heavily based on UML & the UP by Arlow and Neustadt, Addison Wesley, 2002

Calculating the propagation delay of coaxial cable

Matrices and Determinants

TaxiCounter Android App. Περδίκης Ανδρέας ME10069

Εισαγωγή στα Πληροφοριακά Συστήματα. Ενότητα 11: Αρχιτεκτονική Cloud

Advanced Subsidiary Unit 1: Understanding and Written Response

Congruence Classes of Invertible Matrices of Order 3 over F 2

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ ΣΧΟΛΗ ΠΟΛΙΤΙΚΩΝ ΜΗΧΑΝΙΚΩΝ. «Θεσμικό Πλαίσιο Φωτοβολταïκών Συστημάτων- Βέλτιστη Απόδοση Μέσω Τρόπων Στήριξης»

ΓΡΑΜΜΙΚΟΣ & ΔΙΚΤΥΑΚΟΣ ΠΡΟΓΡΑΜΜΑΤΙΣΜΟΣ

ΑΡΙΣΤΟΤΕΛΕΙΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΕΣΣΑΛΟΝΙΚΗΣ ΤΜΗΜΑ ΟΔΟΝΤΙΑΤΡΙΚΗΣ ΕΡΓΑΣΤΗΡΙΟ ΟΔΟΝΤΙΚΗΣ ΚΑΙ ΑΝΩΤΕΡΑΣ ΠΡΟΣΘΕΤΙΚΗΣ

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΕΙΡΑΙΑ ΤΜΗΜΑ ΝΑΥΤΙΛΙΑΚΩΝ ΣΠΟΥΔΩΝ ΠΡΟΓΡΑΜΜΑ ΜΕΤΑΠΤΥΧΙΑΚΩΝ ΣΠΟΥΔΩΝ ΣΤΗΝ ΝΑΥΤΙΛΙΑ

Section 1: Listening and responding. Presenter: Niki Farfara MGTAV VCE Seminar 7 August 2016

Code Breaker. TEACHER s NOTES

Benchmarking Personal Cloud Storage

Partial Differential Equations in Biology The boundary element method. March 26, 2013

department listing department name αχχουντσ ϕανε βαλικτ δδσϕηασδδη σδηφγ ασκϕηλκ τεχηνιχαλ αλαν ϕουν διξ τεχηνιχαλ ϕοην µαριανι

PARTIAL NOTES for 6.1 Trigonometric Identities

Διάρκεια μιας Ομολογίας (Duration) Ανοσοποίηση (Immunization)

Transcript:

Προχωρημένα Θέματα σε Κατανεμημένα Συστήματα GFS / HDFS

Προτωρημένα Θέματα σε Κατανεμημένα Σσστήματα Google File System (GFS)

Overview Main functionalities of a distributed filesystem Naming Load Distribution Persistent Storage File operations Should provide: Consistency Reliability Availability Scalability Security Transparency In the Cloud Storage and management of Big Data! Data produced and consumed at highly diverse geographic locations Hardware failures are very frequent Files are of enormous sizes 3

GFS: Google File System Google s scalable, distributed filesystem for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware. It delivers high aggregate performance to a large number of clients 4

Design Assumptions GFS has been designed with the following assumptions in mind: High frequency of failures Files are huge, typically multi-gb Two types of reads: Large streaming reads Small random reads Once written, files are seldom modified Modifications are mostly done by appending Random writes are supported, but inefficient Targeted towards data analytics Typically huge files High sustained bandwidth more important than low latency 5

Filesystem Interface create/delete open/close read/write Snapshot Creates a copy of a file or directory almost instantaneously, masking current mutations that may be going on. Record Append Append some data (a record ) at the end of a file Many record appends may be taking place concurrently There is no guarantee about the offset where each record will be written The only guarantee is that each record will be written at a contiguous part of the file The actual offset where the record was finally placed is returned at the end of the operation 6

Architecture Three types of nodes: One master node Multiple chunkservers Multiple clients 7

Architecture Chunks Fixed size -- 64MB Smaller file system structures Lower network load Each chunk has a unique 64-bit ID Chunks divided in 1024 blocks of 64KB each Each block has a 32-bit checksum No caching below the filesystem interface Most operations read data once! 8

Master Node The master nodes maintains: The namespace Access control information Mapping: file chunk IDs Mapping: chunk ID chunk location Periodically the master sends heartbeats to each chunkserver to receive updates on its state. The master also is in charge of chunk placement and chunk replication keeps track of chunk leases 9

Consistency Model A file can be in one of the following states: Consistent Data is consistent across all replicas Defined Data is consistent across all replicas The latest mutation is complete (and available to clients) Inconsistent When a mutation has failed 10

Leases and Mutations Mutation = an update of a file Mutations (updates) are applied on all replicas in the same order One of the replicas of a chunk is given a lease, appointing it as the primary replica. The primary replica decides on the order of the mutations. The client copies data to all replicas (including the primary one) in a pipeline route, respecting the relative proximity (for efficiency). 11

Write request Data flow during a write Current lease holder? Operation completed or Error report 3a. data 3b. data identity of primary location of replicas (cached by client) Operation completed Primary assign s/n to mutations Applies it Forward write request 3c. data Operation completed 12

Locking Read lock Prevents a file/directory from being deleted, renamed, or snapshotted Write lock Prevents a file from being modified by someone else 13

Replica Placement The master node is responsible for replica placement Replicate data for Reliability / availability Scalability (maximize network bandwidth utilization in reads) Number of replicas By default 3 User may ask for more 14

Load balancing As we said, the master node is responsible for replica placement Allocates a new chunk on a chunkserver whose utilization is below average 15

Garbage collection Mechanism similar to a Recycle Bin. A deleted file is renamed to a hidden filename Periodically, hidden names are considered for purging (permanent deletion). Purged only after 3 days have passed since deletion. 16

Προτωρημένα Θέματα σε Κατανεμημένα Σσστήματα Hadoop Distributed File System (HDFS)

Hadoop HDFS The primary storage FS used by Hadoop applications Splits up files in blocks (similar to GFS chunks) Strong points: Block replication Locality of data access Started as an open source implementation of GFS. 21

HDFS advantages Very large scale storage 10,000 nodes 100,000,000 files 10PB storage space Based on inexpensive commodity hardware Replication used to address failures Fault detection Optimized for batch processing Data location is visible to the application Computation can be transferred where the data is Very high aggregate bandwidth Storage may rely on heterogeneous underlying operating systems 22

HDFS principles Namespace is universal for the whole cluster Takes care of consistency Write-once-read-many Append is the only supported write operation Files are split in blocks Fixed size 128 MB Each block replicated on multiple DataNodes Based on smart clients Clients know the location of blocks Clients access data directly from Data Nodes 23

HDFS architecture NameNode Cluster Membership Client Secondary NameNode Cluster Membership NameNode : Maps a file to a file-id and list of MapNodes DataNode : Maps a block-id to a physical location on disk SecondaryNameNode: Periodic merge of Transaction log DataNodes 24

NameNode - DataNode NameNode Metadata in RAM No paging! Metadata types List of files Mapping: file blocks Mapping: block DataNode File attributes Logging File creations, deletions, etc. DataNode Stores blocks Data + CRC stored on local FS (e.g., EXT3) Serves data to clients Block Reports Periodically sends a report on blocks state to the NameNode Helps with pipelining data For transferring data to other nodes 25

Writing / Reading in HDFS 26

Write Data Pipelining The Client receives a list of Data Nodes that will serve as replicas of a given block Client streams data to the first DataNode The first DataNode streams data to the next DataNode, etc. 27

Block Replication Policy Replication Policy A replica on the local node Another replica on another node of the same rack A third replica on a node of a different rack Extra replicas on random other nodes Clients read from the closest replica 28

Replication Datanodes 1 2 2 1 1 4 2 5 5 3 3 4 3 5 4 Rack 1 Rack 2 29

Checksums Use of CRC32 during file creation CRC32 per 512 bytes DataNodes store the checksums When reading data If the client detects a checksum error, it tries a different replica DataNode 30

NameNode failures A single point of failure Operation log is stored on different local filesystems In case of failure, a new NameNode may be started to continue where the failed NameNode left off. 31