Benchmarking Personal Cloud Storage

Σχετικά έγγραφα
Εισαγωγή στα Πληροφοριακά Συστήματα. Ενότητα 11: Αρχιτεκτονική Cloud

Connected Threat Defense

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

Block Ciphers Modes. Ramki Thurimella

Στο εστιατόριο «ToDokimasesPrinToBgaleisStonKosmo?» έξω από τους δακτυλίους του Κρόνου, οι παραγγελίες γίνονται ηλεκτρονικά.

Other Test Constructions: Likelihood Ratio & Bayes Tests

Αλίκη Λέσση. CNS&P Presales Engineer

The challenges of non-stable predicates

5.4 The Poisson Distribution.

Δίκτυα Επικοινωνιών ΙΙ: OSPF Configuration

Instruction Execution Times

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Connected Threat Defense

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΕΙΡΑΙΩΣ ΤΜΗΜΑ ΠΛΗΡΟΦΟΡΙΚΗΣ ΠΜΣ «ΠΡΟΗΓΜΕΝΑ ΣΥΣΤΗΜΑΤΑ ΠΛΗΡΟΦΟΡΙΚΗΣ» ΚΑΤΕΥΘΥΝΣΗ «ΕΥΦΥΕΙΣ ΤΕΧΝΟΛΟΓΙΕΣ ΕΠΙΚΟΙΝΩΝΙΑΣ ΑΝΘΡΩΠΟΥ - ΥΠΟΛΟΓΙΣΤΗ»

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ Η/Υ & ΠΛΗΡΟΦΟΡΙΚΗΣ. του Γεράσιμου Τουλιάτου ΑΜ: 697

TaxiCounter Android App. Περδίκης Ανδρέας ME10069

the total number of electrons passing through the lamp.

Statistical Inference I Locally most powerful tests

(C) 2010 Pearson Education, Inc. All rights reserved.

ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΕΣΣΑΛΙΑΣ ΤΜΗΜΑ ΠΟΛΙΤΙΚΩΝ ΜΗΧΑΝΙΚΩΝ ΤΟΜΕΑΣ ΥΔΡΑΥΛΙΚΗΣ ΚΑΙ ΠΕΡΙΒΑΛΛΟΝΤΙΚΗΣ ΤΕΧΝΙΚΗΣ. Ειδική διάλεξη 2: Εισαγωγή στον κώδικα της εργασίας

Παλεπηζηήκην Πεηξαηώο Τκήκα Πιεξνθνξηθήο Πξόγξακκα Μεηαπηπρηαθώλ Σπνπδώλ «Πξνεγκέλα Σπζηήκαηα Πιεξνθνξηθήο»

Επίπεδο Μεταφοράς. (ανεβαίνουμε προς τα πάνω) Εργαστήριο Δικτύων Υπολογιστών Τμήμα Μηχανικών Η/Υ και Πληροφορικής

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

UDZ Swirl diffuser. Product facts. Quick-selection. Swirl diffuser UDZ. Product code example:

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

CYTA Cloud Server Set Up Instructions

EPL 603 TOPICS IN SOFTWARE ENGINEERING. Lab 5: Component Adaptation Environment (COPE)

Business English. Ενότητα # 9: Financial Planning. Ευαγγελία Κουτσογιάννη Τμήμα Διοίκησης Επιχειρήσεων

Υλοποίηση Δικτυακών Υποδομών και Υπηρεσιών: OSPF Cost

CMOS Technology for Computer Architects

ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ

Approximation of distance between locations on earth given by latitude and longitude

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

Calculating the propagation delay of coaxial cable

Προσομοίωση BP με το Bizagi Modeler

Τ.Ε.Ι. ΔΥΤΙΚΗΣ ΜΑΚΕΔΟΝΙΑΣ ΠΑΡΑΡΤΗΜΑ ΚΑΣΤΟΡΙΑΣ ΤΜΗΜΑ ΔΗΜΟΣΙΩΝ ΣΧΕΣΕΩΝ & ΕΠΙΚΟΙΝΩΝΙΑΣ

Τεχνολογίες Παγκόσμιου Ιστού. 1η διάλεξη

Fractional Colorings and Zykov Products of graphs

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

Homework 3 Solutions

Μηχανική Μάθηση Hypothesis Testing

ΓΕΩΜΕΣΡΙΚΗ ΣΕΚΜΗΡΙΩΗ ΣΟΤ ΙΕΡΟΤ ΝΑΟΤ ΣΟΤ ΣΙΜΙΟΤ ΣΑΤΡΟΤ ΣΟ ΠΕΛΕΝΔΡΙ ΣΗ ΚΤΠΡΟΤ ΜΕ ΕΦΑΡΜΟΓΗ ΑΤΣΟΜΑΣΟΠΟΙΗΜΕΝΟΤ ΤΣΗΜΑΣΟ ΨΗΦΙΑΚΗ ΦΩΣΟΓΡΑΜΜΕΣΡΙΑ

Test Data Management in Practice

PortSip Softphone. Ελληνικά Ι English 1/20

Bayesian modeling of inseparable space-time variation in disease risk

ίκτυο προστασίας για τα Ελληνικά αγροτικά και οικόσιτα ζώα on.net e-foundatio // itute: toring Insti SAVE-Monit

ΙΑΤΜΗΜΑΤΙΚΟ ΠΡΟΓΡΑΜΜΑ ΜΕΤΑΠΤΥΧΙΑΚΩΝ ΣΠΟΥ ΩΝ ΣΤΑ ΠΛΗΡΟΦΟΡΙΑΚΑ ΣΥΣΤΗΜΑΤΑ "VIDEO ΚΑΤΟΠΙΝ ΖΗΤΗΣΗΣ" ΑΝΝΑ ΜΟΣΧΑ Μ 11 / 99

ΤΕΧΝΟΛΟΓΙΚΟ ΕΚΠΑΙΔΕΥΤΙΚΟ ΙΔΡΥΜΑ ΚΡΗΤΗΣ. Σχολή Τεχνολογικών Εφαρμογών Τμήμα Εφαρμοσμένης Πληροφορικής & Πολυμέσων

The Simply Typed Lambda Calculus

Συντακτικές λειτουργίες

Web and HTTP. Βασικά Συστατικά: Web Server Web Browser HTTP Protocol

Special edition of the Technical Chamber of Greece on Video Conference Services on the Internet, 2000 NUTWBCAM

Bring Your Own Device (BYOD) Legal Challenges of the new Business Trend MINA ZOULOVITS LAWYER, PARNTER FILOTHEIDIS & PARTNERS LAW FIRM

Διπλωματική Εργασία του φοιτητή του Τμήματος Ηλεκτρολόγων Μηχανικών και Τεχνολογίας Υπολογιστών της Πολυτεχνικής Σχολής του Πανεπιστημίου Πατρών

ΣΤΥΛΙΑΝΟΥ ΣΟΦΙΑ

Επιμέλεια: Αδαμαντία Τραϊφόρου (Α.Μ 263) Επίβλεψη: Καθηγητής Μιχαήλ Κονιόρδος

TMA4115 Matematikk 3

department listing department name αχχουντσ ϕανε βαλικτ δδσϕηασδδη σδηφγ ασκϕηλκ τεχηνιχαλ αλαν ϕουν διξ τεχηνιχαλ ϕοην µαριανι

ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 7η: Consumer Behavior Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

Αναερόβια Φυσική Κατάσταση

Assalamu `alaikum wr. wb.

GREECE BULGARIA 6 th JOINT MONITORING

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

SPEEDO AQUABEAT. Specially Designed for Aquatic Athletes and Active People

ST5224: Advanced Statistical Theory II

«Χρήσεις γης, αξίες γης και κυκλοφοριακές ρυθμίσεις στο Δήμο Χαλκιδέων. Η μεταξύ τους σχέση και εξέλιξη.»

ιαδικτυακές Εφαρµογές

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Ηλεκτρονικός Ιατρικός Φάκελος: Νέες Τάσεις, Κατανεµηµένες Αρχιτεκτονικές και Κινητές

Γκάγκος ηµήτρης Μεταπτυχιακός Φοιτητής

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

LESSON 14 (ΜΑΘΗΜΑ ΔΕΚΑΤΕΣΣΕΡΑ) REF : 202/057/34-ADV. 18 February 2014

Εγκατάσταση λογισμικού και αναβάθμιση συσκευής Device software installation and software upgrade

Abstract Storage Devices

Math 6 SL Probability Distributions Practice Test Mark Scheme

Exercises to Statistics of Material Fatigue No. 5

EE512: Error Control Coding

Ασφάλεια Πληροφοριακών και Επικοινωνιακών Συστημάτων

Current Status of PF SAXS beamlines. 07/23/2014 Nobutaka Shimizu

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!

MUM ATHENS, GREECE 2015

Ανοιχτές Διαδικτυακές Υπηρεσίες και Υποδομές Cloud

«ΑΓΡΟΤΟΥΡΙΣΜΟΣ ΚΑΙ ΤΟΠΙΚΗ ΑΝΑΠΤΥΞΗ: Ο ΡΟΛΟΣ ΤΩΝ ΝΕΩΝ ΤΕΧΝΟΛΟΓΙΩΝ ΣΤΗΝ ΠΡΟΩΘΗΣΗ ΤΩΝ ΓΥΝΑΙΚΕΙΩΝ ΣΥΝΕΤΑΙΡΙΣΜΩΝ»

Advanced Subsidiary Unit 1: Understanding and Written Response

Κεφάλαιο 1ο Πολυπρογραμματισμός Πολυδιεργασία Κατηγορίες Λειτουργικών Συστημάτων

Πέτρος Γ. Οικονομίδης Πρόεδρος και Εκτελεστικός Διευθυντής

Partial Trace and Partial Transpose

Terabyte Technology Ltd

VBA ΣΤΟ WORD. 1. Συχνά, όταν ήθελα να δώσω ένα φυλλάδιο εργασίας με ασκήσεις στους μαθητές έκανα το εξής: Version ΗΜΙΤΕΛΗΣ!!!!

Policy Coherence. JEL Classification : J12, J13, J21 Key words :

ΠΟΛΥΤΕΧΝΕΙΟ ΚΡΗΤΗΣ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ ΠΕΡΙΒΑΛΛΟΝΤΟΣ

EPL324: Tutorials* on Communications and Networks Tutorial 2: Chapter 1 Review Questions

Physical DB Design. B-Trees Index files can become quite large for large main files Indices on index files are possible.

Study of urban housing development projects: The general planning of Alexandria City

Cloud Computing with Google and Microsoft. Despoina Trikomitou Andreas Diavastos Class: EPL425

PARTIAL NOTES for 6.1 Trigonometric Identities

ΣΔΥΝΟΛΟΓΗΚΟ ΔΚΠΑΗΓΔΤΣΗΚΟ ΗΓΡΤΜΑ ΗΟΝΗΧΝ ΝΖΧΝ «ΗΣΟΔΛΗΓΔ ΠΟΛΗΣΗΚΖ ΔΠΗΚΟΗΝΧΝΗΑ:ΜΔΛΔΣΖ ΚΑΣΑΚΔΤΖ ΔΡΓΑΛΔΗΟΤ ΑΞΗΟΛΟΓΖΖ» ΠΣΤΥΗΑΚΖ ΔΡΓΑΗΑ ΔΤΑΓΓΔΛΗΑ ΣΔΓΟΤ

ΣΧΕΔΙΑΣΜΟΣ ΔΙΚΤΥΩΝ ΔΙΑΝΟΜΗΣ. Η εργασία υποβάλλεται για τη μερική κάλυψη των απαιτήσεων με στόχο. την απόκτηση του διπλώματος

Τομέας: Ανανεώσιμων Ενεργειακών Πόρων Εργαστήριο: Σχεδιομελέτης και κατεργασιών

Transcript:

Benchmarking Personal Cloud Storage Idilio Drago Enrico Bocchi Marco Mellia Herman Slatman Aiko Pras IMC 2013 Barcelona

Motivation and goals 1 Personal Cloud Storage: very popular, large amount of traffic Dropbox: 100 million users, 1 billion uploads / day

Motivation and goals 1 Personal Cloud Storage: very popular, large amount of traffic Dropbox: 100 million users, 1 billion uploads / day Lots of different applications Do they follow different design? Which are their client capabilities? What is the impact on end-user performance?

Motivation and goals 1 Personal Cloud Storage: very popular, large amount of traffic Dropbox: 100 million users, 1 billion uploads / day Lots of different applications Do they follow different design? Which are their client capabilities? What is the impact on end-user performance?

Which infrastructure is being used? 2

Which infrastructure is being used? 2 Manual preliminary evaluation of each application Mostly HTTPS Control and Storage servers

Which infrastructure is being used? 2 Manual preliminary evaluation of each application Mostly HTTPS Control and Storage servers Where are their data centers Collect the list of server names of each application Resolve the names using 2,000 open DNS resolvers Collect the list of server IP addresses

Which infrastructure is being used? 2 Manual preliminary evaluation of each application Mostly HTTPS Control and Storage servers Where are their data centers Collect the list of server names of each application Resolve the names using 2,000 open DNS resolvers Collect the list of server IP addresses Geolocation IP addresses Use common techniques Example: Shortest RTT from PlanetLab nodes

Where are data centers located? 3

Where are data centers located? 3 Dropbox Wuala Cloud Drive SkyDrive Few Control and Storage locations Centralized services

Where are data centers located? 3 Google Drive Users reach the closest Google s Edge Point of Presence 1 Reduced RTT, offload public Internet 1 https://peering.google.com/about/delivery ecosystem.html

Protocol design: What happens when the app is idle? 4

Protocol design: What happens when the app is idle? 4 Cumulative traffic (kb) 250 200 150 100 50 0 Dropbox SkyDrive Wuala Google Drive 0 2 4 6 8 10 12 14 16 Time (min) Generally silent protocols e.g., SkyDrive: 1 min polling interval (32 b/s)

Protocol design: What happens when the app is idle? 4 Cumulative traffic (kb) 800 600 400 200 0 Dropbox SkyDrive Wuala Google Drive Cloud Drive 0 2 4 6 8 10 12 14 16 Time (min) Cloud Drive: polling every 15 s over a new HTTPS session 6 kb/s per user 65 MB per day per user 1 million users 6 Gb/s of signaling traffic

Methodology Creating benchmarks 5 Ad-hoc crafted workloads to detect specific features Check implications on end-user performance

Methodology Creating benchmarks 5 0: Parameters Test computer FTP server 1: Create files Testing application Provider App-under-test 2: Upload files 3: Statistics Application-under-test service clients running on a virtual machine Testing application python scripts controlling the experiments

Methodology Creating benchmarks 5 0: Parameters Test computer FTP server 1: Create files Testing application Provider App-under-test 2: Upload files 3: Statistics Configuration parameters e.g., number of repetitions Workload definition e.g., number of files, type of content etc.

Methodology Creating benchmarks 5 0: Parameters Test computer FTP server 1: Create files Testing application Provider App-under-test 2: Upload files 3: Statistics Workload manipulated via FTP Files created at run-time e.g., 1 MB file with random text

Methodology Creating benchmarks 5 0: Parameters Test computer FTP server 1: Create files Testing application Provider App-under-test 2: Upload files 3: Statistics Application-under-test synchronizes files Traffic is intercepted (packet level)

Methodology Creating benchmarks 5 0: Parameters Test computer FTP server 1: Create files Testing application Provider App-under-test 2: Upload files 3: Statistics Post-process to calculate performance metrics Compute upload time, overhead, etc.

What are the client capabilities? 6

What are the client capabilities? 6 Dropbox SkyDrive Wuala Google Drive Cloud Drive Chunking 4 MB variable variable 8 MB Bundling Deduplication Delta encoding Compression always never never smart never Clients differ considerably

What are the client capabilities? 6 Dropbox SkyDrive Wuala Google Drive Cloud Drive Chunking 4 MB variable variable 8 MB Bundling Deduplication Delta encoding Compression always never never smart never Uploading a large file Upload it as a single content?... or chop it into smaller chunks?

What are the client capabilities? 6 Dropbox SkyDrive Wuala Google Drive Cloud Drive Chunking 4 MB variable variable 8 MB Bundling Deduplication Delta encoding Compression always never never smart never Uploading lots of files Create a bundle?... or one/many TCP connections/transactions? Google Drive and Cloud Drive open one/three TCP connections per file

What are the client capabilities? 6 Dropbox SkyDrive Wuala Google Drive Cloud Drive Chunking 4 MB variable variable 8 MB Bundling Deduplication Delta encoding Compression always never never smart never Creating a second copy of the same file Re-upload it completely?... or just update metadata? Deduplication in Wuala is compatible with per-user encryption

What are the client capabilities? 6 Dropbox SkyDrive Wuala Google Drive Cloud Drive Chunking 4 MB variable variable 8 MB Bundling Deduplication Delta encoding Compression always never never smart never Modifying only a fraction of a file Re-upload everything?... or just the modified portion? This has some implications with chunking

What are the client capabilities? 6 Dropbox SkyDrive Wuala Google Drive Cloud Drive Chunking 4 MB variable variable 8 MB Bundling Deduplication Delta encoding Compression always never never smart never Uploading compressible content? Transmit it as quickly as possible?... or compress content?

Example: Compression 7 Upload (MB) 3 2 1 Dropbox SkyDrive Wuala Text files Cloud Drive Google Drive Upload (MB) 3 2 1 Dropbox SkyDrive Wuala Fake JPEGs Cloud Drive Google Drive 0 0.1 0.5 1 1.5 2 File size (MB) 0 0.1 0.5 1 1.5 2 File size (MB)

Example: Compression 7 Upload (MB) 3 2 1 Dropbox SkyDrive Wuala Text files Cloud Drive Google Drive Upload (MB) 3 2 1 Dropbox SkyDrive Wuala Fake JPEGs Cloud Drive Google Drive 0 0.1 0.5 1 1.5 2 File size (MB) 0 0.1 0.5 1 1.5 2 File size (MB)

Example: Compression 7 Upload (MB) 3 2 1 Dropbox SkyDrive Wuala Text files Cloud Drive Google Drive Upload (MB) 3 2 1 Dropbox SkyDrive Wuala Fake JPEGs Cloud Drive Google Drive 0 0.1 0.5 1 1.5 2 File size (MB) 0 0.1 0.5 1 1.5 2 File size (MB) Only Dropbox and Google Drive compress files Dropbox has higher control overhead

Example: Compression 7 Upload (MB) 3 2 1 Dropbox SkyDrive Wuala Text files Cloud Drive Google Drive Upload (MB) 3 2 1 Dropbox SkyDrive Wuala Fake JPEGs Cloud Drive Google Drive 0 0.1 0.5 1 1.5 2 File size (MB) 0 0.1 0.5 1 1.5 2 File size (MB) Only Dropbox and Google Drive compress files Dropbox has higher control overhead Google Drive identifies JPEG content and skip compressing it

Implications to end-user performance? 8

Implications to end-user performance? 8 100 10 Dropbox SkyDrive Wuala Cloud Drive Google Drive Time (s) 1 0.1 0.01 1x100kB 1x1MB 10x100kB 100x10kB Benchmark set Consider different workloads Upload time? One versus many files Small versus large files Notice: test run from Europe

Implications to end-user performance? 8 100 10 Dropbox SkyDrive Wuala Cloud Drive Google Drive Time (s) 1 0.1 0.01 1x100kB 1x1MB 10x100kB 100x10kB Benchmark set Network latency dominates the upload because of TCP SkyDrive (160 ms RTT) 4sto send a 1 MB file Google Drive (15 ms RTT) 300 ms to send a 1 MB file

Implications to end-user performance? 8 100 10 Dropbox SkyDrive Wuala Cloud Drive Google Drive Time (s) 1 0.1 0.01 1x100kB 1x1MB 10x100kB 100x10kB Benchmark set Client capabilities boost performance in this scenarios Dropbox (90 ms RTT) 10 s to send 100 files of 10 kb each Google Drive (15 ms RTT) 42 s to send 100 files of 10 kb each

Overhead = Total Traffic / Content Size? 9 10 Dropbox SkyDrive Wuala Cloud Drive Google Drive Fraction 1 0.1 1x100kB 1x1MB 10x100kB 100x10kB Benchmark set Cloud Drive 3 HTTPS connections per file Dropbox signaling cost of client capabilities

Overhead = Total Traffic / Content Size? 9 10 Dropbox SkyDrive Wuala Cloud Drive Google Drive Fraction 1 0.1 1x100kB 1x1MB 10x100kB 100x10kB Benchmark set 100 small files Bundling files pays back in network overhead

Conclusions 10 Designed specific benchmarks for Personal Cloud Storage

Conclusions 10 Designed specific benchmarks for Personal Cloud Storage Highlighted design choices... and their implications on performance

Conclusions 10 Designed specific benchmarks for Personal Cloud Storage Highlighted design choices... and their implications on performance Data center placement Centralized vs. distributed topologies

Conclusions 10 Designed specific benchmarks for Personal Cloud Storage Highlighted design choices... and their implications on performance Data center placement Centralized vs. distributed topologies Client capabilities Performance gains from bundling, deduplication etc.

Conclusions 10 Designed specific benchmarks for Personal Cloud Storage Highlighted design choices... and their implications on performance Data center placement Centralized vs. distributed topologies Client capabilities Performance gains from bundling, deduplication etc. Protocol design

Questions? 11 Thanks for your attention Data and scripts can be downloaded from: http://www.simpleweb.org/wiki/cloud benchmarks