ΠΛΕ- 074 Αρχιτεκτονική Υπολογιστών 2

Σχετικά έγγραφα

ΠΛΕ- 074 Αρχιτεκτονική Υπολογιστών 2

ΠΛΕ- 027 Μικροεπεξεργαστές

Υ- 01 Αρχιτεκτονική Υπολογιστών Εισαγωγή

Τέτοιες λειτουργίες γίνονται διαμέσου του

Προχωρηµένα Θέµατα Αρχιτεκτονικής

Επεξεργαστής Υλοποίηση ενός κύκλου μηχανής

ΤΕΧΝΙΚΕΣ ΑΥΞΗΣΗΣ ΤΗΣ ΑΠΟΔΟΣΗΣ ΤΩΝ ΥΠΟΛΟΓΙΣΤΩΝ I

και η µονάδα ελέγχου (control) O επεξεργαστής: Η δίοδος δεδοµένων (datapath) Εντολές διακλάδωσης (branch beq, bne) I Type Σχεδίαση datapath

Instruction Execution Times

Chapter 2. Εντολές : Η γλώσσα του υπολογιστή. (συνέχεια) Η διασύνδεση Υλικού και λογισμικού David A. Patterson και John L.

ΜΥΥ- 402 Αρχιτεκτονική Υπολογιστών Φροντιστήριο: MIPS assembly

Τελική Εξέταση, Απαντήσεις/Λύσεις

O επεξεργαστής: Η δίοδος δεδομένων (datapath) και η μονάδα ελέγχου (control)

Η διασύνδεση Υλικού και λογισμικού David A. Patterson και John L. Hennessy. Chapter 5. Ο επεξεργαστής: διαδρομή δεδομένων και μονάδα ελέγχου

ΗΥ 232 Οργάνωση και Σχεδίαση Υπολογιστών. Διάλεξη 2 Οργάνωση μνήμης Καταχωρητές του MIPS Εντολές του MIPS 1

Επεξεργαστής Υλοποίηση ενός κύκλου μηχανής

-Επεξεργαστής: datapath (δίοδος δεδοµένων) (1) και control (2) -Μνήµη (3) -Συσκευές Εισόδου (4), Εξόδου (5) (Μεγάλη ποικιλία!!)

Οι τέσσερις αρχές για τον σχεδιασμό του συνόλου εντολών μιας μηχανής είναι:

O επεξεργαστής: Η δίοδος δεδομένων (datapath) και η μονάδα ελέγχου (control)

ΕΠΛ221: Οργάνωση Υπολογιστών και Συμβολικός Προγραμματισμός. Κεφ. 4: Ο επεξεργαστής 1. Διάδρομος δεδομένων και μονάδα ελέγχου 2.

Αρχιτεκτονική Υπολογιστών

Homework 3 Solutions

LANGUAGE OF THE MACHINE. TEI Κρήτης, Τμ. ΕΠΠ, Αρχιτεκτονική Υπολογιστών. Οργάνωση Υπολογιστή. Τυπική οργάνωση υπολογιστή

Chapter 2. Εντολές : Η γλώσσα του υπολογιστή. Τρίτη (3 η ) δίωρη διάλεξη. Η διασύνδεση Υλικού και λογισμικού David A. Patterson και John L.

Chapter 2. Εντολές : Η γλώσσα του υπολογιστή. (συνέχεια) Η διασύνδεση Υλικού και λογισμικού David A. Patterson και John L.

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

Αρχιτεκτονικη υπολογιστων

Απλός επεξεργαστής (Επανάληψη)

ΗΥ 232 Οργάνωση και Σχεδίαση Υπολογιστών. Διάλεξη 3 Εντολές του MIPS (2)

Χρονισμός και Απόδοση Υπολογιστικών Συστημάτων

Εργαστήριο Αρ. 1. Εισαγωγή στην Αρχιτεκτονική MIPS. Πέτρος Παναγή Σελ. 1

Εντολές του MIPS (2)

Single Cycle Datapath. Αρχιτεκτονική Υπολογιστών. 5ο εξάμηνο ΣΗΜΜΥ ακ. έτος: Νεκ. Κοζύρης

The Simply Typed Lambda Calculus

Αρχιτεκτονική Υπολογιστών

Προχωρηµένα Θέµατα Αρχιτεκτονικής Η/Υ. Storage Systems.. Λιούπης

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

Other Test Constructions: Likelihood Ratio & Bayes Tests

Εισαγωγή στους Η/Υ. Γιώργος Δημητρίου. Μάθημα 2 ο. Πανεπιστήμιο Θεσσαλίας - Τμήμα Πληροφορικής

Οργάνωση και Σχεδίαση Υπολογιστών Η ιασύνδεση Υλικού και Λογισµικού, 4 η έκδοση. Κεφάλαιο 2. Εντολές: η γλώσσα του υπολογιστή

Οργάνωση Υπολογιστών ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Εργαστήριο 10: Επίδοση Επεξεργαστών, CPI. Μανόλης Γ.Η. Κατεβαίνης

Προχωρηµένα Θέµατα Αρχιτεκτονικής

ΠΛΕ- 074 Αρχιτεκτονική Υπολογιστών 2

Κεφάλαιο 4. Ο επεξεργαστής. Οργάνωση και Σχεδίαση Υπολογιστών Η ιασύνδεση Υλικού και Λογισµικού, 4 η έκδοση

Dynamic types, Lambda calculus machines Section and Practice Problems Apr 21 22, 2016

i Throughput: Ο ρυθμός ολοκλήρωσης έργου σε συγκεκριμένο χρόνο

Απόδοση Υπολογιστικών Συστημάτων

Ιόνιο Πανεπιστήμιο Τμήμα Πληροφορικής Αρχιτεκτονική Υπολογιστών Απόδοση ΚΜΕ. (Μέτρηση και τεχνικές βελτίωσης απόδοσης)

Μηχανοτρονική. Τμήμα Μηχανικών Παραγωγής και Διοίκησης 7 ο Εξάμηνο,

ΑΝΙΧΝΕΥΣΗ ΓΕΓΟΝΟΤΩΝ ΒΗΜΑΤΙΣΜΟΥ ΜΕ ΧΡΗΣΗ ΕΠΙΤΑΧΥΝΣΙΟΜΕΤΡΩΝ ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ

Chapter 4 (1) Αξιολόγηση και κατανόηση της απόδοσης

Calculating the propagation delay of coaxial cable

Section 8.3 Trigonometric Equations

the total number of electrons passing through the lamp.

Chapter 4 ( ή 1 στο βιβλίο σας)

ΠΛΕ- 027 Μικροεπεξεργαστές 5ο μάθημα: Αρχιτεκτονική πυρήνα: υλοποίηση ενός κύκλου

Pipeline: Ένα παράδειγμα από.τη καθημερινή ζωή. 30 min κάθε «φάση»

Αρχιτεκτονική Υπολογιστών

EPL 603 TOPICS IN SOFTWARE ENGINEERING. Lab 5: Component Adaptation Environment (COPE)

ΗΜΥ 213. Εργαστήριο Οργάνωσης Ηλεκτρονικών Υπολογιστών και Μικροεπεξεργαστών. Διδάσκων: Δρ. Γιώργος Ζάγγουλος

Modbus basic setup notes for IO-Link AL1xxx Master Block

ST5224: Advanced Statistical Theory II

Σύγχρονες Αρχιτεκτονικές Υπολογιστών

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Αρχιτεκτονική υπολογιστών

CMOS Technology for Computer Architects

ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ. Ψηφιακή Οικονομία. Διάλεξη 7η: Consumer Behavior Mαρίνα Μπιτσάκη Τμήμα Επιστήμης Υπολογιστών

1 η Ενδιάμεση Εξέταση Απαντήσεις/Λύσεις

Μηχανική Μάθηση Hypothesis Testing

Block Ciphers Modes. Ramki Thurimella

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

Fractional Colorings and Zykov Products of graphs

Chapter 5. Ο επεξεργαστής: διαδρομή δεδομένων και μονάδα ελέγχου. Ενδέκατη (11 η ) δίωρη διάλεξη.

Ενσωµατωµένα Υπολογιστικά Συστήµατα (Embedded Computer Systems)

ΕΠΛ221: Οργάνωση Υπολογιστών και Συμβολικός Προγραμματισμός

Λειτουργικά Συστήματα. Εισαγωγή

[1] P Q. Fig. 3.1

Multi Cycle Datapath. Αρχιτεκτονική Υπολογιστών. 5ο εξάμηνο ΣΗΜΜΥ ακ. έτος: Νεκ. Κοζύρης

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

Partial Trace and Partial Transpose

derivation of the Laplacian from rectangular to spherical coordinates

Αρχιτεκτονική Υπολογιστών

Κεφ. 1: Μετρικά Σύγκρισης Επίδοσης και Χρονοπρογράμματα (Benchmarking), και Άλλα Μετρικά Κεφ. 1

ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΕΣΣΑΛΙΑΣ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΔΕΥΤΕΡΗ ΠΡΟΟΔΟΣ ΣΤΗΝ ΟΡΓΑΝΩΣΗ ΣΤΟΥΣ Η/Y (ΗΥ232)

Εγκατάσταση λογισμικού και αναβάθμιση συσκευής Device software installation and software upgrade

Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1

Υπερβαθμωτή (superscalar) Οργάνωση Υπολογιστών

ΠΛΕ- 074 Αρχιτεκτονική Υπολογιστών 2

Αρχιτεκτονική Υπολογιστών

1. Αφετηρία από στάση χωρίς κριτή (self start όπου πινακίδα εκκίνησης) 5 λεπτά µετά την αφετηρία σας από το TC1B KALO LIVADI OUT

Αρχιτεκτονική Υπολογιστών

Pipelined Datapath, Hazards and Forwarding

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Advanced Subsidiary Unit 1: Understanding and Written Response

Κεφάλαιο 3. Αριθμητική Υπολογιστών Review. Hardware implementation of simple ALU Multiply/Divide Real Numbers

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

ΠΛΕ- 074 Αρχιτεκτονική Υπολογιστών 2

Αρχιτεκτονική Υπολογιστών

Transcript:

ΠΛΕ- 074 Αρχιτεκτονική Υπολογιστών 2 2ο μάθημα: Αξιοπιστία, επίδοση, εντολές μηχανής MIPS Αρης Ευθυμίου Πηγές διαφανειών: συνοδευτικές διαφάνειες αγγλικης εκδοσης του βιβλιου

Dependability! Historically ICs have been very reliable much more likely for a disk to fail than a processor! In feature size 65nm and less we see rising numbers of transient faults permanent faults! Most faults limited to single component at different levels of abstrac]on! U_er failure of a module - > component error in higher level module 2

What is failure?! Hard to decide! Service Level Agreement (SLA) or Service Level Objec]ves (SLO) guarantees of quality of service 3

Dependability! Systems alternate between two states of service with respect to SLA/SLO: 1. Service accomplishment, where service is delivered as specified by SLA 2. Service interrup]on, where the delivered service is different from the SLA! failure(f)=transi]on from 1 to 2! restora]on(r)=transi]on from 2 to 1 4

Measures! Reliability measures: Mean ]me to failure (MTTF) Failures in ]me (FIT) = 1/MTTF! expressed in billion hours! Service interrup]on: Mean ]me to repair (MTTR)! Commonly quoted reliability measure: Mean ]me between failures (MTBF) = MTTF + MTTR! Availability MTTF / MTBF 5

Example! In a disk subsystem: 10 disks, each rated at 1,000,000- hour MTTF 1 ATA controller, 500,000- hour MTTF 1 power supply, 200,000- hour MTTF 1 fan, 200,000- hour MTTF 1 ATA cable, 1,000,000- hour MTTF! Assuming the life]mes are exponen]ally distributed and that failures are independent, compute the MTTF of the system as a whole 6

! Because the overall failure rate of the collec]on is the sum of the failure rates of the modules, the failure rate of the system System failure rate = 10*(1/1,000,000) + 1/500,000 + 1/200,000 + 1/200,000 + 1/1,000,000 = 23/1,000,000 or 23,000 FIT! Because MTTF is the inverse of the failure rate MTTF system = 1/(23/1,000,000) = 43,500 hours 5 years 7

Improving dependability! Add redundancy to the system in ]me: repeat the opera]on in resources: have spare components ready 8

Example disk unit (2)! Dual- power supply unit, each with MTTF = 200,000- hour! MTTF for dual power supply is the mean ]me un]l one power supply fails divided by the chance that the other will fail before the first one is replaced! Assuming exponen]al distribu]on, mean ]me un]l 1 fails = MTTF pwr / 2! Es]mate of probability = repair ]me / mean ]me to failure of other unit MTTF pwr / 2 MTTR pwr MTTF pwr = 2 MTTF pwr 2 MTTR pwr 9

Measuring Performance! Typical performance metrics: Execu]on/Response ]me Throughput! Speedup of X rela]ve to Y Execu]on ]me Y / Execu]on ]me X! Performance = 1/Execu]on ]me! Speedup of X rela]ve to Y Performance X / Performance Y 10

Execu]on ]me! Wall clock ]me: includes all system overheads response ]me, elapsed ]me! CPU ]me: only computa]on ]me no I/O, running other processes 11

Benchmarks! Benchmarks Kernels (e.g. matrix mul]ply) Toy programs (e.g. sor]ng) Synthe]c benchmarks (e.g. Dhrystone) Benchmark suites (e.g. SPEC06fp, TPC- C)! Suites more chance of covering more users needs understand variability in performance! SPEC CPU benchmarks SPEC2006: CINT integer benchmarks, CFP floa]ng point Real programs modified for portability and minimized I/O! Server benchmarks TPC, SPECWeb, 12

Summarizing results! Need to report a single number! Arithme]c mean? some programs take much longer than others! Weighted mean? Hard to pick weights. SPEC consor]um of compe]ng companies! Normalize execu]on ]mes to a reference computer i.e. Execu]on ]me reference / Execu]on ]me SPEC calls this SPECra]o 13

Comparing normalised ]mes ExecutionTime ref SPECRatio A SPECRatio B = ExecutionTime A ExecutionTime ref = ExecutionTime B ExecutionTime A = Performance A Performance B ExecutionTime B! Choice of reference computer is irrelevant! For ra]os geometric rather than arithme]c mean is used Gmean = n n i=1 ratio i 14

Παράδειγμα Computer A Computer B Computer C Program 1 1 10 20 Program 2 1000 100 20 Arithm mean 500.5 55 20 Geom mean 31.622 31.622 20 Ποιό σύστημα είναι καλύτερο; 15

Κανονικοποίηση και μέσος όρος Computer A Computer B Computer C Program 1 1 10 20 Program 2 1 0.1 0.02 Arithm mean 1 5.05 10.01 Geom mean 1 1 0.632 Computer A Computer B Computer C Program 1 0.1 1 2 Program 2 10 1 0.2 Arithm mean 5.05 1 1.1 Geom mean 1 1 0.632 ως προς Α ως προς Β 16

Principles of Computer Design! Take Advantage of Parallelism e.g. mul]ple processors, disks, memory banks, pipelining, mul]ple func]onal units! Principle of Locality Reuse of data and instruc]ons temporal and spa]al! Focus on the Common Case 17

Amdahl s Law! Frac]on enhanced Frac]on of ]me (in original machine) than can be converted to take advantage of enhancement! Speedup enhanced How much faster the task would run if the enhancement was applied for the en]re program 18

Dirty Laundry Washing Machine Drying Machine Clean Laundry 30 minutes washing 90 minutes drying Total Execution Time: 30+90 = 120 minutes Washing Portion: 30/120 = ¼ Drying Portion: 90/120 = ¾ 19

Dirty Laundry Washing Machine Drying Machine 2x fast Clean Laundry 30 minutes washing 90/2 = 45 minutes drying Speedup = (30+90)/(30+45)=1.6 Fraction enhanced = 90/120= ¾ Speedup enhanced = 2 Speedup = 1 / (0.25+0.75/2) = 1/0.625 = 1.6 20

Law of diminishing returns! Incremental improvement in speedup gained by just a por]on of the computa]on diminishes as improvements are added! If an enhancement is useable only for a frac]on of a task, we can t speed up the task by more than 1 that frac]on 21

If we can have super- fast drying machines Dirty Laundry Washing Machine Drying Machine Clean Laundry 30 minutes washing 90/ 0 minutes drying Speedup = (30+90)/(30+0)=4 22

Example! If the new processor is 10 ]mes faster than the original processor, and we assume that the original processor is busy with computa]on 40% of the ]me and is wai]ng for I/O 60% of the ]me, what is the overall speedup gained by incorpora]ng the enhancement?! Frac]on enhanced = 0.4, Speedup enhanced = 10! Speedup overall = 1/(0.6+0.4/10) = 1.56! What is the upper bound of the overall speedup?! Upper bound = 1/0.6 = 1.67 23

Principles of Computer Design! The Processor Performance Equa]on 24

Αριθμός εντολών, CPI! Αριθμός (δυναμικών) εντολών προγράμματος Καθορίζονται από το πρόγραμμα, ISA, μεταφραστή! Μέσος όρος κύκλων ανά εντολή (CPI) Καθορίζεται από το υλικό Αν το CPI είναι διαφορετικό για κάθε εντολή! Το μέσο CPI επηρεάζεται από το «μίγμα εντολών» του προγράμματος 25

CPI Παράδειγμα! Computer A: Cycle Time = 250ps, CPI = 2.0! Computer B: Cycle Time = 500ps, CPI = 1.2! Ιδια ISA! Ποιός είναι ταχύτερος και κατα πόσο; A ταχύτερος 26

Παράδειγμα! Computer A: ρολόι 2GHz, 10s CPU ]me! Σχεδιάζουμε Computer B Προσπαθούμε για 6s CPU ]me Μπορούμε να αυξήσουμε το ρυθμό ρολογιού, αλλά αυτό προκαλεί αύξηση κύκλων ρολογιού 1.2! Πόση πρέπει να είναι η συχνότητα ρολογιού του Computer B; 27

Λεπτομέρειες CPI Αν οι διάφορες κατηγορίες εντολών έχουν διαφορετικό αριθμό κύκλων εκτέλεσης: Το μέσο CPI πρέπει να χρησιμοποιεί τη σχετική συχνότητα εμφάνισης εντολών Σχετική συχνότητα 28

CPI Παράδειγμα! 2 εναλλακτικά προγράμματα με εντολές κατηγοριών A, B, C Κατηγορία A B C CPI κατηγορίας 1 2 3 IC πρόγραµµα 1 2 1 2 IC πρόγραµµα 2 4 1 1 " Πρόγραμμα 1: IC = 5 " Κύκλοι = 2 1 + 1 2 + 2 3 = 10 " μέσο CPI = 10/5 = 2.0 " Πρόγραμμα 2: IC = 6 " Κύκλοι = 4 1 + 1 2 + 1 3 = 9 " μέσο CPI = 9/6 = 1.5 29

Που ξοδεύει χρόνο το δικό σου πρόγραμμα;! Ανάλυση χαρακτηριστικών (profiling) ενός προγράμματος (με gprof) δείχνει που ξοδεύεται ο χρόνος ανά συνάρτηση έτσι μπορεί να δεί κανείς πιο είναι το αργότερο κομμάτι και να το βελτιώσει! Συνήθως βλέπουμε ένα 90/10 ή 80/20: το 10% του κώδικα ευθύνεται για το 90% του χρόνου! Μετάφραση και σύνδεση με κατάλληλες οδηγίες gcc pg progr.c! Εκτέλεση κανονικά (λίγο πιο αργά) δημιουργεί ένα αρχείο gmon.out! Τρέχουμε gprof για να δούμε τα αποτελέσματα 30

Βελτιστοποίηση μετάφρασης! Ο gcc και άλλοι μεταφραστές, έχει διάφορες επιλογές βελτιστοποίησης Συνήθως - Ο[1-3]. Οσο μεγαλύτερος αριθμός τόσο περισσότερο προσπαθεί να βελτιστοποιήσει τον κώδικα! Μετά το debugging καλό είναι κανείς να κάνει μια τελική μετάφραση με μια από τις - Ο επιλογές 31

Αριθμός εντολών - υπολογισμός! Οι σύγχρονοι επεξεργαστές έχουν μετρητές απόδοσης μετρούν διάφορα γεγονότα (π.χ. cache miss, ) Συνήθως χρειάζεται ειδική έκδοση του πυρήνα του Λ.Σ. και κάποια ειδική βιβλιοθήκη! ψάξτε για perfmon2, Intel performance counter monitor, Δεν μπορούν να μετρήσουν οτιδήποτε θέλει ένας αρχιτέκτονας! Εναλλακτικά τρέχουμε το πρόγραμμα σε έναν γρήγορο προσομοιωτή ή εργαλείο dynamic binary instrumenta]on PIN h_p://www.pintool.org 32

Παράδειγμα αριθμός εντολών! Θα χρησιμοποιήσουμε το μετροπρόγραμμα gzip (SPEC Int 2000) και το pin ~efthym/pin/pin*/pin \ -t ~efthym/pin/pin*/source/tools/ ManualExamples/obj-ia32/inscount2.so \ -- ~efthym/icarus/spec/gzip \ ~efthym/icarus/spec/input.source 60! Αποτέλεσματα στο αρχείο: inscount.out περίπου 70.5 εκατομύρια εντολές IA- 32 33

Υπολογισμός CPI (χονδρικά)! Χωρίς το pin, τρέχει σε 19.95sec στο hp6000ws12 /usr/bin/time --verbose ~efthym/ /gzip \ ~efthym/ /input.source 60! Κύκλος ρολογιού 376ps 2.66GHz more /proc/cpuinfo! 19.95*10 12 /376 = 53 εκατομύρια κύκλοι! CPI = cycles/instruc]on count! = 0.7526 στο παράδειγμα! Ο επεξεργαστής εκτελεί πάνω από μία εντολή (1.33) ανά κύκλο 34

MIPS Instruc]ons! All instruc]ons exactly 32 bits wide! Different formats for different purposes! Similari]es in formats ease implementa]on 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits op rs rt rd shamt funct R-Format! 6 bits 5 bits 5 bits 16 bits op rs rt offset I-Format! 6 bits 26 bits op address J-Format! 35

MIPS registers 36

MIPS Instruc]on Types! Arithme]c & Logical - manipulate data in registers add $s1, $s2, $s3 $s1 = $s2 + $s3 or $s3, $s4, $s5 $s3 = $s4 OR $s5! Data Transfer - move register data to/from memory: load & store lw $s1, 100($s2) $s1 = Memory[$s2 + 100] sw $s1, 100($s2) Memory[$s2 + 100] = $s1! Branch - alter program flow beq $s1, $s2, 25 if ($s1==$s1) PC = PC + 4 + 4*25 else PC = PC + 4 37

MIPS Arithme]c & Logical Instruc]ons! Instruc]on usage (assembly) add dest, src1, src2 sub dest, src1, src2 and dest, src1, src2 dest=src1 + src2 dest=src1 - src2 dest=src1 AND src2! Instruc]on characteris]cs Always 3 operands: des]na]on + 2 sources Operand order is fixed Operands are always general purpose registers! Design Principles: Design Principle 1: Simplicity favors regularity Design Principle 2: Smaller is faster 38

Instr. Binary Representa]on 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits 31 op rs rt rd shamt funct 0! Used for arithme]c, logical, shi instruc]ons op: Basic opera]on of the instruc]on (opcode) rs: first register source operand rt: second register source operand rd: register des]na]on operand shamt: shi amount (more about this later) funct: func]on - specific type of opera]on! Also called R- Format or R- Type Instruc]ons 39

Example! Machine language for add $8, $17, $18! See reference card for op, funct values 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits 31 op rs rt rd shamt funct 0 40

Immediate Operands! Constant data specified in an instruc]on addi $s3, $s3, 4! No subtract immediate instruc]on Just use a nega]ve constant addi $s2, $s1, -1! Design Principle 3: Make the common case fast Small constants are common Immediate operand avoids a load instruc]on 41

Condi]onal Opera]ons! Branch to a labeled instruc]on if a condi]on is true Otherwise, con]nue sequen]ally! beq rs, rt, L1 if (rs == rt) branch to instruc]on labeled L1;! bne rs, rt, L1 if (rs!= rt) branch to instruc]on labeled L1;! j L1 uncondi]onal jump to instruc]on labeled L1 42

More Condi]onal Opera]ons! Set result to 1 if a condi]on is true Otherwise, set to 0! slt rd, rs, rt if (rs < rt) rd = 1; else rd = 0;! slti rt, rs, constant if (rs < constant) rt = 1; else rt = 0;! Use in combina]on with beq, bne slt $t0, $s1, $s2 # if ($s1 < $s2) bne $t0, $zero, L # branch to L 43

ΠΛΕ 074 Αρχιτεκτονική ΙΙ 2012-2013 44