ΗΜΥ 312 ΑΡΧΙΤΕΚΤΟΝΙΚΗ ΗΛΕΚΤΡΟΝΙΚΩΝ ΥΠΟΛΟΓΙΣΤΩΝ Χειμερινό Εξάμηνο 2017

ΗΜΥ 312 ΑΡΧΙΤΕΚΤΟΝΙΚΗ ΗΛΕΚΤΡΟΝΙΚΩΝ ΥΠΟΛΟΓΙΣΤΩΝ Χειμερινό Εξάμηνο 2017 ΔΙΑΛΕΞΗ 11: ΣΥΣΤΗΜΑΤΑ ΕΠΙΚΟΙΝΩΝΙΑΣ / ΔΙΑΣΥΝΔΕΣΗΣ Διδάσκων: Χάρης Θεοχαρίδης, ΗΜΜΥ ttheocharides@ucy.ac.cy [Προσαρμογή από Computer Architecture, a Quantitive Approach Patterson & Hennessy, 2005, UCB και Juse Duato]

(Επ) Συστήματα Εισόδου/Εξόδου Δεδομένων Processor interrupts Cache Memory - I/O Bus Main Memory I/O Controller I/O Controller I/O Controller Disk Disk Graphics Network ideal: high bandwidth, low latency ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.2 Θεοχαρίδης, ΗΜΥ, 2017

Δίκτυα l Στόχος: Επικοινωνία Μεταξύ Υπολογιστών l Στρατηγικός Σκοπός: Μεταχείριση Δικτύου Υπολογιστών σαν ένα μεγάλο ατομικό υπολογιστή, με διασκορπισμένες και μοιραζόμενες πηγές υπολογισμού (distributed resource sharing) l Το Θέμα: Οι υπολογιστές του δικτύου πρέπει να συμφωνούν σε πολλά θέματα Σημασία σε standards και πρωτοκόλλα επικοινωνίας Ανοχή σε λάθη είναι περισσότερο από αναγκαία l ΠΡΟΣΟΧΗ: Πάρα πολλή ορολογία ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.3 Θεοχαρίδης, ΗΜΥ, 2017

Παραδείγματα Δικτύων FDDI 100Mbps CS Net Phonenet IP - internet Protocol TCP - Transmission Control Protocol NSF Net 1.6Mbps CS Net Relay ARPA net T3, 230Kbps Bitnet T1, 56Kbps 10 Mbps Token Ring ATM Ethernet 4Mbps X.25 (Telenet, Uninet_ ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.4 Θεοχαρίδης, ΗΜΥ, 2017

Networks in Our Life! ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.5 Θεοχαρίδης, ΗΜΥ, 2017

Networks of Workstations (NOWs) Clusters l Clusters (από «έτοιμα» συστήματα), πλήρης υπολογιστές με πολλαπλά private address spaces l Τα Clusters συνδέονται με τα I/O bus των computers lower bandwidth than multiprocessors that use the memory bus lower speed network links more conflicts with I/O traffic l Τα Clusters από N processors έχουν N copies του O/S περιορίζοντας την διαθέσιμη για εφαρμογές μνήμη. l Βελτιωμένη διαθεσιμότητα και επεκτασιμότητα του συστήματος (system availability and expandability) easier to replace a machine without bringing down the whole system allows rapid, incremental expandability l Economy-of-scale: Πλεονεκτήματα σε σχέση με τα κόστα ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.6 Θεοχαρίδης, ΗΜΥ, 2017

Commercial (NOW) Clusters Dell PowerEdge eserver IBM SP Proc Proc # Proc Network Speed P4 Xeon 3.06GHz 2,500 Myrinet Power4 1.7GHz 2,944 VPI BigMac Apple G5 2.3GHz 2,200 Mellanox Infiniband HP ASCI Q Alpha 21264 1.25GHz 8,192 Quadrics LLNL Thunder Intel Itanium2 1.4GHz 1,024*4 Quadrics Barcelona PowerPC 970 2.2GHz 4,536 Myrinet ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.7 Θεοχαρίδης, ΗΜΥ, 2017

Networks in Our Daily and Activities in our not-so-daily activities ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.8 Θεοχαρίδης, ΗΜΥ, 2017

Δίκτυα l Τι απασχολεί τους ερευνητές: απευθείας (direct, point-to-point) vs. μέσω τρίτων (indirect, multi-hop) τοπολογία (e.g., bus, ring, DAG) Αλγόριθμοι Δρομολόγησης (routing algorithms) Μεταγωγή (switching (aka multiplexing)) Καλωδίωση (wiring (e.g., choice of media, copper, coax, fiber)) l Τι όμως μας αφορά: Καθυστέρηση (latency) Εύρος (bandwidth) Κόστος Αξιοπιστία (reliability) ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.9 Θεοχαρίδης, ΗΜΥ, 2017

Περισσότερη Ορολογία l Σύνδεση 2 ή περισσότερων δικτύων: Διαδικτύωση (Internetworking) l 3 Οικογένειες Δικτύων MPP Massively Parallel Processing performance, latency and bandwidth LAN Local Area Network workstations, cost WAN Wide Area Network telecommunications, phone call revenue l Προσπάθεια συρρίκνωσης l Προσπάθεια ιεραρχίας από μικρό σε μεγάλο για απλότητα ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.10 Θεοχαρίδης, ΗΜΥ, 2017

Τα βασικά του Δικτύου l Αρχικό Σημείο: Μεταβίβαση ενός ψηφίου από ένα σύστημα (ή υπολογιστή) σε δεύτερο σύστημα (ή υπολογιστή). l Σειρά επικοινωνίας [Queue (FIFO)] σε κάθε μεριά l Οι μεταβιβασμένες πληροφορίες ονομάζονται μήνυμα - message l Αμφίδρομη Ταυτόχρονη Επικοινωνία ( Full Duplex ) l Κανόνες Επικοινωνίας; Πρωτόκολλο! protocol Μέσα στον υπολογιστή: Loads/Stores: Request (Address) & Response (Data) Need Request & Response signaling ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.11 Θεοχαρίδης, ΗΜΥ, 2017

Απλό Παράδειγμα l Πώς είναι η διατύπωση του μηνύματος? Fixed? Number bytes? Request/ Response Address/Data 1 bit 32 bits 0: Please send data from Address 1: Packet contains data corresponding to request Header/Trailer: Πληροφορεί πως παραδίδεται το μήνυμα Payload: Τα δεδομένα στο μήνυμα (1 λέξη στο παράδειγμα) ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.12 Θεοχαρίδης, ΗΜΥ, 2017

Ερωτήσεις J l Πώς επιτυγχάνουμε επικοινωνία 3 και πλέον Η/Υ? Θέλουμε το address field (destination) στο πακέτο l Τί γίνεται αν στο πακέτο εισέλθουν λάθη? Θέλουμε τρόπο εύρεσης λαθών error detection field στο πακέτο (e.g., CRC) l Τί γίνεται αν χαθεί το πακέτο? Πιο εξελιγμένο πρωτόκολλο elaborate protocols για εντοπισμό απώλειας (e.g., NAK, ARQ, time outs) l Πώς διαχωρίζουμε πια εργασία είναι ο παραλήπτης? Κάθε εργασία έχει την δική της γραμμή εξυπηρέτησης για προστασία (Queue per process to provide protection) l Ερωτήσεις απλές σαν και αυτές οδηγούν στην δημιουργία πιο εξειδικευμένων πρωτοκόλλων και διατυπώσεις πακέτων => πολύπλοκες αρχιτεκτονικές ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.13 Θεοχαρίδης, ΗΜΥ, 2017

Πίσω στο απλό παράδειγμα l Πώς είναι η διατύπωση του μηνύματος? Fixed? Number bytes? Request/ Response Address/Data CRC 2 bits 32 bits 4 bits 00: Request Please send data from Address 01: Reply Packet contains data corresponding to request 10: Acknowledge request 11: Acknowledge reply ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.14 Θεοχαρίδης, ΗΜΥ, 2017

Προγράμματα εξυπηρέτησης δικτύων l SW Send steps Βήματα αποστολής 1: Application copies data to OS buffer 2: OS calculates checksum, starts timer 3: OS sends data to network interface HW and says start l SW Receive steps Βήματα παραλαβής 3: OS copies data from network interface HW to OS buffer 2: OS calculates checksum, if matches send ACK; if not, deletes message (sender resends when timer expires) 1: If OK, OS copies data to user address space and signals application to continue l Sequence of steps for SW: protocol η συμφωνημένη διαδικασία αποστολής και παραλαβής Παράδειγμα η διαδικασία UDP/IP protocol στο UNIX ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.15 Θεοχαρίδης, ΗΜΥ, 2017

l Interconnecting Many Devices Additional Network Structure and Functions Additional functions (routing, arbitration, switching) Routing Which of the possible paths are allowable (valid) for packets? Provides the set of operations needed to compute a valid path Executed at source, intermediate, or even at destination nodes Arbitration When are paths available for packets? (along with flow control) Resolves packets requesting the same resources at the same time For every arbitration, there is a winner and possibly many losers Losers are buffered (lossless) or dropped on overflow (lossy) Switching How are paths allocated to packets? The winning packet (from arbitration) proceeds towards destination Paths can be established one fragment at a time or in their entirety ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.16 Θεοχαρίδης, ΗΜΥ, 2017

Interconnecting Many Devices l Shared-media Networks The network media is shared by all the devices Operation: half-duplex or full-duplex Node Node Node X ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.17 17 Θεοχαρίδης, ΗΜΥ, 2017

Interconnecting Many Devices l Shared-media Networks Arbitration Centralized arbiter for smaller distances between devices Dedicated control lines Distributed forms of arbiters CSMA/CD The device first checks the network (carrier sensing) Then checks if the data sent was garbled (collision detection) If collision, device must send data again (retransmission): wait an increasing exponential random amount of time beforehand Fairness is not guaranteed Token ring provides fairness Owning the token provides permission to use network media token holder Node Node Node X ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.18 Θεοχαρίδης, ΗΜΥ, 2017

Interconnecting Many Devices l Shared-media Networks Switching Switching is straightforward The granted device connects to the shared media Routing Routing is straightforward Performed at all the potential destinations Each end node device checks whether it is the target of the packet Broadcast and multicast is easy to implement Every end node devices sees the data sent on shared link anyway Established order: arbitration, switching, and then routing ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.19 19 Θεοχαρίδης, ΗΜΥ, 2017

Interconnecting Many Devices l Switched-media Networks Disjoint portions of the media are shared via switching Switch fabric components Passive point-to-point links Active switches Dynamically establish communication between sets of source-destination pairs Aggregate bandwidth can be many times higher than that of shared-media networks Node Node Switch Fabric Node Node ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.20 20 Θεοχαρίδης, ΗΜΥ, 2017

Interconnecting Many Devices l Switched-media Networks Routing Every time a packet enters the network, it is routed Arbitration Centralized or distributed Resolves conflicts among concurrent requests Switching Once conflicts are resolved, the network switches in the required connections Established order: routing, arbitration, and then switching ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.21 21 Θεοχαρίδης, ΗΜΥ, 2017

Interconnecting Many Devices l Comparison of Shared- versus Switched-media Networks Shared-media networks Low cost Aggregate network bandwidth does not scale with # of devices Global arbitration scheme required (a possible bottleneck) Time of flight increases with the number of end nodes Switched-media networks Aggregate network bandwidth scales with number of devices Concurrent communication Potentially much higher network effective bandwidth Beware: inefficient designs are quite possible Superlinear network cost but sublinear network effective bandwidth ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.22 22 Θεοχαρίδης, ΗΜΥ, 2017

Routing, Arbitration, and Switching l Arbitration Performed at each switch, regardless of topology Determines use of paths supplied to packets (When allocated?) Needed to resolve conflicts for shared resources by requestors Ideally: Maximize the matching between available network resources and packets requesting them At the switch level, arbiters maximize the matching of free switch output ports and packets located at switch input ports Problems: Starvation Arises when packets can never gain access to requested resources Solution: Grant resources to packets with fairness, even if prioritized Many straightforward distributed arbitration techniques for switches Two-phased arbiters, three-phased arbiters, and iterative arbiters ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.23 23 Θεοχαρίδης, ΗΜΥ, 2017

Δυναμικά Διαδίκτυα l Επιτυγχάνεται με χρήση Crossbar - πολλαπλός διακόπτης με διασταυρωμένες ράβδους!!! Συνδέει π.χ. α επεξεργαστές με β μνήμες α * β matrix α οριζόντιες γραμμές, β κάθετες γραμμές Σημεία διασταύρωσης: on/off switches Ένας μόνο διακόπτης για κάθε σημείο (row,column) pair Δεν σταματά: Πόρτα Χ i με με Πόρτα Υ j δεν διακόπτει την Πόρτα Ζ l to W k Αρκετά ακριβό, δεν μεγεθύνεται γραμμικά α * β διακόπτες Πολύπλοκη χρονική συναλλαγή και έλεγχος l Θα το δούμε αναλυτικά πιο μετά ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.24 Θεοχαρίδης, ΗΜΥ, 2017

Στατικά Δίκτυα l Σταθερά καλώδια (channels) μεταξύ συσκευών l Πολλές τοπολογίες Πλήρως Συνδεδεμένα (n(n-1))/2 channels Static counterpart of crossbar Αστέρι One central PE for message passing Static counterpart of bus Πολλαπλό επίπεδο δικτύων με συσκευή σε κάθε διακόπτη ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.25 Θεοχαρίδης, ΗΜΥ, 2017

Μέτρηση Απόδοσης (Performance Metrics) l Κόστος Δικτύου (Network cost) number of switches number of (bidirectional) links on a switch to connect to the network (plus one link to connect to the processor) width in bits per link, length of link l Εύρος (Network bandwidth (NB)) represents the best case bandwidth of each link * number of links l Πλάτος (Bisection bandwidth (BB)) represents the worst case divide the machine in two parts, each with half the nodes and sum the bandwidth of the links that cross the dividing line l Άλλες Πτυχές latency on an unloaded network to send and receive messages throughput maximum # of messages transmitted per unit time # routing hops worst case, congestion control and delay ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.26 Θεοχαρίδης, ΗΜΥ, 2017

Bus IN Bidirectional network switch Processor node l N processors, 1 switch ( ), 1 link (the bus) l Only 1 simultaneous transfer at a time NB = link (bus) bandwidth * 1 BB = link (bus) bandwidth * 1 ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.27 Θεοχαρίδης, ΗΜΥ, 2017

Ring IN l N processors, N switches, 2 links/switch, N links l N simultaneous transfers NB = link bandwidth * N BB = link bandwidth * 2 l If a link is as fast as a bus, the ring is only twice as fast as a bus in the worst case, but is N times faster in the best case ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.28 Θεοχαρίδης, ΗΜΥ, 2017

Fully Connected IN l N processors, N switches, N-1 links/switch,(n*(n-1))/2 links l N simultaneous transfers NB = link bandwidth * (N*(N-1))/2 BB = link bandwidth * (N/2) 2 ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.29 Θεοχαρίδης, ΗΜΥ, 2017

Crossbar (Xbar) Connected IN l N processors, N 2 switches (unidirectional),2 links/switch, N 2 links / processor l N simultaneous transfers NB = link bandwidth * N BB = link bandwidth * N/2 ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.30 Θεοχαρίδης, ΗΜΥ, 2017

Hypercube (Binary N-cube) Connected IN 2-cube 3-cube l N processors, N switches, logn links/switch, (NlogN)/2 links l N simultaneous transfers NB = link bandwidth * (NlogN)/2 BB = link bandwidth * N/2 ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.31 Θεοχαρίδης, ΗΜΥ, 2017

2D and 3D Mesh/Torus Connected IN q N processors, N switches, 2, 3, 4 (2D torus) or 6 (3D torus) links/switch, 4N/2 links or 6N/2 links q N simultaneous transfers NB = link bandwidth * 4N or link bandwidth * 6N BB = link bandwidth * 2 N 1/2 or link bandwidth * 2 N 2/3 ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.32 Θεοχαρίδης, ΗΜΥ, 2011

Fat Tree l N processors, log(n-1)*logn switches, 2 up + 4 down = 6 links/switch, N*logN links l N simultaneous transfers NB = link bandwidth * NlogN BB = link bandwidth * 4 ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.33 Θεοχαρίδης, ΗΜΥ, 2017

Fat Tree l Τα «δέντρα» είναι πολύ χρήσιμες δομές, και στα δίκτυα είναι πολύ δημοφιλή. Trees are good structures. People in CS use them all the time. Suppose we wanted to make a tree network. A B C D l Any time A wants to send to C, it ties up the upper links, so that B can't send to D. The bisection bandwidth on a tree is horrible - 1 link, at all times l The solution is to 'thicken' the upper links. More links as the tree gets thicker increases the bisection l Rather than design a bunch of N-port switches, use pairs ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.34 Θεοχαρίδης, ΗΜΥ, 2017

SGI NUMAlink Fat Tree www.embedded-computing.com/articles/woodacre ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.35 Θεοχαρίδης, ΗΜΥ, 2017

Συγκρίσεις Δικτύων l For a 64 processor system Bus Ring Torus 6-cube Fully connected Network bandwidth Bisection bandwidth Total # of Switches 1 1 1 Links per switch Total # of links 1 ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.36 Θεοχαρίδης, ΗΜΥ, 2017

IN Comparison l For a 64 processor system Bus Ring 2D Torus 6-cube Fully connected Network bandwidth 1 64 256 192 2016 Bisection bandwidth Total # of switches Links per switch 1 1 2 64 2+1 64+64 16 64 4+1 128+64 32 64 6+7 192+64 1024 64 63+1 2016+64 Total # of links (bidi) 1 ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.37 Θεοχαρίδης, ΗΜΥ, 2017

How About On-Chip Networks? ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.38 Θεοχαρίδης, ΗΜΥ, 2017

Differences between on-chip and off-chip networks Off-chip: I/O bottlenecks Pin-limited bandwidth Inherent overheads of off-chip I/O transmission On-chip Tight area and power budgets Ultra-low on-chip latencies ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.39 Θεοχαρίδης, ΗΜΥ, 2017

MulticoreExamples (1) XBAR 0 1 2 3 4 5 0 1 2 3 4 5 Sun Niagara ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.40 Θεοχαρίδης, ΗΜΥ, 2017

MulticoreExamples (2) RING l Element Interconnect Bus 4 rings Packet size: 16B-128B Credit-based flow control Up to 64 outstanding requests Latency: 1 cycle/hop IBM Cell ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.41 Θεοχαρίδης, ΗΜΥ, 2017

Many Core Example 2D MESH l Intel Polaris 80 core prototype l Academic Research ex: MIT Raw, TRIPs 2-D Mesh Topology Scalar Operand Networks ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.42 Θεοχαρίδης, ΗΜΥ, 2017

Suggested Reading l l l l William Dally and Brian Towles. Principles and Practices of Interconnection Net- works. Morgan Kaufmann Pub., San Francisco, CA, 2003. William Dally and Brian Towles, Route packets not wires: On-chip interconnection networks, in Proceedings of the 38th Annual Design Automation Conference (DAC-38), 2001, pp. 684 689. David Wentzlaff, Patrick Griffin, Henry Hoffman, LieweiBao, Bruce Edwards, Carl Ramey, Matthew Mattina, Chi-Chang Miao, John Brown I I I, and AnantAgarwal. On-chip interconnection architecture of the tile processor. IEEE Micro, pages 15 31, 2007. Michael Bedford Taylor, Walter Lee, SamanAmarasinghe, and AnantAgarwal. Scalar operand networks: On-chip interconnect for ILP in partitioned architectures. In Proceedings of the International Symposium on High Performance Computer Architecture, February 2003. l S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P. Iyer, A. Singh, T. Jacob, S. Jain, S. Venkataraman, Y. Hoskote, and N. Borkar. An 80-tile 1.28tflops network-on-chip in 65nm cmos. Solid-State Circuits Conference, 2007. ISSCC 2007. Digest of Technical Papers. IEEE International, pages 98 589, Feb. 2007. ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.43 Θεοχαρίδης, ΗΜΥ, 2017

Παραδείγματα Δικτυωμένων Επεξεργαστών Proc Proc Speed # Proc IN Topology SGI Origin R16000 128 fat tree 800 Cray 3TE Alpha 300MHz 2,048 3D torus 600 21164 Intel ASCI Red Intel 333MHz 9,632 mesh 800 IBM ASCI White Power3 375MHz 8,192 multistage Omega BW/link (MB/sec) 500 NEC ES SX-5 500MHz 640*8 640-xbar 16000 NASA Columbia IBM BG/L Intel Itanium2 Power PC 440 1.5GHz 512*20 fat tree, Infiniband 0.7GHz 65,536*2 3D torus, fat tree, barrier ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.44 Θεοχαρίδης, ΗΜΥ, 2017

IBM BlueGene 512-node proto BlueGene/L Peak Perf 1.0 / 2.0 TFlops/s 180 / 360 TFlops/s Memory Size 128 GByte 16 / 32 TByte Foot Print 9 sq feet 2500 sq feet Total Power 9 KW 1.5 MW # Processors 512 dual proc 65,536 dual proc Networks 3D Torus, Tree, Barrier 3D Torus, Tree, Barrier Torus BW 3 B/cycle 3 B/cycle ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.45 Θεοχαρίδης, ΗΜΥ, 2017

A BlueGene/L Chip 32K/32K L1 440 CPU 128 2KB L2 11GB/s 256 Double FPU 700 MHz 32K/32K L1 440 CPU Double FPU 5.5 GB/s 128 5.5 GB/s 2KB L2 256 256 16KB Multiport SRAM buffer 256 11GB/s 4MB L3 ECC edram 128B line 8-way assoc Gbit ethernet 3D torus Fat tree Barrier 1 8 DDR control 6 in, 6 out 1.6GHz 1.4Gb/s link 3 in, 3 out 350MHz 2.8Gb/s link 4 global barriers 144b DDR 256MB 5.5GB/s ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.46 Θεοχαρίδης, ΗΜΥ, 2017

Δικτύωση On-Chip l Ενώνουν Επεξεργαστές, μνήμη, I/O συσκευές l Δυναμικά δίκτυα Συνδέονται σε οποιοδήποτε κομμάτι με διακόπτες/διακλαδωτές (switches) ή διαύλους (busses) Δύο είδη διακλαδωτών (switches) On / off: 1 input, 1 output Pass through / cross over: 2 inputs, 2 outputs l Στατικά δίκτυα Συνδέονται με καλώδια ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.47 Θεοχαρίδης, ΗΜΥ, 2017

Crossbar vs bus l Crossbar Η απόδοση εξαρτάται από τον αριθμό Ι/Ο Εύκολο στον σχεδιασμό l Δίαυλος (Bus) Δεν επηρεάζεται από τον αριθμό Ι/Ο Πιο πολύπλοκο όσο αυξάνεται ο αριθμός Ι/Ο l Συμβιβασμός: Δίκτυο Πολλαπλών Σταδίων ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.48 Θεοχαρίδης, ΗΜΥ, 2017

Συμβιβασμός: Δίκτυο Πολλαπλών Σταδίων l Συνδέει ν εξαρτήματα μεταξύ τους l Συνήθως γινόμενο από ν.log(ν) 2x2 διακόπτες l Φθηνότερο από crossbar l Γρηγορότερο από δίαυλο l Πολλές τοπολογίες e.g. Banyan Tree Perfect Shuffle ΗΜΥ312 Δ11 Δίκτυα Επικοινωνίας.49 Θεοχαρίδης, ΗΜΥ, 2017