Automatic generation of Network-on-Chip topology under link length and latency constraint

Σχετικά έγγραφα
Vol. 31,No JOURNAL OF CHINA UNIVERSITY OF SCIENCE AND TECHNOLOGY Feb

[4] 1.2 [5] Bayesian Approach min-max min-max [6] UCB(Upper Confidence Bound ) UCT [7] [1] ( ) Amazons[8] Lines of Action(LOA)[4] Winands [4] 1

FPGA. Fast and Efficient Tsunami Propagation Simulation with FPGA and GPGPU

An Automatic Modulation Classifier using a Frequency Discriminator for Intelligent Software Defined Radio

Retrieval of Seismic Data Recorded on Open-reel-type Magnetic Tapes (MT) by Using Existing Devices

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

Buried Markov Model Pairwise

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks

2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems

Feasible Regions Defined by Stability Constraints Based on the Argument Principle

Schedulability Analysis Algorithm for Timing Constraint Workflow Models

Development of a Seismic Data Analysis System for a Short-term Training for Researchers from Developing Countries

ER-Tree (Extended R*-Tree)

A summation formula ramified with hypergeometric function and involving recurrence relation

Technical Research Report, Earthquake Research Institute, the University of Tokyo, No. +-, pp. 0 +3,,**1. No ,**1

Design and Fabrication of Water Heater with Electromagnetic Induction Heating

Speeding up the Detection of Scale-Space Extrema in SIFT Based on the Complex First Order System

n 1 n 3 choice node (shelf) choice node (rough group) choice node (representative candidate)

Simplex Crossover for Real-coded Genetic Algolithms

Re-Pair n. Re-Pair. Re-Pair. Re-Pair. Re-Pair. (Re-Merge) Re-Merge. Sekine [4, 5, 8] (highly repetitive text) [2] Re-Pair. Blocked-Repair-VF [7]

Estimation of stability region for a class of switched linear systems with multiple equilibrium points

Nov Journal of Zhengzhou University Engineering Science Vol. 36 No FCM. A doi /j. issn

HOSVD. Higher Order Data Classification Method with Autocorrelation Matrix Correcting on HOSVD. Junichi MORIGAKI and Kaoru KATAYAMA

GPGPU. Grover. On Large Scale Simulation of Grover s Algorithm by Using GPGPU

ΒΙΟΓΡΑΦΙΚΟ ΣΗΜΕΙΩΜΑ ΣΤΥΛΙΑΝΗΣ Κ. ΣΟΦΙΑΝΟΠΟΥΛΟΥ Αναπληρώτρια Καθηγήτρια. Τµήµα Τεχνολογίας & Συστηµάτων Παραγωγής.

GPU. CUDA GPU GeForce GTX 580 GPU 2.67GHz Intel Core 2 Duo CPU E7300 CUDA. Parallelizing the Number Partitioning Problem for GPUs

IPSJ SIG Technical Report Vol.2014-CE-127 No /12/6 CS Activity 1,a) CS Computer Science Activity Activity Actvity Activity Dining Eight-He

Reading Order Detection for Text Layout Excluded by Image

Αρχιτεκτονική Σχεδίαση Ασαφούς Ελεγκτή σε VHDL και Υλοποίηση σε FPGA ΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ

, Evaluation of a library against injection attacks

Matrices and vectors. Matrix and vector. a 11 a 12 a 1n a 21 a 22 a 2n A = b 1 b 2. b m. R m n, b = = ( a ij. a m1 a m2 a mn. def

Laplace Expansion. Peter McCullagh. WHOA-PSI, St Louis August, Department of Statistics University of Chicago

Newman Modularity Newman [4], [5] Newman Q Q Q greedy algorithm[6] Newman Newman Q 1 Tabu Search[7] Newman Newman Newman Q Newman 1 2 Newman 3

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

VBA Microsoft Excel. J. Comput. Chem. Jpn., Vol. 5, No. 1, pp (2006)

Probabilistic Approach to Robust Optimization

Διπλωματική Εργασία του φοιτητή του Τμήματος Ηλεκτρολόγων Μηχανικών και Τεχνολογίας Υπολογιστών της Πολυτεχνικής Σχολής του Πανεπιστημίου Πατρών

Κεφάλαιο 1 Αφαιρετικότητα και Τεχνολογία Υπολογιστών (Computer Abstractions and Technology)

ΕΛΕΓΧΟΣ ΤΩΝ ΠΑΡΑΜΟΡΦΩΣΕΩΝ ΧΑΛΥΒ ΙΝΩΝ ΦΟΡΕΩΝ ΜΕΓΑΛΟΥ ΑΝΟΙΓΜΑΤΟΣ ΤΥΠΟΥ MBSN ΜΕ ΤΗ ΧΡΗΣΗ ΚΑΛΩ ΙΩΝ: ΠΡΟΤΑΣΗ ΕΦΑΡΜΟΓΗΣ ΣΕ ΑΝΟΙΚΤΟ ΣΤΕΓΑΣΤΡΟ

Maxima SCORM. Algebraic Manipulations and Visualizing Graphs in SCORM contents by Maxima and Mashup Approach. Jia Yunpeng, 1 Takayuki Nagai, 2, 1

Liner Shipping Hub Network Design in a Competitive Environment

MOTROL. COMMISSION OF MOTORIZATION AND ENERGETICS IN AGRICULTURE 2014, Vol. 16, No. 5,

Study on the Strengthen Method of Masonry Structure by Steel Truss for Collapse Prevention

Παράλληλος προγραμματισμός περιστροφικών αλγορίθμων εξωτερικών σημείων τύπου simplex ΠΛΟΣΚΑΣ ΝΙΚΟΛΑΟΣ

ΤΕΧΝΙΚΕΣ ΑΥΞΗΣΗΣ ΤΗΣ ΑΠΟΔΟΣΗΣ ΤΩΝ ΥΠΟΛΟΓΙΣΤΩΝ I

VSC STEADY2STATE MOD EL AND ITS NONL INEAR CONTROL OF VSC2HVDC SYSTEM VSC (1. , ; 2. , )

DESKTOP - Intel processor reference chart

Οι Διδάσκοντες. Αντώνης Πασχάλης, Καθηγητής, Θεωρία. Χρήστος Κρανιώτης, ΕEΔΙΠ, Εργαστήριο

Bundle Adjustment for 3-D Reconstruction: Implementation and Evaluation

Development and Verification of Multi-Level Sub- Meshing Techniques of PEEC to Model High- Speed Power and Ground Plane-Pairs of PFBS

Η Διδακτική Ενότητα «Γνωρίζω τον Υπολογιστή», στα πλαίσια των Προγραμμάτων Σπουδών της Πληροφορικής: μια Μελέτη Περίπτωσης.

Διπλωματική Εργασία του φοιτητή του Τμήματος Ηλεκτρολόγων Μηχανικών και Τεχνολογίας Υπολογιστών της Πολυτεχνικής Σχολής του Πανεπιστημίου Πατρών

High order interpolation function for surface contact problem

ΕΘΝΙΚΗ ΣΧΟΛΗ ΔΗΜΟΣΙΑΣ ΔΙΟΙΚΗΣΗΣ ΙΓ' ΕΚΠΑΙΔΕΥΤΙΚΗ ΣΕΙΡΑ

A multipath QoS routing algorithm based on Ant Net

X g 1990 g PSRB

FPGA. Variations and BTI-induced Aging Degradation on Commercial FPGAs. Shouhei ISHII and Kazutoshi KOBAYASHI, 3 FPGA JST, CREST

Computation Method to Improve Three-phase Voltage Imbalance by Exchange of Single-phase Load Connection

Detection and Recognition of Traffic Signal Using Machine Learning

Toward a SPARQL Query Execution Mechanism using Dynamic Mapping Adaptation -A Preliminary Report- Takuya Adachi 1 Naoki Fukuta 2.

ΣΥΣΤΗΜΑΤΑ ΥΠΟΛΟΓΙΣΤΩΝ.

Research on vehicle routing problem with stochastic demand and PSO2DP algorithm with Inver2over operator

Fourier transform, STFT 5. Continuous wavelet transform, CWT STFT STFT STFT STFT [1] CWT CWT CWT STFT [2 5] CWT STFT STFT CWT CWT. Griffin [8] CWT CWT

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ

C F E E E F FF E F B F F A EA C AEC

Ηρϊκλειτοσ ΙΙ. Πανεπιζηήμιο Θεζζαλίας. Τμήμα Μηχανικών Η/Υ και Δικτύων

Architecture for Visualization Using Teacher Information based on SOM

Stabilization of stock price prediction by cross entropy optimization

CMOS Technology for Computer Architects

(C) 2010 Pearson Education, Inc. All rights reserved.

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

Secure Cyberspace: New Defense Capabilities

Πτυχιακή Εργασία Η ΠΟΙΟΤΗΤΑ ΖΩΗΣ ΤΩΝ ΑΣΘΕΝΩΝ ΜΕ ΣΤΗΘΑΓΧΗ

From Secure e-computing to Trusted u-computing. Dimitris Gritzalis

ΣΧΕΔΙΑΣΜΟΣ ΔΙΚΤΥΩΝ ΔΙΑΝΟΜΗΣ. Η εργασία υποβάλλεται για τη μερική κάλυψη των απαιτήσεων με στόχο. την απόκτηση του διπλώματος

ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ. «Προστασία ηλεκτροδίων γείωσης από τη διάβρωση»

Resurvey of Possible Seismic Fissures in the Old-Edo River in Tokyo

An Advanced Manipulation for Space Redundant Macro-Micro Manipulator System

ΑΡΧΕΣ ΣΧΕΔΙΑΣΗΣ FPGA

ΠΕΡΙΒΑΛΛΟΝΤΙΚΟΣ ΣΧΕ ΙΑΣΜΟΣ ΠΕ ΙΝΗΣ ΚΟΙΤΗΣ ΠΟΤΑΜΟΥ ΝΕΣΤΟΥ ΣΕ ΣΥΜΒΑΤΟΤΗΤΑ ΜΕ ΧΩΡΟΥΣ ΑΝΑΨΥΧΗΣ ΚΑΙ ΡΑΣΤΗΡΙΟΤΗΤΩΝ

3.8.1 J (7) (1883~1906) (1907~1931) A ~ (10) i J C-1 ~1973 C-2

Optimization, PSO) DE [1, 2, 3, 4] PSO [5, 6, 7, 8, 9, 10, 11] (P)

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

Correction of chromatic aberration for human eyes with diffractive-refractive hybrid elements

3+1 Splitting of the Generalized Harmonic Equations

Gaze Estimation from Low Resolution Images Insensitive to Segmentation Error

Πρόσκληση. DOSSIER-Cloud DevOpS-based Software engineering for the cloud

BCI On Feature Extraction from Multi-Channel Brain Waves Used for Brain Computer Interface

Web 論 文. Performance Evaluation and Renewal of Department s Official Web Site. Akira TAKAHASHI and Kenji KAMIMURA

{takasu, Conditional Random Field

Εθνικό Μετσόβιο Πολυτεχνείο Σχολή Ηλεκτρολόγων Μηχ. και Μηχανικών Υπολογιστών. Εισαγωγή. Συστήματα Παράλληλης Επεξεργασίας 9 ο Εξάμηνο

Applying Markov Decision Processes to Role-playing Game

ΤΕΧΝΟΛΟΓΙΚΟ ΕΚΠΑΙΔΕΥΤΙΚΟ ΙΔΡΥΜΑ ΚΡΗΤΗΣ ΣΧΟΛΗ ΔΙΟΙΚΗΣΗΣ ΚΑΙ ΟΙΚΟΝΟΜΙΑΣ ΤΜΗΜΑ ΛΟΓΙΣΤΙΚΗΣ ΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ


ΙΕΥΘΥΝΤΗΣ: Καθηγητής Γ. ΧΡΥΣΟΛΟΥΡΗΣ Ι ΑΚΤΟΡΙΚΗ ΙΑΤΡΙΒΗ

Information and Communication Technologies in Education

Development of a Tiltmeter with a XY Magnetic Detector (Part +)

Reaction of a Platinum Electrode for the Measurement of Redox Potential of Paddy Soil

ΕΝΣΩΜΑΤΩΜΕΝΑ ΣΥΣΤΗΜΑΤΑ ΤΕΙ ΗΠΕΙΡΟΥ- ΣΤΕΦ ΤΜΗΜΑ ΜΗΧ. ΠΛΗΡΟΦΟΡΙΚΗΣ Τ.Ε.

2 ~ 8 Hz Hz. Blondet 1 Trombetti 2-4 Symans 5. = - M p. M p. s 2 x p. s 2 x t x t. + C p. sx p. + K p. x p. C p. s 2. x tp x t.

Transcript:

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE.,, 113 8656 7 3 1 113 0032 2 11 16 CREST E-mail: {tanida,hiroaki,matsumoto}@cad.t.u-tokyo.ac.jp,fujita@ee.t.u-tokyo.ac.jp SoC NoC NoC NoC [5] NoC Automatic generation of Network-on-Chip topology under link length and latency constraint Hideo TANIDA, Hiroaki YOSHIDA,,TakeshiMATSUMOTO, and Masahiro FUJITA, Dept. of Electrical Engineering and Information Systems, The University of Tokyo 7 3 1 Hongo, Bunkyo-ku, Tokyo, 113 8656 Japan VLSI Design and Education Center, The University of Tokyo 2 11 16 Yayoi, Bunkyo-ku, Tokyo, 113 0032 Japan CREST, Japan Science and Technology Agency E-mail: {tanida,hiroaki,matsumoto}@cad.t.u-tokyo.ac.jp,fujita@ee.t.u-tokyo.ac.jp Abstract With wire delay becoming dominant compared to transistor delay in deep-submicron era, the performance of SoC is more affected by interconnect. Although many NoC (Network-on-Chip) architectures which improve interconnect performance are proposed, automatically finding the most efficient one for a given application and mapping the function blocks onto it, is still an open issue. This paper proposes a method for generating a custom NoC which meets communication link-length and latency requirements. Additional constraint for floor-planning and interconnect architecture generation, to existing integer-linear-programming-based approach [5], enables link-length and latency requirement to be met in the generated NoC architecture. Key words Network-on-Chip, integer linear programming, guaranteed performance, floor planning 1. SoC System-on-a- Chip 1

NoC Network-on-Chip/ [2] SoC 2 NoC NoC SoC NoC 3 4 NoC 5 6 2. Network-on-Chip NoC 1 NoC CPU DSP, Memory NoC 1 1 NoC NoC FPGA NoC 2. 1 NoC NoC 1 NoC [1] 2 NoC a SPIN, b CLICHË 2-D mesh, c Torus, d Folded torus, e Octagon, f BFT [4] 2 [4] :SPIN,Octagon NoC 2

3 Communication trace graph [5] 6 [5] dist(u, v) u, v Ψ l ω(e),σ(e) X max,y max α, β X max Y max X max + Y max 4 [5] 5 4 NoC [5] 3. Network-on-Chip [5] 3. 1 3 CTG: communication trace graph W i,h i 3. 2 3 CTG 4 α[ (u,v) E dist(u, v) Ψ l ω(e) σ 2 (e) ]+β[xmax + Y max ] (1) 1 1 [5] v i X i,min,y i,min W i,h ix i,max,y i,max dist(u, v) i X i,min,y i,min v i,v j V X i,min > = X j,max,x j,min > = X i,max, Y i,min > = Y j,max,y j,min > = Y i,max (2) X i,max < = X max,y i,max < = Y max (3) 2 3. 3 3 CTG 5 6 bounding box 2 [5] 3

r i i p i,j r i j 0 < = j<ν, ν NR k,i,j v k p i,j 1 0 RR i,j,k,l p i,j,p k,l 1 0 O i,j,k,l v i v j p k,l 1 0 I i,j,k,l v i v j p k,l 1 0 BO k,l BO k,l = ω(e m) O i,j,k,l (4) BI k,l e m =v i,v j E BI k,l = ω(e m) I i,j,k,l (5) Z i,j,k,l,m,n e m =v i,v j E v i,v j p k,l p k,l,p m,n 1 0 Z i,j,k,l,m,n = O i,j,k,l RR k,l,m,n (6) O i,j,k,l + RR k,l,m,n > = 2 Z i,j,k,l,m,n (7) O i,j,k,l + RR k,l,m,n < = Z i,j,k,l,m,n +1 (8) (P R + P L ) (9) P R,P L P R P R =Ψ i BI i,j +Ψ o BO i,j (10) r i R p i,j r i R p i,j Ψ i, Ψ o P L P L =Ψ L ( i,j,k,l,m ω(i, j) RD k,m Z i,j,k,l,m,n + i,j,k,l ND i,k ω(i, j) NR i,k,l + i,j,k,l ND j,k ω(i, j) NR j,k,l ) (11) Ψ L RD k,m k, m ND i,k 3 v s,v d p = {(v s,r a ), (r a,r b ),..., (r z,v d )} (12) 5 4. NoC 4. 1 3 NoC Intel TeraFLOPS Processor [6] 3 [5] 4

4GHz 4. 2 NoC 4. 2. 1 1 NoC a 1 0.5 b 2 1 d 4 1 e 2 1 f 4 1 2 NoC a e 3000 6 d e 3000 6 d f 3000 6 b d 3000 6 e f 3000 6 b a 10 1 b e 3000 6 f e 3000 6 (u, v) E dist(u, v) < = D max σ(e u,v) (13) D max σ(e u,v ) u, v σ(e u,v ) 4. 2. 2 3. 3 1,0 RR i,j,k,l,nr i,j,k D max r p p i,j, v k NR k,i,j = 0 (14) D max r n,r n p m,i,p n,j RR m,i,n,j = 0 (15) D max 5. NoC ILP lp solve 5.5.0.14 [3] CPU Intel Xeon X5470 3.33GHz NoC 1 2 b a 1 7 5. 1 b,d,e,f 1 ILP 4.75 4.68 7 8 1 b,a 1.5 0 1 b,a 5. 2 7 10 Ψ i, Ψ o 328,66,11 Ψ L 80 [5] 5

8 b f e e d SW8 SW0 SW0 a f b a d 9 10 ILP 824.21 1800 9 a,b 0 d,e,f 8 10 3, 4 9 a,b 0 d,e,f 8 0-8 6. NoC 3 a e a SW0 e d e d SW0 e d f d SW0 f b d b SW0 d e f e SW0 f b a b SW0 a b e b SW0 e f e f SW0 e 4 a e a SW0 e d e d SW8 e d f d SW8 f b d b SW0 SW8 d e f e SW8 f b a b SW0 a b e b SW0 SW8 e f e f SW8 e NoC 10 NoC [1] M. Dall Osso, G. Biccari, L. Giovannini, D. Bertozzi, and L. Benini. xpipes: a Latency Insensitive Parameterized Network-on-chip Architecture For Multi-Processor SoCs. Proceedings of the 21st International Conference on Computer Design, pp. 536 539, 2003. [2] W.J. Dally and B. Towles. Route packets, not wires: on-chip interconnection networks. Proceedings of the 38th Design Automation Conference (DAC), pp. 684 689, 2001. [3] lp solve. http://lpsolve.sourceforge.net/5.5/. [4] P.P. Pande, C. Grecu, M. Jones, A. Ivanov, and R. Saleh. Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures. IEEE Transactions on Computers, pp. 1025 1040, 2005. [5] Krishnan Srinivasan, Karam S. Chatha, and Goran Konjevod. Linear-programming-based techniques for synthesis of network-on-chip architectures. IEEE Trans. Very Large Scale Integr. Syst., Vol. 14, No. 4, pp. 407 420, 2006. [6] S.R. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, A. Singh, T. Jacob, S. Jain, V. Erraguntla, C. Roberts, Y. Hoskote, N. Borkar, and S. Borkar. An 80-tile sub-100-w teraflops processor in 65-nm cmos. IEEE Journal of Solid-State Circuits, Vol. 43, No. 1, pp. 29 41, Jan. 2008. 6