pipe stage 1 pipe stage 2 Figure 1: An example illustrating the eect of dierent clock periods on resource sharing of a pipelined design

Similar documents
Power and Slew-aware Clock Network Design for Through-Silicon-Via (TSV) based 3D ICs

Van Diest Medical Center PSA CHNA Implementation Plan (Version 1 as of 01/20/15) 2014 Health Needs - Primary Service Area (Hamilton County IA + )

2008 International ANSYS Conference

A HISTORY OF RADIO A RADIO INTERVIEW. Teacher s notes 1 ARTS AND MEDIA

ES 00 DIM D PLC OPTIONAL CENTER MODULE, WITH n.150 THRU HOLE OR HOLD DOWN FEATURE

Strategic Plan for the Internationalization of UDEM

PLC A- R PLC PIN #P NOTES: #0.15 X45$ X60$ 4 PLC -D

Introduction. Methods

WORKING GROUP HAZARDOUS WASTE Work program

Prisoners in Highlights

CRS Report for Congress Received through the CRS Web

Governance in the NHS in Wales Memorandum for the Public Accounts Committee

Multi HLR Architecture for Improving Location Management in PCS Network

CHAPTER 1: ENTREPRENEURSHIP: EVOLUTIONARY DEVELOPMENTREVOLUTIONARY IMPACT

Part C: Service Specification for emergency road ambulance services

Equity in Athletics 2017 Institution Information

Return of Private Foundation

SEMI-ELLIPTICAL SURFACE FLAW EC INTERACTION AND INVERSION: THEORY. B. A. Auld and s. Jefferies

ASSOCIATION OF SENIOR LIVING INDIA CODE OF PRACTICE

Guarantor: William F. Page, PhD Contributors: Clare M. Mahan, PhD*; William F. Page, PhD ; Tim A. Bullman, MS*; Han K. Kang, DrPH*

H0006 Case Management $ Minute Increment 1-28 Units per month

AETC Philosophy Future Requirements Combat Systems Officer Training Remotely Piloted Aircraft Training Future of Pilot Training

Advertising packages 2018

EUROPEAN UNION (ZIMBABWE SANCTIONS) (AMENDMENT) ORDER 2017

Return of Private Foundation

Remotely excited Raman optical activity using chiral plasmon propagation in Ag nanowires

Return of Private Foundation

The provision of same-day care in general practice: an observational study

Free clinics are often overlooked as a part of the US health

Time-Based Tree Graphs for Stabilized Force Structure Representations *

Rapportens tittel. Norway Rapport xx Rapport

Arthropod. Fecal-oral. Exposure/exertion. Hepatitis Hepatitis B. Sexually transmitted. Page 24. MSMR Vol. 19 No. 4 April Lyme disease.

Explanatory Memorandum

MEDICAL SURVEILLANCE MONTHLY REPORT

Multicenter Collaboration in Observational Research: Improving Generalizability and Efficiency

Kong. meeting. was invited to. 1 st Section activities: Details Date 21 & 22 January R10 EXCOM. in Hong Kong. meeting in.

Balancing the NHS balanced scorecard!

MARKET INSIGHT INDIANAPOLIS MULTIFAMILY REPORT FOURTH QUARTER 2017

NAVAL MINES INTRODUCTION: EOD CONSIDERATIONS CONTACT MINES: SAFETY FOR OFFICIAL USE ONLY FOR OFFICIAL USE ONLY

Reducing imbalances between demand and supply of bed capacity for the clinic

Register Pressure in Software-Pipelined Loop Nests

NZDF Policies and Practices Relating to Physical, Sexual, and Other Abuses

Title: Time-Based Tree Graphs for Stabilized Force Structure Representations

HEALTH SERVICE COSTS IN EUROPE: COST AND REIMBURSEMENT OF PRIMARY HIP REPLACEMENT IN NINE COUNTRIES

Nursing in 3D: Diversity, Disparities, and Social Determinants ABSTRACT. 32 Public Health Reports / 2014 Supplement 2 / Volume 129

Lectures 11-1, Polymorphism. Introduction to Computer Science, Shimon Schocken slide 1

Accuracy of Contrast-Enhanced Ultrasound in the Diagnosis of Bile Duct Obstruction

Perioperative Care. Kay S. Jones, Elizabeth A. Potts, and J. W. Thomas Byrd. Preoperative Care

Maneuver Center of Excellence (MCoE) Libraries HQ Donovan Research Library Armor Research Library Fort Benning, Georgia

SYLLABUS TAXATION 2 ECAU EVEN SEMESTER 2016/2017

Bringing Climate Opportunities to Entrepreneurs: Lessons Learned from the Caribbean Climate Innovation Center

DLN: I OMB No Iefile GRAPHIC print - DO NOT PROCESS As Filed Data -

INNOVATION TOOLKIT SCIENCE AND INNOVATION NETWORK

Application for Recognition of Exemption

As organizations strive to improve

Memo Operating Guidance No March 15, 2002

Training Aids, Devices, Simulators, and Simulations Study

Shelter Care/Detention Hearing

MEDICAL SURVEILLANCE MONTHLY REPORT

EDP Renewables Canada Ltd. 1320B th Avenue SW Calgary, Alberta T2R 0C5 Toll-free:

Bloom Period Management of Lygus bug in Alfalfa Seed

Return of Organization Exempt From Income Tax. Under section 501(c), 527, or 4947(a)(1) of the Internal Revenue Code (except private foundations)

Positive and Negative Consequences of a Military Deployment

The For-Profitization of Affordable Housing Development and the de Blasio Plan

Table of Contents. What is Branding and Why is it Important? The Shield of the College Selecting the Correct File Format...

COUNTY OF LOS ANGELES DEPARTMENT OF AUDITOR.CONTROLLER

abstract SUPPLEMENT ARTICLE

ONLINE FIRST OCTOBER 18, 2017 ORIGINAL RESEARCH

MEDICAL SURVEILLANCE MONTHLY REPORT

2016 Department of the Treasury

Clinical audit in the laboratory

Jobseekers Act 1995 (Application) (Amendment) Order 2017 JOBSEEKERS ACT 1995 (APPLICATION) (AMENDMENT) ORDER 2017 PART 1 INTRODUCTION 3

s The Green Home 2.0

UNCLASSIFIED. Program Element for Code B Items N/A Other Related Program Elements F (RDT&E AF) FY 2012 OCO

Cross-border care and healthcare quality improvement in Europe: the MARQuIS research project

Table of Contents. Letter from the Sheriff 3-4 Comparative Summary 5-7 Organizational Charts 8-12 Call History Countywide 13-15, 69 Calls for Service

HORIBA Core Technology

AFSC 2A3X3 TACTICAL AIRCRAFT MAINTENANCE SPECIALTY

CITY OF PEMBROKE COMMUNITY IMPROVEMENT PLAN FINAL

Updated 9/5/08 NUR 105. Adult Nursing. Plan of Instruction. Effective Date: 2008 Version Number:

Differences in End-of-Life Care in the ICU Across Patients Cared for by Medicine, Surgery, Neurology, and Neurosurgery Physicians

Undergraduate Student Workbook

National quality improvement policies and strategies in European healthcare systems

2018 SALES & EXHIBITOR GUIDE

For Personal Use Only. Any commercial use is strictly prohibited.

Emergency department visit volume variability

Chapter Comparing Effectiveness and costs of Home v. Hospital Care

THE REPEATER. Warrensburg Area Amateur Radio Club, Inc. Volume 18, Issue 2 March / April 2013

ADULT SOCIAL CARE SERVICES (CHARGES) REGULATIONS 2017

Using the DTW method for estimation of deviation of care processes from a care plan

DISTRO II-DISTRIBUTION ROTATION MODEL Joanne M. Witt Army Research Institute for the Behavioral and Social Sciences Arlington, Virginia April 1973

8/11/16. Disclosures. What is an 1115 Medicaid Waiver? Objectives. What is an 1115 Medicaid Waiver? What is an 1115 Medicaid Waiver?

NAVAL POSTGRADUATE SCHOOL Monterey, California THESIS

FOOD AND NUTRITION SERVICES TEST TRAY EVALUATION. MEAL: M N Ex DAY OF CYCLE 4 DIET: Regular 1

Representing Alabama s Public Two-Year College System NUR 107. Adult/Child Nursing. Plan of Instruction. Effective Date: 2007 Version Number:

NEMA LAY-IN WIREWAY

National Benchmarking Report for the Philippines

What is the best way for providers to ask patients about antiretroviral adherence?

MEDICAL SURVEILLANCE MONTHLY REPORT

Putting the quality into care: Making a measurable difference in children s lives. Kimberly Green Family Health International July 2010

Transcription:

Clok Optimiztion for High-Performne Pipeline Design Hsio-Ping Jun, Dniel D. Gjski n Smit Bkshi Deprtment of Informtion n Computer Siene University of Cliforni, Irvine, CA 9717-45, USA Astrt In orer to reue the esign ost of pipeline systems, resoures my e shre yopertions within n ross ifferent pipe stges. In orer to mimize resoure shring, ruil eision is the seletion of lok perio, sine hoie n versely et the performne n ost of the esign. In this pper, we present n lgorithm to selet lok perio tht ttempts to minimize esign re while stisfying given throughput onstrint. Eperimentl results on severl emples emonstrte the qulity of our seletion lgorithm n the enet of llowing resoure shring ross pipe stges. 1 Introution In generl, high-performne onstrints re met y pipelining the esign into severl onurrently eeuting stges, suh tht the pipe stges operte one fter the other on the sme smple ut t the sme time on ierent smples. In esigning suh pipeline system, the urrent prtie is to prtition the esription uner evelopment into stges n then eh stge, long with its performne n ost onstrints, is given to ierent esign group. In this sheme, ierent stges re implemente seprtely n hve their own tpths n ontrol units. Thus, ifferent stges n use ierent lok signls, s long s their elys stisfy the throughput onstrint. Clerly, implementing eh pipe stge seprtely woul result in lrge numer of hrwre resoures, therey inresing the ost of the esign. However, the esign ost n e reue y shring resoures mong these stges. For instne, the sme funtionl unit n e utilize to perform severl ierent opertions from ierent pipe stges over ierent time-steps. When performing resoure shring, n importnt eision is the seletion of lok perio to sheule the opertions into ierent sttes. A hoie of the lok perio oul versely et the performne n ost of the - nl esign. For instne, Figure 1() shows two-stge pipeline with pipe stge ely (the inverse of throughput) onstrint of 1 ns. Figures 1(), () n () show sheuling results of the given pipeline using 6, n ns s the lok perio respetively. When the lok perio is equl to either 6 or ns, the pipe stge ely onstrint is stise. The opertions n e n shre the sme multiplier sine they re in ierent lok yles. Note tht the opertions n re sheule in the sme lok yle when the lok perio is 6 ns, ut in ierent pipe stge 1 pipe stge pipeline DFG e : 4 ns pipe stge ely : 1 ns lk : 6 ns ost : mult, lk : ns ost : mult, 1 () () () e s s e s s6 s7 s s6 lk : ns perf : violte Figure 1: An emple illustrting the eet of ierent lok perios on resoure shring of pipeline esign lok yles when the lok perio is ns. Consequently, the opertion n n shre one er when the lok perio is ns, ut not when the lok perio is 6 ns. Thus, the implementtion when the lok perio is ns requires one less er thn the implementtion when the lok perio is 6 ns. However, it oes not imply tht shorter lok perio is lwys preferle. For emple, Figure 1() shows tht, when the lok perio is equl to ns, pipe stge 1 requires seven lok yles, whih results in ely of 14 ns n violtes the pipe stge ely onstrint. Using this emple, we hve shown tht the seletion of lok perio is non-trivil prolem. In this pper, we propose lok estimtion lgorithm tht etermines the lok perio whih stises the throughput onstrint n requires minimum numer of resoures. The rest of the pper is orgnize s follows. In the net setion, we isuss previous reserh one in this re n lso eplin how we ier from it. We give the prolem efinition n present the ssume esign moel in Setion. The lok seletion lgorithm is epline in Setion 4. Finlly, we present eperimentl results n give onlusions. Previous Work Severl previous ppers resse the issue of lok perio estimtion for given t ow grph. For emple, there re severl lok estimtion shemes [4] [7] [8] tht use the ely of the slowest omponent s the estimte () e EURO-DAC 96 with EURO-VHDL 96-89791-848-7/96 $4. 1996 IEEE

lok perio. However, using the slowest omponent ely s the lok perio n le to uner-utilize funtionl units n onsequently higher esign ost in ses where the omponents hve wiely iering elys. A lok estimtion metho se on slk minimiztion is propose in [6]. This estimtion metho ims to selet the lok perio tht optimizes the performne of the esign. In our prolem, performne optimiztion is not the gol; our gol is to minimize the esign ost while stisfying the performne onstrint. In [1], methoology is propose to estimte the lok perio for time-onstrine sheuling s well s resoureonstrine sheuling. However, this methoology oes not onsier pipeline esigns, while our lgorithm ims to selet lok perio for pipeline esigns. Finlly, the lgorithm presente in this pper iers from ll the lgorithms mentione ove in tht our lgorithm lso tkes the ontrol unit ely into ount. When the numer of sttes is very lrge, the ontrol unit tens to eome very omple n ontrol unit ely ontriutes signintly to the lok perio n nnot e neglete. By onsiering the ontrol unit ely, our lgorithm provies more relisti estimtion thn the previously pulishe work. Prolem Denition n Assumptions Given (1) pipeline of n pipe stges PS 1 PS n, where eh pipe stge PS i is represente y t ow grph DFG i, () omponent lirry, () the pipe stge ely onstrint PSDely n (4) rnge of llowle lok perios, represente y [lkmin, lkm], the gol of our lgorithm is to n lok perio lk suh tht, for ll i, DFG i n e sheule into PSDely/lk sttes of ely lk n the esign re is minimize. The mimum lok perio llowe, lkm, is equl to PSDely. Design lirries often speify the mimum lok frequeny t whih the lok input of istte iruit my e riven suh tht stle trnsitions of logi levels re mintine. This frequeny is use to etermine the vlue of lkmin if it is not lrey speie y the user. The ost of pipeline is pproimte y the totl re of tpth omponents. Our lok estimtion lgorithm ssumes esign moel, s shown in Figure, similr to the esign moel use in [6]. In this moel, the tpth onsists of registers, funtionl units n tri-stte rivers. A two-level us struture is ssume for the interonnetion ross the registers n funtionl units. Note tht register oul e use to store temporry vlue tht is use in ierent sttes of the sme pipe stge or oul e use s pipeline lth etween pipe stges. Opertion hining is supporte in this moel y llowing onnetions from the output ports of some funtionl units iretly to the input ports of other funtionl units. Moreover, opertions n eeute over severl lok yles; tht is, multi-yle opertions re possile. The ontrol unit onsists of the stte register, eoer, the ontrol logi to rive the ontrol lines for the tpth omponents, n the net-stte logi to ompute the net Control Unit ontrol logi eoer stge reg net stte logi ontrol ritil pth sttus FU reg Dtpth Figure : Design moel for lok estimtion stte to e store in the stte register. The ontrol unit implements stte mhine tht sequenes esign through series of sttes, eh of the sttes represents the set of tpth opertions performe onurrently in the sme or ierent pipe stges of the esign. The lok perio is etermine y the longest registerto-register ely. Typilly, the pth through the ontrol logi, s shown in Figure, hs the lrgest ely. Therefore, in our estimtion, the miniml lok perio is pproimte using the sum of ll the elys ssoite with the omponents in the pth, inluing the tpth n the ontrol unit. 4 Algorithm Our lgorithm selets the lok perio in three si steps. 1. Pipe stge shpe funtion genertion: The rst step in our lgorithm is to proue shpe funtion in terms of lok perios versus the pipe stge ely, iniviully, for eh pipe stge of the esription. This shpe funtion n lerly inite the loks tht n stisfy the pipe stge ely onstrint.. Clok nites seletion: Net, given the pipe stge ely onstrint, PSDely, n the shpe funtions of eh pipe stge, set of lok perios tht n stisfy the pipe stge ely onstrint in ll stges n e esily otine. These lok perios re lle lok nites.. Resoure estimtion: Hving otine the set of lok nites, the nl step in our lgorithm is to estimte the mount of resoures require y eh lok nite. Then the lgorithm woul return the lok perio tht requires the lest mount of resoures. Detils of pipe stge shpe funtion genertion n resoure estimtion will e presente in the following setions. 4.1 Pipe Stge Shpe Funtion Genertion The shpe funtion genertion lgorithm silly onsists of three steps. It rst proues shpe funtion in terms of lok perios versus the minimum numer of FU reg

Proeure: MinClkPerio Inputs: t ow grph DF G, the numer of sttes N; Output: the minimum lok perio; egin Proeure Cstep =1; ComputeP thlength(df G); MPthLength = ely of the longest pth in DF G; MinClk = MPthLength/N; InsertReyOps(DF G; P List); while (P List 6= ;) o if Cstep = N then sheule ll the non-sheule opertions; MinClk = mimum stte ely; P List = ;; else op = First(P List); if op is single-yle opertor then etermine hining or non-hining; sheule op n upte MinClk; else etermine the numer of yles of op; sheule op n upte MinClk; en if; InsertReyOps(DF G; P List); Cstep = Cstep 1; en if; en while; return MinClk; en Proeure Figure : The proeure to estimte the minimum lok perio, given N sttes sttes y onsiering only the tpth ely. Then the lgorithm estimtes the ontrol unit ely n uptes the shpe funtion oringly. Finlly, the shpe funtion of lok perios versus the pipe stge ely n e ompute y multiplying the lok perios y the orresponing numer of sttes. Given t ow grph DFG of pipe stge n the rnge of llowle lok perios, [lkmin, lkm], the shpe funtion is generte inrementlly y ing the numer of sttes, n then omputing the minimum lok perio for the e numer of sttes using the proeure MinClkPerio outline in Figure. This proess proues one (lok perio, numer of sttes) point in the shpe funtion. To otin the entire shpe funtion, we itertively inrese the numer of sttes, eginning with the smllest possile numer, whih ispsdely=lkme, n nishing with the lrgest possile numer, whih ispsdely=lkmin. Given t ow grph DFG, the proeure MinClkP erio rst omputes the pth length for eh of the opertions in DFG. The pth length of n opertion is ene s the longest pth ely strting from this opertion till the output noe. Therefore, y enition, the mimum pth length, MPthLength, of ll opertions in DFG is the ritil pth length. The vrile MinClk is initilize to the optiml lok perio MPthLength/N, where N is the numer of sttes tht the DFG will e sheule into. The net step of the proeure involves etermining whether rey opertion n e sheule n whether hining or multi-yling shoul e performe epening upon its eet on the lok perio. The sheuling of n opertion my inrese the lok perio n the vrile MinClk is upte if it oes. One n opertion is sheule, other non-rey opertions eome rey n these re then inserte into the rey list. This proess ontinues n when it rehes the lst stte, ll the nonsheule opertions re sheule into the lst stte n the proeure returns the vrile MinClk. Clerly, the result of this lgorithm epens upon how hining n multi-yling is performe. We now illustrte how hining n multi-yling re etermine on the emple in Figure 4. DFG MinClk= 7.ns MinClk= 8ns num of sttes = 5 MinClk=8ns : 4 ns () () () () Figure 4: Determining the minimum lok perio Given tht multiplition opertion tkes 56 ns n n ition tkes 4 ns, the proeure omputes tht the mimum pth length is 16 ns. Sine the t ow grph nees to e sheule into ve sttes, the optiml lok perio, tht is, the urrent MinClk, is 16/5=7. ns. In the rst itertion, the proeure ttempts to sheule opertion n nees to etermine whether to sheule it ross 56/7.= sttes or 56/7.e= sttes. If is sheule ross two sttes, the verge ely of the rst two sttes is 56/=8 ns, n opertions n oul e sheule ross three sttes, whih results in n estimte ely per stte of (456)/=6.7 ns. Thus, the lok perio is 8 ns. If is sheule ross three sttes, then the verge stte ely is 18.7 ns for the rst three sttes. However, opertions n now nee to e nishe within two sttes, whih gives n estimte ely per stte of (456)/=4 ns. Tht is, the lok perio in this se is 4 ns. Sine sheuling the opertion into two-yle opertion gives n estimtion of shorter lok perio, the proeure eies to sheule ross the rst two sttes s shown in Figure 4(), n the lok perio MinClk is upte to 8 ns. Net, the opertion is sheule. Sine the ely of the opertion is less thn 8 ns, it is single-yle opertion n its sheuling oes not hnge the urrent lok perio. The result of this itertion is shown in Figure 4(). The proeure ontinues this proess for opertions n, n the nl result is shown in Figure 4(). The lgorithm otins minimum lok perio of 8 ns. Similrly, we n estimte tht the minimum lok perios for sheuling the t ow grph in Figure 4() into one, two, three, or four sttes re 16 ns, 8 ns, 56 ns n 56 ns respetively. Therefore, we n onlue tht for ny lok perio lrger thn 16 ns, the minimum numer of

sttes the DFG requires is one; for ny lok perio etween 16 n 8 ns, the minimum numer of sttes is two, et. Figure 5 shows the resulting shpe funtion. DFG : 4 ns numer of sttes 5 4 1 8 56 8 16 lok perio (ns) Figure 5: The shpe funtion of lok perios versus numer of sttes Thus fr, the shpe funtion oes not inorporte ontrol unit elys. As illustrte in Figure, the ontrol unit onsists of stte register, eoer, the ontrol logi n the net-stte logi. Therefore, given the numer of sttes N, the ely ofnn-stte ontrol unit, T CU(N), is estimte s the sum of the eoer ely (T DEC), the ontrol logi ely (T CL), the net-stte logi ely (T NS), n the propgtion ely n the setup time of the stte register. The propgtion ely n the setup time of the stte register n e otine from the omponent lirry. The following equtions re use to estimte T DEC, T CL, n T NS. T DEC = T INV log M log N et AND T CL = log M N et OR T NS = log M (N=)eT OR For lk of spe, we will not eplin how these equtions hve een erive; etile isussion is provie in [5]. Given point (lk i;n) in the shpe funtion, the lgorithm net uptes the point (lk i;n) to (lk i T CU(N);N). Note tht given two points (lk i;n) n (lk i1;n 1), where lk i lki1, it is possile tht lk i T CU(N) lk i1 T CU(N 1). In this se, the lgorithm rops the point (lk i T CU(N);N). After the shpe funtion of lok perios versus the numer of sttes is upte, the shpe funtion of lok perios versus the pipe stge ely n e otine y multiplying the lok perios y the orresponing numer of sttes. 4. Resoure Estimtion From the shpe funtions for eh pipe stge, the set of lok nites n e esily otine. The net step of our lgorithm is to estimte the numer n type of resoures require for eh lok nite. An emple to illustrte the lgorithm is shown in Figure 6(). We know tht ll pipe stges re eeute onurrently n in orer to onsier resoure shring ross the stges, the lgorithm nees to onsier the opertions in ll stges t the sme time. In orer to emonstrte this, we put the DFGs from two pipe stges, DFG 1 n DFG, sie y sie in Figure 6. Note tht these pipe stges re eeute in prllel ut on ierent input smples. Given the lok perio n the numer of sttes, the rst step of the lgorithm is to ompute the time frme of eh DFG1 DFG s () () stge 1 stge opertion istriution intervl I1 = {,} num of mult = I1 lk = ns num of sttes = 5 : 4 ns s stge 1 s () () stge stge 1 stge opertion istriution intervl I1 = {,} opertion istriution intervl I = {s,s} opertion istriution intervl I = {,} num of = 1 Figure 6: An emple illustrting the resoure estimtion lgorithm opertion. Let ASAP i n ALAP i enote the ASAP n ALAP vlue of opertion o i respetively, the time frme of o i is ene s (ALAP i ASAP i yle(o i)), where yle(o i) represents the numer of lok yles require to nish the opertion o i. Figure 6() shows the time frmes of ll the opertions in Figure 6(). The net step is to prtition sttes into set of isjoint opertion istriution intervls suh tht there re no overlpping time frmes etween two onseutive intervls. For emple, in Figure 6(), there is no wy of prtitioning the ve sttes into intervls suh tht there re no overlpping time frmes of the multiplition opertions; therefore, there is only one opertion istriution intervl, f,g, for multiplition opertions, where is the strting stte n is the ening stte of the intervl. On the other hn, there re three opertion istriution intervls for itions. After the opertion intervls re otine, the lgorithm estimtes the require numer of omponents for eh intervl seprtely, n the mimum numer of require omponents over ll intervls is the minimum numer of omponents neee to perform ll the opertions. The unerlying onept of etermining the minimum numer of omponents for one istriution intervl is tht, if there re n opertions tht nee to e nishe within s sttes, n omponent use to perform n opertion requires t lest lok yles to nish the eeution efore it n e use gin to eeute nother opertion, then lerly, the minimum numer of omponents require is equl to (n )/se. For emple, for the multiplition opertions shown in Figure 6(), n =,= n s =5, hene, the minimum numer of multipliers require is ( )/5e, tht is, t lest two multipliers re neee. Similrly, from Figure 6(), it n e estimte tht t lest one er is require. I1 I I

5 Eperiments In this setion, we present results of three eperiments with the lok estimtion lgorithm whih we hve implemente using C on SUN SPARC 5 sttion. In the rst eperiment, we emonstrte the qulity of our lgorithm y ompring the selete lok ginst the \est" lok otinle using fore-irete sheuling. The seon eperiment stuies the impt of resoure shring ross ierent pipe stges on the ost of esign, n nlly, the thir eperiment emonstrtes the eet of onsiering ontrol unit ely on the lok seletion. For ll eperiments we hve use the VLSI Tehnology In. VDP7 1. miron Dtpth Element Lirry [9] to otin the re n elys of the funtionl units. The tpth elements use re shown in Figure 7. omponent er sutrtor multiplier ely(ns) 11. 15.5. re(1 um*) 54 6 Figure 7: Dtpth omponent lirry 5.1 Eperiment 1: Qulity of Results As isusse in Setion, there re no eisting lok seletion lgorithms for pipeline esigns; furthermore, the eisting lok seletion lgorithms o not tke ontrol unit ely into ount. Thus, in orer to emonstrte the qulity of our lgorithm, we hve een unle to ompre our results with relte reserh in lok seletion; inste, wehve utilize fore-irete sheuling, whih iswell known time-onstrine sheuling lgorithm. This eperiment is onute on four emples: the AR lttie lter (AR) [4], the liner phse -spline interpolte lter (BSpline) [6], the elliptil lter (EF) [] n the HAL enhmrk []. For eh of the emples, we rst generte numer of input esriptions y mnully pipelining the speition into ierent numer of stges, where the ely of the pipe stges in eh pipeline is s equl s possile. We then ple ierent pipe stge ely onstrints on eh of the pipeline esriptions, n for given pipe stge ely onstrint we otin the estimte n the \est" lok perio. The estimte lok perio is otine y eeuting our lok-seletion lgorithm. The est lok perio is otine y eeuting the fore-irete sheuling lgorithm for numer of lok perios, eh orresponing to ifferent numer of sttes (from one stte to fteen sttes). The lok perio tht gives the miniml re esign is then the est perio. The results of ompring the est n the estimte lok perio for the four emples mentione ove re shown in Figure 8. The lst olumn of the tle shows the perentge ierene in esign re, whih is pproimte y the sum of the res of ll the omponents, otine y the fore-irete sheuling lgorithm n y our lokseletion lgorithm. As n e seen from the results, the estimte lok perio ws ientil to the one otine with FDS in most ses; however, in three ses our lgorithm estimte lok perio tht resulte in the use Emples AR BSpline EF # of stges 4 PSDely FDS ours (ns) lk(ns) resoures lk(ns) resoures 16.6 4A,5M 16.6 4A,5M 1 1 HAL A: er, S: sutrtor, M: multiplier 16.6 6A,8M 16.6 6A,8M 1.4 A,M 18.75 A,M. A,M 1.5 A,M 7. A,M. 4A,M. 5A,M. 5A,M 16.6 5A,M 16.6 5A,M 5 1A,1S,M 5 1A,1S,M res. iff. (%). 6.7 5. Figure 8: Compring the est n the estimte lok perio for four enhmrks: AR, BSpline, EF, n HAL of either one more multiplier or one more er thn tht otine with FDS. This isrepny etween the estimte n the est lok perio my e epline y onsiering the elity of our resoure estimtion metho, whih essentilly gives lower oun on the numer of resoures. It is importnt to note tht the orret seletion of the lok epens more on the elity rther thn on the ury of the resoure estimtion. In orer to illustrte the role of elity of our resoure estimtes, we ompre the results of our resoure estimtes with the resoures otine y the fore-irete sheuling lgorithm for ll emples n PSDely onstrints shown in Figure 8. Due to spe limittions, we give only the results for the -stge AR n EF emples in Figure 9. From the results of omprison, we onlue tht when elity is high, our lok seletion lgorithm selete the est lok - in spite of the ft tht the ury of the estimtion is low in some ses, suh s in the -stge AR esign. When elity is low, our lok seletion lgorithm selete the wrong lok, even though the ury my e high, suh s in the -stge EF esign. re (1 um*) 6 5 4 FDS ours AR filter ( pipe stges) 4 6 8 1 lok perio (ns) re (1 um*) 4 1 FDS ours Ellipti filter ( pipe stges) 5 1 lok perio (ns) Figure 9: Compring our resoure estimtes ginst the results of the FDS lgorithm for the -stge AR n EF emples From the results it my pper tht the FDS pproh is superior thn our pproh; however, wewoul like to point out tht in the se of the elliptil lter emple, wheres it took pproimtely 1 seon to estimte n selet the lok perio for given pipe stge ely onstrint using our lgorithm, it took more thn 17 minutes to otin the est lok perio using the FDS lgorithm sine it h to e iterte over pproimtely fteen ierent lok perios.

5. Eperiment : Resoure Shring This eperiment is onute on the sme emples tht were use in the previous setion. For eh esription n onstrint, we ompre the minimum numer of resoures otine y implementing ll the pipe stges iniviully (tht is, y is-llowing resoure-shring ross ierent pipe stges) to tht otine y implementing ll the pipe stges together n thus llowing resoure shring ross ierent pipe stges. The minimum ost of esign without resoure shring is ompute y rst otining the est lok perio n the minimum numer of resoures require for eh pipe stge seprtely using fore-irete sheuling, n then summing up the resoures of ll the pipe stges. To ompute the minimum numer of resoures require with shring, we rst selet lok perio y pplying our lgorithm to the pipeline esriptions n then generte the minimum numer of resoures require using fore-irete sheuling. Emples # of stg. PSDely without shring (ns) lk(ns) resoures AR BSpline 18.75 18.75,5 6A,7M A,M 1 1 1.5,5,5 8A,8M 4A,M EF HAL 4 5,7.5,5,5 18.75,,5 7.5,75 5A,M 7A,M 6A,4M A,1S,M A: er, S: sutrtor, M: multiplier with shring imprv. resoures (%) 4A,5M 9. 6A,8M.6 A,M 6.7 A,M 9. 4A,M 5.9 5A,M 8.1 5A,M 4. 1A,1S,M 6.7 lk(ns) 16.6 16.6 18.75 1.5.. 16.6 5 Figure 1: The eets of resoure shring on four enhmrks AR, BSpline, EF, n HAL The results on the four emples re shown in Figure 1. Note tht when eh stge is implemente seprtely, in some ses, more thn one lok signl is use euse ifferent pipe stges n use ierent lok signls. In ll the ses, the results inite tht resoure shring within n ross ierent pipe stges reues the esign re from nywhere etween.6 n 4. %. This shows sustntil reution in re when resoure-shring ross ierent pipe stges is llowe n it lso inites the eetiveness of our lgorithm. 5. Eperiment : Control Unit Dely This eperiment is onute for the AR lter n the elliptil lter enhmrks. Figure 11 shows the result of the elliptil lter enhmrk. There re two shpe funtions of lok perios versus totl ely. The shpe funtion in soli line is otine y our shpe funtion genertion lgorithm with the ontrol unit ely estimtion, while the shpe funtion in she line is generte y our lgorithm, ut ssuming the ontrol unit ely is zero. From the results, we oserve tht the ierene etween elys otine with n without onsiering the ontrol unit ely eomes lrger when the lok perio eomes smller. Note tht the ierene n e s lrge s 7 ns. Therefore, we onlue tht the ontrol unit ely ontriutes signintly in the lok perio n negleting the ontrol unit ely my result in hoie of the lok ely (ns) 1 1 8 6 4 with ontrol unit ely w/o ontrol unit ely 4 6 8 1 lok perio (ns) Figure 11: The lok perio vs. ely shpe funtions of the EF emple, generte with n without the ontrol unit ely estimtion perio. Sme onlusion n e rehe for the AR lter enhmrk [5]. 6 Conlusions n Future Work In summry, we hve presente lok seletion lgorithm tht, given pipeline ehviorl esription n throughput onstrint, selets the lok perio leing to the miniml-re esign. We teste our lok-seletion lgorithm on severl emples n the results show tht, in most ses, our lgorithm selets lok perio tht uses miniml re resoures within less thn one seon. We pln to eten our moel to inorporte wire elys. Currently, we re working on lok seletion lgorithm tht llows multiple lok signls. 7 Aknowlegements This work ws supporte y the Semionuter Reserh Corportion (grnt #94-DJ-146), n y the Ntionl Siene Fountion (grnt CDA-9495). We grtefully knowlege their support. 8 Referenes [1] S. Chuhuri, S. A. Blythe, n R. A. Wlker, \An Et Methoology for Sheuling in D Design Spe," in Pro. 8th ISSS, 1995. [] N. D. Dutt, n C. Rmhnrn, \Benhmrks for the 199 High-Level Synthesis workshop," TR#9-17, Dept. of ICS, UCI, 199. [] D. D. Gjski, N. Dutt, A. Wu, n S. Lin, High-Level Synthesis: Introution to Chip n System Design, Kluwer Aemi Pulishers, 199. [4] R. Jin, A. C. Prker, n N. Prk, \Moule Seletion for Pipeline Synthesis," in Pro. 5th DAC, 1988. [5] H.-P. Jun, D. D. Gjski, n S. Bkshi, \Clok Optimiztion for High-Performne Pipeline Design," TR#96-1, Dept. of ICS, UCI, 1995. [6] S. Nryn, n D. D. Gjski, \System Clok Estimtion se on Clok Slk Minimiztion," in Pro. EuroDAC, 199. [7] N. Prk, n A. C. Prker, \Synthesis of Optiml Cloking Shemes," in Pro. n DAC, 1985. [8] A. C. Prker, T. Pizzro, n M. Mlinr, \MAHA: A Progrm for Dtpth Synthesis," in Pro. th DAC, 1986. [9] VLSI Tehnology In., VDP7 1. Miron CMOS Dtpth Cell Lirry, 1991.