《设计时序收敛》PPT课件.ppt

上传人:牧羊曲112 文档编号:5605430 上传时间:2023-08-01 格式:PPT 页数:77 大小:2.61MB
返回 下载 相关 举报
《设计时序收敛》PPT课件.ppt_第1页
第1页 / 共77页
《设计时序收敛》PPT课件.ppt_第2页
第2页 / 共77页
《设计时序收敛》PPT课件.ppt_第3页
第3页 / 共77页
《设计时序收敛》PPT课件.ppt_第4页
第4页 / 共77页
《设计时序收敛》PPT课件.ppt_第5页
第5页 / 共77页
点击查看更多>>
资源描述

《《设计时序收敛》PPT课件.ppt》由会员分享,可在线阅读,更多相关《《设计时序收敛》PPT课件.ppt(77页珍藏版)》请在三一办公上搜索。

1、FPGA设计时序收敛,王巍,2007年Xilinx 联合实验室主任会议,2023/8/1,2,主要内容,时序约束的概念时序收敛流程时序收敛流程代码风格时序收敛流程综合技术时序收敛流程管脚约束时序收敛流程时序约束时序收敛流程静态时序分析时序收敛流程实现技术时序收敛流程FloorPlanner和PACE,2023/8/1,3,提高设计的工作频率通过附加约束可以控制逻辑的综合、映射、布局和布线,以减小逻辑和布线延时,从而提高工作频率。获得正确的时序分析报告FPGA设计平台包含静态时序分析工具,可以获得映射或布局布线后的时序分析报告,从而对设计的性能做出评估。静态时序分析工具以约束作为判断时序是否满足

2、设计要求的标准。指定FPGA引脚位置与电气标准FPGA的可编程特性使电路板设计加工和FPGA设计可以同时进行,而不必等FPGA引脚位置完全确定,从而节省了系统开发时间。通过约束还可以指定I/O引脚所支持的接口标准和其他电气特性。,附加约束的基本作用,2023/8/1,4,周期(PERIOD)指参考网络为时钟的同步元件间的路径,包括:flip-flop、latch、synchronous RAM等。周期约束不会优化以下路径:从输入管脚到输出管脚之间的路径纯组合逻辑从输入管脚到同步元件之间的路径从同步元件到输出管脚的路径,周期约束路径示意图,周期约束,2023/8/1,5,周期约束是一个基本时序和

3、综合约束,它附加在时钟网线上,时序分析工具根据周期约束检查与同步时序约束端口(指有建立、保持时间要求的端口)相连接的所有路径延迟是否满足要求(不包括PAD到寄存器的路径)。周期是时序中最简单也是最重要的含义,其它很多时序概念会因为软件商不同略有差异,而周期的概念却是最通用的,周期的概念是FPGA/ASIC时序定义的基础概念。后面要讲到的其它时序约束都是建立在周期约束的基础上的,很多其它时序公式,可以用周期公式推导。在附加周期约束之前,首先要对电路的时钟周期有一定的估计,不能盲目上。约束过松,性能达不到要求,约束过紧,会大大增加布局布线时间,甚至效果相反。,周期约束,2023/8/1,6,周期约

4、束的计算设计内部电路所能达到的最高运行频率取决于同步元件本身的建立保持时间,以及同步元件之间的逻辑和布线延迟。时钟的最小周期为:Tperiod=Tcko+Tlogic+Tnet+Tsetup-Tclk_skew Tclk_skew=Tcd1-Tcd2其中Tcko为时钟输出时间,Tlogic为同步元件之间的组合逻辑延迟,Tnet为网线延迟,Tsetup为同步元件的建立时间,Tclk_skew为时钟信号偏斜。,周期约束,2023/8/1,7,附加周期约束的一个例子:NET SYS_CLK PERIOD=10ns HIGH 4ns 这个约束将被附加到SYS_CLK所驱动的所有同步元件上。PERIOD

5、约束自动处理寄存器时钟端的反相问题,如果相邻同步元件时钟相位相反,那么它们之间的延迟将被默认限制为PERIOD约束值的一半。,反相时钟周期约束问题的例子,周期约束,2023/8/1,8,偏移约束指数据和时钟之间的约束,偏移约束规定了外部时钟和数据输入输出引脚之间的时序关系,只用于与PAD相连的信号,不能用于内部信号。,偏移约束示意图,偏移约束,2023/8/1,9,偏移约束优化以下时延路径从输入管脚到同步元件偏置输入(OFFSET IN)从同步元件到输出管脚偏置输出(OFFSET OUT)为了确保芯片数据采样可靠和下级芯片之间正确的交换数据,需要约束外部时钟和数据输入输出引脚之间的时序关系。偏

6、移约束的内容的时刻,从而保证与下一级电路的时序关系。告诉综合器、布线器输入数据到达的时刻,或者输出数据稳定。,偏移约束,2023/8/1,10,OFFSET_IN_BEFORE说明了输入数据比有效时钟沿提前多长时间准备好,于是芯片内部与输入引脚的组合逻辑延迟就不能大于该时间(上限,最大值),否则将发生采样错误。OFFSET_IN_AFTER指出输入数据在有效时钟沿之后多长时间到达芯片的输入引脚,也可以得到芯片内部延迟的上限。,偏移约束,2023/8/1,11,输入到达时间计算时序描述 OFFSET_IN_AFTER定义的含义是输入数据在有效时钟沿之后的Tarrival时刻到达。即:Tarriv

7、al=Tcko+Toutput+Tlogic 综合实现工具将努力使输入端延迟Tinput 满足以下关系:Tarrival+Tinput+TsetupTperiod其中Tinput为输入端的组合逻辑、网线和PAD的延迟之和,Tsetup为输入同步元件的建立时间,Tcko为同步元件时钟输出时间。,偏移约束,2023/8/1,12,例子:假设Tperiod=20ns,Tcko1ns,Toutput3ns,Tlogic8ns,请给出偏移约束。,偏移约束,Tarrival=Tcko+Toutput+Tlogic12ns,使用OFFSET_IN_AFTER进行偏移约束为:NET DATA_IN OFFSE

8、T=IN 12ns AFTER CLK也可以使用OFFSET_IN_BEFORE进行偏移约束,它们是等价的:NET DATA_IN OFFSET=IN 8ns BEFORE CLK,2023/8/1,13,OFFSET_OUT_BEFORE指出下一级芯片的输入数据应该在有效时钟沿之前多长时间准备好。从下一级的输入端的延迟可以计算出当前设计输出的数据必须在何时稳定下来,根据这个数据对设计输出端的逻辑布线进行约束,以满足下一级的建立时间要求,保证下一级采样数据稳定。OFFSET_OUT_AFTER规定了输出数据在有效时钟沿之后多长时间(上限,最大值)稳定下来,芯片内部的输出延迟必须小于这个值。,偏

9、移约束,2023/8/1,14,计算要求的输出稳定时间定义:Tstable=Tlogic+Tinput+Tsetup只要当前设计输出端的数据比时钟上升沿提前Tstable时间稳定下来,下一级就可以正确采样数据。实现工具将会努力使输出端的延迟满足以下关系:Tcko+Toutput+TstableTperiod这个公式就是Tstable必须要满足的基本时序关系,即本级的输出应该保持怎么样的稳定状态,才能保证下级芯片的采样稳定。,偏移约束,2023/8/1,15,例子:设时钟周期为20ns,后级输入逻辑延时Tinput为4ns、建立时间Tsetup为1ns,中间逻辑Tlogic的延时为8ns,请给出

10、设计的输出偏移约束。答案:OFFSET_OUT_BEFORE 偏移约束为:NET DATA_OUT OFFSET=OUT 13ns BEFORE CLKOFFSET_OUT_AFTER约束:NET DATA_OUT FFSET=OUT 7ns AFTER CLK,偏移约束,2023/8/1,16,Given the system diagram below,what values would you put in the Constraints Editor so that the system will run at 100 MHz?(Assume no clock skew between

11、 devices),偏移约束,2023/8/1,17,Path-Specific Timing Constraints,Using global timing constraints(PERIOD,OFFSET,and PAD-TO-PAD)will constrain your entire designUsing only global constraints often leads to over-constrained designsConstraints are too tightIncreases compile time and can prevent timing object

12、ives from being metReview performance estimates provided by your synthesis tool or the Post-Map Static Timing ReportPath-specific constraints override the global constraints on specified pathsThis allows you to loosen the timing requirements on specific paths,2023/8/1,18,Areas of your design that ca

13、n benefit from path-specific constraintsMulti-cycle pathsPaths that cross between clock domainsBidirectional busesI/O timingPath-specific timing constraints should be used to define your performance objectives and should not be indiscriminately placed,Path-Specific Timing Constraints,2023/8/1,19,Pat

14、h-Specific Timing Constraints,2023/8/1,20,Path-Specific Timing Constraints,2023/8/1,21,假设要做一个32位的高速计数器,由于计数器的速度取决于最低位到最高位的进位延迟,为了提高速度采用了预定标计数器的结构,也就是把计数器分成一个小计数器和一个大计数器,如图所示。,其中小计数器是两位的,大计数器是30位,它们由同一时钟驱动。大计数器使能端EN受小计数器进位驱动,小计数器每4个CLK进位一次,使EN持续有效一个CLK的时间,此时有效时钟沿到来大计数器加1。可见,小计数器的寄存器可能每个CLK翻转1次,低位寄存器输

15、出的数据必须在1个CLK内到达高位寄存器的输入端,即寄存器之间的最大延时为1个CLK。而大计数器内部的寄存器每4个时钟周期才可能翻转一次,低位寄存器输出的数据在4个CLK内到达高位寄存器的输入端即可,即寄存器之间的最大延迟为4个CLK,因此降低了计数器的时序要求,可以实现规模较大的高速计数器。,预定标计数器,Path-Specific Timing Constraints,2023/8/1,22,约束文件,Path-Specific Timing Constraints,2023/8/1,23,Use the Pad to Setup and Clock to Pad columns to s

16、pecify OFFSETs for all I/O paths on each clock domain.Easiest way to constrain most I/O pathsHowever,this can lead to an over-constrained designUse the Pad to Setup and Clock to Pad columns to specify OFFSETs for each I/O pin,Use this type of constraint when only a few I/O pins need different timing

17、,Path-pin offset Timing Constraints,2023/8/1,24,False paths Constraints,If a PERIOD constraint were placed on this design,what delay paths would be constrained?If the goal is to optimize the input and output times without constraining the paths between registers,what constraints are needed?Assume th

18、at a global PERIOD constraint is already defined,2023/8/1,25,Timing Constraint Priority,False pathsMust be allowed to override any timing constraintFROM THRU TOFROM TOPin-specific OFFSETsGroup OFFSETsGroups of pads or registersGlobal PERIOD and OFFSETsLowest priority constraints,2023/8/1,26,主要内容,时序约

19、束的概念时序收敛流程时序收敛流程代码风格时序收敛流程综合技术时序收敛流程管脚约束时序收敛流程时序约束时序收敛流程静态时序分析时序收敛流程实现技术时序收敛流程FloorPlanner和PACE,2023/8/1,27,设计完成后,如何判断一个成功的设计?设计是否满足面积要求-是否能在选定的器件中实现。设计是否满足性能要求-能否达到要求的工作频率。管脚定义是否满足要求-信号名、位置、电平标准及数据 流方向等。,时序收敛流程,2023/8/1,28,如何判断设计适合所选芯片?所选芯片是否有足够的资源容纳更多的逻辑?如果有,有多少?如果适合所选芯片,能否完全成功布通?手段:查看 Map Report

20、或者 Place&Route Report,时序收敛流程,2023/8/1,29,Project Navigator 产生两种时序报告:Post-Map Static Timing ReportPost-Place&Route Static Timing Report时序报告包含没有满足时序要求的详细路径的描述,用于分析判断时序要求没有得到满足的原因。Timing Analyzer用于建立和阅读时序报告。,时序收敛流程,2023/8/1,30,合理的性能约束的依据Post-Map Static Timing Report包括:实际的逻辑延迟和(block delays)和0.1 ns网络延迟(

21、net delays)合理的时序性能约束的原则:60/40 原则If less than 60 percent of the timing budget is used for logic delays,the Place&Route tools should be able to meet the constraint easily.Between 60 to 80 percent,the software run time will increase.Greater than 80 percent,the tools may have trouble meeting your goals.

22、,时序收敛流程,2023/8/1,31,时序收敛流程,2023/8/1,32,性能突破只要三步:1.充分利用嵌入式(专用)资源DSP48,PowerPC processor,EMAC,MGT,FIFO,block RAM,ISERDES,and OSERDES,等等。2.追求优秀的代码风格Use synchronous design methodologyEnsure the code is written optimally for critical pathsPipeline(Xilinx FPGAs have abundant Registers)3.充分利用synthesis工具和Pl

23、ace&Route工具参数选择Try different optimization techniquesAdd critical timing constraints in synthesisPreserve hierarchyApply full and correct constraintsUse High effort,时序收敛流程,2023/8/1,33,时序收敛流程,Use embedded blocks,2023/8/1,34,Simple Coding Steps Yield 3x Performance,Use pipeline stagesmore bandwidthUse

24、synchronous resetbetter system controlUse Finite State Machine optimizationsUse inferable resourcesMultiplexerShift Register LUT(SRL)Block RAM,LUT RAMCascade DSPAvoid high-level constructs(loops,for example)in codeMany synthesis tool produce slow implementations,时序收敛流程,2023/8/1,35,Synthesis guidelin

25、es,Use timing constraintsDefine tight but realistic individual clock constraintsPut unrelated clocks into different clock groupsUse proper options and attributesTurn off resource sharingMove flip-flops from IOBs closer to logicTurn on FSM optimizationUse the retiming option,时序收敛流程,2023/8/1,36,时序收敛流程

26、,Impact of Constraints,2023/8/1,37,Place&Route Guidelines,Timing constraintsUse tight,realistic constraintsRecommended optionsHigh-effort Place&RouteBy default,effort is set to StandardTiming-driven MAPMulti-Pass Place&Route(MPPR)Tools to help meet timingFloorplanning(Use the PACE and PlanAhead soft

27、ware tools)Physical synthesis toolsOther available options:Incremental designModular design flows,时序收敛流程,2023/8/1,38,时序收敛流程,Impact of Constraints in Tools,2023/8/1,39,主要内容,时序约束的概念时序收敛流程时序收敛流程代码风格时序收敛流程综合技术时序收敛流程管脚约束时序收敛流程时序约束时序收敛流程静态时序分析时序收敛流程实现技术时序收敛流程FloorPlanner和PACE,2023/8/1,40,代码风格,使用同步设计技术使用Xi

28、linx-Specific代码使用Xilinx提供的核使用层次化设计,使用ISE产生的静态时序分析报告,找出时序关键路径,并进行优化,2023/8/1,41,主要内容,时序约束的概念时序收敛流程时序收敛流程代码风格时序收敛流程综合技术时序收敛流程管脚约束时序收敛流程时序约束时序收敛流程静态时序分析时序收敛流程实现技术时序收敛流程FloorPlanner和PACE,2023/8/1,42,使用综合工具提供的参数选项,尤其是constraint-driven技术,可以优化设计网表,提高系统性能,为综合工具指定关键路径,综合工具可以提高 工作级别,使用更深入的算法,减少关键路径延迟,综合技术,202

29、3/8/1,43,综合工具提供许多优化选择,以获得期望的系统性能和面积要求,参考F1帮助信息或XST Userguide,Register DuplicationTiming-Driven SynthesisTiming Constraint EditorFSM ExtractionRetimingHierarchy ManagementSchematic ViewerError NavigationCross-ProbingPhysical Optimization,综合技术,2023/8/1,44,High-fanout nets can be slow and hard to route

30、Duplicating flip-flops can fix both problemsReduced fanout shortens net delaysEach flip-flop can fanout to a different physical region of the chip to reduce routing congestionDesign trade-offsGain routability and performanceIncrease design areaIncrease fanout of other nets,Duplicating Flip-Flops,综合技

31、术,2023/8/1,45,Timing-Driven Synthesis,Synplify,Precision,and XST softwareTiming-driven synthesis uses performance objectives to drive the optimization of the designBased on your performance objectives,the tools will try several algorithms to attempt to meet performance while keeping the amount of re

32、sources in mindPerformance objectives are provided to the synthesis tool via timing constraints,综合技术,2023/8/1,46,实施period约束和input/output约束(.xcf文件)通常,根据期望的性能目标进行1.5X2X的过约束,综合工具会提高工作级别,有利于在实现中更容易满足时序目标切记:如果使用过约束,不要把这些约束传递给实现工具使用Multi-cycle和false paths约束使用Critical path约束,对Critical path进行优化,综合技术,Timing-

33、Driven Synthesis,2023/8/1,47,Retiming,Synplify,Precision,and XST softwareRetiming:The synthesis tool automatically tries to move register stages to balance combinatorial delay on each side of the registers,Before Retiming,After Retiming,综合技术,2023/8/1,48,Hierarchy Management,Synplify,Precision,and XS

34、T softwareThe basic settings are:Flatten the design:Allows total combinatorial optimization across all boundariesMaintain hierarchy:Preserves hierarchy without allowing optimization of combinatorial logic across boundariesIf you have followed the synchronous design guidelines,use the setting-maintai

35、n hierarchyIf you have not followed the synchronous design guidelines,use the setting-flatten the designYour synthesis tool may have additional settingsRefer to your synthesis documentation for details on these settings,综合技术,2023/8/1,49,Hierarchy Preservation Benefits,Easily locate problems in the c

36、ode based on the hierarchical instance names contained within static timing analysis reportsEnables floorplanning and incremental design flowThe primary advantage of flattening is to optimize combinatorial logic across hierarchical boundariesIf the outputs of leaf-level blocks are registered,there i

37、s no need to flatten,综合技术,2023/8/1,50,主要内容,时序约束的概念时序收敛流程时序收敛流程代码风格时序收敛流程综合技术时序收敛流程管脚约束时序收敛流程时序约束时序收敛流程静态时序分析时序收敛流程实现技术时序收敛流程FloorPlanner和PACE,2023/8/1,51,管脚约束,管脚约束通常在设计早期就要确定下来,以保证电路板的设计同步进行对高速设计、复杂设计和具有大量I/O管脚的设计,Xilinx推荐手工进行管脚约束实现工具可以自动布局逻辑和管脚,但是一般来说不会是最优的管脚约束可以指导内部数据流向,不合理的管脚布局很容易降低系统性能合理的管脚布局需要对

38、所设计系统和Xilinx器件结构的详细了解,如要考虑I/O bank、I/O电气标准等时钟(单端或差分)必须约束在专用时钟管脚 注意:时钟资源数量的限制最后使用dual-purpose管脚(如配置和DCI管脚),2023/8/1,52,根据数据流指导管脚约束,用于控制信号的I/O置于器件的顶部或底部控制信号垂直布置用于数据总线的I/O置于器件的左部和右部数据流水平布置。,以上布局方法可以充分利用Xilinx器件的资源布局方式进位链排列方式块RAM,乘法器位置,管脚约束,2023/8/1,53,使用PACE进行管脚约束,管脚约束,2023/8/1,54,主要内容,时序约束的概念时序收敛流程时序收

39、敛流程代码风格时序收敛流程综合技术时序收敛流程管脚约束时序收敛流程时序约束时序收敛流程静态时序分析时序收敛流程实现技术时序收敛流程FloorPlanner和PACE,2023/8/1,55,时序约束,如果实现后性能目标得到满足,则设计完成否则,施加特定路径时序约束,施加multi-cycle,false path和关键路径约束,实现工具会优先考虑这些特定路径约束,2023/8/1,56,时序约束的概念时序收敛流程时序收敛流程代码风格时序收敛流程综合技术时序收敛流程管脚约束时序收敛流程时序约束时序收敛流程静态时序分析时序收敛流程实现技术时序收敛流程FloorPlanner和PACE,主要内容,2

40、023/8/1,57,静态时序分析,Post-map:Map后,使用Post-map timing report确定关键路径的逻辑延迟Post-PAR:PAR后,使用Post-PAR static timing report确定时序约束是否满足Logic delay Vs.Routing delay:60%/40%原则Timing Analyzer可以读取时序报告,查找关键路径,并与Floorplanner协同解决时序问题,2023/8/1,58,Report Example,静态时序分析,2023/8/1,59,Analyzing Post-Place&Route Timing,There

41、are many factors that contribute to timing errors,includingNeglecting synchronous design rules or using incorrect HDL coding stylePoor synthesis results(too many logic levels in the path)Inaccurate or incomplete timing constraintsPoor logic mapping or placementEach root cause has a different solutio

42、nRewrite HDL codeAdd timing constraintsResynthesize or re-implement with different software optionsCorrect interpretation of timing reports can reveal the most likely causeTherefore,the most likely solution,静态时序分析,2023/8/1,60,静态时序分析,Case1,2023/8/1,61,Poor Placement:Solutions,Increase Placement effor

43、t level(or Overall effort level)Timing-driven packing,if the placement is caused by packing unrelated logic togetherCross-probe to the Floorplanner to see what has been packed togetherThis option is covered in the.Advanced Implementation Options.modulePAR extra effort or MPPR optionsCovered in the.A

44、dvanced Implementation Options.moduleFloorplanning or Relative Location Constraints(RLOCs)if you have the skill,静态时序分析,2023/8/1,62,静态时序分析,Case2,2023/8/1,63,High Fanout:Solutions,Most likely solution is to duplicate the source of the high-fanout netthe net is the output of a flip-flop,the solution is

45、 to duplicate the flip-flopUse manual duplication(recommended)or synthesis optionsIf the net is driven by combinatorial logic,locating the source of the net in the HDL code may be more difficultUse synthesis options to duplicate the source,静态时序分析,2023/8/1,64,静态时序分析,Case3,2023/8/1,65,Too Many Logic L

46、evels:Solutions,The implementation tools cannot do much to improve performanceThe netlist must be altered to reduce the amount of logic between flip-flopsPossible solutionsCheck whether the path is a multicycle pathIf yes,add a multicycle path constraintUse the retiming option during synthesis to di

47、stribute logic more evenly between flip-flopsConfirm that good coding techniques were used to build this logic(no nested if or case statements)Add a pipeline stage,静态时序分析,2023/8/1,66,时序约束的概念时序收敛流程时序收敛流程代码风格时序收敛流程综合技术时序收敛流程管脚约束时序收敛流程时序约束时序收敛流程静态时序分析时序收敛流程实现技术时序收敛流程FloorPlanner和PACE,主要内容,2023/8/1,67,使

48、用更高级别的Effort Level:可以提高时序性能,而不必采取其它措施(如施加更高级的时序约束,使用高级工具或者更改代码等)Xilinx推荐:第一遍实现时,使用全局时序约束和缺省的实现参数选项。如果不能满足时序要求:尝试修改代码,如使用合适的代码风格,增加流水线等修改综合参数选项,如Optimization Effort,Use Synthesis Constraints File,Keep Hierarchy,Register Duplication,Register Balancing 等增加PAR Effort LevelApply path-specific timing cons

49、traints for synthesis and implementation,R&R参数选项:Effort Level,实现技术,2023/8/1,68,和PAR一样,可以使用Map-timing参数选项针对关键路径进行约束。如参数“Timing-Driven Packing and Placement”给关键路径以优先时序约束的权利。用户约束通过Translate过程从User Constraints File(UCF)中传递到设计中。,实现技术,2023/8/1,69,Timing-Driven Packing,Timing constraints are used to optimi

50、ze which pieces of logic are packed into each sliceNormal(standard)packing is performedPAR is run through the placement phaseTiming analysis analyzes the amount of slack in constrained pathsIf necessary,packing changes are made to allow better placementThe output of MAP contains both mapping and pla

展开阅读全文
相关资源
猜你喜欢
相关搜索
资源标签

当前位置:首页 > 生活休闲 > 在线阅读


备案号:宁ICP备20000045号-2

经营许可证:宁B2-20210002

宁公网安备 64010402000987号