《嵌入式系统架构软体设计.ppt》由会员分享,可在线阅读,更多相关《嵌入式系统架构软体设计.ppt(179页珍藏版)》请在三一办公上搜索。
1、嵌入式系统架构软体设计,嵌入式系统架构软体设计-using ARMDay#3,#4,#5 Modules Outline,课程介绍,Day#3Simple RISC Assembly Language ARM Assembly LanguageARM Development Suite 使用练习Day#4Arm Instruction setImportant ASM Programming SkillsARM/THUMB/C InterworkingDay#5ARM Exception HandlerBuild ARM ROM ImageUse NET-Start!ucLinux BSP,嵌
2、入式系统产品设计流程概观,ARM system-on-chip Architecture,2nd ed.ARM architecture reference manual,2nd ed.ARM Development Suite-Getting StartedARM Development Suite-Developer GuideARM Development Suite-Assembler Guidehttp:/www.uclinux.org/2002嵌入式系统开发经验Building powerful platform with Windows CESoftware Engineerin
3、g,A practitioners Approach 3rd ed.Professional Symbian Programming,嵌入式系统架构软体设计-using ARMModule#3-1:Simple RISC Assembly Concept,RISC精简指令集vs.CISC复杂指令集,Hardware instruction decode logicPipeline executionSingle execution,Large microcode ROMs to decode instructionAllow little pipelineMany cycles to comp
4、leter a single instruction,A smaller die sizeA shorter development timeA higher performance Poor code density,MUO 一个简单的处理器,MUO指令集与资料路径,指令规则,指令执行范例,ADD 0 x16AACC:=ACC+mem0 x16A,运算范例,C function:Main()C=A+B;,MUO 机器指令LDA 0 x100ADD 0 x104STO 0 x108,练习:MUO微处理器的运算,0 x100 LDA 0 x1000 x002 SUB 0 x1040 x004 S
5、TO 0 x1000 x006 JNE 0 x0000 x008 STP,请描述此段程式的动作,暂存器值的变化、与资料流。请用C语言来写出这段程式码。,嵌入式系统架构软体设计-using ARMModule#3-2:ARM Assembly Language,ARM7TDMI资料流,e.g.r3:=r4+(r4,2)ADD r3,r4,r4,LSL#2 A bus B bus,ARM 的暂存器,30 general-purpose,32 bits registers1 Program Counter(PC)1 Current Program Status Register(CPSR)5 Sav
6、ed Program Status Registers(SPSR),User mode FIQ mode irq mode SVC mode abort mode undefined mode,Program Status Register,CPSR:Currrent Program Status RegiterSPSR:Saved Program Status Register,Condition code flags-N:Negative rsult from ALU-Z:Zero result from ALU-C:ALU operation Carried out-V:ALU oper
7、ation overflowed,Inerrupt Disable bits-I:disable the IRQ-F:Disable the FIQ,T bit-Architechture xT only-T=0:ARM state-T=1:Thumb state,Q:Stickly Overflow flag-Architecture 5TE only-QADD,QSUB,J:Processor in Jazelle stateArchitecture 5TEJ only,Mode bitsSpecify the processor mode10000 User10001 FIQ10010
8、IRQ10011 SVC10111 Abort11011 Undef11111 System,31 30 29 28 27 24 7 6 5 4 0N Z C V Q J undefined I F T mode,Program counter R15,ARM state:All ARM instructions are four bytes long(one 32-bit word)and are always aligned on a word boundary.The PC value is stored in bits 31:2 with bits 1:0 undefined.In T
9、humb state:All instructions are 16 bits wide,and halfword alignedThe PC value is stored in bits31:1 with bits 0 undefined.In Jazelle state:All instructions are 8 bits wide.The processor performs a word access to read 4 instructions at once.,Link Register R14,Register 14 is the Link Register(LR).This
10、 register holds the address of the next instruction after a Branch and Link(BL)instruction,which is the instruction used to make a subroutine call.At all other times,R14 can be used as a general-purpose register,Other Register R0-R13,The remaining 15 registers have no special hardware purpose.Their
11、uses are defined purely by software.By convention,ARM assembly language use R13 as Stack Pointer.C and C+compilers always use R14 as the Stack Pointer(SP),Structure of ARM Assembly Language Module,AREA Sectionname,attr,attrStart of New code or data section.CODE:contain machine instructions.READONLY:
12、section should not be written to.Other attr:DATA,NOINIT,READWRITE,Declares an entry point to a program.Labels.Declares the end of the source file.,Calling Subroutines Uses BL,BL destination destination is the label on the first instruction of the subroutine.BL does:place the return address in the li
13、nk register(R14)sets PC to the address of the subroutine.In the subroutinewe can use“MOV pc,lr”to return.By convention,R0-R3 are used to pass parameters.,Calling Subroutines Example,;name this block of code;mark first instruction;to execute,;Set up parameters;Call subroutine,;angel_SWI reason_report
14、 Exception;ADP_Stopped_ApplicationExit;ARM semihosting SWI,;Subroutine code;Return from subroutine.;Mark end of file,Constant Data Types,Numbers Numeric constants are accepted in three forms:Decimal,for example,123Hexadecimal,for example,0 x7Bn_XXX where:n is as base between 2 and 9 xxx is a number
15、in that base.Boolean TRUE and FALSE must be written as TRUE and FALSE.Characters constants consist of opening and closing single quotes X,enclosing either a single character or an escaped character,using the standard C escape characters.Strings consist of opening and closing double quotes“XXXX”.If d
16、ouble quotes or dollar signs are used within a string as literal text characters,they must be represented by a pair of the appropriate character.For example,you must use$if you require a single$in the string.The standard C escape sequences can be used within string constants.,Almost all ARM instruct
17、ions can be conditionally executed.e.g.ADDS r0,r1,r2ADDEQ r0,r1,r2Execute if the N,Z,C and V flags in the CPSR satisfy a condition specified in the instruction,otherwise,NOP.,Conditional ARM Instructions,Almost every ARM instruction can be executed conditionally on the state of the ALU state flags i
18、n the CPSR.Add an S suffix to an ARM data processing instruction to make it update the ALU state flags in the CPSRE.g.ADDS r0,r1,r2;r0=r1+r2 and update ALU status in CPSR.In ARM state,you can:update the ALU status flags in the PSR on the result of a data operationexecute several other data operation
19、 without updating the flagsexecute following instructions or not,according to the state of the flags updated in the first operation.In Thumb state most data operations always update the flagsand conditional execution can only be achieved using the conditional branch instruction(B).Do not use the S s
20、uffix with CMP,CMN,TST,or TEQ.These comparison instructions always update the flag,Conditional Execution,ALU Status Register in CPSR,N Set when the result of the operation was Negative.Z Set when the result of the operation was Zero.C when the result of the operation was Carry.A carry occurs if the
21、result of an addition is greater than or equal to 232If the result of a instruction is positive,or as the result of an inline barrel shifter operation in a move or logical instruction.V Set when the operation caused oVerflow.Overflow occurs if the result of an add,subtract,or compare is greater than
22、 or equal to 231,or less than 231.Q ARM architecture v5Eonly.Sticky flag.Used to detect saturation in special saturating arithmetic instructions(e.g.QAD,ASUB,QDADD,and QDSUB),Or overflow in certain multiply instructions(SMLAxy and SMLAWy),Conditional Code Suffixes,Conditional Code Examples,ADD r0,r1
23、,r2;r0=r1+r2,dont update flagsADDS r0,r1,r2;r0=r1+r2,and update flagsADDCSS r0,r1,r2;if C flag set then r0=r1+r2,and update flagsCMP r0,r1;update flags based on r0-r1.Example code sequence:MOV R0,#0LOOP ADD R0,R0,#1CMP R0,#10BNE LOOPSUB R1,R1,R0,Write Efficient and small size Code by Conditional Ins
24、truction,Exercise,Write program by ARM assembly,&evaluate the execution cost in clock.A Branch needs 3 cycles,others cost 1,注:只需使用CMP,SUB,B这三个指令,加上条件式,就可以完成,While(r1!=r2)do if(r1r2)r1=r1-r2;elser2=r2-r1;,嵌入式系统架构软体设计-using ARM Module#3-3:ARM Development Suite使用练习,ARM ADS 1.2,Others:C&C+LibrariesARM f
25、irmware suiteAM application libraryRealMonitor:for real time debug monitor,Implementation Integration,Pre-configured Project Stationary Files,DebugThis build target is configured to built output binaries that are fully debuggable,at the expense of optimization.ReleaseThis build target is debuggable
26、to build output binaries that are fully optimized,at the expense of debug information.DebugRelThis build target is output binaries that de adequate optimization,and give a good debug view.,Possible Development Environment,Reference,ARM Developer Suie Version 1.2 Getting Started请用Chapter 3练习使用 ADS.,嵌
27、入式系统架构软体设计-using ARM Module#3-4:ARM Instruction Set,ARM 指令集特点,所有指令为32 bitsADD r0,r1,r2;r0:=r1+r2大部分的指令,可以在一个周期内执行完成指令皆可为有条件式执行Load/store 架构.,Thumb 指令集,Thumb指令长度为16 bits针对程式码的密度最佳化,约为65%的ARM code size适合小记忆体系统Thumb指令支援的功能为ARM指令集的一部分执行期间必须切换到Thumb模式ADDSr1,r1,#3ADDr1,#3,Jazelle,Jazelle 技术可以让ARM执行8-bit J
28、ava Bytecode硬件可以支援到95%的bytecodes速度约为一般软件JVM的五倍,ARM 指令集分类,Branch instructionsData-processing instructionsLoad and store instructionsStatus register transfer instructionsCoprocessor instructionsException-generating instructions.,Branch instructions,B BranchBL Branch with linkStore the return address t
29、o r14e.g.CMP r2,#0 BLEQ function function MOV PC,r14,Branch Instruction Encoding,The range of the branch instruction is+/-32 MbytesL:the branch and link variant.,Assembly Format:BLSRmBLS,Branch instructions example,e.g.C if(a=0)unction 1(1);ElsecFunction 1()function2();Function2()return;,ASMfunction
30、 1 STMFDr13!,r0-r4,r14 BL function2 LDMFDr13!,r0-r4,pcfunction2 MOV pc,r14,Data-processing instructions Encoding,Assembly Format:S Rd,Rn,#S Rd,Rn,Rm,shift,Data Processing Opode,Assembly Format:SRd,Rn#SRd,Rn Rm,OpcodeMnemonic MeaningEffect24:210000ANDLogical bit-wise AND Rd:=Rn&Op20001 EOR Logical bi
31、t-wise excusive ORRd:=Rn EOR Op20010 SUBSubtractRd:=Rn-Op20011RSBReverse subtractRd:=Op2-Rn0100ADDAdd Rd:=Rn+Op20101 ADCAdd with carry Rd:=Rn+Op2+C 0110 SBCSubtract with carry Rd:=Rn-Op2+C-10111RSCReverse subtract with carry Rd:=Op2-Rn+C-11000TSTTest Rd:=Scc on Rn&Op21001TEQTest equivalence Rd:=Scc
32、on Rn EOR Op21010CMPCompare Rd:=Scc on Rn-Op21011CMNCompare negated Rd:=Scc on Rn+Op21100ORRLogical bit-wise OR Rd:=Rn|Op21101MOVMove Rd:=Op21110BICBit clear Rd:=Rn AND NOT Op21111MVNMove negated Rd:=NOT Op2,Example Data-processing Instructions,Arithmetic operationsADD r0,r1,r2;r0=r1+r2SUBr0,r1,r2;r
33、0=r1-r2RSBr0,r1,r2;r0=r2-r1Bit-wise logical operationsAND r0,r1,r2;r0=r1 bit clear,Example Data-processing Instructions(cont.),Register movement operationsMOV r0,r2;r0=r2MVN r0,r2;r0=not r2Comparison operations(set condition code bits N,Z,C,V)CMP r1,r2;set cc on r1-r2Immediate operandsADD r3,r3,#1;r
34、3=r3+1ANDr8,r7,#r8=r77:0&:base 16,Shifter,LSL:Logical Left Shift(X2)LSR:Logical Shift Right(/2)ASR Arithmetic Right ShiftROR:Rotate Right,Shifter Applications,e.g.#1ADD r3,r2,r1,LSL#3;r3:=r2+8*r1e.g.#2r0=r1*5 r0=r1+(r1*4)ADD r0,r1,r1,LSL#2,Multiply instruction binary encoding,Assembly FormatMULS Rd,
35、Rm,RsMLAS Rd,Rm,Rs,RnS RdHi,RdLo,Rm,RsRdHi:the most significant 32 bits of 64-bit format numberRdLo:the least significant 32 bits of 64-bit format numberOpcode Mnemonic Meaning Effect23:21000 MUL Multiply(32-bit result)Rd:=(Rm*Rs)31:0001 MLA Multiply-accumulate(32-bit result)Rd:=(Rm*Rs+Rn)31:0100 UM
36、ULL Unsigned multiply longRdHi:RdLo:=Rm*Rs101 UMLAL Unsigned multiply-accumulate longRdHi:RdLo+=Rm*Rs110 SMULL Signed multiply longRdHi:RdLo:=Rm*Rs111 SMLAL Signed multiply-accumulate longRdHi:RdLo+=Rm*Rs,Assembly Format:CLZSRd,RmSets Rd to the number of the bit position of hr most significant 1 in
37、Rm.If Rm=0 Rd=32.E.g.MOV r0,#&100CLZr1,R0r1=8,Count Leading Zeros Instruction(v5T only),练习,用ARM Assembly写一个程式,=mul_包含一个subroutine用来做x10的运算.用ADS环境。不支持具有乘法器功能的ARM Core。main()x=5;y=mul_ten(x);int mul_ten(x)return 10*x;,Single Word and Unsigned Byte Data Transfer Instruction Binary Encoding,Assemble For
38、mat:LDR|STRB Rd,Rn,!;Pre-indexed formLDR|STRB Rd,Rn,;Post-indexed formLDR|STRB Rd,LABEL;PC-relative form,Load and Store Examples,Single register and store LDRr0,r1;r0:=mem32r1STR r0r1;mem32r1:=r0Base plus offset addressing Pre-indexing LDR r0,r1,#4;r0:=mem32r1+4Auto indexing LDR r0,r1,#4!;r0:=mem32r
39、1+4,r1=r1+4Post-indexed LDR r0,r1,#4;r0:=mem32r1,r1=r1+4PC-relative LDR r1,UART_ADD;UART address into r1 STRBr0,r1;store data to UART UART_ADD address literal,Half-word and Signed Byte Data Transfer Instruction Binary Encoding,Assemble Format:LDR|STRH|SH|SB Rd;Rn,!;Pre-indexed formLDR|STRH|SH|SB Rd;
40、Rn,;Post-indexed form,An unsigned value is zero-extended to 32 bits when loaded;A singed value is extended to 32 bits by replicating the most significant bit of the data.,Half-word Load/Store Example,ADR r1,ARRAY1;half-word array startADR r2,ARRAY2;word array startADR r3,ENDARR1;ARRAY1 end+2LOOP LDR
41、SH r0,r1,#2;get signed half-wordSTR r0,r2,#4;save wordCMP r1,r3;check for end of arrayBLT LOOP;if not finished,loop,练习:字串复制,写一个Assembly程序做字串复制的动作用ADS环境A=“Hello,this is a sunny day!”B=“,Multiple Register Data Transfer Instruction Binary Encoding,In a non-user mode,CPSP may be restored by:LDM|Rn!,Full
42、 or empty:The stack pointer can either point to the last item in the stack(a full stack),or the next free space on the stack(an empty stack).,Assembly Format:LDM|STM Rn!,IA:Increment after.IB:Increment before.DA:Decrement after.DB:Decrement before,Example Addressing Mode for LDM/STM,ISR Example,e.g.
43、Interrupt handler_irq void IRQHandler(void)volatile unsigned int*base=(unsigned int*)0 x80000000;If(*base=1)C_int_handler_1();*(base+1)=();IRQHandler PROCSTMFDspl,ro-r4,r12,lrMOVr4,#0 x80000000LDRr0,r4,#0SUBsp,sp,#4CMPr0,#1 BLEQ C_int_handlerMOV r0,#0STR r0,r4,#4ADD sp,sp,#4LDMFD spl,r0-r4,r12,lrSUB
44、Spc,lr,#4,Swap Memory And Register Instruction Binary Encoding,Assembly Format:SWPBRd,Rm,Rn,SWP Example,ADR r0,SEMAPHORESWPB r1,r1,r0;exchange byter0,r1,r?,0,Status Register to General Register Transfer Instruction Binary Encoding,Assembly Format:MRSRd,CPSR|SPSR,E.g.MRS r0,CPSR;move the CPSR to r0MR
45、S r3,CPSR;move the SPSR to r3Note:The SPSR form should not be used in user or system mode.,Transfer to Status Register Instruction Binary Encoding,Assembly Format:MRSCPSR_f|SPSR_f,#MRSCPSR_|SPSR_,Rm,C-the control field PSR7:0X the extension field PSR15:8S the status field PSR23:16F the flags field P
46、SR31:24,MSR Example,Set N,C,V,Z flages:MSRCPSR_f,#set bit 29 of r0MSR CPSR_f,r0:move back to CRSR,练习:切换ARM操作模式,写一段程序,将ARM由Supervisory mode切换到IRQ mode。用ADS环境。31 30 29 28 27 24 7 6 5 4 0N Z C V Q J underfined I F T mode,Mode bits Specify the processor mode 10000User 10001 FIQ 10010 IRQ 10011 SVC 0111A
47、bort 11011 Undef 11111 System,Coprocessor Instructions,There are 3 types:Coprocessor data operationsCDP:initiate a coprocessor data processing operationCoprocessor Register transfersMRC:Move to ARM register from coprocessor registerMCR:Move to coprocessor register from ARM registerCoprocessor Memory
48、 transferLDC:load coprocessor register from memorySTC:store from coprocessor register to memory,Exception-generating&Semaphore Instructions,SWIUsed to cause a Software Interrupt exception to occurSWI SWI 0 x123456BKPTUsed fro software breakpoints in ARM architecture 5 or above.Cause a Prefetch Abort
49、 exception to occur.BKPT,Summary of ARM Architectures,Core ArchitectureARM1v1ARM2v2ARM2as,ARM3v2aARM6,ARM600,ARM610v3ARM7,ARM700,ARM710v3ARM7TDMI,ARM710T,ARM720T,ARM740Tv4TStrongARM,ARM8,ARM810v4ARM9TDMI,ARM920T,ARM940Tv4TARM9ES,XScale Microarchitecturev5TEARM10TDMI,ARM1020Ev5TE926EJ-S/1026EJ-Sv5TEJ
50、,Reference,S.Furber,ARM system-on-chip Architecture,2nd ed.Addison-WesleySeal.ARM architecture reference manual,2nd ed.Addison-WesleyARM Development Suite User Guide,嵌入式系统架构与软体设计 using ARMModule#3-5:Important ARM ASM Programming Skills,Load Constant into Register,Direct loading with MOV and MVNLoadi