NOTICE: The Processors Wiki will End-of-Life on January 15, 2021. It is recommended to download any files or other content you may need that are hosted on processors.wiki.ti.com. The site is now set to read only.
PRU Assembly Advanced Topics
Content is no longer maintained and is being kept for reference only!
For the most up to date PRU-ICSS collateral click here
^ Up to main Programmable Realtime Unit Software Development Table of Contents
This arcticle is part of a collection of articles describing software development on the PRU subsystem included in OMAP-L1x8/C674m/AM18xx devices (where m is an even number). To navigate to the main PRU software development page click on the link above.
Contents
Advanced Topics[edit]
Using Macros[edit]
Macros are used to define custom instructions for the CPU. They are similar to in-line subroutines in C.
Defining a Macro[edit]
A macro is defined by first declaring the start of a macro block and specifying the macro name, then specifying the assembly code to implement the intended function, and finally closing the macro block.
.macro macro name .mparam macro parameters < lines of assembly code > < lines of assembly code > < lines of assembly code > endm
The assembly code within a macro block is identical to that used outside a macro block with minor variances:
- No dot-commands may appear within a macro block other than ".mparam".
- Pre-processor definitions and conditional assembly are processed when the macro is defined.
- Structure references are expanded when the macro is used.
- Labels defined within a macro are considered local and can only be referenced from within the macro.
- References to external labels from within a macro are allowed.
Macro Parameters[edit]
The macro parameters can be specified on one ".mparam" line or multiple. They are processed in the order that they are encountered. There are two types of parameters, mandatory and optional. Optional parameters are assigned a default value that is used in the event that they are not specified when the macro is used. Since parameters are always processed in order, any optional parameters must come last, and once an optional parameter is used, none of the remaining parameters may be specified.
For example:
.macro mv1 // Define macro "mv1" .mparam dst=r0, src=5 // Two optional parameters mov dst, src .endm
For the above macro, the following expansions are possible:
Macro Invocation | Result |
---|---|
mv1 r1, 7 | mov r1, 7 |
mv1 r2 | mov r2, 5 |
mv1 | mov r0, 5 |
Note that option parameters can not be passed by using "empty" delimiters. For example, the following invocation of "mv1" is illegal:
mv1 , 7 // Illegal attempt to do ’mov r0, 7’
Example Macros[edit]
Example 1: Move 32-bit Value (mov32)[edit]
The mov32 macro is a good example of a simple macro that saves some typing and makes a source code look a little cleaner.
Specification:
// // mov32 : Move a 32bit value to a register // // Usage: // mov32 dst, src // // Sets dst = src. Src must be a 32 bit immediate value. // .macro mov32 .mparam dst, src mov dst.w0, src & 0xFFFF mov dst.w2, src >> 16 .endm
Example Invocation: The invocation for this macro is the same as the standard mov pseudo op:
mov32 r0, 0x12345678
Example Expansion: The expansion of the above invocation uses to immediate value moves to accomplish the 32-bit load.
mov r0.w0, 0x12345678 & 0xFFFF mov r0.w2, 0x12345678 >> 16
Example 2: Quick Branch If in Range (qbir)[edit]
Any label defined within a macro is altered upon expansion to be unique. Thus internal labels are local to the macro and code defined outside of a macro can not make direct use of a label that is defined inside a macro. However code contained within a macro can make free use of externally defined labels.
The qbir macro is a simple example that uses a local label. The macro instruction will jump to the supplied label if the test value is within the specified range.
Specification:
// // qbir : Quick branch in range // // Usage: // qbir label, test, low, high // // Jumps to label if (low <= test <= high). // Test must be a register. Low and high can be // a register or a 8 bit immediate value. // .macro qbir .mparam label, test, low, high qbgt out_of_range, test, low qbge label, test, high out_of_range: .endm
Example Invocation: The example below checks the value in R5 for membership of two different ranges. Note that the range "low" and "high" values could also come from registers. They do not need to be immediate values:
qbir range1, r5, 1, 9 // Jump if (1 <= r5 <= 9) qbir range2, r5, 25, 50 // Jump if (25 <= r5 <= 50)
Example Expansion: The expansion of the above invocation illustrates how external labels are used unmodified while internal labels are altered on expansion to make them unique.
qbgt _out_of_range_1_, R5, 1 qbge range1, r5, 9 _out_of_range_1_: qbgt _out_of_range_2_, R5, 25 qbge range2, r5, 50 _out_of_range_2_:
Using Structures and Scope[edit]
Basic Structures[edit]
Structures are used in PASM to eliminate the tedious process of defining structure offset fields for using in LBBO/SBBO, and the even more painful process of mapping structures to registers.
Declaring Structure Types[edit]
Structures are declared in PASM using the ".struct" dot command. This is similar to using a "typedef" in C. PASM automatically processes each declared structure template and creates an internal structure type. The named structure type is not yet associated with any registers or storage. For example, say the application programmer has the following structure in C:
typedef struct _PktDesc_ { struct _PktDesc *pNext; char *pBuffer; unsigned short Offset; unsigned short BufLength; unsigned short Flags; unsigned short PktLength; } PKTDESC;
The equivalent PASM structure type is created using the following syntax:
.struct PktDesc .u32 pNext .u32 pBuffer .u16 Offset .u16 BufLength .u16 Flags .u16 PktLength .ends
Assigning Structure Instances to Registers[edit]
The second function of the PASM structure is to allow the application developer to map structures onto the PRU register file without the need to manually allocate registers to each field. This is done through the ".assign" dot command. For example, say the application programmer performs the following assignment:
.assign PktDesc, R4, R7, RxDesc // Make sure this uses R4 thru R7
When PASM sees this assignment, it will perform three tasks for the application developer:
- PASM will verify that the structure perfectly spans the declared range (in this case R4 through R7). The application developer can avoid the formal range declaration by substituting ’*’ for ’R7’ above.
- PASM will verify that all structure fields are able to be mapped onto the declared range without any alignment issues. If an alignment issue is found, it is reported as an error along with the field in question. Note that assignments can begin on any register boundary.
- PASM will create an internal data type named "RxDesc", which is of type "PktDesc".
For the above assignment, PASM will use the following variable equivalencies. Note PASM uses the little endian register mapping.
Variable | Assignment |
RxDesc | R4 |
RxDesc.pNext | R4 |
RxDesc.pBuffer | R5 |
RxDesc.Offset | R6.w0 |
RxDesc.BufLength | R6.w2 |
RxDesc.Flags | R7.w0 |
RxDesc.PktLength | R7.w2 |
For example the source line below will be converted to the output shown:
// Input Source Line add r20, RxDesc.pBuffer, RxDesc.Offset // Output Source Line add r20, R5, R6.w0
SIZE and OFFSET Operators[edit]
SIZE and OFFSET are two useful operators that can be applied to either structure types or structure assignments. The SIZE operator returns the byte size of the supplied structure or structure field. The OFFSET operator returns the byte offset of the supplied field from the start of the structure.
SIZE Operator Example[edit]
Using the assignment example from the previous section, the following SIZE equivalencies would apply:
Variable Operation | Results! |
---|---|
SIZE(PktDesc) | 16 |
SIZE(PktDesc.pNext) | 4 |
SIZE(PktDesc.pBuffer) | 4 |
SIZE(PktDesc.Offset) | 2 |
SIZE(PktDesc.BufLength) | 2 |
SIZE(PktDesc.Flags) | 2 |
SIZE(PktDesc.PktLength) | 2 |
SIZE(RxDesc) | 16 |
SIZE(RxDesc.pNext) | 4 |
SIZE(RxDesc.pBuffer) | 4 |
SIZE(RxDesc.Offset) | 2 |
SIZE(RxDesc.BufLength) | 2 |
SIZE(RxDesc.Flags) | 2 |
SIZE(RxDesc.PktLength) | 2 |
OFFSET Operator Example[edit]
Using the assignment example from the previous section, the following OFFSET equivalencies would apply:
Variable Operation | Results |
---|---|
OFFSET(PktDesc) | 0 |
OFFSET(PktDesc.pNext) | 0 |
OFFSET(PktDesc.pBuffer) | 4 |
OFFSET(PktDesc.Offset) | 8 |
OFFSET(PktDesc.BufLength) | 10 |
OFFSET(PktDesc.Flags) | 12 |
OFFSET(PktDesc.PktLength) | 14 |
OFFSET(RxDesc) | 0 |
OFFSET(RxDesc.pNext) | 0 |
OFFSET(RxDesc.pBuffer) | 4 |
OFFSET(RxDesc.Offset) | 8 |
OFFSET(RxDesc.BufLength) | 10 |
OFFSET(RxDesc.Flags) | 12 |
OFFSET(RxDesc.PktLength) | 14 |
Using Variable Scopes[edit]
On larger PASM applications, it is common for different structures to be applied to the same register range for use at different times in the code. For example, assume the programmer uses three structures, one called "global", one called "init" and one called "work". Assume that the global structure is always valid, but that the init and work structures do not need to be used at the same time.
The programmer could assign the structures as follows:
.assign struct_global, R2, R8, myGlobal .assign struct_init R9, R12, init // Registers shared with "work" .assign struct_work R9, R13, work // Registers shared with "init"
The program code may look something like the following:
Start: call InitGlobalData mov init.suff, myGlobal.data call InitProcessing qbbs InitComplete, init.flags.fComplete |
Using R9 to R12 for "init" structure |
DoWork: call LoadWorkRecord mov r0, myGlobal.Status qbeq type1, work.type, myGlobal.WorkType1 ... |
Using R9 to R13 for "work" structure |
InitProcessing: mov init.start, init.stuff set init.flags.fComplete ret |
Using R9 to R12 for "init" structure |
The code has been shaded to emphasize when the shared registers are being used for the "init" structure and when they are been used for the "work" structure. The above is quite legal, but in this example, PASM does not provide any enforcement for the register sharing. For example, assume the work section of the code contained a reference to the "init" structure:
DoWork: call LoadWorkRecord mov r0, myGlobal.Status ''set init.flags.fWorkStarted'' qbeq type1, work.type, myGlobal.WorkType1 ... |
The reference to "init" would not cause an assembly error! |
The above example would not result in an assembly error even though using the same registers for two different purposes at the same time would result in a functional error.
To solve this potential problem, named variable scopes can be defined in which the register assignments are to be made. For example, the above shared assignments can be revised to as shown below to include the creation of variable scopes:
.assign struct_global, R2, R8, myGlobal // Available in all scopes .enter Init_Scope // Create new scope Init_Scope .assign struct_init R9, R12, init // Only available in Init_Scope .leave Init_Scope // Leave scope Init_Scope .enter Work_Scope // Create new scope Work_Scope .assign struct_work R9, R13, work // Only available in Work_Scope .leave Work_Scope // Leave scope Work_Scope
Once the scopes have been defined, the structures assigned within can only be accessed while the scope is open. Previously defined scopes can be reopened via the ".using" command.
.using Init_Scope Start: call InitGlobalData mov init.suff, myGlobal.data call InitProcessing qbbs InitComplete, init.flags.fComplete .leave Init_Scope |
Using "Init_Scope" |
.using Work_Scope DoWork: call LoadWorkRecord mov r0, myGlobal.Status qbeq type1, work.type, myGlobal.WorkType1 ... .leave Work_Scope |
Using "Work_Scope" |
.using Init_Scope InitProcessing: mov init.start, init.stuff set init.flags.fComplete ret .leave Init_Scope |
Using "Init_Scope" |
When using scopes as in the above example, any attempted reference to a structure assignment made outside a currently open scope will result in an assembly error.
Register Addressing and Spanning[edit]
Certain PRU instructions act upon or affect more than a single register field. These include MVIx, ZERO, SCAN, LBxO, and SBxO. It is important to understand how register fields are packed into registers, and how these fields are addressed when using one of these PRU functions.
Little Endian Register Mapping[edit]
The registers of the PRU are memory mapped with the little endian byte ordering scheme. For example, say we have the following registers set to the given values:
R0 = 0x80818283 R1 = 0x84858687
The following table shows the register mapping to byte offset in little endian:
Byte Offset | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
Register Field | R0.b0 | R0.b1 | R0.b2 | R0.b3 | R1.b0 | R1.b1 | R1.b2 | R1.b3 |
Example Value | 0x83 | 0x82 | 0x81 | 0x80 | 0x87 | 0x86 | 0x85 | 0x84 |
There are three factors affected by register mapping and little endian mapping. There are register spans, the first byte affected in a register field, and register addressing. In addition, there are some alterations in PRU opcode encoding.
Register Spans[edit]
The concept of how the register file is spanned can be best viewed using the tables created in the example from section 3.3.1. Registers are spanned by incrementing the byte offset from the start of the register file for each subsequent byte.
For example assume we have the following registers set to their indicated values:
R0 = 0x80818283 R1 = 0x84858687 R2 = 0x00001000
If the instruction "SBBO R0.b2, R2, 0, 5" is executed, it will result in a memory write to memory address 0x1000 as shown in little endian:
Byte Address | 0x1000 | 0x1001 | 0x1002 | 0x1003 | 0x1004 |
Value | 0x81 | 0x80 | 0x87 | 0x86 | 0x85 |
First Byte Affected[edit]
The first affected byte in a register field is literally the first byte to be altered when executing a PRU instruction. For example, in the instruction "LBBO R0, R1, 0, 4", the first byte to be affected by the LBBO is R0.b0 in little endian. The width of a field in a register span operation is almost irrelevant in little endian, since the first byte affected is independent of field width. For example, consider the following table:
Register Expression | First Byte Affected |
---|---|
R0 | R0.b0 |
R0.w0 | R0.b0 |
R0.w1 | R0.b1 |
R0.w2 | R0.b2 |
R0.b0 | R0.b0 |
R0.b1 | R0.b1 |
R0.b2 | R0.b2 |
R0.b3 | R0.b3 |
As can be seen in the table above, for any expression the first byte affected is always the byte offset of the field within the register. Thus in little endian, the expressions listed below all result in identical behavior.
LBBO R0, R1, 0, 4 LBBO R0.w0, R1, 0, 4 LBBO R0.b0, R1, 0, 4
Register Address[edit]
The MVIx, ZERO, SCAN, LBxO, and SBxO instructions may use or require a register address instead of the direct register field in the instruction. In the assembler a leading ’&’ character is used to specify that a register address is to be used. The address of a register is defined to be the byte offset within the register file of the first affected byte in the supplied field.
Given the information already presented in this chapter, it should be straight forward to verify the following register address mappings:
Register Address Expression | Little Endian | |
---|---|---|
First Bye Affected | Register Address | |
&Rn | Rn.b0 | (n*4) |
&Rn.w0 | Rn.b0 | (n*4) |
&Rn.w1 | Rn.b1 | (n*4) + 1 |
&Rn.w2 | Rn.b2 | (n*4) + 2 |
&Rn.b0 | Rn.b0 | (n*4) |
&Rn.b1 | Rn.b1 | (n*4) + 1 |
&Rn.b2 | Rn.b2 | (n*4) + 2 |
&Rn.b3 | Rn.b3 | (n*4) + 3 |
Register addresses are very useful for writing endian agnostic code, or for overriding the declared field widths in a structure element.
PRU Opcode Generation[edit]
The PRU binary opcode formats for LBBO, SBBO, LBCO, and SBCO use a byte offset for the source/destination register in the PRU register file. For example, only the following destination fields can actually be encoded into a PRU opcode for register R1:
- LBBO R1.b0, R0, 0, 4
- LBBO R1.b1, R0, 0, 4
- LBBO R1.b2, R0, 0, 4
- LBBO R1.b3, R0, 0, 4