Ok so I am trying to ditch energia texas instruments arduino style ide and I have used IAR for coding a Tiva C series development board where I was able to use pointers to memory locations to perform specific things like toggling led for example. I have had a hard time doing the same on a dev board running a MSP430FR5994 mcu, I know the memory address of the green led pin to be PORT 1 PIN 1 OR P1.1 on the board. I also have included the msp430.h header file for an api to the board from my ide. What I don't understand is why when in debug my code is changing the value of the correct registers to the correct numbers but it is not altering the board. I have also verified that it is connected to the board as it will not proceed to debug with it unplugged. My direct questions are this: 1 I should be able to alter memory locations with no headerfiles or any special api's as long as I know the specific addresses correct? 2 I did not see anything about clock gating in the data sheet and in debug I can see those registers changing values so is there something other than setting the pin direction and value that I need to do?( the default pin function is generic gpio I checked so I left that register alone. Any ideas or pointing out obvious errors in my approach would be very helpful thanks. In the code below I used the header file names as I could not get the direct pointers to work. Also I was confused by the data sheet as the base address for port 1 was written as 0200H which is 5 hex numbers when I was expecting 4 since the chip is 16bit system? I assumed with the offsets it meant 0x202H etc am I incorrect in this assumption?
registers during debugging image
ti.com/lit/ds/symlink/msp430fr5994.pdf (datasheet port 1 mem locations page 130)
#include <msp430.h>
/**
* main.c
*/
int main(void)
{
WDTCTL = WDTPW | WDTHOLD; // stop watchdog timer
while(1){
int i ;
int j ;
P1DIR = 2;
//*((unsigned int *)0x204Hu) = 2;
P1OUT = 2;
//*((unsigned int *)0x202Hu)= 2;
for( i = 0; i< 2 ; i++){}
P1OUT = 0;
//*((unsigned int *)0x202Hu)= 0;
for (j = 0 ; j< 2; j++){}
}
return 0;
}
See section 12.3.1 of the MSP430FR59xx User's Guide.
After a BOR reset, all port pins are high-impedance with Schmitt
triggers and their module functions disabled to prevent any cross
currents. The application must initialize all port pins including
unused ones (Section 12.3.2) as input high impedance, input with
pulldown, input with pullup, output high, or output low according to
the application needs by configuring PxDIR, PxREN, PxOUT, and PxIES
accordingly. This initialization takes effect as soon as the LOCKLPM5
bit in the PM5CTL register (described in the PMM chapter) is cleared;
until then, the I/Os remain in their high-impedance state with Schmitt
trigger inputs disabled.
And here is the example blinky code provided in the MSP430FR599x Code Examples available to download from here.
#include <msp430.h>
int main(void)
{
WDTCTL = WDTPW | WDTHOLD; // Stop WDT
// Configure GPIO
P1OUT &= ~BIT0; // Clear P1.0 output latch for a defined power-on state
P1DIR |= BIT0; // Set P1.0 to output direction
PM5CTL0 &= ~LOCKLPM5; // Disable the GPIO power-on default high-impedance mode
// to activate previously configured port settings
while(1)
{
P1OUT ^= BIT0; // Toggle LED
__delay_cycles(100000);
}
}
You probably need to add that PM5CTL0 &= ~LOCKLPM5; line to your code.
And then single-step through your code in the debugger to observe the LED. Because if you let your code run at full speed the delay loops are way too short to observe the LED flash with your eye.
This is derived from one of my examples; I don't have a card handy to test it on, but it should just work or be close.
startup.s
.word hang /* 0xFFE0 */
.word hang /* 0xFFE2 */
.word hang /* 0xFFE4 */
.word hang /* 0xFFE6 */
.word hang /* 0xFFE8 */
.word hang /* 0xFFEA */
.word hang /* 0xFFEC */
.word hang /* 0xFFEE */
.word hang /* 0xFFF0 */
.word hang /* 0xFFF2 */
.word hang /* 0xFFF4 */
.word hang /* 0xFFF6 */
.word hang /* 0xFFF8 */
.word hang /* 0xFFFA */
.word hang /* 0xFFFC */
.word reset /* 0xFFFE */
reset.s
.global reset
reset:
mov #0x03FF,r1
call #notmain
jmp hang
.global hang
hang:
jmp hang
.globl dummy
dummy:
ret
so.c
void dummy ( unsigned short );
#define WDTCTL (*((volatile unsigned short *)0x015C))
#define P1OUT (*((volatile unsigned short *)0x0202))
#define P1DIR (*((volatile unsigned short *)0x0204))
void notmain ( void )
{
unsigned short ra;
WDTCTL = 0x5A80;
P1DIR|=0x02;
while(1)
{
P1OUT |= 0x0002;
for(ra=0;ra<10000;ra++) dummy(ra);
P1OUT &= 0xFFFD;
for(ra=0;ra<10000;ra++) dummy(ra);
}
}
memmap
MEMORY
{
rom : ORIGIN = 0xC000, LENGTH = 0xFFE0-0xC000
ram : ORIGIN = 0x1C00, LENGTH = 0x2C00-0x1C00
vect : ORIGIN = 0xFFE0, LENGTH = 0x20
}
SECTIONS
{
VECTORS : { startup.o } > vect
.text : { *(.text*) } > rom
.bss : { *(.bss*) } > ram
.data : { *(.data*) } > ram
}
build
msp430-gcc -Wall -O2 -c so.c -o so.o
msp430-ld -T memmap reset.o so.o startup.o -o so.elf
msp430-objdump -D so.elf > so.list
msp430-objcopy -O ihex so.elf out.hex
examine output
so.elf: file format elf32-msp430
Disassembly of section VECTORS:
0000ffe0 <VECTORS>:
ffe0: 0a c0 bic r0, r10
ffe2: 0a c0 bic r0, r10
ffe4: 0a c0 bic r0, r10
ffe6: 0a c0 bic r0, r10
ffe8: 0a c0 bic r0, r10
ffea: 0a c0 bic r0, r10
ffec: 0a c0 bic r0, r10
ffee: 0a c0 bic r0, r10
fff0: 0a c0 bic r0, r10
fff2: 0a c0 bic r0, r10
fff4: 0a c0 bic r0, r10
fff6: 0a c0 bic r0, r10
fff8: 0a c0 bic r0, r10
fffa: 0a c0 bic r0, r10
fffc: 0a c0 bic r0, r10
fffe: 00 c0 bic r0, r0
Disassembly of section .text:
0000c000 <reset>:
c000: 31 40 ff 03 mov #1023, r1 ;#0x03ff
c004: b0 12 0e c0 call #0xc00e
c008: 00 3c jmp $+2 ;abs 0xc00a
0000c00a <hang>:
c00a: ff 3f jmp $+0 ;abs 0xc00a
0000c00c <dummy>:
c00c: 30 41 ret
0000c00e <notmain>:
c00e: 0b 12 push r11
c010: b2 40 80 5a mov #23168, &0x015c ;#0x5a80
c014: 5c 01
c016: a2 d3 04 02 bis #2, &0x0204 ;r3 As==10
c01a: a2 d3 02 02 bis #2, &0x0202 ;r3 As==10
c01e: 0b 43 clr r11
c020: 0f 4b mov r11, r15
c022: b0 12 0c c0 call #0xc00c
c026: 1b 53 inc r11
c028: 3b 90 10 27 cmp #10000, r11 ;#0x2710
c02c: f9 23 jnz $-12 ;abs 0xc020
c02e: b2 f0 fd ff and #-3, &0x0202 ;#0xfffd
c032: 02 02
c034: 0b 43 clr r11
c036: 0f 4b mov r11, r15
c038: b0 12 0c c0 call #0xc00c
c03c: 1b 53 inc r11
c03e: 3b 90 10 27 cmp #10000, r11 ;#0x2710
c042: f9 23 jnz $-12 ;abs 0xc036
c044: ea 3f jmp $-42 ;abs 0xc01a
looks fine the vector table is there and points to the right place, etc.
10,000 might not be enough to see the led blink.
From the datasheet it appears that 0x202 is P1OUT and 0x204 is P1DIR
And you have to get it programmed into the board. I use mspdebug for the boards I have but that program may have stopped working on the eval boards from TI a while ago. And mspdebug supported the Intel hex format. So use objcopy for other formats.
If you were not wanting to use gnu tools then you still have to deal with the vector table and the bootstrap in front of the C code if you are trying to get away from someone's sandbox and do your own thing.
You are on the right path it may be as simple as your delay loops are way way too small and or getting optimized out since they are dead code as written.
If you rely on .bss or .data being initialized then you have more work to do in the linker script and bootstrap. I don't, so don't have that problem, actually wondering why .data was in this linker script...
I've matched addresses to the datasheet for your part, increasing the odds of success. If you use an external function and pass the loop variable to that function (the external can be C or asm, doesn't matter) then the optimizer won't remove it as dead code. That or add volatile on the loop variable and check the disassembly to see that it wasn't removed.
Related
Recently I tried to pack my code into small ATTiny13 with 1kB of flash. In optimalisation process I discovered something weird for me. Let's take the example code:
#include <avr/interrupt.h>
int main() {
TCNT0 = TCNT0 * F_CPU / 58000;
}
It has no sense of course, but interesting thing is output size - it produces 248 bytes.
Quick explaination of code: F_CPU is constant defined by -DF_CPU=... switch for avr-gcc, TCNT0 is 8-bit register (on ATTiny13). In real program I assign equation result to uint16_t, but still same behaviour was observed.
If part of expression were wrapped in brackets:
TCNT0 = TCNT0 * (F_CPU / 58000);
Output file size is 70 bytes. Huge difference, but results of these operations are same (right?).
I looked into generated assembly code and, despite fact that I don't understand ASM very well, I see that no-brackets version adds some labels like:
00000078 <__divmodsi4>:
78: 05 2e mov r0, r21
7a: 97 fb bst r25, 7
7c: 16 f4 brtc .+4 ; 0x82 <__divmodsi4+0xa>
7e: 00 94 com r0
80: 0f d0 rcall .+30 ; 0xa0 <__negsi2>
82: 57 fd sbrc r21, 7
84: 05 d0 rcall .+10 ; 0x90 <__divmodsi4_neg2>
86: 14 d0 rcall .+40 ; 0xb0 <__udivmodsi4>
88: 07 fc sbrc r0, 7
8a: 02 d0 rcall .+4 ; 0x90 <__divmodsi4_neg2>
8c: 46 f4 brtc .+16 ; 0x9e <__divmodsi4_exit>
8e: 08 c0 rjmp .+16 ; 0xa0 <__negsi2>
And much more. I learned only x86 assembler awhile, but as far as I remember, for division there was simple mnemonic. Why avr-gcc adds so much code in first example?
Another question is why compiler does not inline right part of equation if both numbers are known in compile time.
We have this:
x = x * 1200000 / 58000
Note that 1200000/58000 = 20.69... is not an integer, so this must be computed as first multiplying and then flooring dividing. Your architecture does not have native integer division for this data type, so it has to emulate it, resulting in a lot of code.
However this:
x = x * (1200000 / 58000)
we find that 1200000 / 58000 = 20, since C uses flooring division, so this code is simplified to just:
x = x * 20
Still struggling with AVR assembly. This time avr-gcc seems to completely ignore my directive to permanently bind a local variable to a register. Here's an example — this is of course just an illustration, not the final code:
// C code:
ISR(USART1_RX_vect)
{
register uint8_t c asm("r3") = UDR1;
tty1::buffer[tty1::ptr.head] = c;
}
// Generated assembly:
000000d8 <__vector_20>:
d8: 1f 92 push r1
da: 0f 92 push r0
dc: 0f b6 in r0, SREG ; 63
de: 0f 92 push r0
e0: 11 24 eor r1, r1
e2: 8f 93 push r24
e4: ef 93 push r30
e6: ff 93 push r31
e8: 80 91 73 00 lds r24, UDR1 ; 0x800073 <__EEPROM_REGION_LENGTH__+0x7f0073>
ec: e0 91 01 01 lds r30, 0x0101 ; 0x800101 <tty<drv::uart1>::ptr>
f0: e2 95 swap r30
f2: ef 70 andi r30, 0x0F ; 15
f4: f0 e0 ldi r31, 0x00 ; 0
f6: ee 5f subi r30, 0xFE ; 254
f8: fe 4f sbci r31, 0xFE ; 254
fa: 80 83 st Z, r24
fc: ff 91 pop r31
fe: ef 91 pop r30
100: 8f 91 pop r24
102: 0f 90 pop r0
104: 0f be out SREG, r0 ; 63
106: 0f 90 pop r0
108: 1f 90 pop r1
10a: 18 95 reti
- Give me r3, please.
- Sure thing! here's r24.
I can ask any register between r2 and r7 that compiler just takes whatever it wants! And it has nothing to do with UDR1, it does that with whatever I assign c. What's the point of that directive if it does nothing at all? How am I supposed to control the register the compiler selects?
To the question «Why the heck am I wanting to assign variable to a register?» I reply «Because the generated code is sub-optimal for an interrupt and I want a fine control over the generated assembly.» So far it's been a trouble for me.
Still using avr-gcc version 7.1.0...
Remove the declaration from the Interrupt Service Routine (ISR) to the global scope. GCC allows global and local permanent bindings. Yours is in the ISR scope, which is but useless in your case.
I will try my best to give a more detailed explanation.
Local scope
Register from the binding stops to be a register, and it is treated as the storage location.
The same optimisation rules apply as to any other automatic variables.
The ISR has to restore all of the registers (except those bound in the global scope) - so no side effects possible - even if registers are bound in the local scope.
See the code:
https://godbolt.org/g/HsjvCa and https://godbolt.org/g/ZGjvRE with optimisations on.
Local bindings do not change any general register use rules.
If you bind registers - you can't call any functions compiled without header files with register bindings (global scope). In the local scope call to any function invalidates the local bindings.
avr-gcc seems to completely ignore my directive to permanently bind a local variable to a register.
That's correct:
Local Register Variables
According to the GCC documentation for local register variables, the only supported use of local register variables is as operands of inline assembly:
The only supported use for this feature is to specify registers for input and output operands when calling Extended asm
As you just have a definition of a local reg variable but no inline assembly, there's no need to allocate c to R3. And even if avr-gcc would use R3, it would have to push / pop it so you wouldn't gain anything.
Global Register Variables
Using a global register variable might not do what you expect, either. Take for example
char a, b;
register char r3 asm ("r3");
void func (void)
{
r3 = a;
b = r3;
}
which will be compiled to (tested with avr-gcc -Os with v5, v8, v11, v13):
func:
lds r24,a
mov r3,r24
sts b,r24
ret
Apart from that, global register variables come with their own caveats:
Such variables are supposed to be global, i.e. the compiler must not use them for register allocation (which is the reason for the above code by the way). This implies that the global reg definition must be present in each compilation unit, even in those that do not use it explicitly. Alternatively, one can turn R3 into a fixed register be means of command line option -ffixed-3 or -ffixed-r3.
You actually do not have control over the options used for each compilation unit, in particular libraries like libgcc, AVR-LibC or 3rd party libraries might use R3.
Many of the modules in AVR-LibC are written in assembly, so even compiling them with -ffixed-* would not have an effect on R3 usage. None of the libgcc assembly functions are using R2...R9 though, but there are also C functions in libgcc.
Conclusion
Register variables won't bring the expected code, but they will introduce unexpected caveats and need extra attention.
I want a fine control over the generated assembly
The goto technique to actually have fine control is writing the ISR in assembly, or at least to write bits of it in inline assembly. This is indicated anyways when you need control over each cycle in an ISR.
You might also consider avr-gcc v8 because it implements optimized ISR prologues and epilogues as of PR81268 — but no avr-gcc v9...v13 due to PR90706.
Take for example the following code with avr-gcc v8 on ATmega16 -Os:
unsigned char c, x, a[10];
__attribute__((signal))
void __vector1 (void)
{
a[x] = c;
}
On older than v8 (or with attribute no_gccisr), the outcome is:
00000000 <__vector1>:
0: 1f 92 push r1
2: 0f 92 push r0
4: 0f b6 in r0, 0x3f ; SREG
6: 0f 92 push r0
8: 11 24 eor r1, r1
a: 8f 93 push r24
c: ef 93 push r30
e: ff 93 push r31
... 6 instructions in function body
20: ff 91 pop r31
22: ef 91 pop r30
24: 8f 91 pop r24
26: 0f 90 pop r0
28: 0f be out 0x3f, r0 ; SREG
2a: 0f 90 pop r0
2c: 1f 90 pop r1
2e: 18 95 reti
Without reti, prologue + epilogue is 15 instructions and 27 ticks, whereas the fully optimized version takes 10 instructions and 18 ticks:
00000000 <__vector1>:
0: 8f 93 push r24
2: 8f b7 in r24, 0x3f ; SREG
4: 8f 93 push r24
6: ef 93 push r30
8: ff 93 push r31
... 6 instructions in function body
1a: ff 91 pop r31
1c: ef 91 pop r30
1e: 8f 91 pop r24
20: 8f bf out 0x3f, r24 ; SREG
22: 8f 91 pop r24
24: 18 95 reti
I am having trouble with the creation and addressing of an array created purely in assembly using the instruction set for the Atmel ATMega8535.
What I understand so far is as follows:
The array contains contiguous data that is equal in length.
The creation of the array involves defining the beginning and end locations of the array (much like you would the stack).
You would address an index in the array by adding an offset of the base address of the array.
What I am looking to do specifically is create a 1-D array of 8-bit integers with predefined values populating it during initialization it does not have to be written to, only addressed when needed. The problem ultimately lying in not being able to translate the logic into the assembly code.
I have tried with little progress to do so using support from the following books:
Some Assembly Required: Assembly Language Programming with the AVR Microcontroller by Timothy S Margush
Get Going with...AVR Microcontrollers by Peter Sharpe
Any help, advice or further resources would be greatly appreciated.
If your array is read-only, you do not need to copy it to RAM. You can
keep it in Flash and read it from there when needed. This will save you
precious RAM, at the cost of slower access (read from RAM is 2 cycles,
read from flash is 3 cycles).
You can declare your array like this:
.global my_array
.type my_array, #object
my_array:
.byte 12, 34, 56, 78
Then, to read a member of the array, you have to compute:
adress of member = array base address + member index
If your members were more than one byte, you would have to also multiply
the index by the size, but this is not the case here. Then, you put the
address of the required member in the Z register and issue an lpm
instruction. Here is a function implementing this logic:
.global read_data
; input: r24 = array index, r1 = 0
; output: r24 = array value
; clobbers: r30, r31
read_data:
ldi r30, lo8(my_array) ; load Z = address of my_array
ldi r31, hi8(my_array) ; ...high byte also
add r30, r24 ; add the array index
adc r31, r1 ; ...and add 0 to propagate the carry
lpm r24, Z
ret
#scottt advised you to first write in C, then look at the generated
assembly. I consider this very good advice, let's follow it:
#include <stdint.h>
__flash const uint8_t my_array[] = {12, 34, 56, 78};
uint8_t read_data(uint8_t index)
{
return my_array[index];
}
The __flash keyword identifying a “named address space” is an embedded
C extension supported by
gcc. The
generated assembly is slightly different from the previous one: instead
of computing base_address + index, gcc does index − (−base_address):
read_data:
mov r30, r24 ; load Z = array index
ldi r31, 0 ; ...high byte of index is 0
subi r30, lo8(-(my_array)) ; subtract -(address of my array)
sbci r31, hi8(-(my_array)) ; ...high byte also
lpm r24, Z
ret
This is just as efficient as the previous hand-rolled assembly, except
that it does not need the r1 register to be initialized to zero. But
keeping r1 to zero is part of the gcc ABI anyway, so it should make no
difference.
The role of the linker
This section is meant to answer the question in the comment: how can we
access the array if we do not know its address? The answer is: we access
it by its name, just like in the code snippets above. Choosing the final
address for the array, as well as replacing the name by the appropriate
address, is the linker’s job.
Assembling (with avr-gcc -c) and disassembling (with avr-objdump -d)
the first code snippet gives this:
my_array.o, section .text:
00000000 <my_array>:
0: 0c 22 38 4e ."8N
If we were compiling from C, gcc would have put the array in the
.progmem.data section instead of .text, but it makes little difference.
The numbers “0c 22 38 4e” are the array contents, in hex. The characters
to the right are the ASCII equivalents, ‘.’ being the placeholder for
non printing characters.
The object file also carries this symbol table, shown by avr-nm:
my_array.o:
00000000 T my_array
meaning the symbol “my_array” has been defined as referring to offset 0
into the .text section (implied by “T”) of this object.
Assembling and disassembling the second code snippet gives this:
read_data.o, section .text:
00000000 <read_data>:
0: e0 e0 ldi r30, 0x00
2: f0 e0 ldi r31, 0x00
4: e8 0f add r30, r24
6: f1 1d adc r31, r1
8: 84 91 lpm r24, Z
a: 08 95 ret
Comparing the disassembly with the actual source code, it can be seen
that the assembler replaced the address of my_array with 0x00, which is
almost guaranteed to be wrong. But it also left a note to the linker in
the form of “relocation records”, shown by avr-objdump -r:
read_data.o, RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
00000000 R_AVR_LO8_LDI my_array
00000002 R_AVR_HI8_LDI my_array
This tells the linker that the ldi instructions at offsets 0x00 and
0x02 are intended to load the low byte and the high byte (respectively)
of the final address of my_array. The object file also carries this
symbol table:
read_data.o:
U my_array
00000000 T read_data
where the “U” line means the file makes use of an undefined symbol named
“my_array”.
Linking these pieces together, with a suitable main(), yields a binary
containing the C runtime from avr-lbc, together with our code:
0000003c <my_array>:
3c: 0c 22 38 4e ."8N
00000040 <read_data>:
40: ec e3 ldi r30, 0x3C
42: f0 e0 ldi r31, 0x00
44: e8 0f add r30, r24
46: f1 1d adc r31, r1
48: 84 91 lpm r24, Z
4a: 08 95 ret
It should be noted that, not only has the linker moved the pieces around
to their final addresses, it has also fixed the arguments of the ldi
instructions so that they now point to the correct address of my_array.
The code should look something like this:
.section .text
.global main
main:
ldi r30,lo8(data)
ldi r31,hi8(data)
ldd r24,Z+3
sts output,r24
ld r24,Z
sts output,r24
ldi r24,0
ldi r25,0
ret
.global data
.data
data:
.byte 1, 2, 3, 4
.comm output,1,1
Explanation
For people who have programmed in assembler using the GNU toolchain before, there are lessons that are transferable even to unfamiliar instruction sets:
You reserve space for an array with the assembler directives .byte 1, 2, 3, 4, .word 1, 2 (.word is 16 bits for AVR) or .space 100.
When learning a new instruction set, write C programs and ask the C compiler to generate assembler output. Find a good assembler programming reference for the instruction set as you read the assembler code.
Applying this trick below.
byte-array.c
/* volatile our code doesn't get optimized out even when compiler optimization is on */
volatile char output;
char data[] = { 1, 2, 3, 4 };
int main(void)
{
output = data[3];
output = data[0];
return 0;
}
Generate Assembler from C
avr-gcc -mmcu=atmega8 -Wall -Os -S byte-array.c
This will generate the assembler file byte-array.s.
byte-array.s
.file "byte-array.c"
__SP_H__ = 0x3e
__SP_L__ = 0x3d
__SREG__ = 0x3f
__tmp_reg__ = 0
__zero_reg__ = 1
.section .text.startup,"ax",#progbits
.global main
.type main, #function
main:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
ldi r30,lo8(data)
ldi r31,hi8(data)
ldd r24,Z+3
sts output,r24
ld r24,Z
sts output,r24
ldi r24,0
ldi r25,0
ret
.size main, .-main
.global data
.data
.type data, #object
.size data, 4
data:
.byte 1
.byte 2
.byte 3
.byte 4
.comm output,1,1
.ident "GCC: (Fedora 4.9.2-1.fc21) 4.9.2"
.global __do_copy_data
.global __do_clear_bss
Read this explanation of Pointer Registers to see how the AVR instruction set uses the r30, r31 register pair as the pointer register Z. Read up on the ld, st, ldi, ldd, sts and std instructions.
Implementation Notes
If you link the program then disassemble it:
avr-gcc -mmcu=atmega8 -Os byte-array.c -o byte-array.elf
avr-objdump -d byte-array.elf
00000000 <__vectors>:
0: 12 c0 rjmp .+36 ; 0x26 <__ctors_end>
2: 2c c0 rjmp .+88 ; 0x5c <__bad_interrupt>
4: 2b c0 rjmp .+86 ; 0x5c <__bad_interrupt>
6: 2a c0 rjmp .+84 ; 0x5c <__bad_interrupt>
8: 29 c0 rjmp .+82 ; 0x5c <__bad_interrupt>
a: 28 c0 rjmp .+80 ; 0x5c <__bad_interrupt>
c: 27 c0 rjmp .+78 ; 0x5c <__bad_interrupt>
e: 26 c0 rjmp .+76 ; 0x5c <__bad_interrupt>
10: 25 c0 rjmp .+74 ; 0x5c <__bad_interrupt>
12: 24 c0 rjmp .+72 ; 0x5c <__bad_interrupt>
14: 23 c0 rjmp .+70 ; 0x5c <__bad_interrupt>
16: 22 c0 rjmp .+68 ; 0x5c <__bad_interrupt>
18: 21 c0 rjmp .+66 ; 0x5c <__bad_interrupt>
1a: 20 c0 rjmp .+64 ; 0x5c <__bad_interrupt>
1c: 1f c0 rjmp .+62 ; 0x5c <__bad_interrupt>
1e: 1e c0 rjmp .+60 ; 0x5c <__bad_interrupt>
20: 1d c0 rjmp .+58 ; 0x5c <__bad_interrupt>
22: 1c c0 rjmp .+56 ; 0x5c <__bad_interrupt>
24: 1b c0 rjmp .+54 ; 0x5c <__bad_interrupt>
00000026 <__ctors_end>:
26: 11 24 eor r1, r1
28: 1f be out 0x3f, r1 ; 63
2a: cf e5 ldi r28, 0x5F ; 95
2c: d4 e0 ldi r29, 0x04 ; 4
2e: de bf out 0x3e, r29 ; 62
30: cd bf out 0x3d, r28 ; 61
00000032 <__do_copy_data>:
32: 10 e0 ldi r17, 0x00 ; 0
34: a0 e6 ldi r26, 0x60 ; 96
36: b0 e0 ldi r27, 0x00 ; 0
38: e4 e8 ldi r30, 0x84 ; 132
3a: f0 e0 ldi r31, 0x00 ; 0
3c: 02 c0 rjmp .+4 ; 0x42 <__SREG__+0x3>
3e: 05 90 lpm r0, Z+
40: 0d 92 st X+, r0
42: ac 36 cpi r26, 0x6C ; 108
44: b1 07 cpc r27, r17
46: d9 f7 brne .-10 ; 0x3e <__SP_H__>
00000048 <__do_clear_bss>:
48: 10 e0 ldi r17, 0x00 ; 0
4a: ac e6 ldi r26, 0x6C ; 108
4c: b0 e0 ldi r27, 0x00 ; 0
4e: 01 c0 rjmp .+2 ; 0x52 <.do_clear_bss_start>
00000050 <.do_clear_bss_loop>:
50: 1d 92 st X+, r1
00000052 <.do_clear_bss_start>:
52: ad 36 cpi r26, 0x6D ; 109
54: b1 07 cpc r27, r17
56: e1 f7 brne .-8 ; 0x50 <.do_clear_bss_loop>
58: 02 d0 rcall .+4 ; 0x5e <main>
5a: 12 c0 rjmp .+36 ; 0x80 <_exit>
0000005c <__bad_interrupt>:
5c: d1 cf rjmp .-94 ; 0x0 <__vectors>
0000005e <main>: ...
00000080 <_exit>:
80: f8 94 cli
00000082 <__stop_program>:
82: ff cf rjmp .-2 ; 0x82 <__stop_program>
You can see avr-gcc automatically generates startup code, including:
the interrupt vector (__vectors), which uses rjmp to jump to the Interrupt Service Routines.
initialize the status register, SREG , and the stack pointer, SPL/SPH (__ctors_end)
copies the data segment content from FLASH to RAM for initialized, writable global variables (__do_copy_data)
clears the BSS segment for uninitialized writable global variables (__do_clear_bss etc)
calls our main() function
calls _exit() if main() ever returns
_exit() is just a cli to disable interrupts
and an infinite loop (__stop_program)
Edit: I forgot to add an -mmcu flag during the linker step, meaning my program was not being compiled for an avr microcontroller. The code itself is correct.
I am using this piece of code to drive a seven segment display:
#include <avr/io.h>
int main(void)
{
DDRA = 0xff;
DDRB = 0xff;
for (;;) {
PORTA = _BV(7);
PORTB = ~0x07;
}
return 0;
}
This works fine, but when I try to set the DDRs in a helper function like this, it no longer works:
#include <avr/io.h>
void initIO(void)
{
DDRA = 0xff;
DDRB = 0xff;
}
int main(void)
{
initIO();
for (;;) {
PORTA = _BV(7);
PORTB = ~0x07;
}
return 0;
}
Why is this incorrect?
This is the disassembled code:
Disassembly of section .text:
00000000 <initIO>:
0: 8f ef ldi r24, 0xFF ; 255
2: 8a bb out 0x1a, r24 ; 26
4: 87 bb out 0x17, r24 ; 23
6: 08 95 ret
00000008 <main>:
8: fb df rcall .-10 ; 0x0 <initIO>
a: 90 e8 ldi r25, 0x80 ; 128
c: 88 ef ldi r24, 0xF8 ; 248
e: 9b bb out 0x1b, r25 ; 27
10: 88 bb out 0x18, r24 ; 24
12: fd cf rjmp .-6 ; 0xe <main+0x6>
If the device model is not specified during the final link step then avr-gcc won't generate the proper preamble required to initialize variables and to call the main() function. Be sure to specify the proper model at each invocation of avr-gcc or avr-ld.
I have a simple code that I am trying to compile with lm32-rtems4.11-gcc.
I have the code, the compile command and the lst below. When I compile I see a bunch of code added on the top instead of the startup code that I want in there. The code I want the processor to start with after reset is at location 3f4 instead of 0. What I wanted help on is to figure out how the rest of the code got in and find a way to remove it or move all that code to addresses after my code. I appreciate the help.
Thanks
The code:
//FILE: crt.S
.globl _start
.text
_start:
xor r0, r0, r0
mvhi sp, hi(_fstack)
ori sp, sp, lo(_fstack)
mv fp,r0
mvhi r1, hi(_fbss)
ori r1, r1, lo(_fbss)
mvhi r2, hi(_ebss)
ori r2, r2, lo(_ebss)
1:
bge r1, r2, 2f
sw (r1+0), r0
addi r1, r1, 4
bi 1b
2:
calli main
mvhi r1, 0xdead
ori r2, r0, 0xbeef
sw (r1+0), r2
//FILE: hello_world.c
void putc(char c)
{
char *tx = (char*)0xff000000;
*tx = c;
}
void puts(char *s)
{
while (*s) putc(*s++);
}
void main(void)
{
puts("Hello World\n");
}
//FILE: linker.ld
OUTPUT_FORMAT("elf32-lm32")
ENTRY(_start)
__DYNAMIC = 0;
MEMORY {
pmem : ORIGIN = 0x00000000, LENGTH = 0x8000
dmem : ORIGIN = 0x00008000, LENGTH = 0x8000
}
SECTIONS
{
.text :
{
_ftext = .;
*(.text .stub .text.* .gnu.linkonce.t.*)
_etext = .;
} > pmem
.rodata :
{
. = ALIGN(4);
_frodata = .;
*(.rodata .rodata.* .gnu.linkonce.r.*)
*(.rodata1)
_erodata = .;
} > dmem
.data :
{
. = ALIGN(4);
_fdata = .;
*(.data .data.* .gnu.linkonce.d.*)
*(.data1)
_gp = ALIGN(16);
*(.sdata .sdata.* .gnu.linkonce.s.*)
_edata = .;
} > dmem
.bss :
{
. = ALIGN(4);
_fbss = .;
*(.dynsbss)
*(.sbss .sbss.* .gnu.linkonce.sb.*)
*(.scommon)
*(.dynbss)
*(.bss .bss.* .gnu.linkonce.b.*)
*(COMMON)
. = ALIGN(4);
_ebss = .;
_end = .;
} > dmem
}
The compile command
lm32-rtems4.11-gcc -Tlinker.ld -fno-builtin -o hello_world.elf crt.S hello_world.c
lm32-rtems4.11-objdump -DS hello_world.lst hello_world.elf
The lst file
00000000 <rtems_provides_crt0>:
#include <signal.h> /* sigset_t */
#include <time.h> /* struct timespec */
#include <unistd.h> /* isatty */
void rtems_provides_crt0( void ) {} /* dummy symbol so file always has one */
0: c3 a0 00 00 ret
00000004 <rtems_stub_malloc>:
#define RTEMS_STUB(ret, func, body) \
ret rtems_stub_##func body; \
ret func body
/* RTEMS provides some of its own routines including a Malloc family */
RTEMS_STUB(void *,malloc(size_t s), { return 0; })
4: 34 01 00 00 mvi r1,0
8: c3 a0 00 00 ret
0000000c <malloc>:
c: 34 01 00 00 mvi r1,0
10: c3 a0 00 00 ret
.
.
.
//omitting other such unrelated code that was inserted into the code and going to the
//code at 3f4 that is the code I wanted at 0
000003f0 <__assert_func>:
3f0: c3 a0 00 00 ret
000003f4 <_start>:
3f4: 98 00 00 00 xor r0,r0,r0
3f8: 78 1c 00 00 mvhi sp,0x0
3fc: 3b 9c ff fc ori sp,sp,0xfffc
400: b8 00 d8 00 mv fp,r0
404: 78 01 00 00 mvhi r1,0x0
408: 38 21 84 48 ori r1,r1,0x8448
40c: 78 02 00 00 mvhi r2,0x0
410: 38 42 84 48 ori r2,r2,0x8448
414: 4c 22 00 04 bge r1,r2,424 <_start+0x30>
418: 58 20 00 00 sw (r1+0),r0
41c: 34 21 00 04 addi r1,r1,4
420: e3 ff ff fd bi 414 <_start+0x20>
424: f8 00 00 28 calli 4c4 <main>
428: 78 01 de ad mvhi r1,0xdead
42c: 38 02 be ef mvu r2,0xbeef
430: 58 22 00 00 sw (r1+0),r2
.
.
.
As far as the .elf object you have generated is concerned, execution starts from 0x3f4, not from location 0. That's a result of your linker map specifying the entry point as the _start symbol. Whatever parses the .elf object should jump to that location when transferring execution to the program.
Now, perhaps an .elf object is not what you want to end up with - if the result isn't to be loaded by something which knows how to parse an .elf object, then you may need some other format, such as a flat binary image.
It's quite common when using a gcc elf toolchain with a small embedded chip to turn the .elf object into a flat binary using a command along the lines of
toolchain-prefix-objcopy -O binary something.elf something.bin
It's also possible you may need to create some sort of stub to jump to the _start label, and adjust your linker map to make sure that is the first thing in the image.
More generally though, you can probably find a working example for this toolchain and either this processor or a comparable one. Setting up embedded build systems from scratch is a bit tricky, so don't do it the hard way if there's any chance of finding an example to follow.
So I could not figure out why the compiler does not move the .start label to 0 when the linker.ld clearly tells it to do so. But I did figure a work around.
I created a section name for the startup code as shown in BOLD below. I then created a section in memory starting at 0 which I reserved only for this start up code. That seemed to do the trick. I ran the code and got a hello world :) . All the changes I made are in BOLD and also commented //Change 1 //Change 2 and //Change 3.
//FILE: crt.S
.section .init// Change 1
.globl _start
.text
_start:
xor r0, r0, r0
mvhi sp, hi(_fstack)
ori sp, sp, lo(_fstack)
mv fp,r0
mvhi r1, hi(_fbss)
ori r1, r1, lo(_fbss)
.
.
//linker.ld
OUTPUT_FORMAT("elf32-lm32")
ENTRY(_start)
__DYNAMIC = 0;
MEMORY {
init : ORIGIN = 0x00000000, LENGTH = 0x40 //Change 2
pmem : ORIGIN = 0x00000040, LENGTH = 0x8000
dmem : ORIGIN = 0x00008000, LENGTH = 0x8000
}
SECTIONS
{
.init : {*(.init)}>init //Change 3
.text :
{
_ftext = .;
*(.text .stub .text.* .gnu.linkonce.t.*)
_etext = .;
} > pmem