p89lpc936 keil programming help required - c

I am trying to program Blinky program from Keil complier to P89LPC936 microcontroller through a universal programmer(SuperPro). But the microcontroller is not running. But when i write a simple program in assambly and program the same hardware it works fine. Please I need help regarding it where i am doing wrong.
Here is code >>>
Code:
/* Blinky.C - LED Flasher for the Keil LPC900 EPM Emulator/Programmer Module */
#include <REG936.H> // register definition
void delay (unsigned long cnt)
{
while (--cnt);
}
void main()
{
unsigned char i;
P1M1 |= 0x20;
P1M2 &= 0xDF;
P2M1 &= 0xE7;
P2M2 |= 0x18;
delay (20000);
for(;;)
{ for (i = 0x01; i; i <<= 1)
{ P2 = i; // simulate running lights
delay (20000);
}
for (i = 0x80; i; i >>= 1)
{ P2 = i;
delay (20000);
}
}
}
Here is Hex file >>>
:10006B008F0B8E0A8D098C08780874FF12004DECEB
:06007B004D4E4F70F32210
:100003004391205392DF53A4E743A5187F207E4EEC
:100013007D007C0012006B7B01EB6013F5A07F2059
:100023007E4E7D007C0012006BEB25E0FB80EA7BBB
:1000330080EB60E3F5A07F207E4E7D007C00120004
:070043006BEBC313FB80EA25
:01004A002293
:04FFF00023001E00CC
:08FFF800000000000000000001
:030000000200817A
:0C00810078FFE4F6D8FD75810B02000347
:10004B007401FF3395E0FEFDFC080808E62FFFF670
:10005B0018E63EFEF618E63DFDF618E63CFCF622E9
:00000001FF
And here is the assembly code and its hex file which is working absolutely right.
Code:
; LPC936A1.A51
; Oct 7, 2010 PCB: ?
; Features: ?
; ?
$mod51
RL1 bit P2.3
RL2 bit P2.4
DSEG AT 20H
FLAG1: ds 1
STACK: ds 1
FRL1 bit FLAG1.0 ; Relay 1
CSEG
org 0H
ajmp Reset
org 30H
Reset: mov 0A5H,#0FFH
Start: mov c,FRL1 ;
mov RL1,c
cpl c
mov FRL1,c
mov RL2,c
acall Delay0
ajmp Start
Delay0: mov R7,#250
Delay: mov R6,#61
Delay1: nop
nop
nop
nop
nop
nop
nop
nop
djnz R6,Delay1
djnz R7,Delay
ret
Text: DB '(C) DIGIPOWER 2010'
Text0: DB ' LPC936A1 '
END
And its hex is
:020000000130CD
:1000300075A5FFA20092A3B3920092A411400133D0
:100040007FFA7E3D0000000000000000DEF6DFF2D7
:10005000222843292044494749504F5745522032CE
:0D006000303130204C5043393336413120CF
:00000001FF
Please help i m stuck.
Regards
Dani

I don't work with keil tools for a long time and I never used that micro, so probably I won't be able to help you much.
Did you tried running it on the emulator?
Try to put a breakpoint in main and check if it stops there. There might me some issue with c_start and your main isn't being called.
Look at the assembly of the initialization code and check for something odd. I think you can check the assembly code generated by the compiler. You might have to turn on some option to generate intermediate files
You might also check "Electronics and Robotics" at stackexchange. There you may find people working with electronics that might provide better help.

You say that you write a program in assembly and it works fine, but not in C. Have you verified that your C environment is configured to place your code and data in the correct spots in memory?
Also, some chips have a "reset vector" that is called when the chip is first powered and also when the chip resets. Does your C environment set this vector correctly? Does it put code that will jump to your program when it starts to run?

Disassemble or compile the C to assembler to see what the compiler is doing. What is working or not in your C program? does the led just glow? Your assembler looks to be burning about 140,000 instructions but the C maybe 40,000? that could make the difference between an led you can see with your eyes and one that looks to be on but not blinking.
The C program appears to be setting up registers that the assembler does not. is there a bug there? are they disabling something that shouldnt be touched?
bottom line is you need to move the two programs toward each other, complicate the assembler until it approaches what the C is doing and adjust the C toward the assembler (have to look at the output of the compiler though).

Try:
void delay (unsigned long cnt)
{
while (--cnt) {
#pragma asm
NOP
#pragma endasm
}
}

Related

Custom Instruction crashing with SIGNAL 4 (Illegal Instruction): RISC-V (32) GNU-Toolchain with QEMU

I have been wanting to develop and understand the process of creating custom extensions for a large-scale task I have, involving RISC-V compilation using the QEMU emulator. I have been loosely following the below guide: https://chowdera.com/2021/04/20210430120004272q.html - In trying to understand where and how the instruction is parsed, implemented and executed in this way through QEMU. I have edited and created an entry for my R-type instruction, same as up there, however when I try to issue it as an .insn directive in ASM, the following exception is thrown:
'Program stopped with Signal 4 (Illegal Instruction)'
The code that produced this exception:
#include <stdio.h>
static int custom_cube(int addr)
{
int cube;
asm volatile(".insn r 0x7b, 6, 6, %0, %1, x0" : "=r"(cube) : "r"(addr));
return cube;
}
int main() {
int a = 3;
int ret = 0;
ret = custom_cube((int)&a);
if(ret == a * a * a) {
printf("Success");
}
else {
printf("Failed");
}
return 0;
}
I have attempted to look into the encoding of the instruction (exactly as the tutorial describes and same opcode and bitfields), to see if there was any conflict, if the helper functions were incorrectly implemented etc, I also rebuilt and configured QEMU after implementation and ran again, reaching the same error, I changed the encoding once again to something different and changed the .insn opcode call accordingly, and attained the same behaviour, even after verifying and type-checking. I am still rather new to the build process of implementing custom RISC-V instructions specifically through QEMU, and would appreciate if there was any input or anything that either I or perhaps this tutorial has neglected to include in this process, a similar guide using the RiscFree IDE had a similar process, (implementing trans and helper functions - then rebuilding QEMU and running), so the fact the Linux kernel is throwing this exception I am still unsure of. Is there something blindingly obvious I might have missed? Should I have rebuilt the entire toolchain and not just QEMU? From what this tutorial had gathered to me, I presumed nothing further was required.
Disassembly under the custom header: With the corresponding source
static int custom_cube(int addr)
{
1018c: fd010113 addi sp,sp,-48
10190: 02812623 sw s0,44(sp)
10194: 03010413 addi s0,sp,48
10198: fca42e23 sw a0,-36(s0)
int cube;
asm volatile (
1019c: fdc42783 lw a5,-36(s0)
101a0: 0c07e7fb .4byte 0xc07e7fb
101a4: fef42623 sw a5,-20(s0)
".insn r 0x7b, 6, 6, %0, %1, x0"
:"=r"(cube)
:"r"(addr)
);
return cube;
101a8: fec42783 lw a5,-20(s0)
}
101ac: 00078513 mv a0,a5
101b0: 02c12403 lw s0,44(sp)
101b4: 03010113 addi sp,sp,48
101b8: 00008067 ret
(Since it doesn't appear to disassemble under an existing RISC-V instruction, I ((presume)) that I did put it under a free spot? I changed the variation a fair few times and still experienced this, even after validating all other possible overlaps - none were found, so I feel as if it is to do with the implementation or it's call).
UPDATE TO ISSUE: I have since used a different encoding for the instruction
OPCODE = "0110011", FUNCT3 = "111" and FUNCT7 = "0100000".
Upon recompiling QEMU and running this using the .insn directive, QEMU appears to just hang upon decoding the instruction - I am unsure what further implementation is required after using TCG directives.

Is there any way to generate inline assembly programmatically?

In my program I need to insert NOP as inline assembly into a loop, and the number of NOPs can be controlled by an argument. Something like this:
char nop[] = "nop\nnop";
for(offset = 0; offset < CACHE_SIZE; offset += BLOCK_SIZE) {
asm volatile (nop
:
: "c" (buffer + offset)
: "rax");
}
Is there any way to tell compiler to convert the above inline assembly into the following?
asm volatile ("nop\n"
"nop"
:
: "c" (buffer + offset)
: "rax");
Well, there is this trick you can do:
#define NOPS(n) asm volatile (".fill %c0, 1, 0x90" :: "i"(n))
This macro inserts the desired number of nop instructions into the instruction stream. Note that n must be a compile time constant. You can use a switch statement to select different lengths:
switch (len) {
case 1: NOPS(1); break;
case 2: NOPS(2); break;
...
}
You can also do this for more code size economy:
if (len & 040) NOPS(040);
if (len & 020) NOPS(020);
if (len & 010) NOPS(010);
if (len & 004) NOPS(004);
if (len & 002) NOPS(002);
if (len & 001) NOPS(001);
Note that you should really consider using pause instructions instead of nop instructions for this sort of thing as pause is a semantic hint that you are just trying to pass time. This changes the definition of the macro to:
#define NOPS(n) asm volatile (".fill %c0, 2, 0x90f3" :: "i"(n))
No, the inline asm template needs to be compile-time constant, so the assembler can assemble it to machine code.
If you want a flexible template that you modify at run-time, that's called JIT compiling or code generation. You normally generate machine-code directly, not assembler source text which you feed to an assembler.
For example, see this complete example which generates a function composed of a variable number of dec eax instructions and then executes it. Code golf: The repetitive byte counter
BTW, dec eax runs at 1 per clock on all modern x86 CPUs, unlike NOP which runs at 4 per clock, or maybe 5 on Ryzen. See http://agner.org/optimize/.
A better choice for a tiny delay might be a pause instruction, or a dependency chain of some variable number of imul instructions, or maybe sqrtps, ending with an lfence to block out-of-order execution (at least on Intel CPUs). I haven't checked AMD's manuals to see if lfence is documented as being an execution barrier there, but Agner Fog reports it can run at 4 per clock on Ryzen.
But really, you probably don't need to JIT any code at all. For a one-off experiment that only has to work on one or a few systems, hack up a delay loop with something like
for (int i=0 ; i<delay_count ; i++) {
asm volatile("" : "r" (i)); // defeat optimization
}
This forces the compiler to have the loop counter in a register on every iteration, so it can't optimize the loop away, or turn it into a multiply. You should get compiler-generated asm like delayloop: dec eax; jnz delayloop. You might want to put _mm_lfence() after the loop.

SPARC assembly jmp \boot

I'll explain the problem briefly. I have a Leon3 board (gr-ut-g99). Using GRMON2 I can load executables at the desired address in the board.
I have two programs. Let's call them A and B. I tried to load both in memory and individually they work.
What I would like to do now is to make the A program call the B program.
Both programs are written in C using a variant of the gcc compiler (the Gaisler Sparc GCC).
To do the jump I wrote a tiny inline assembler function in program A that jumps to a memory address where I loaded the program B.
below a snippet of the program A
unsigned int return_address;
unsigned int * const RAM_pointer = (unsigned int *) RAM_ADDRESS;
printf("RAM pointer set to: 0x%08x \n",(unsigned int)RAM_pointer);
printf("jumping...\n");
__asm__(" nop;" //clean the pipeline
"jmp %1;" // jmp to programB
:"=r" (return_address)
:"r" (RAM_pointer)
);
RAM_ADDRESS is a #define
#define RAM_ADDRESS 0x60000000
The program B is a simple hello world. The program B is loaded at the 0x60000000 address. If I try to run it, it works!
int main()
{
printf ("HELLO! I'M BOOTED! \n");
fflush(stdout);
return 0;
}
What I expect when I run the ProgramA, is to see the "jumping..." message on the console and then see the "HELLO! I'M BOOTED!" from the programB
What happens instead an IU exception.
Below I posted the messages show by grmon2 monitor. I also reported the "inst" report which should show the last operations performed before the exception.
grmon2> run
IU exception (tt = 0x07, mem address not aligned)
0x60004824: 9fc04000 call %g1
grmon2> inst
TIME ADDRESS INSTRUCTION RESULT SYMBOL
407085 600047FC mov %i3, %o2 [600063B8] -
407086 60004800 cmp %i4 [00000013] -
407089 60004804 be 0x60004970 [00000000] -
407090 60004808 mov %i0, %o0 [6000646C] -
407091 6000480C mov %i4, %o3 [00000013] -
407092 60004810 cmp %i4, %l0 [80000413] -
407108 60004814 bleu 0x60004820 [00000000] -
407144 60004818 ld [%i1 + 0x20], %o1 [FFFFFFFF] -
407179 60004820 ld [%i1 + 0x28], %g1 [FFFFFFFF] -
407186 60004824 call %g1 [ TRAP ] -
I also tried to substitute the "jmp" with a "jmpl" or a "call" but it does not worked.
I'm quite confused.
I do not know how to cope well with the problem and therefore I do not know what other information it is necessary to provide.
I can say that, the programB is loaded at 0x60000000 and the entry_point is, of course, 0x60000000. Running directly program B from that entry point it works good!
Thanks in advance for your help!
Looks to me like you did execute the jump, and it got to program B, as evidenced by the addresses of the instructions in the trace buffer. But where you crashed was in stdio trying to print stuff. Stdio makes extensive use of function pointers, and the sequence clearly shows a call instruction with the target address in a register, which indicates use of a function pointer.
I suggest putting fflush(stdout) in program A just before the jump, and this will allow you to see the messages before doing the jump. Then, in program B, instead of using printf, just put some known value in memory that you can examine later via the monitor to verify that it got there.
My guess is that the stdio library has some data or parameter that needs to be set up at the start of the program, and that's not being done or not done properly. Not sure about the platform you are running on, but do you have some sort of debugging or single stepping ability, like in a debugger? If so, just single step through the jump and follow where the program goes.

keil 4 and stm32f4discovery fpu not working

////// EDIT: SOLVED, read solution below
I'm trying to use the fpu with the stm32f4Discovery board, programmed with Keil 4 (free version) but, when trying to use it, enters in an infinite loop.
I don't know exactly why, I'm using a very simple code in C and the debugger:
#include "stm32f4_discovery.h"
#include <stdio.h>
float a = 1.332, b = 2.994;
int main(void)
{
printf("Hola");
printf("%f",a*b);
return(0);
}
Here is the results from the debugger: nothing in printf viewer and infinite loop because of "Hard Fault exception occurs" (image here, imgur)
Without line printf("%f",a*b) the debugger shows perfectly "Hola" and ends the program.
I've been searching a possible solution in google since I used this board for a project in university months ago, but anyone knows how to fix it.
I know I can disable fpu and use libraries but that's not the point...
Thank you for your help
/////////////// SOLUTION
I had to change the code in startup_stm32f4xx.s and the function SystemInit() from system_stm32f4xx.c
In startup_stm32f4xx.s, search for the Reset Handler, the code should look like this, but the part below ";FPU settings" is mostly not in the original:
; Reset handler
Reset_Handler PROC
EXPORT Reset_Handler [WEAK]
IMPORT SystemInit
IMPORT __main
;FPU settings
LDR R0, =0xE000ED88 ; Enable CP10,CP11
LDR R1,[R0]
ORR R1,R1,#(0xF << 20)
STR R1,[R0]
LDR R0, =SystemInit
BLX R0
LDR R0, =__main
BX R0
ENDP
And then add in system_stm32f4xx.c, function SystemInit(void), this lines:
void SystemInit(void)
{
/* FPU settings ------------------------------------------------------------*/
#if (__FPU_PRESENT == 1) && (__FPU_USED == 1)
SCB->CPACR |= ((3UL << 10*2)|(3UL << 11*2)); /* set CP10 and CP11 Full Access */
#endif
The debugger will show now the operation result in printf viewer. I don't really know if it's using FPU, but I will test it tomorrow (should be faster now).
Source here

Empty array when printed

Im writing some C/asm program for the AVR MCU. Im still learning as I go so I hope I have made some sort of mistake in my code.
I have a buffer volatile unsigned char suart_0_rx_buffer[SUART_0_BUF_SIZE+1]; in my C code that I am accessing in my asm code as below. All I want to do is store a byte s0_Rxbyte in the buffer and increment the pointer s0_index every time. 's0_Rxbyte` is always a non zero value.
suart_0_wr_buf_2: ldi s0_z_low, lo8(suart_0_rx_buffer)
ldi s0_temp1, hi8(suart_0_rx_buffer)
add s0_z_low, s0_index
adc s0_z_high,s0_temp1
suart_0_wr_buf_3: st Z+, s0_Rxbyte
inc s0_index
clr s0_temp1
st Z, s0_temp1
If I try and print the contents in a loop in my C code I am getting absolutely nothing.
I didnt want to attach everything here because it will be cluttered.
So does anyone see any problems with the asm code above ?
Managed to figure it out in the end. It was a case of a simple error in the assembly code that caused it write an incorrect location in the SRAM.
suart_0_wr_buf_2: clr s0_temp1
ldi s0_z_low, lo8(suart_0_rx_buffer)
ldi s0_z_high, hi8(suart_0_rx_buffer)
add s0_z_low, s0_index
adc s0_z_high, s0_temp1
suart_0_wr_buf_3: st Z+, s0_Rxbyte
inc s0_index
st Z, s0_temp1

Resources