C18 Passing a char array to function - c

I am new to C programming and microcontrollers. I am using a PIC18F24K20 microcontroller with C18. I have it set up to receive information from a computer input using USART transmit and receive functions. My goal is to compare the received word against known words, and transmit something back to the computer based on what word was received. Below is the relevant code.
#include "p18f24k20.h"
#include "delays.h"
#include "string.h"
#include "stdlib.h"
void CommTransmit ( rom char * );
void main (void)
{
char buf[11], data, T;
int i;
i = 0;
memset(buf, 0, sizeof buf);
while(1)
{
if (PIR1bits.RCIF)
{
data = USART_receive();
if (data != 47) // 47 is /, indicates end of string
{
buf[i] = data;
i++;
}
else
{
// T = strcmppgm2ram(buf,(const far rom char*)"test");
CommTransmit(buf);
USART_transmit('t');
buf[0] = 0'
}
}
}
}
void CommTransmit ( rom char *CommVariable )
{
char test;
test = strcmppgm2ram(CommVariable, (const far rom char*)"test");
if (test == 0)
{
USART_transmit('g');
}
}
The code is currently set up to test to try to determine what is wrong. If I run it as is, the computer will receive a 't', as if the microcontroller ran through the CommTransmit function. However, it never transmits the 'g'. Even if I put a USART_transmit('g') call in the CommTransmit function, outside of and after the if statement, it never gets called (like it gets stuck in the strcmppgm2ram function?) but yet it still transmits the 't'.
It is also strange because if I put a break at the CommTransmit function and run through line by line, it seems to work properly. However, if I watch the CommVariable inside MPLAB IDE, it is never what it supposed to be (though the 'buf' variable prior to being called into the function is correct). From what I can tell, the value of CommVariable when I watch it depends on the size of the array.
From reading, I think it may caused by how the microcontroller stores the variable (program vs data memory?) but I'm not sure. Any help is greatly appreciated!
edit: I should also add that if I uncomment the T = strcmppgm2ram line in the else statement before the CommTransmit line, it works properly (T = 0 when the two strings are the same). I believe the array changes when I pass it through the function, which causes the strcmppgm2ram function to not work properly.

Looking at signature for strcmppgm2ram
signed char strcmppgm2ram(const char * str1, const rom char * str2 );
I don't understand why do you have rom char * for CommVariable. From chapter 2.4.3 ram/rom Qualifiers of MPLAB® C18 C Compiler User’s Guide
Because the PICmicro microcontrollers use separate program memory and
data memory address busses in their design, MPLAB C18 requires
extensions to distinguish between data located in program memory and
data located in data memory. /---/ Pointers can point to either data memory (ram pointers) or program
memory (rom pointers). Pointers are assumed to be ram pointers unless
declared as rom.
And in 2.7.3 String Constants:
An important consequence of the separate address spaces for MPLAB C18
is that pointers to data in program memory and pointers to data in
data memory are not compatible. /---/ because they refer to different
address spaces. /---/
MPLAB C18 automatically places all string constants in program memory.
This type of a string constant is “array of char located in program
memory”, (const rom char []).
And also it's not clear the purpose of type casting to const far rom char* for the second argument. That may cause stack corruption because far pointer has bigger size (24 bits). So, it looks like it should be rewritten as:
void CommTransmit (const char *CommVariable )
{
if (!strcmppgm2ram(CommVariable, "test")) {
USART_transmit('g');
}
}

Related

Why does initializing C local character arrays internally store the strings in different stack/data segments?

While working on some position-independent C injected shellcode, the strings were initially coded using this array initialization
char winexec[] = "WinExec";
However, this caused the shellcode to fail because the string WinExec was stored in the data segment of the injector but the injectee did not have access to that data.
To fix, the array initialization was changed to
char winexec[] = { 'W','i','n','E','x','e','c','\0' };
which worked perfectly because the string was stored in the injectee local stack segment.
For example https://godbolt.org/z/v8cqn5E56
#include <stdio.h>
int main()
{
/* String stored in the stack segment */
char winexecStack[] = { 'W','i','n','E','x','e','c','\0' };
/* String stored in the data segment */
char winexecData[] = "WinExec";
printf("Stack Segment: %s\n", winexecStack);
printf("Data Segment: %s\n", winexecData);
return 0;
}
Question
Why does C have multiple ways to initialize local arrays which externally appear the same, but internally the strings are stored very differently?
Do tidier methods exist to initialize a C character array on the stack?
Maybe something like
char winexecStack[8];
winexecStack[0] = 'W';
winexecStack[1] = 'i';
winexecStack[2] = 'n';
winexecStack[3] = 'E';
winexecStack[4] = 'x';
winexecStack[5] = 'e';
winexecStack[6] = 'c';
winexecStack[7] = '\0';
or convert strings such as Hello, World! to little endian values in an array
unsigned long long hello[] = { 0x57202C6F6C6C6548,0x00000021646C726F };
printf("Stack Segment: %s\n", (char*)&hello);
Perhaps for strings <= 8 bytes, they could be represented as a numerical value, stored on the stack but treated as a char* for example "WinExec"
unsigned long long winexec = 0x00636578456e6957;
printf("Stack Segment: %s\n", (char*)&winexec);
Why does C have multiple ways to initialize local arrays which externally appear the same, but internally the strings are stored very differently?
It doesn't. That you observe the source data for the initializers to be stored differently in the two cases is a function of your C implementation. It is not required by the C language itself. More generally, C has a lot to say about what is stored, but less to say about how it is stored, and almost nothing to say about where it is stored.
Do tidier methods exist to initialize a C character array on the stack?
A valid character array initializer takes one of the two forms you show.
Note also that "on the stack" is not a C concept (refer to "almost nothing to say about where").
Turning on optimization with /O2 makes the difference vanish. This suggests that, without optimization, the compiler implements C somewhat literally, putting the array induced by a string literal in a data segment (for static storage) while individual character initializers are treated as small constants. With optimization turned on, the compiler performs deeper semantic analysis and optimizes the generated code, and in fact the constant proposed in the question, 0x00636578456e6957, is seen in the generated assembly.

C how can I decrease my program size

I am writing a c program for a uController and I managed to fill 32kb with code.
The program is full with the following 'print' function:
void print(char *x)
{
while(*x) {
SerialWrite(*x++);}
}
SerialWrite looks like:
void SerialWrite(unsigned char c)
{
while(tx_buffer_size>250);
ES=0;
tx_buffer_size++;
if (tx_buffer_empty == 0){
txBuffer[tx_in++] = c;}
else {
tx_buffer_empty = 0;
SBUF = c;}
ES=1;
}
I call the print function with a string text as arguement like:
print("Hello World");
I only used 2.2kb of my 32kb external memory so my conclusion is too move "Hello World" to the external memory instead of having it hardcoded in the main program.
Unfortunately my attempts actually made it worse, I did the following:
char xdata *msg1 = "Hello World"; // <-- this made it worse, and the external memory space was not used at all
print(msg1);
Than I tried:
char xdata msg1[] = "Hello World";
print(msg1);
This did increase the external memory size by 12 which is correct as the string contains 11 chars + the null. But the program memory also increased by 3. I tried different string lengths but the program memory keeps increasing by 3 bytes.
How can I solve this problem?
Addition:
I am using Keil's C compiler and I compile with favor for program size.
The uController is an FPGA chip which has an emulated 80C51 chip. For this virtual 80c51 I am writing the code. I have 32kb memory for the main program and 32kb for variables
EDIT: *msg[] was a typo, it had to be *msg
char xdata *msg1 = "Hello World";
char xdata msg1[] = "Hello World";
I don't think you can ever shorten your program this way, because even if the message will be accessed from xdata, it will be first copied there by the C runtime, fetching it from code (initialized data).
If you really have a lot of "print()", then you can squeeze some bytes by declaring a correct pointer type for its parameter. I explain: due to the strange architecture of the 8051, a "high level" C "generic" pointer takes three bytes - two bytes as offset, and a third byte to specify whether it points in XDATA or in CODE. Then every call to print() has to construct such a pointer.
If, for example, you declare:
void print(char xdata *x)
now print() only accepts pointers into xdata, and only two bytes long pointers are required. Every call to print() will be simpler.
It seems that most of your texts will be constant, so they will be in code; and so the pointer type you need is not XDATA but CODE (or what else keyword is correct).
Once you have declared print() like above, you will NO MORE be able to print() data which is not in the code segment; perhaps you will need another function, like
printx(char* what)
which will accept char* from any segments, like before, but you will call this costly printx() only when needed.
Hope this helps; the 8051 is very peculiar and compilers for it are complicated (but Keil is superb). I would suggest to take a look at the generated (low level) code, may be you can spot some other memory waste. Good luck!
Addendum:______________
It happened once to me to have little room for texts. I solved nicely by applying some text compression. Given that all the texts were plain ASCII, I used characters over 127 to point to common, repeated words in the texts. For example "Hello world" can be replaced with 0x80 " world", if the string "Hello" is repeated a few times. Of course, the print routine must detect those special markers and manage the decompression - but this is very easy.
In short, put this line in the beginning of your source file:
#pragma STRING (XDATA)
These pages may help you understand better:
http://www.keil.com/support/man/docs/c51/c51_le_const.htm
http://www.keil.com/support/man/docs/c51/c51_string.htm
2018/06/25 edit:
try this line:
#pragma O2 STRING (XDATA)
as stated in the 2nd link above and this page, OMF2 should be enabled to use STRING directive
http://www.keil.com/support/man/docs/c51/c51_omf2.htm

Can't align "pointer targets signedness"

I've been struggling with this issue for a while where this code
uint8_t *PMTK = "$PSIMIPR,W,115200*1C";
gives me the error
pointer targets in initialization differ in signedness [-Wpointer-sign]
Changing it to only char * or unsigned char * does not make a difference, and const char * causes the program to complain further down where PMTK is supposed to be used, in the following code:
if (HAL_UART_Transmit(&huart3, PMTK, 32, 2000) != HAL_TIMEOUT)
{
HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_7);
HAL_Delay(500);
HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_7);
}
else
{ ....
The program is supposed to establish uart communication from STM32F0xx to a GPS receiver (SIM33ELA), using HAL driver.
Yeah, that's a really annoying corner of the STM32 Cube libs. Someone should give them a big clue that random read-only buffers are best expressed as const void * in C ... mumble.
So, to fix it: It's convenient to use a string literal since the data is textual. So, make it so and then cast in the call, instead:
const char PMTK[] = "$PSIMIPR,W,115200*1C";
if (HAL_UART_Transmit(&huart3, (uint8_t *) PMTK, strlen(PMTK), 2000) != HAL_TIMEOUT)
Note use of strlen() to get the proper length, hardcoding literal values is never the right choice and was broken here (the string is not 32 characters long). We could use sizeof (it's an array, after all) too but that's bit more error-prone since you must subtract 1 for the terminator. I'm pretty sure compilers will optimize this strlen() call out, anyway.
c Strings are treated by the compiler as char[]. If you add a cast to (uint8_t *) before the String the warning is quieted.
uint8_t *PMTK = (uint8_t *)"$PSIMIPR,W,115200*1C";

returned pointer address getting modified when ASLR turned on

I have this piece of C code running on development box with ASLR enabled. It is returning a char pointer (char *) to a function, but somehow few bytes in the returned pointer address are getting changed, printf output below:
kerb_selftkt_cache is 0x00007f0b8e7fc120
cache_str from get_self_ticket_cache 0xffffffff8e7fc120
The char pointer 0x00007f0b8e7fc120 is being returned to another function, which is getting modified as 0xffffffff8e7fc120 that differs from the original pointer address by one word (4-bytes) 0xffffffff instead of 0x00007f0b, the last four bytes (8e7fc120) being same. Any idea what might be going on? and how do I possibly fix this. The code is running on linux 64-bit architecture on Intel Xeon. This code is from an existing proprietary library, so I can't share the exact code, but the code logic looks something like this:
typedef struct mystr {
int num;
char addr[10];
}mystr;
static mystr m1;
char *get_addr() {
return m1.addr;
}
void myprint() {
printf("mystr m1 address %p\n",&m1);
printf("mystr m1 addr %p\n",m1.addr);
}
int main (int argc, char *argv[]) {
char *retadd;
myprint();
retadd = get_addr();
printf("ret address %p\n",retadd);
return 0;
}
retadd and m1.addr are different when ASLR is turned on.
My guess is the func takes an int or something else only 4 bytes wide, then the argument gets casted to a pointer type which sign-extends it. Except the compiler (gcc?) should warn you even without flags like -Wall, but hey, maybe you have weird-ass macros or something which obfuscate it.
Alternatively what you mean by passing is the fact that you return it (as opposed to passing as an argument). That could be easily explained by C defaulting to int as a return value if function declaration is missing. So in that case make sure you got stuff declared.

examining and modifying memory addresses in C

I want to essentially do the following (which is probably dangerous and what not) just for the heck of it:
int main() {
int x = 0x00ff00ff;
printf("Value at addr x: %x\n",*x);
return 0;
}
Basically take a look at the contents of a certain address in my machine. Maybe write to it. I'm guessing I'm not allowed to do the latter.
The error I get is error: invalid type argument of 'unary *'.
Is there any way to do this?
You need a pointer:
int *x = (int*)0x00ff00ff;
And you're right, it's probably not a good idea, unless you know that 0x00ff00ff is a valid address of some sort. It's not actually undefined behaviour since the standard says you can't dereference illegal addresses but then states that "illegal" includes things like:
addresses of freed heap objects.
NULL pointers.
wrong alignment.
but doesn't explicitly list arbitrary pointer values, since that would make memory-mapped I/O in embedded systems problematic.
For example, you may control a UART (universal asynchronous receiver/transmitter, basically a serial port device) in an embedded system by reading or writing known memory-mapped I/O addresses:
#define UART_READ_READY ((char*)0xff00)
#define UART_READ_CLEAR ((char*)0xff01)
#define UART_DATA ((char*)0xff02)
char getUartCharWithWait (unsigned int tries) {
char retChar;
unsigned int limit;
// Keep looping until character available, at least for a while.
limit = tries;
while (*UART_READ_READY == 0)
if (limit-- == 0)
return '\0';
// Get character, tell UART to clear it, then return it.
retChar = *UART_DATA;
*UART_CLEAR = 1;
return retChar;
}
In this example, you have code like:
retChar = *UART_DATA;
which will read a byte (C char) from "memory" address 0xff02, which will actually be from a device monitoring the address bus and intercepting specific addresses.
You are getting the previously mentioned error because there is no way to dereference an int, making x a pointer-to-int will yield the "correct" result (ie. it will be able to compile).
int * x = (int*)0x00ff00ff;
"It works, IT WORKS! Or er.. I mean, it compiles. Now, what's a segfault?"

Resources