Why out-of-bounds occurs with memchr() on mac os - c

I encounter a problem of overflow when use memchr() on mac os x.
Here is my test code:
#include <stdio.h>
#include <stdlib.h>
int main(void){
char *content="http\r\nUser";
int content_size = strlen(content);
char *contmem = malloc(content_size+1);
memset(contmem, '\0', content_size+1);
memcpy(contmem, content, content_size);
printf("%c\n", *(content+content_size));
printf("%c\n", *(contmem+content_size));
char *t = memchr(content, 't', content_size);
printf("%c\n", *t);
return 0;
}
It works normally on linux, i.e., my fedora 16, and prints the correct value of t.
But when I run the same piece of code on Mac, Segmentation Fault occurs!!
After debugging with gdb, I take the saying:
(gdb) print t
$7 = 0xf4b <Address 0xf4b out of bounds>
Then I try to rewrite the memchr function in this test file:
static char*
memchr(const char *data, int c, unsigned long len){
char *tp = data;
unsigned long i;
for( i = 0; i<len; i++){
if((int)*tp == c){
return tp;
}else{
tp = tp+1;
}
}
}
And the output seems correct!
(gdb) print t
$1 = 0x100000f1d "ttp\r\nUser"
So I am confused with the abnormal behavior of memchr() on mac os, while other mem functions like memset() memcpy() works fine.
How can I run the test without rewriting the memchr() on mac??
Thanks.

The function memchr() is declared in string.h, for which there is no include directive in the posted code. This means an implicit function declaration will be generated by compiler (which should emit a warning) which returns an int. If the sizeof(int) and sizeof(char*) are different on your system this may explain the problem. Add:
#include <string.h>

Your code should indeed work. Your compiler may be using built-in versions of the mem***() functions. Try to include string.h to force the use of the libc versions.

Related

Trying to concatenate strings using variadic function looping odd number of times in C

I am using freeBSD arm64 on my pi4 to test some C code. I am finding strange behavior.
I know that i should be doing the parsing a bit differently but I like to get the basics working first.
#include <stdlib.h>
#include <stdint.h> /* uints */
#include <string.h> /* char */
#include <stdio.h> /* printf*/
#include <stdarg.h> /* unknown number of arguments */
char* combineString(int num, ...) {
char* finalStr;
finalStr = calloc(600, sizeof(char));
va_list vaList;
/* initialize */
va_start(vaList, num);
printf("%i\n", num);
for (int x = 0; x < num; num++) {
char* str = va_arg(vaList, char*);
strcat(finalStr, str);
printf("%s\n", finalStr);
}
va_end(vaList);
return finalStr;
}
Somehow, this program loops 5 times instead of 2 (the number of arguments I told the function I had).
The number of arguments is indicated by 'num'.
combineString(2, "f", char* type here)
will produce:
f
fchartypehere
fchartypehere
fchartypehere
fchartypeherechartypehere
where the double chartypehere indicates it wrote that char* twice even though it should only be looping twice? I am using gcc to compile this and whenever i use gdb to get some relevant information i get this:
Program received signal SIGSEGV, Segmentation fault.
Address not mapped to object.
strcat (s=<optimized out>, append=<optimized out>) at /usr/src/lib/libc/string/strcat.c:46
46 /usr/src/lib/libc/string/strcat.c: No such file or directory.
I suppose its just coincedentally running 5 times due to undefined behavior but i am not sure where in my code exactly its causing that?
You loop out of bounds of the va_arg list.
for (int x = 0; x < num; num++)
should be
for (int i = 0; i < num; i++)
You can avoid bugs like this by always naming your loop iterator i unless you have very good reasons not to.

Function pointer different behaviour in GCC and Arduino

In my C program I have a skeleton for a command interpreter. It works fine on Linux/GCC, but in Arduino it does not return the expected results.
Below find the PC code. I did the appropriate changes for Arduino, and it works except the return string (see second line from bottom in listing).
C-Code on PC (working):
#include <stdio.h>
char* help(char *s){
char *helpString="This is the help string\n";
return helpString;
}
typedef struct {
const char* command;
char* (*cmdExec)(char *s);
}S_COMMAND;
S_COMMAND cmdTable[]= {
{"he", help}
};
int main(void){
char *text;
printf("\n%s\t",cmdTable[0].command); // returns "he"
text = (cmdTable[0].cmdExec)("0");
/* returns help string on PC as expected, but garbage on Arduino */
printf("\n%s", text);
}
Thanks for the advice. I modified the code like follows (peeking at Node.JS), and it is working fine now on both platforms.
int help(char *req, char *res){
strcpy(res, "This is the help string...\n");
return 0;
}
typedef struct {
char* command;
int (*cmdExec)(char *request, char *result);
}S_COMMAND;
S_COMMAND cmdTable[]= {
{"he", help}
};
int main(void){
char text[20];
cmdTable[0].cmdExec("0", text);
}
This is valid C code (with the exception of the missing return statement in main), and a conforming C compiler should accept it and produce a working executable. In particular, your use of function pointers isn’t related to the problem at hand. Furthermore, the manual for avr-gcc does not mention any relevant restrictions. I don’t have an Arduino on hand to test the behaviour but if avr-gcc does not produce working code for the input you’ve shown then this suggests a bug in the compiler.
char* help(char *s){
char *helpString="This is the help string\n";
return helpString;
}
You're returning a local variable - it ceases to exist once you go out of scope from the function. That it works at all on any platform is pure luck as once it ceases to exist, trying to access the string is undefined behaviour.

skip the next instruction in c code

Hi every one : I try to skip instruction
void func(char *str) {
char buffer[24];
int *ret;
strcpy(buffer, str);
}
int main(int argc, char **argv) {
int x;
x = 0;
func(argv[1]);
x = 1;
printf("%d\n”, x);
}
How I can use the pointer *ret defined in funct() to modify the return address for the function in such away I can skip x=1
It's a bad idea to use this in production code! (Reasons copied from dvnrrs' comment.) This is undefined behavior; it is severely abusing (assumed) knowledge about the way the stack is laid out under the hood, and the size of the compiled instructions. This is doomed to failure, especially if optimization is turned on.
Please note that modifying the memory next to local variables this way is incorrect C code, and it makes undefined behavior. I think there is no standard C solution to your problem, so if you want to do that, your best bet is architecture-specific assembly code. The following code happens to work for me in C on i386 with my GCC, but it's still undefined behavior, so it's inherently fragile and can cease to work in any changes in the compiler or in the ABI.
This prints 42 for me:
#include <stdio.h>
void func(char *str) {
(&str)[-1] += 2;
}
int main(int argc, char **argv) {
(void)argc; (void)argv;
int x;
x = 42;
func(argv[1]);
x = 137;
printf("%d\n",x);
return 0;
}
Compile and run on Linux i386:
$ gcc -m32 -W -Wall -s -O0 t.c && ./a.out
42
You can't do this in C; the closest thing would be the setjmp() and longjmp() functions but those won't work for your exact case.
As others pointed out in comments, the best thing to do might be to give your function a return value, and then check that return value in the caller to decide what code to run next. Your sample code seems like a contrived test case so I'm not sure exactly what you're trying to accomplish.

Microchip C18 - Weird code behavior (maybe extended-mode / non-extended-mode related)

I have this weird problem with the Microchip C18 compiler for PIC18F67J60.
I have created a very simple function that should return the index of a Sub-String in a larger String.
I don't know whats wrong, but the behavior seems to be related to wether extended mode is enabled or not.
With Extended-Mode enabled in MPLAB.X I get:
The memcmppgm2ram function returns zero all the time.
With Extended-Mode disabled in MPLAB.X I get:
The value of iterator variable i counts as: 0, 1, 3, 7, 15, 21
I'm thinking some stack issue or something, because this is really weird.
The complete code is shown below.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char bigString[] = "this is a big string";
unsigned char findSubStr(char *str, const rom char *subStr, unsigned char n, unsigned char m)
{
unsigned char i;
for (i=0; i < n-m; i++)
{
if(0 == memcmppgm2ram(&str[i], (const far rom void*)subStr, m))
return i;
}
return n; // not found
}
void main(void)
{
char n;
n = findSubStr(bigString, (const rom void*)"big", sizeof(bigString), 3);
}
memcmppgm2ram() expects a pointer to data memory (ram) as its first argument. You are passing a pointer to a string literal, which is located in program memory (rom).
You can use memcmppgm() instead, or copy the other string to ram using memcpypgm2ram() or strcpypgm2ram().
Unfortunately I can't test this, as I don't have access to this compiler at the moment.

LLVM Compiler 2.0 bug?

When the following code is compiled with LLVM Compiler, it doesn't operate correctly.
(i doesn't increase.)
It operates correctly when compiling with GCC 4.2.
Is this a bug of LLVM Compiler?
#include <stdio.h>
#include <string.h>
void BytesFromHexString(unsigned char *data, const char *string) {
printf("bytes:%s:", string);
int len = (int)strlen(string);
for (int i=0; i<len; i+=2) {
unsigned char x;
sscanf((char *)(string + i), "%02x", &x);
printf("%02x", x);
data[i] = x;
}
printf("\n");
}
int main (int argc, const char * argv[])
{
// insert code here...
unsigned char data[64];
BytesFromHexString(data, "4d4f5cb093fc2d3d6b4120658c2d08b51b3846a39b51b663e7284478570bcef9");
return 0;
}
For sscanf you'd use %2x instead of %02x. Furthermore, %2x indicates that an extra int* argument will be passed. But you're passing an unsigned char*. And finally, sscanf takes a const char* as first argument, so there's no need for that cast.
So give this a try :
int x;
sscanf((string + i), "%2x", &x);
EDIT : to clarify why this change resolves the issue : in your code, sscanf tried to write sizeof(int) bytes in a memory location (&x) that could only hold sizeof(unsigned char) bytes (ie. 1 byte). So, you were overwriting a certain amount of memory. This overwritten memory could very well have been (part of) the i variable.
From the compiler side of things the reason for this code behaving differently is that gcc and llvm (or any other compiler) may lay out the stack differently. You were likely just clobbering something else on the stack before that you didn't need for this example, but with the different layout for the llvm compiler you were clobbering something more useful.
This is another good reason to use stack protectors when debugging a problem (-fstack-protector-all/-fstack-protector). It can help flush out these issues.

Resources