builtin_memcpy warning on linux but not on windows - c

While trying to copy the string larger than the "string" variable, I know the reason for getting this warning, it is because I am trying to fit a 21-byte string into a 6-byte region. But why I am confused is why I am not getting a warning on the windows compiler.
On Windows, I am using Mingw, Visual Studio Code, and it runs the loop but there is no warning of any kind, while on Linux it is showing this warning.
rtos_test.c: In function 'main':
rtos_test.c:18:5: warning: '__builtin_memcpy' writing 21 bytes into a region of size 6 overflows the destination [-Wstringop-overflow=]
18 | strcpy(string, "Too long to fit ahan");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <stdio.h>
#include <stdint.h>
#include <pthread.h>
#include <string.h>
uint8_t test = 0;
char string[] = "Short";
int main()
{
while (test < 12)
{
printf("\nA sample C program\n\n");
test++;
}
strcpy(string, "Too long to fit ahan");
return 0;
}

I haven't enough reputation point to comment to your post.
I think in Linux gcc -Wall flag is enabled, you can try add -Wall flag to your IDE on Windows
additional,
I checked some compiler I saw that
char string[] = "Short";
only allocate for string with size is 6
your code use string is incorrectly, if you try to use more than allocated space the program may be crashed, you can verified this via asm code on Windows
└─[0] <> gcc test.c -S
test.c: In function ‘main’:
test.c:18:5: warning: ‘__builtin_memcpy’ writing 21 bytes into a region of size 6 overflows the destination [-Wstringop-overflow=]
18 | strcpy(stringssss, "Too long to fit ahan");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
┌─[longkl#VN-MF10-NC1011M] - [~/tmp] - [2021-12-22 07:00:36]
└─[0] <> grep stringsss test.s
.globl stringssss
.type stringssss, #object
.size stringssss, 6

This warning on Linux imply that GCC replaced memcpy() to a GCC builtin and that GCC can detect and is configured to detect such error. Which may not be the case on Windows and depending of compiler options, version, mood, etc.
You are also comparing Windows and Linux which are very different platforms, don't expect the same behavior on both. GCC is not very Windows oriented too (MinGW = Minimalist Gnu for Windows). Even between Linux distros, the GCC is different, there is a hugely large amount of variables to consider, especially when optimizations are involved.
To sum up, different environments produce different results, warnings and errors. You can't really do anything against that without fixing your code when you rely on environment specific behavior (often without knowing it), tweaking compiler options or code... Often the answer is to fix your source, the source of your problems ~100% of the time.
As a side note, setting up CI with different environment are a great bug catching system since behavior that looks fine on a system would not on another as in your case where there is memory corruption that would happen on both Linux and Windows.

Related

memcpy behaves differently with optimization flags compared to without

Consider this demo programme:
#include <string.h>
#include <unistd.h>
typedef struct {
int a;
int b;
int c;
} mystruct;
int main() {
int TOO_BIG = getpagesize();
int SIZE = sizeof(mystruct);
mystruct foo = {
123, 323, 232
};
mystruct bar;
memset(&bar, 0, SIZE);
memcpy(&bar, &foo, TOO_BIG);
}
I compile this two ways:
gcc -O2 -o buffer -Wall buffer.c
gcc -g -o buffer_debug -Wall buffer.c
i.e. the first time with optimizations enabled, the second time with debug flags and no optimization.
The first thing to notice is that there are no warnings when compiling, despite getpagesize returning a value that will cause buffer overflow with memcpy.
Secondly, running the first programme produces:
*** buffer overflow detected ***: terminated
Aborted (core dumped)
whereas the second produces
*** stack smashing detected ***: terminated
Aborted (core dumped)
or, and you'll have to believe me here since I can't reproduce this with the demo programme, sometimes no warning at all. The programme doesn't even interrupt, it runs as normal. This was a behaviour I encountered with some more complex code, which made it difficult to debug until I realised that there was a buffer overflow happening.
My question is: why are there two different behaviours with different build flags? And why does this sometimes execute with no errors when built as a debug build, but always errors when built with optimizations?
..I can't reproduce this with the demo program, sometimes no warning at all...
The undefined behavior directives are very broad, there is no requirement for the compiler to issue any warnings for a program that exhibits this behavior:
why are there two different behaviours with different build flags? And why does this sometimes execute with no errors when built as a debug build, but always errors when built with optimizations?
Compiler optimizations tend to optimize away unused variables, if I compile your code with optimizations enabled I don't get a segmentation fault, looking at the assembly (link above), you'll notice that the problematic variables are optimized away, and memcpy doesn't get called, so there is no reason for it to not compile successfuly, the program exits with success code 0, whereas if don't optimize it, the undefined behavior manifests itself, and the program exits with code 139, classic segmentation fault exit code.
As you can see these results are different from yours and that is one of the features of undefined behavior, different compilers, systems or even compiler versions can behave in a completely different way.
Accessing memory behind what's been allocated is undefined behavior, which means the compiler is allowed to do anything. When there are no optimizations, the compiler may try to guess and do something reasonable. When optimizations are turned on, the compiler may take advantage of the fact that any behavior is allowed to do something that runs faster.
The first thing to notice is that there are no warnings when compiling, despite getpagesize returning a value that will cause buffer overflow with memcpy.
That is the programmer's responsibility to fix, not the compiler. You'll be very lucky if a compiler manages to find potential buffer overflows for you. Its job is to check that your code is valid C then translate it to machine code.
If you want a tool that catches bugs, they are called static analysers and that's a different type of program. At some extent, static analysis might be integrated in a compiler as a feature. There is one for clang, but most static analysers are commercial tools and not open source.
Secondly, running the first programme produces: ... whereas the second produces
Undefined behavior simply means there is no defined behavior. What is undefined behavior and how does it work?. Meaning there's not likely anything to learn from examining the results, no interesting mystery to solve. In one case it apparently accessed forbidden memory, in the other case it mangled a poor little "stack canary". The difference will be related to different memory layouts. Who cares - bugs are bugs. Focus on why the bug happened (you already know!), instead of trying to make sense of the undefined results.
Now when I run your code with optimizations actually enabled for real (gcc -O2 on an x86 Linux), the compiler gives me
main:
subq $8, %rsp
call getpagesize
xorl %eax, %eax
addq $8, %rsp
ret
With optimizations actually enabled, it didn't even bother calling memcpy & friends because there are no side effects and the variables aren't used, so they can be safely removed from the executable.

C error: size of array is too large

I have tried to compile this C code:
#define MAX_INT 2147483647
int main()
{
int vector[MAX_INT];
return 0;
}
I'm using the C compilers provided by both MinGW and MSYS projects, i.e., MinGW / MSYS. MinGW compiler is "gcc version 6.3.0 (MinGW.org GCC-6.3.0-1)", which is the most recently version and have win32 thread model, and MSYS compiler is "gcc version 3.4.4 (msys special)" with posix thread model.
That MAX_INT constant value is set in the constant "__INT_MAX__" provided by the "limits.h" header.
How can I avoid this problem and get my simplest code compiled?
Your stack will not be that large to contain the array this is the main problem.
Try setting the stack size using the following lines as suggested in Increase stack size when compiling with mingw? while compiling
gcc -Wl,--stack,N
where N is stack size. E.g. gcc -Wl,--stack,4194304
Also as mentioned in the comments you might have to compile for 64 bits and will require that much amount of RAM or possibly a large page file.

compile error: expected identifier or '(' numeric constant using sigaction() system call

So far I am trying to compile Wolfenstein: Enemy Territory to be x86_64 native. After dealing with ASM instructions, amd64 specific processor registers and other strange stuff, I eventually got three beautiful GCC compile error:
url.c:xxxx:xx error: expected identifier or '(' before numeric constant
sigaction( SIGALRM, &sigact, NULL );
^
If I type "sigaction" onto Google, almost every single link is visited by me, so any help would be greeeeaaatly appreciated.
#include <signal.h>
is present.
#define _POSIX_SOURCE
#define _XOPEN_SOURCE
#define _POSIX_C_SOURCE
are present, too. By using string earch method, I can see no "overwrite" of the structure / function sigaction(), so I guess this is not the problem. Because the source file is ~ 4 000 lines long, I won't paste the whole code here (but the original is available there). I' compiling with the following flags (orinigal Makefile(s) only had the -m32 flag which, of course, I removed):
gcc --std=c99 -D_XOPEN_SOURCE
I still get this same error. This begins to get beyond my understanding, so this is why I'm asking you: how to I get rid of this error?
From what I've read, I think it is related with POSIX not being compatible with ANSI (which won't allow to compile sigaction() ).
Also, I'm running Ubuntu 14.04 64 bits (because it came preinstalled and I've been too lazy to install anoher distro), Enemy Territory compiles with scons (my version is 2.3.0) and GCC version is 4.8.2-19 .
Thanks in advance.

IMAGE_REL_AMD64_ADDR64 64-bit relocation

I was trying to get the Microsoft compiler to generate a relocation of type IMAGE_REL_AMD64_ADDR64 for testing purposes. The compiler prefers to use relative 32-bit LEA instructions, which is understandable, but I thought the following program would require it to use a 64-bit relocation:
#include <stdio.h>
char a[5000000000];
char b[5000000000];
int main(){
printf("%p %p %p %p\n", a, b, b - a, a - b);
}
It didn't, so I tried it with MinGW (both compilers in 64-bit mode on 64-bit Windows), and got the same results, respectively:
0000000169CE5A40 000000013FC86840 FFFFFFFFD5FA0E00 000000002A05F200
000000002A46D580 000000000040E380 FFFFFFFFD5FA0E00 000000002A05F200
I then tried it with GCC on Linux and the result was more enlightening, being a linker error message: 'relocation truncated to fit: R_X86_64_32'.
So just to make sure I'm not missing something, this is a bug in both compilers, except that in the case of GCC on Linux, at least the linker notices the problem instead of silently giving a wrong answer?
And how do you get the Microsoft compiler to generate 64-bit relocations? They occur in the Microsoft standard library (which is what prompted me to try to generate some in the first place) so presumably there must be a way?
Mind you, the bug is understandable because the compiler doesn't know that the offsets will end up exceeding 32 bits; basically it's an interaction between the separate compiling and linking model and a quirk of the x64 instruction set.
Anyway, it turns out you can get actual 64-bit relocations with e.g.
char *p = a;

Is this a valid C program?

I wrote a program, where the size of an array is taken as an input from user.
#include <stdio.h>
main()
{
int x;
scanf("%d", &x);
int y[x];
/* some stuff */
}
This program failed to compile on my school's compiler Turbo C (an antique compiler).
But when I tried this on my PC with GNU CC, it compiled successfully.
So my question is, is this a valid C program? Can I set the size of the array using a user's input?
It is a valid C program now, but it wasn't 15 years ago.
Either way, it's a buggy C program because x is used without any knowledge of how large it might be. The user can input a malicious value for x and cause the program to crash or worse.
C99 gives C programmers the ability to use variable length arrays,which are arrays whose sizes are not known until run time. --C:A Reference Manual
c90 does not support variable length arrays you can see this using this command line:
gcc -std=c90 -pedantic code.c
you will see an error message like this:
warning: ISO C90 forbids variable length array ‘y’ [-Wvla]
but c99 this is perfectly valid:
gcc -std=c99 -pedantic code.c
Instead of asking whether this is strictly valid C code, it may be better to ask whether it is good C code. Although it is valid, as you have seen, a number of compilers do not support variable length arrays.
Variable length arrays are not supported by a number of modern compilers. These include Microsoft Visual Studio and some versions of the IBM XL compilers. As you have found, variable length arrays are not entirely portable. That's fine if the code will only be used on systems that support the feature but not if it has to be run on other systems. Instead, it may be better to allocate the array with constant size using a reasonable limit or use a malloc and free to create the array in portable manner.

Resources