Gcc compiler C string assignment issue

Gcc compiler C string assignment issue - c

I wrote this code because I'm having a similar problem in a larger program I'm writing. For all I know the problem is the same so I made this small example.
#include <stdio.h>
typedef struct
{
int x;
char * val;
}my_struct;
int main()
{
my_struct me = {4, " "};
puts("Initialization works.");
me.val[0] = 'a';
puts("Assignment works.");
puts(me.val);
puts("Output works.");
return 0;
}
When compiled with tcc (Tiny C Compiler) it compiles and executes fine. But using GCC 4.6.0 20110513 (prerelease) it compiles, however, when I execute it I only get past "Initialization works." before getting a segfault.
What am I doing wrong? Is it my code or my GCC compiler?

Your code. ANSI permits string constants to be read-only, and this is encouraged because it means they can be shared system-wide across all running instances of a program; gcc does so unless you specify -fwritable-strings, while tcc makes them writable (probably because it's easier).

val is an points to read only location.
char *readOnly = "Data in read only location" ;
readOnly pointing data cannot be modified.

As other answers have pointed out, val is pointing at a string constant. Try
my_struct me = {4, malloc(2)};
and remember to check if val is NULL if you're using this in a real program.

Related

What should happen, when we try to modify a string constant?

#include<stdio.h>
#include<string.h>
int main()
{
int i, n;
char *x="Alice"; // ....... 1
n = strlen(x); // ....... 2
*x = x[n]; // ....... 3
for(i=0; i<=n; i++)
{
printf("%s ", x);
x++;
}
printf("\n");
return 0;
}
String constant cannot be modified. In the above code *x means 'A'. In line 3 we are trying to modify a string constant. Is it correct to write that statement? When I run this code on Linux, I got segmentation fault. But on www.indiabix.com, they have given answer:
If you compile and execute this program in windows platform with Turbo C, it will give lice ice ce e It may give different output in other platforms (depends upon compiler and machine). The online C compiler given in this site will give Alice lice ice ce e as output (it runs on Linux platform).

Your analysis is correct. The line
*x = x[n];
is trying to modify a string literal, so it's undefined behavior.
BTW, I checked the website that you linked. Just browsing it for two minutes, I've already found multiple incorrect code samples (to name a few, using gets, using char(not int) to assign return value of getchar, etc), so my suggestion is don't use it.

Your analysis is correct, but doesn't contradict what you quoted.
The code is broken. The answer already acknowledges that it may behave differently on different implementations, and has given two different outputs by two different implementations. You happen to have found an implementation that behaves in a third way. That's perfectly fine.

Modification of a string literal is Undefined Behaviour. So the behaviour you observe, and the two described, are consistent with the requirements of the C standard (as is emailing your boss and your spouse, or making demons fly out of your nose). Those three are all actually quite reasonable actions (modify the 'constant', ignore the write, or signal an error).
With GCC, you can ask to be warned when you assign the address of a string literal to a pointer to (writable) char:
cc -g -Wall -Wextra -Wwrite-strings -c -o 27211884.o 27211884.c
27211884.c: In function ‘main’:
27211884.c:7:13: warning: initialization discards ‘const’ qualifier from pointer target type [enabled by default]
char *x="Alice"; // ....... 1
^
This warning is on by default when compiling C++, but not for C, because char* is often used for string literals in old codebases. I recommend using it when writing new code.
There are two correct ways to write the code of the example, depending on whether you want your string to actually be constant or not:
const char *x = "Alice";
char x[] = "Alice";

In this code, the memory for "Alice" will be in the read-only data section of the executable file and x is a pointer pointing to that read-only location. When we try to modify the read-only data section, it should not allow this. But char *x="Alice"; is telling the compiler that x is declared as a pointer to a character, i.e. x is pointing to a character which can be modified (i.e. is not read-only). So the compiler will think that it can be modified. Thus the line *x = x[n]; will behave differently on different compilers. So it will be undefined behavior.
The correct way of declaring a pointer to a assign string literal is as below:
const char *x ="Alice";
Only then can the behavior of the compiler be predicted.

Array of char* and how to allocate memory for each

I have a very simple problem that I cannot seem to figure out. I have this:
char* array[10];
So, I then have 10 char* pointers on the stack. Now all I want to do is allocate memory for each pointer. As in:
array[0] = malloc(sizeof(char)*6);
And then store some characters at this location:
strncpy(array[0], "hello", sizeof("hello"));
Yet, I am getting a compile-time error at the first step of allocating the memory:
error: invalid conversion from ‘void*’ to ‘char*’ [-fpermissive]
But it works as expected at Ideone.
What am I doing wrong? I understand what I am trying to do, but I do not understand why it does not work. At each index in array there is a char*. By using the = symbol I am trying to assign each pointer to a block of memory allocated to it.
What am I doing wrong?
Compiling with g++ -g -Wall

What am I doing wrong? Compiling with g++ -g -Wall
g++ always compile a .c file as .cpp. Compile it with a C compiler (like GCC). In C++, you must have to cast the return value of malloc. In case of C, do not cast return value of malloc.

Your code is valid C, but you are compiling your code as C++, which, unike C, has no implicit conversion from void* to char*.
If you intended to compile the code as C (in which case you do not require the cast), use gcc, instead of g++. Also make sure you your file does not end with an extension that gcc interprets as C++ (.cpp, .C, .cxx or .cc). Or play it safe and use the .c extension.
If you want to make the code valid C++, you need to cast to char*:
array[0] = (char*)malloc(sizeof(char)*6);

This is probably the most visible difference between C and C++: C can implicitely convert the void* returned by malloc() to any other type, C++ can't.
Now, by compiling with g++, or by using a .cpp file name extension, you are compiling your code as C++ code, not C code. Use gcc instead and make sure that your source file ends with .c, and your code will compile fine.
An alternative solution is to add the cast that C++ requires: array[0] = static_cast<char*>(malloc(sizeof(char)*6));

As others have pointed out, C++ does not allow an implicit conversion from void * to char *.
If this is really supposed to be C++ code, I'd advise using new instead of malloc for dynamic memory allocation, and for this particular code I'd advise using a vector of string instead of an array of char *:
#include <vector>
#include <string>
...
std::vector< std::string > array;
...
array[0] = "hello"; // literal is implicitly converted to an instance of string
The string and vector implementations do all the memory management for you.
If this is really supposed to be C code, simply compile it using gcc instead of g++.

Try something like this:
array[0] = static_cast<char *>(malloc(sizeof(char)*6));
How should I cast the result of malloc in C++?

getting address of particular instruction in a function [duplicate]

I want to know the length of C function (written by me) at runtime. Any method to get it? It seems sizeof doesn't work here.

There is a way to determine the size of a function. The command is:
nm -S <object_file_name>
This will return the sizes of each function inside the object file. Consult the manual pages in the GNU using 'man nm' to gather more information on this.

You can get this information from the linker if you are using a custom linker script. Add a linker section just for the given function, with linker symbols on either side:
mysec_start = .;
*(.mysection)
mysec_end = .;
Then you can specifically assign the function to that section. The difference between the symbols is the length of the function:
#include <stdio.h>
int i;
__attribute__((noinline, section(".mysection"))) void test_func (void)
{
i++;
}
int main (void)
{
extern unsigned char mysec_start[];
extern unsigned char mysec_end[];
printf ("Func len: %lu\n", mysec_end - mysec_start);
test_func ();
return 0;
}
This example is for GCC, but any C toolchain should have a way to specify which section to assign a function to. I would check the results against the assembly listing to verify that it's working the way you want it to.

There is no way in standard C to get the amount of memory occupied by a function.

I have just came up with a solution for the exact same problem but the code i have written is platform depended.
The idea behind, putting known opcodes at the end of the function and searching for them from start while counting bytes we have skipped.
Here is the medium link which i have explained with some code
https://medium.com/#gurhanpolat/calculate-c-function-size-x64-x86-c1f49921aa1a

Executables (at least ones which have debug info stripped) doesn't store function lengths in any way. So there's no possibility to parse this info in runtime by self. If you have to manipulate with functions, you should do something with your objects in linking phase or by accessing them as files from your executable. For example, you may tell linker to link symbol tables as ordinary data section into the executable, assign them some name, and parse when program runs. But remember, this would be specific to your linker and object format.
Also note, that function layout is also platform specific and there are some things that make the term "function length" unclear:
Functions may have store used constants in code sections directly after function code and access them using PC-relative addressing (ARM compilers do this).
Functions may have "prologs" and "epilogs" which may may be common to several functions and thus lie outside main body.
Function code may inline other function code
They all may count or not count in function length.
Also function may be completely inlined by compiler, so it loose its body.

A fully worked out solution without linker or dirty platform dependent tricks:
#include <stdio.h>
int i;
__attribute__((noinline, section("mysec"))) void test_func (void)
{
i++;
}
int main (void)
{
extern char __start_mysec[];
extern char __stop_mysec[];
printf ("Func len: %lu\n", __stop_mysec - __start_mysec);
test_func ();
return 0;
}
That's what you get when you read FazJaxton's answer with jakobbotsch's comment

In e.g. Codewarrior, you can place labels around a function, e.g.
label1:
void someFunc()
{
/* code goes here. */
}
label2:
and then calculate the size like (int)(label2-label1), but this is obviously very compiler dependent. Depending on your system and compiler, you may have to hack linker scripts, etc.

The start of the function is the function pointer, you already know that.
The problem is to find the end, but that can be done this way:
#include <time.h>
int foo(void)
{
int i = 0;
++i + time(0); // time(0) is to prevent optimizer from just doing: return 1;
return i;
}
int main(int argc, char *argv[])
{
return (int)((long)main - (long)foo);
}
It works here because the program has ONLY TWO functions so if the code is re-ordered (main implemented before foo) then you will get an irrelevant (negative) calculation, letting you know that it did not work this way but that it WOULD work if you move the foo() code into main() - just substract the main() size you got with the initial negative reply.
If the result is positive, then it will be correct -if no padding is done (yes, some compilers happily inflate the code, either for alignment or for other, less obvious reasons).
The ending (int)(long) cast is for portability between 32-bit and 64-bit code (function pointers will be longer on a 64-bit platform).
This is faily portable and should work reasonably well.

There's no facility defined within the C language itself to return the length of a function; there are simply too many variables involved (compiler, target instruction set, object file/executable file format, optimization settings, debug settings, etc.). The very same source code may result in functions of different sizes for different systems.
C simply doesn't provide any sort of reflection capability to support this kind of information (although individual compilers may supply extensions, such as the Codewarrior example cited by sskuce). If you need to know how many bytes your function takes up in memory, then you'll have to examine the generated object or executable file directly.
sizeof func won't work because the expression func is being treated as a pointer to the function, so you're getting the size of a pointer value, not the function itself.

Just subtract the address of your function from the address of the next function. But note it may not work on your system, so use it only if you
are 100% sure:
#include <stdint.h>
int function() {
return 0;
}
int function_end() {
return 0;
}
int main(void) {
intptr_t size = (intptr_t) function_end - (intptr_t) function;
}

There is no standard way of doing it either in C or C++. There might naturally exist implementation/platform-specific ways of doiung it, but I am not aware of any

size_t try_get_func_size_x86(void* pfn, bool check_prev_opcode = true, size_t max_opcodes_runout = 10000)
{
const unsigned char* op = (const unsigned char*)pfn;
for(int i = 0; i < max_opcodes_runout; i++, op++)
{
size_t sz_at = (size_t)(op - (const unsigned char*)pfn) + 1;
switch(*op)
{
case 0xC3: // ret Opcode
case 0xC2: // ret x Opcode
if(!check_prev_opcode)
return sz_at;
switch(*(op-1)) // Checking Previous Opcode
{
case 0x5D: // pop ebp
case 0x5B: // pop ebx
case 0x5E: // pop esi
case 0x5F: // pop edi
case 0xC9: // leave
return sz_at;
}
}
}
return 0;
}

You can find the length of your C function by subtracting the addresses of functions.
Let me provide you an example
int function1()
{
}
int function2()
{
int a,b; //just defining some variable for increasing the memory size
printf("This function would take more memory than earlier function i.e function01 ");
}
int main()
{
printf("Printing the address of function01 %p\n",function01);
printf("Printing the address of function02 %p\n",function02);
printf("Printing the address of main %p\n",main);
return 0;
}
Hope you would get your answer after compiling it. After compiling you will able to see the
difference in size of function01 and function2.
Note : Normally there is 16bytes diff between one function and other.

passing a two-dimmensional array to function

I am trying to compile the following simple code in Workbench:
1. typedef float matrixType[3][3]
2.
3. void my_func(matrixType matrix)
4. {
5. printf("matrix[0][0] = %g\n",matrix[0][0]);
6. }
7.
8. void main()
9. {
10. matrixType my_matrix = {{0,1,2},{3,4,5},{6,7,8}};
11. matrixType* ptr_matrix = &my_matrix;
12.
13. my_func(*ptr_matrix);
14. }
I receive the following warning:
test.c:13: warning: passing arg 1 of `my_func' from incompatible pointer type
I can't understand, what am I doing wrong. The compilation of the same code in Visual Studio works without any warnings, but in Workbench something is going wrong.
Thanks.

With gcc (GCC) 4.5.3 with all warnings turned on it also compiles fine after making the following changes:
Add a semicolon after the first line.
Add #include <stdio.h> at top.
Change the return type of main to int.
Add return 0; as the last line.
The void main() is not correct C even though it appears in various books, manuals, and web tutorials. On some architectures it will cause strange problems, usually as the program terminates.
Taking the address of an array type is challenging the workbench type checker. I'm not going to drag out the C standard to figure out if the workbench warning is correct. It's probably a bug.
But I'm pretty sure that if you recode this way you will see no errors with any compiler:
#include <stdio.h>
typedef float rowType[3];
typedef rowType matrixType[3];
void my_func(matrixType matrix)
{
printf("matrix[0][0] = %g\n",matrix[0][0]);
}
int main()
{
matrixType my_matrix = {{0,1,2},{3,4,5},{6,7,8}};
rowType* ptr_matrix = my_matrix;
my_func(ptr_matrix);
return 0;
}
The reason is that my_matrix is automatically converted to a pointer to it's first element in the assignment
rowType* ptr_matrix = my_matrix;
This is just as in
char s[] = "hello world!";
char *p = s;
the array name s is converted to a pointer to its first element.
The parameter in void my_func(matrixType matrix) has a type identical to rowType* because all arrays are also passed as pointers to first elements. So all the types in this code must match in a way that's very clearly defined in the C standard. &my_matrix may not be incorrect, but it's an "edge case" more likely to expose type checking bugs.

You are missing a semicolon at the end of line 1.

Strange compile errors in Linux Eclipse C

I'm hoping someone can help me out with this. I'm a Linux & Eclipse noob, but I'm pretty familiar with C/C++, though its been a while since I've used them. When I try to compile I get strange errors. No matter what I do to fix them they don't seem to go away.
You can see the there's a simple main function with a little bit of code. There's only 15 lines of code but if you look at the errors they are in external libraries, stdio.h. In main it says there's one error at line 11 but that one doesn't make sense. I assume it's an Eclipse settings problem, but I have no idea what to do to fix it. Any help would be very appreciated. By the way I'm using SciLinux and Eclipse Indigo Service Release 2. Thanks
Code:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *ptr;
int a;
a = 20;
ptr = &a;
int b;
b = *ptr;
printf(" ptr is %d\n",b);
return 0;
}

int *ptr;
int a;
int b; //<- move to block top declaration
a = 20;
ptr = &a;

Some of the previous compiler have this weird problem relating to C they only accept variables which are declared in the beginning of the function.
So most probably the error is because you have not declared the variable b at the starting of the block , i suggest you try using a different compiler or be prepared to declare all the variables at the beginning.

As other answers say, mixing code and declarations is illegal in old fashioned plain C. See:
Variable declaration placement in C
How to enforce C89-style variable declarations in gcc?
In eclipse, the standard version used will depend on the compiler flags passed to the C compiler gcc: either -std=c89 or -std=c99. Depending on how the project is set up, will either be in the Eclipse project properties or a Makefile.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Gcc compiler C string assignment issue - c

Your code. ANSI permits string constants to be read-only, and this is encouraged because it means they can be shared system-wide across all running instances of a program; gcc does so unless you specify -fwritable-strings, while tcc makes them writable (probably because it's easier).

val is an points to read only location. char *readOnly = "Data in read only location" ; readOnly pointing data cannot be modified.

As other answers have pointed out, val is pointing at a string constant. Try my_struct me = {4, malloc(2)}; and remember to check if val is NULL if you're using this in a real program.

Related

What should happen, when we try to modify a string constant?

Array of char* and how to allocate memory for each

getting address of particular instruction in a function [duplicate]

passing a two-dimmensional array to function

Strange compile errors in Linux Eclipse C

Categories

Resources