Segmentation fault when accessing specific pointer - c

I'm trying to code my own malloc and free in C for a project. For the most part, it's going fine but I can't wrap my head around a strange Segmentation Fault it's giving me. I reproduced the code that was giving the error in a simplified form:
#include <stdio.h>
#define MEMORY_SIZE 4096
static char memory[MEMORY_SIZE];
typedef struct metaData{
unsigned short isFree; //1 if free, 0 if allocated
unsigned short size;
} metaData;
int main(){
metaData *head = (metaData *) memory;
printf("%d\n", (head+2031)->size);
printf("%d\n", (head+2032)->size);
puts("Segmentation up here???");
return 0;
}
memory is a static array of chars of size 4096. The first printf prints out a 0. But the next printf is a segfault. I am able to manipulate the metaData struct at every pointer up until head+2032. Does anyone have any idea why?

Pointer arithmetic in C is performed in base units of the size of the pointed-to type. Your head is a pointer to a metaData structure, which has a size of 2 × sizeof(unsigned short). Assuming (as is likely, but not certain) that an unsigned short has a size of 2 bytes on your platform, then that "base unit" will be 4 bytes.
Thus, when the head + 2031 calculation is made, the value of 4 × 2031 (which is 8124) will be added to the address in head to give the result of that expression. So, with the following ->size operator, you are attempting to reference memory that is 8,126 bytes1 from the location of the beginning of your memory array – but that array is declared as only 4096 bytes (sizeof(char) is, by definition, 1 byte).
Accessing memory beyond the declared size of an array is undefined behaviour (UB); once you have invoked such UB (as you do in both printf calls), many different things can happen, and in unpredictable ways. A "segmentation fault" (trying to read or write memory to which your program does not have access) is one possible manifestation of UB. (Another possible manifestation is that no error is reported and the program appears to work properly; in many people's opinion, that's the worst kind!)
1 8,124 bytes will be the offset of the start of the potentially pointed-to meteData structure; because size is preceded by another unsigned short member, then another two bytes will be added.

Related

Why does node* root = malloc(sizeof(int)) allocate 16 bytes of memory instead of 4?

I'm messing around with Linked List type data structures to get better with pointers and structs in C, and I don't understand this.
I thought that malloc returned the address of the first block of memory of size sizeof to the pointer.
In this case, my node struct looks like this and is 16 bytes:
typedef struct node{
int index;
struct node* next;
}node;
I would expect that if I try to do this: node* root = malloc(sizeof(int))
malloc would allocate only a block of 4 bytes and return the address of that block to the pointer node.
However, I'm still able to assign a value to index and get root to point to a next node, as such:
root->index = 0;
root->next = malloc(sizeof(node));
And the weirdest part is that if I try to run: printf("size of pointer root: %lu \n", sizeof(*root));
I get size of pointer root: 16, when I clearly expected to see 4.
What's going on?
EDIT: I just tried malloc(sizeof(char)) and it still tells me that *root is 16 bytes.
There is a few things going on here, plus one more that probably isn't a problem in this example but is a problem in general.
1) int isn't guaranteed to be 4 bytes, although in most C compiler implementations they are. I would double check sizeof(int) to see what you get.
2) node* root = malloc(sizeof(int)) is likely to cause all sorts of problems, because sizeof(struct node) is not the same as an int. As soon as you try to access root->next, you have undefined behavior.
3) sizeof(struct node) is not just an int, it is an int and a pointer. Pointers are (as far as I know, someone quote the standard if not) the same size throughout a program depending on how it was compiled (32-bit vs 64-bit, for example). You can easily check this on your compiler with sizeof(void*). It should be the same as sizeof(int*) or sizeof(double*) or any other pointer type.
4) Your struct should be sizeof(int) + sizeof(node*), but isn't guaranteed to be. For example, say I have this struct:
struct Example
{
char c;
int i;
double d;
};
You'd expect its size to be sizeof(char) + sizeof(int) + sizeof(double), which is 1 + 4 + 8 = 13 on my compiler, but in practice it won't be. Compilers can "align" members internally to match the underlying instruction architecture, which generally will increase the structs size. The tradeoff is that they can access data more quickly. This is not standardized and varies from one compiler to another, or even different versions of the same compiler with different settings. You can learn more about it here.
5) Your line printf("size of pointer root: %lu \n", sizeof(*root)) is not the size of the pointer to root, it is the size of the struct root. This leads me to believe that you are compiling this as 64-bit code, so sizeof(int) is 4, and sizeof(void*) is 8, and they are being aligned to match the system word (8 bytes), although I can't be positive without seeing your compiler, system, and settings. If you want to know the size of the pointer to root, you need to do sizeof(node*) or sizeof(root). You dereference the pointer in your version, so it is the equivalent of saying sizeof(node)
Bottom line, is that the weirdness you are experiencing is undefined behavior. You aren't going to find a concrete answer, and just because you think you find a pattern in the behavior doesn't mean you should use it (unless you want impossible to find bugs later that make you miserable).
You didn't mention what system (M$ or linux, 32bit or 64bit) but your assumptions about memory allocation are wrong. Memory allocations are aligned to some specified boundary to guarantee all allocations for supported types are properly aligned - typically it is 16 bytes for 64bit mode.
Check this - libc manual:
http://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html
The address of a block returned by malloc or realloc in GNU systems is
always a multiple of eight (or sixteen on 64-bit systems). If you need
a block whose address is a multiple of a higher power of two than
that, use aligned_alloc or posix_memalign. aligned_alloc and
posix_memalign are declared in stdlib.h.
There's a few things happening here. First, C has no bounds checking. C doesn't track how much memory you allocated to a variable, either. You didn't allocate enough memory for a node, but C doesn't check that. The following "works", but really it doesn't.
node* root = malloc(sizeof(int));
root->index = 0;
root->next = malloc(sizeof(node));
Since there wasn't enough memory allocated for the struct, someone else's memory has been overwritten. You can see this by printing out the pointers.
printf("sizeof(int): %zu\n", sizeof(int));
printf("root: %p\n", root);
printf("&root->index: %p\n", &root->index);
printf("&root->next: %p\n", &root->next);
sizeof(int): 4
root: 0x7fbde5601560
&root->index: 0x7fbde5601560
&root->next: 0x7fbde5601568
I've only allocated 4 bytes, so I'm only good from 0x7fbde5601560 to 0x7fbde5601564. root->index is fine, but root->next is writing to someone else's memory. It might be unallocated, in which case it might get allocated to some other variable and then you'll see weird things happening. Or it might be memory for some existing variable, in which case it will overwrite that memory and cause very difficult to debug memory problems.
But it didn't go so far out of bounds so as to walk out of the memory allocated to the whole process, so it didn't trigger your operating system's memory protection. That's usually a segfault.
Note root->next is 8 bytes after root->index because this is a 64 bit machine and so elements of a struct align on 8 bytes. If you were to put another integer into the struct after index, next would still be 8 bytes off.
There's another possibility: even though you only asked for sizeof(int) memory, malloc probably allocated more. Most memory allocators do their work in chunks. But this is all implementation defined, so your code still has undefined behavior.
And the weirdest part is that if I try to run: printf("size of pointer root: %lu \n", sizeof(*root)); I get size of pointer root: 16, when I clearly expected to see 4.
root is a pointer to a struct, and you'd expect sizeof(root) to be pointer sized, 8 bytes on a 64 bit machine to address 64 bits of memory.
*root dereferences that pointer, sizeof(*root) is the actual size of the struct. That's 16 bytes. (4 for the integer, 4 for padding, 8 for the struct pointer). Again, C doesn't track how much memory you allocated, it only tracks what the size of the variable is supposed to be.

C or C++ sprintf and value in struct

code:
sprintf(tmp, "xbitmap_width %d\n", symbol->scale);
Output:
xbitmap_width 1075052544
expected output - value of scale which is 5 so it should be:
xbitmap_width 5
What am i missing??? Why is sprintf taking pointer value?
Update:
If symbol->scale is indeed not a pointer, then also ensure tmp is big enough, to avoid overflow. I hope tmp is at least 18 chars big, but best make it big enough (like 30 or bigger), and if it's allocated on the heap: initialize it to zeroes: memset or calloc(30, sizeof *tmp) would be preferable.
You may also want to ensure that symbol is not a stack value, returned by a function. This, too, would be undefined behaviour. However, given that you say you're using new or malloc (which _does not initialize the struct, BTW), that can't be the issue.
The not-initializing bit here (when using malloc) might be, though: malloc merely reserves enough memory to store a given object one or more times. The memory is not initialized, though:
char *str = malloc(100);
Is something like that thing where you give a bunch of monkeys type-writers: eventually one of them might wind up punching in a line of Shakespeare: well, if you malloc strings like this, and print them, eventually one of them might end up containing the string "Don't panic".
Now, this isn't exactly true, but you get the point...
To ensure your struct is initialized, either use calloc or memset those members that str giving you grief.
if your struct looks like this:
struct symbol
{
int *scale;
}
Then you are passing the value of scale to sprintf. This value is a memory address, not an int. An int, as you may no is guaranteed to be at least 2 bytes in size (most commonly it's 4 though). A pointer is 4 or 8 bytes in size, so passing a pointer, and have sprintf interpret it as an int, you get undefined behaviour.
To print 5 in your case:
struct symbol *symbol = malloc(sizeof *symbol);
int s = 5;
symbol->scale = &s;
printf("%d\n", *(symbol->scale));//dereference the scale pointer
But this is undefined behaviour:
printf("%d\n", symbol->scale);//passing pointer VALUE ==> memory address
//for completeness & good practices' sake:
free(symbol);
Oh, and as stated in the comments: snprintf is to sprintf what strncpy is to strcpy and strncat is to strcat: it's safer to use the function which allows you to specify a maximum of chars to set

pointer typecasting

int main()
{
int *p,*q;
p=(int *)1000;
q=(int *)2000;
printf("%d:%d:%d",q,p,(q-p));
}
output
2000:1000:250
1.I cannot understand p=(int *)1000; line, does this mean that p is pointing to 1000 address location? what if I do *p=22 does this value is stored at 1000 address and overwrite the existing value? If it overwrites the value, what if another program is working with 1000 address space?
how q-p=250?
EDIT: I tried printf("%u:%u:%u",q,p,(q-p)); the output is the same
int main()
{
int *p;
int i=5;
p=&i;
printf("%u:%d",p,i);
return 0;
}
the output
3214158860:5
does this mean the addresses used by compiler are integers? there is no difference between normal integers and address integers?
does this mean that p is pointing to 1000 address location?
Yes.
what if I do *p=22
It's invoking undefined behavior - your program will most likely crash with a segfault.
Note that in modern OSes, addresses are virtual - you can't overwrite an other process' adress space like this, but you can attempt writing to an invalid memory location in your own process' address space.
how q-p=250?
Because pointer arithmetic works like this (in order to be compatible with array indexing). The difference of two pointers is the difference of their value divided by sizeof(*ptr). Similarly, adding n to a pointer ptr of type T results in a numeric value ptr + n * sizeof(T).
Read this on pointers.
does this mean the addresses used by compiler are integers?
That "used by compiler" part is not even necessary. Addresses are integers, it's just an abstraction in C that we have nice pointers to ease our life. If you were coding in assembly, you would just treat them as unsigned integers.
By the way, writing
printf("%u:%d", p, i);
is also undefined behavior - the %u format specifier expects an unsigned int, and not a pointer. To print a pointer, use %p:
printf("%p:%d", (void *)p, i);
Yes, with *p=22 you write to 1000 address.
q-p is 250 because size of int is 4 so it's 2000-1000/4=250
The meaning of p = (int *) 1000 is implementation-defined. But yes, in a typical implementation it will make p to point to address 1000.
Doing *p = 22 afterwards will indeed attempt to store 22 at address 1000. However, in general case this attempt will lead to undefined behavior, since you are not allowed to just write data to arbitrary memory locations. You have to allocate memory in one way or another in order to be able to use it. In your example you didn't make any effort to allocate anything at address 1000. This means that most likely your program will simply crash, because it attempted to write data to a memory region that was not properly allocated. (Additionally, on many platforms in order to access data through pointers these pointers must point to properly aligned locations.)
Even if you somehow succeed succeed in writing your 22 at address 1000, it does not mean that it will in any way affect "other programs". On some old platforms it would (like DOS, fro one example). But modern platforms implement independent virtual memory for each running program (process). This means that each running process has its own separate address 1000 and it cannot see the other program's address 1000.
Yes, p is pointing to virtual address 1000. If you use *p = 22;, you are likely to get a segmentation fault; quite often, the whole first 1024 bytes are invalid for reading or writing. It can't affect another program assuming you have virtual memory; each program has its own virtual address space.
The value of q - p is the number of units of sizeof(*p) or sizeof(*q) or sizeof(int) between the two addresses.
Casting arbitrary integers to pointers is undefined behavior. Anything can happen including nothing, a segmentation fault or silently overwriting other processes' memory (unlikely in the modern virtual memory models).
But we used to use absolute addresses like this back in the real mode DOS days to access interrupt tables and BIOS variables :)
About q-p == 250, it's the result of semantics of pointer arithmetic. Apparently sizeof int is 4 in your system. So when you add 1 to an int pointer it actually gets incremented by 4 so it points to the next int not the next byte. This behavior helps with array access.
does this mean that p is pointing to 1000 address location?
yes. But this 1000 address may belong to some other processes address.In this case, You illegally accessing the memory of another process's address space. This may results in segmentation fault.

Dynamic memory allocation in 'c' Issues

I was writing a code using malloc for something and then faced a issue so i wrote a test code which actually sums up the whole confusion which is below::
# include <stdio.h>
# include <stdlib.h>
# include <error.h>
int main()
{
int *p = NULL;
void *t = NULL;
unsigned short *d = NULL;
t = malloc(2);
if(t == NULL) perror("\n ERROR:");
printf("\nSHORT:%d\n",sizeof(short));
d =t;
(*d) = 65536;
p = t;
*p = 65536;
printf("\nP:%p: D:%p:\n",p,d);
printf("\nVAL_P:%d ## VAL_D:%d\n",(*p),(*d));
return 0;
}
Output:: abhi#ubuntu:~/Desktop/ad/A1/CC$ ./test
SHORT:2
P:0x9512008: D:0x9512008:
VAL_P:65536 ## VAL_D:0
I am allocating 2 bytes of memory using malloc. Malloc which returns a void * pointer is stored in a void* pointer 't'.
Then after that 2 pointers are declared p - integer type and d - of short type. then i assigned t to both of them*(p =t and d=t)* that means both d & p are pointing to same mem location on heap.
on trying to save 65536(2^16) to (*d) i get warning that large int value is truncated which is as expected.
Now i again saved 65536(2^16) to (*p) which did not caused any warning.
*On printing both (*p) and (d) i got different values (though each correct for there own defined pointer type).
My question are:
Though i have allocated 2 bytes(i.e 16 bits) of heap mem using malloc how am i able to save 65536 in those two bytes(by using (p) which is a pointer of integer type).??
i have a feeling that the cause of this is automatic type converion of void to int* pointer (in p =t) so is it that assigning t to p leads to access to memory regions outside of what is allocated through malloc . ??.
Even though all this is happening how the hell derefrencing the same memory region through (*p) and (*d) prints two different answers( though this can also be explained if what i am thinking the cause in question 1).
Can somebody put some light on this, it will be really appreciated..and also if some one can explain the reasons behind this..
Many thanks
Answering your second question first:
The explanation is the fact that an int is generally 4 bytes, and the most significant bytes may be stored in the first two positions. A short, which is only 2 bytes, also stores its data in the first two positions. Clearly, then, storing 65536 in an int and a short, but pointing at the same memory location, will cause the data to be stored offset by two bytes for the int in relation to the short, with the two least significant bytes of the int corresponding to the storage for the short.
Therefore, when the compiler prints *d, it interprets this as a short and looks at the area corresponding to storage for a short, which is not where the compiler previously stored the 65536 when *p was written. Note that writing *p = 65536; overwrote the previous *d = 65536;, populating the two least significant bytes with 0.
Regarding the first question: The compiler does not store the 65536 for *p within 2 bytes. It simply goes outside the bounds of the memory you've allocated - which is likely to cause a bug at some point.
In C there is no protection at all for writing out of bounds of an allocation. Just don't do it, anything can happen. Here it seems to work for you because by some coincidence the space behind the two bytes you allocated isn't used for something else.
1) The granularity of the OS memory manager is 4K. An ovewrite by one bit is unlikely to trigger an AV/segfault, but will it corrupt any data in the adjacent location, leading to:
2) Undefined behaviour. This set of behaviour includes 'aparrently correct operation', (for now!).

C Language - Malloc unlimited space?

I'm having difficulty learning C language's malloc and pointer:
What I learned so far:
Pointer is memory address pointer.
malloc() allocate memory locations and returns the memory address.
I'm trying to create a program to test malloc and pointer, here's what I have:
#include<stdio.h>
main()
{
char *x;
x = malloc(sizeof(char) * 5);
strcpy(*x, "123456");
printf("%s",*x); //Prints 123456
}
I'm expecting an error since the size I provided to malloc is 5, where I put 6 characters (123456) to the memory location my pointer points to. What is happening here? Please help me.
Update
Where to learn malloc and pointer? I'm confused by the asterisk thing, like when to use asterisk etc. I will not rest till I learn this thing! Thanks!
You are invoking undefined behaviour because you are writing (or trying to write) beyond the bounds of allocated memory.
Other nitpicks:
Because you are using strcpy(), you are copying 7 bytes, not 6 as you claim in the question.
Your call to strcpy() is flawed - you are passing a char instead of a pointer to char as the first argument.
If your compiler is not complaining, you are not using enough warning options. If you're using GCC, you need at least -Wall in your compiler command line.
You need to include both <stdlib.h> for malloc() and <string.h> for strcpy().
You should also explicitly specify int main() (or, better, int main(void)).
Personally, I'm old school enough that I prefer to see an explicit return(0); at the end of main(), even though C99 follows C++98 and allows you to omit it.
You may be unlucky and get away with invoking undefined behaviour for a while, but a tool like valgrind should point out the error of your ways. In practice, many implementations of malloc() allocate a multiple of 8 bytes (and some a multiple of 16 bytes), and given that you delicately do not step over the 8 byte allocation, you may actually get away with it. But a good debugging malloc() or valgrind will point out that you are doing it wrong.
Note that since you don't free() your allocated space before you return from main(), you (relatively harmlessly in this context) leak it. Note too that if your copied string was longer (say as long as the alphabet), and especially if you tried to free() your allocated memory, or tried to allocate other memory chunks after scribbling beyond the end of the first one, then you are more likely to see your code crash.
Undefined behaviour is unconditionally bad. Anything could happen. No system is required to diagnose it. Avoid it!
If you call malloc you get and adress of a memory region on heap.
If it returns e.g. 1000 you memory would look like:
Adr Value
----------
1000 1
1001 2
1002 3
1003 4
1004 5
1005 6
1006 0
after the call to strcpy(). you wrote 7 chars (2 more than allocated).
x == 1000 (pointer address)
*x == 1 (dereferenced the value x points to)
There are no warnings or error messages from the compiler, since C doesn't have any range-checking.
My three cents:
Use x, as (*x) is the value that is stored at x (which is unknown in your case) - you are writing to unknown memory location. It should be:
strcpy(x, "123456");
Secondly - "123456" is not 6 bytes, it's 7. You forgot about trailing zero-terminator.
Your program with it's current code might work, but not guaranteed.
What I would do:
#include<stdio.h>
main()
{
char str[] = "123456";
char *x;
x = malloc(sizeof(str));
strcpy(x, str);
printf("%s",x); //Prints 123456
free(x);
}
Firstly, there is one problem with your code:
x is a pointer to a memory area where you allocated space for 5 characters.
*x it's the value of the first character.
You should use strcpy(x, "123456");
Secondly, the memory after your 5 bytes allocated, can be valid so you will not receive an error.
#include<stdio.h>
main()
{
char *x;
x = malloc(sizeof(char) * 5);
strcpy(x, "123456");
printf("%s",x); //Prints 123456
}
Use this...it will work
See difference in your & mine program
Now here you are allocating 5 bytes & writing 6 byte so 6th byte will be stored in next consecutive address. This extra byte can be allocated to some one else by memory management so any time that extra byte can be changed by other program because 6th byte is not yours because you haven't malloc'd that.. that's why this is called undefined behaviour.

Resources