memcpy Seg fault seemingly innoculous

memcpy Seg fault seemingly innoculous - c

Got a seg fault from my memcpy that gdb can't give me anything else on (at least beyond the simple manner that I know how to use gdb...). This thing is deeply imbedded in some code using the Berkely DB; I have taken out the only lines that should be of any use.
void *databuf;
int smallest;
databuf=malloc(2*sizeof(int));
memset(databuf,0,2*sizeof(int));
/* This next line comes from the DB structures; key.data is a void*... */
smallest=*(int *)key.data;
memcpy(databuf,(void *)smallest,sizeof(int));
To confirm the variable smallest is correct, I can run gdb and get
(gdb) print smallest
$1 = 120321
The error I recieve (in gdb) is the useless
Program received signal SIGSEGV, Segmentation fault.
0x08048efa in main (argc=4, argv=0xbffff344) at index_DB-1.1.c:128
128 memcpy(databuf,(void *)smallest,sizeof(int));
(gdb) backtrace
#0 0x08048efa in main (argc=4, argv=0xbffff344) at index_DB-1.1.c:128
The reason I am doing this is mostly because I am bastardizing the Berkley DB tutorial, but also later I want to do
memcpy(databuf+len,(void *)largest,sizeof(int));
i.e. have a void pointer databuf with first byes the smallest integer and second bytes the largest integer. What am I missing?

In this step, you are interpreting the value in smallest as a pointer:
memcpy(databuf,(void *)smallest,sizeof(int));
Since that value is almost certainly not a valid pointer, this is causing your segfault. Instead, you likely want:
memcpy(databuf, &smallest, sizeof smallest);
Unless you need smallest for some other reason though, you can just copy directly from key.data to to databuf:
memcpy(databuf, key.data, sizeof(int));

(void *)smallest
That takes the integer value of smallest and treats it as a pointer. What you meant to do was this:
(void *)&smallest

It's hard to tell what you're doing, considering the code is so awful, but this looks very suspicious:
memcpy(databuf,(void *)smallest,sizeof(int));
I believe smallest contains normal integer data, not a pointer to anything. So why are you dereferencing it? It doesn't point to anything.
You might want:
memcpy(databuf,(void *) &smallest,sizeof(int));
Also, this is suspect:
smallest=*(int *)key.data;
Is smallest guaranteed to be integer aligned?

Related

Using long int as a parameter for malloc

int main() {
int n;
long u=0,d=0,count=0,i=0;
char *p=(char *)malloc(sizeof(char)*n);
scanf("%ld",&n);
scanf("%s",p);
for(i=0;i<n;i++){
if(p[i]=='U'){
u=u+1;
}
if(p[i]=='D'){
d=d+1;
}
if((d-u)==0 && p[i]=='U'){
count=count+1;}
}
printf("%ld",count);
return 0;
}
In this standard syntax for implicit memory allocation, if i replace "int n;" with "long int n;"
An error pops up saying:
GDB trace:
Reading symbols from solution...done.
[New LWP 10056]
Core was generated by `solution'.
Program terminated with signal SIGSEGV, Segmentation fault.
I have searched everywhere for a solution, rather i quite dont know what to search for,
i would be greatful if anyone helps me out. Thanks :)
(This was executed on an online compiler)

There are a couple of things that I would like to point out:
First of all, you do not have to declare n as "long int". "long int" and "long" are the same. So,
long int n; //is same as
long n;
malloc() works perfectly fine whether n is an "int" or a "long". However, you don't seem to have initialized n. What is the value of n? C does not perform auto-initialization of variables and n might have a garbage value (even negative) which might cause your program to crash. So please give a value to n.
long n = 10; //example
or use a scanf() to input a value.
Now in your code, what is scanf() doing "after" malloc? I presume that you intended to read a value for n and then pass it to malloc. So please change the order of code to this:
scanf("%ld",&n);
char *p=(char *)malloc(sizeof(char)*n);
I ran your program with these changes on my system and it works fine (no segmentation fault)
malloc() limits: We know that malloc allocates from a heap. But I really don't see malloc returning NULL on current platforms (which are generally 64 bit). However, if you do try to allocate a very large chunk of memory, malloc might return NULL which will cause your program to crash.
So it's good to check the return value for malloc() and if that's NULL then take appropriate actions (such as retry or exit the program)
Having a check like the one below will always help:
if (p == NULL) {
printf("Malloc error");
exit(1);
}

Extracting the relevant parts of your code:
int n;
char *p=(char *)malloc(sizeof(char)*n);
The parameter to malloc is of type size_t, which is an unsigned type. If you pass an argument of any other integer type, it will be implicitly converted to size_t.
You report that with int n; you don't see a problem, but with long int n; your program dies with a segmentation fault.
In either case, you're passing an uninitialized value to malloc(). Just referring to the value of an uninitialized object has undefined behavior.
It may be that the arbitrary long int value you're passing to malloc() happens to cause it to fail and return a null pointer, causing a segmentation fault later when you try to dereference the pointer; the arbitrary int value might just happen to cause malloc to succeed. Checking whether malloc succeeded or failed would likely avoid the segmentation fault.
Passing an uninitialized value to malloc() is a completely useless thing to do. The fact that it behaves differently depending on whether that uninitialized value is an int or a long int is not particularly significant.
If you're curious, you might add a line to print the value of n before calling malloc(), and you definitely should check whether malloc() reported failure by returning a null pointer. Beyond that, you know the code is incorrect. Don't waste too much time figuring out the details of how it fails (or, worse, why it sometimes doesn't fail). Just fix the code by initializing n to the number of bytes you actually want to allocate. (And define n as an object of type size_t.)
Some more points:
The code in your question is missing several required #include directives. If they're missing in your actual code, you should add them. If they're present in your actual code, you should have included them in your question. Don't make assumptions about what you can safely leave out.
int main() should be int main(void). (This is a minor point that probably doesn't make any practical difference.)
scanf("%s",p);
This is inherently dangerous. It reads a blank-delimited string that can be arbitrarily long. If the user enters more characters than the buffer p points to can hold, you have undefined behavior.
u=u+1;
Not incorrect, but more idiomatically written as u ++;.
(d-u)==0 is more clearly and safely written as d == u. (For extreme values of d and u the subtraction can overflow; an equality comparison doesn't have that problem.)

Strange Pointers Behaviour in C

I was experimenting with pointers. Look at this code:
#include <stdio.h>
int main() {
int numba = 1;
int *otherintptr = &numba;
printf("%d\n", otherintptr);
printf("%d\n", *otherintptr);
*otherintptr++;
printf("%d\n", otherintptr);
printf("%d\n", *otherintptr);
return 0;
}
The output is:
2358852
1
2358856
2358856
Now, I am well aware that (*otherintptr)++ would have incremented my int, but my question is not this.
After the increment, the memory location is correctly increased by 4 bytes, which is the size of an integer.
I'd like to know why the last printf instruction prints the memory location, while I am clearly asking to print the content of memory locations labelled 2358856 (I was expecting some dirty random content).
Note that the second printf statement prints the content of memory cell 2358852, (the integer 1) as expected.

What happens with these two lines
int numba = 1;
int *otherintptr = &numba;
due to the fact the C compiler will generate a sequential memory layout, otherintptr will initially point to the memory address corresponding to the numba variable. And this is relative to the stack frame allocated when main was called.
Now, the next position on the stack (actually the previous if we consider that the stack grows down on x86 based architectures) is occupied by the otherintptr variable. Incrementing otherintptr will make it point to itself, thus you see the same value.
To exemplify, let's assume that the stack for main begins at the 0x20 offset in memory:
0x20: 0x1 #numba variable
0x24: 0x20 #otherintptr variable pointing to numa
After executing the otherintptr++ instruction, the memory layout will look like this:
0x20: 0x1 #numba variable
0x24: 0x24 #otherintptr variable now pointing to itself
This is why the second printf's have the same output.

When you did otherintptr++, you accidentally made otherintptr to point to otherintptr, i.e. to itself. otherintptr just happened to be stored in memory immediately after your numba.
In any case, you got lucky on several occasions here. It is illegal to use an int * pointer to access something that is not an int and not compatible with int. It is illegal to use %d to print pointer values.

I suppose you wanted to increment the integer otherpointer points to (numba). However, you incremented actually the pointer, as ++ binds stronger than *
see here.
So otherpointer pointed past the variable. And as there is no valid variable, dereferencing the pointer is undefined behaviour. Thus, anything can happen and you just were lucky the program did not crash. It just happend by chance otherpointer itself resided at that address.

Doubts about pointer and memory access

i am just started learning pointers in c. I have following few doubts. If i find the answers for the below questions. It Will be really useful for me to understand the concept of pointers in c. Thanks in advance.
i)
char *cptr;
int value = 2345;
cptr = (char *)value;
whats the use of (char *) and what it mean in the above code snippet.
ii)
char *cptr;
int value = 2345;
cptr = value;
This also compiles without any error .then whats the difference between i & ii code snippet
iii) &value is returning address of the variable. Is it a virtual memory address in RAM? Suppose another c program running in parallel, will that program can have same memory address as &value. Will each process can have duplicate memory address same as in other process and it is independent of each other?
iv)
#define MY_REGISTER (*(volatile unsigned char*)0x1234)
void main()
{
MY_REGISTER=12;
printf("value in the address tamil is %d",(MY_REGISTER));
}
The above snippet compiled successfully. But it outputs segmentation fault error. I don't know what's the mistake I am doing. I want to know how to access the value of random address, using pointers. Is there any way? Will program have the address 0x1234 for real?
v) printf("value at the address %d",*(236632));//consider the address 236632 available in
//stack
why does the above printf statement showing error?

That's a type cast, it tells the compiler to treat one type as some other (possibly unrelated) type. As for the result, see point 2 below.
That makes cptr point to the address 2345.
Modern operating systems isolate the processes. The address of one variable in one process is not valid in another process, even if started with the same program. In fact, the second process may have a completely different memory map due to Address Space Layout Randomisation (ASLR).
It's because you try to write to address 0x1234 which might be a valid address on some systems, but not on most, and almost never on a PC running e.g. Windows or Linux.

i)
(char *) means, that you cast the data stored in value to a pointer ptr, which points to a char. Which means, that ptr points to the memory location 2345. In your code snipet ptr is undefined though. I guess there is more in that program.
ii)
The difference is, that you now write to cptr, which is (as you defined) a pointer pointing to a char. There is not much of a difference as in i) except, that you write to a different variable, and that you use a implicit cast, which gets resolved by the compiler. Again, cptr points now to the location 2345 and expects there to be a char
iii)
Yes you can say it is a virtual address. Also segmentation plays some parts in this game, but at your stage you don't need to worry about it at all. The OS will resolve that for you and makes sure, that you only overwrite variables in the memory space dedicated to your program. So if you run a program twice at the same time, and you print a pointer, it is most likely the same value, but they won't point at the same value in memory.
iv)
Didn't see the write instruction at first. You can't just write anywhere into memory, as you could overwrite another program's value.
v)
Similar issue as above. You cannot just dereference any number you want to, you first need to cast it to a pointer, otherwise neither the compiler, your OS nor your CPU will have a clue, to what exactely it is pointing to
Hope I could help you, but I recommend, that you dive again in some books about pointers in C.

i.) Type cast, you cast the integer to a char
ii.) You point to the address of 2345.
iii.) Refer to answer from Joachim Pileborg. ^ ASLR
iv.) You can't directly write into an address without knowing if there's already something in / if it even exists.
v.) Because you're actually using a pointer to print a normal integer out, which should throw the error C2100: illegal indirection.

You may think pointers like numbers on mailboxes. When you set a value to a pointer, e.g cptr = 2345 is like you move in front of mailbox 2345. That's ok, no actual interaction with the memory, hence no crash. When you state something like *cptr, this refers to the actual "content of the mailbox". Setting a value for *cptr is like trying to put something in the mailbox in front of you (memory location). If you don't know who it belongs to (how the application uses that memory), it's probably a bad idea. You could use "malloc" to initialize a pointer / allocate memory, and "free" to cleanup after you finish the job.

pointer typecasting

int main()
{
int *p,*q;
p=(int *)1000;
q=(int *)2000;
printf("%d:%d:%d",q,p,(q-p));
}
output
2000:1000:250
1.I cannot understand p=(int *)1000; line, does this mean that p is pointing to 1000 address location? what if I do *p=22 does this value is stored at 1000 address and overwrite the existing value? If it overwrites the value, what if another program is working with 1000 address space?
how q-p=250?
EDIT: I tried printf("%u:%u:%u",q,p,(q-p)); the output is the same
int main()
{
int *p;
int i=5;
p=&i;
printf("%u:%d",p,i);
return 0;
}
the output
3214158860:5
does this mean the addresses used by compiler are integers? there is no difference between normal integers and address integers?

does this mean that p is pointing to 1000 address location?
Yes.
what if I do *p=22
It's invoking undefined behavior - your program will most likely crash with a segfault.
Note that in modern OSes, addresses are virtual - you can't overwrite an other process' adress space like this, but you can attempt writing to an invalid memory location in your own process' address space.
how q-p=250?
Because pointer arithmetic works like this (in order to be compatible with array indexing). The difference of two pointers is the difference of their value divided by sizeof(*ptr). Similarly, adding n to a pointer ptr of type T results in a numeric value ptr + n * sizeof(T).
Read this on pointers.
does this mean the addresses used by compiler are integers?
That "used by compiler" part is not even necessary. Addresses are integers, it's just an abstraction in C that we have nice pointers to ease our life. If you were coding in assembly, you would just treat them as unsigned integers.
By the way, writing
printf("%u:%d", p, i);
is also undefined behavior - the %u format specifier expects an unsigned int, and not a pointer. To print a pointer, use %p:
printf("%p:%d", (void *)p, i);

Yes, with *p=22 you write to 1000 address.
q-p is 250 because size of int is 4 so it's 2000-1000/4=250

The meaning of p = (int *) 1000 is implementation-defined. But yes, in a typical implementation it will make p to point to address 1000.
Doing *p = 22 afterwards will indeed attempt to store 22 at address 1000. However, in general case this attempt will lead to undefined behavior, since you are not allowed to just write data to arbitrary memory locations. You have to allocate memory in one way or another in order to be able to use it. In your example you didn't make any effort to allocate anything at address 1000. This means that most likely your program will simply crash, because it attempted to write data to a memory region that was not properly allocated. (Additionally, on many platforms in order to access data through pointers these pointers must point to properly aligned locations.)
Even if you somehow succeed succeed in writing your 22 at address 1000, it does not mean that it will in any way affect "other programs". On some old platforms it would (like DOS, fro one example). But modern platforms implement independent virtual memory for each running program (process). This means that each running process has its own separate address 1000 and it cannot see the other program's address 1000.

Yes, p is pointing to virtual address 1000. If you use *p = 22;, you are likely to get a segmentation fault; quite often, the whole first 1024 bytes are invalid for reading or writing. It can't affect another program assuming you have virtual memory; each program has its own virtual address space.
The value of q - p is the number of units of sizeof(*p) or sizeof(*q) or sizeof(int) between the two addresses.

Casting arbitrary integers to pointers is undefined behavior. Anything can happen including nothing, a segmentation fault or silently overwriting other processes' memory (unlikely in the modern virtual memory models).
But we used to use absolute addresses like this back in the real mode DOS days to access interrupt tables and BIOS variables :)
About q-p == 250, it's the result of semantics of pointer arithmetic. Apparently sizeof int is 4 in your system. So when you add 1 to an int pointer it actually gets incremented by 4 so it points to the next int not the next byte. This behavior helps with array access.

does this mean that p is pointing to 1000 address location?
yes. But this 1000 address may belong to some other processes address.In this case, You illegally accessing the memory of another process's address space. This may results in segmentation fault.

Help interpreting gdb: segfault in function

I am trying to debug a segfault, and I have this output from gdb:
(gdb) n
Program received signal SIGSEGV, Segmentation fault.
0x08048af9 in parse_option_list (ptr=0x6f72505f <Address 0x6f72505f out of bounds>, box_name=0x696d6978 <Address 0x696d6978 out of bounds>, option_list=0x313a7974,
num_elements=0x33313532) at submit.c:125
125 memcpy(&(option_list[(*num_elements)].value), value, 24);
(gdb) p num_elements
$15 = (int *) 0x33313532
(gdb) p *num_elements
Cannot access memory at address 0x33313532
(gdb)
It looks to me like something in memcpy() is going haywire. But I can't figure out what exactly the problem is, since that line references so many variables.
Can somebody help figure out what the 0x8048af9 in parse_option_list... line is telling me?
My function signature is:
int parse_option_list(char *ptr, char *box_name,
struct option_list_values *option_list, int *num_elements)
And this might be useful:
struct option_list_values {
char value[24];
char name[24];
};
Also, the variables value and name are not segfaulting (but if you think they are, i can post the code which sets those values.) But right now, if I can understand this gdb output, I will be happy as a clam! Thank you!

You have all the signs of a classic buffer overflow. The values of all the stack parameters have been overwritten by ASCII text - here is the translation of those values (assuming you have a little-endian architecture, which looks right):
ptr = 0x6f72505f = "_Pro"
box_name = 0x696d6978 = "ximi"
option_list = 0x313a7974 = "ty:1"
num_elements = 0x33313532 = "2513"
Concatenating them together gives "_Proximity:12513" - if this substring looks familiar to you, you should be able to track down where that data is being copied around - somewhere you are copying it into an array stored on the stack, without proper bounds checking.

0x8048af9 is the instruction pointer - the address of the executable code in memory that your code was at when the SEGFAULT occurred.
Are you sure that option_list[(*num_elements)].value is a valid address? You might have a buffer overflow, and be overwriting something you shouldn't be.
If num_elements is the length of option_list, then option_list[(*num_elements)] refers to just after the end of the list.

ptr=0x6f72505f - Address 0x6f72505f out of bounds
This is the useful part in this case
The first input to parse_option_list is invalid. Possibly an uninitialized pointer.