Example code:
int main ()
{
char b[] = {"abcd"};
char *c = NULL;
printf("\nsize: %d\n",sizeof(b));
c = (char *)malloc(sizeof(char) * 3);
memcpy(c,b,10); // here invalid read and invalid write
printf("\nb: %s\n",b);
printf("\nc: %s\n",c);
return 0;
}
See in code I have done some invalid reads and invalid writes, but this small program works fine and does not create a core dump.
But once in my big library, whenever I make 1 byte of invalid read or invalid write, it was always creating core dump.
Question:
Why do I sometimes get a core dump from an invalid read/write and sometimes do not get a core dump?
It entirely depends on what you're overwriting or dereferencing when you do an invalid read/write. Specifically, if you're overwriting some pointer that gets dereferenced for example, let's say, the most significant byte of one, you could end up having something get dereferenced to a completely different (and completely invalid) area of memory.
So, for example, if the stack were arranged such that memcpy past the end of c would overwrite part of b, when you attempt to call printf() with b as an argument, it tries to take that pointer and dereference it to print a string. Since it's no longer a valid pointer, that'll cause a segfault. But since things like stack arrangement are platform (and perhaps compiler?) dependent, you may not see the same behavior with similar examples in different programs.
What you are trying to do is basically buffer overflow & in your code sample more specifically heap overflow. The reason you see the crash only at times depends on which memory area you are accessing & if or not you have permission to access/write it (which has been well explained by Dan Fego). I think the example provided by Dan Fego is more about stack overflow (correction welcome!). gcc has protection related to buffer overflow on the stack (stack smashing). You can see this (stack based overflow) in the following example:
#include <stdio.h>
#include <string.h>
int main (void)
{
char b[] = { "abcdefghijk"};
char c [8];
memcpy (c, b, sizeof c + 1); // here invalid read and invalid write
printf ("\nsize: %d\n", sizeof b);
printf ("\nc: %s\n", c);
return 0;
}
Sample output:
$ ./a.out
size: 12
c: abcdefghi���
*** stack smashing detected ***: ./a.out terminated
This protection can be disabled using -fno-stack-protector option in gcc.
Buffer overflow are one of major cause of security vulnerability. Unfortunately function like memcpy do not check for these kinds of problems, but there are ways to protect against these kinds of problems.
Hope this helps!
you create a 3 char string c, but you copy on it 10 chars. it is an error.
it is called a bufferoverflow : you write in a memory that doesnot belong to you. so the behavior is undefined. it could be a crash, it could works fine or it could modify another variable you created.
so the goo thing to do is to allocate enough memory for c to contain the content of b :
c = (char *)malloc(sizeof(char) * (sizeof(b)+1)); // +1 is for the '\0' char that ends every string in c.
2 - when you copy b in c dont forget to put the end of string char : '\0'. it is mandatory in the c standard.
so printf("%s",c); knows where to string finish.
3 - you copied 10 chars from b to c but b containd only 5 chars (a,b,c,d and '\0'), so the behavior of memcpy is undefined (e.g. : memcpy can try to read memory that cant be read,...).
you can copy only the memory you own : the 5 chars of b.
4 - i think the good instruction for defining b is : char b="abcd"; or char b={'a','b','c','d',0};
Related
This is what I have tried. I have not even ended my string with a \0 character.
#include <stdio.h>
#include <malloc.h>
#include <string.h>
int main()
{
int size=5;
char *str = (char *)malloc((size)*sizeof(char));
*(str+0) = 'a';
*(str+1) = 'b';
*(str+2) = 'c';
*(str+3) = 'd';
*(str+4) = 'e';
*(str+5) = 'f';
printf("%d %s", (int)strlen(str), str);
return 0;
}
According to the rule, it can store only 4 charaters and one for the \0 as I have specified it in malloc.
It gives me the perfect output.
Output
6 abcdef
Check this out here : https://onlinegdb.com/B1UeOXbjH
You allocated your own memory, so it is up to you to manage it responsibly. In your example, you allocated 5 bytes of RAM, and you created a pointer which points to the first address of it. Your pointer is not a string, it is not an array. So, what you then did was you wrote 6 bytes, starting at the address pointed to by your pointer. The 6th byte is overflowing into unallocated memory. So you wrote it into memory which may be used for something else and could cause unknown problems. You have created a leak, and you didn't free up the memory you allocated when you quit the program, which is another leak. You didn't add in a /0 anywhere so I honestly think you lucked out. There really isn't any way to know how strlen() would respond. If you want C to handle it for you, than you have char *str = "abcdef" and that will create your string of length 6 plus the /0. But if you do it manually like you did, than you have to handle everything.
C does not count its arrays: if you ask for a chunk of memory of so-many bytes, it gives you that chunk via a pointer, and it's entirely up to you to manage it responsibly.
This leads to a certain efficiency - no overhead from the compiler/runtime checking all this for you - but creates enormous challenges for incorrect code (which you've shown an example of).
Many of us every much like the down-to-the-metal efficiency of C, but there's a reason that so many prefer languages such as Java or C# that do manage this for you, and enforce array bounds. It's a tradeoff.
While working on dynamic memory allocation in C, I am getting confused when allocating size of memory to a char pointer. While I am only giving 1 byte as limit, the char pointer successfully takes input as long as possible, given that each letter corresponds to 1 byte.
Also I have tried to find sizes of pointer before and after input. How can I understand what is happening here? The output is confusing me.
Look at this code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int limit;
printf("Please enter the limit of your string - ");
gets(&limit);
char *text = (char*) malloc(limit*4);
printf("\n\nThe size of text before input is %d bytes",sizeof(text));
printf("\n\nPlease input your string - ");
scanf("%[^\n]s",text);
printf("\n\nYour string is %s",text);
printf("\n\nThe size of char pointer text after input is %d bytes",sizeof(text));
printf("\n\nThe size of text value after input is %d bytes",sizeof(*text));
printf("\n\nThe size of ++text value after input is %d bytes",sizeof(++text));
free(text);
return 0;
}
Check this output:
It works because malloc usually doesn't allocate the same number of bytes you pass to it.
It reserves memory multiple of "blocks". It usually reserve more memory to "cache" it for next malloc calls as an optimization. (it is an implementation specific)
check glibc malloc internals for example.
Using more memory than allocated by malloc is an undefined behavior. you may overwrite metadata of malloc saved on heap or corrupt other data.
Also I have tried to find sizes of pointer before and after input. How
can I understand what is happening here? The output is confusing me.
The size of pointer is fixed for all pointer types in a machine, it is usually 4/8 bytes depending on the address size. It doesn't have anything to do with data size.
Welcome to the world of Undefined Behaviour!
char *text = malloc(limit*4); (don't cast malloc in C) will make text point the the first element of an array of size limit*4.
C will not prevent you to write past the end of any array, simply the behaviour is undefined by the standard. It may work fine, or it may crash immediately, or you may experience abnormal behaviour later in the program.
Here, the underlying system call has probably allocated a full memory page (often 4k), and as you have not used another malloc you have just used a memory belonging to the process but still officially unused. But do not rely on that and never use it in production code.
And sizeof does not make sense with pointers. sizeof(text) is sizeof(char *) (same for sizeof(++text) for same reason) and is the size of a pointer (generaly 2, 4 or 8 bytes) and sizeof(*text) is sizeof(char) which by definition is 1.
C is confident that you as the programmer know how much memory you have asked, and will not try to use more. Anything can happen if you do (including expected result) but do not blame the language or the compiler if it breaks: only you will be guilty.
I am trying to understand the array concept in string.
char a[5]="hello";
Here, array a is an character array of size 5. "hello" occupies the array index from 0 to 4. Since, we have declared the array size as 5, there is no space to store the null character at the end of the string.
So my understanding is when we try to print a, it should print until a null character is encountered. Otherwise it may also run into segmentation fault.
But, when I ran it in my system it always prints "hello" and terminates.
So can anyone clarify whether my understanding is correct. Or does it depends upon the system that we execute.
As ever so often, the answer is:
Undefined behavior is undefined.
What this means is, trying to feed this character array to a function handling strings is wrong. It's wrong because it isn't a string. A string in C is a sequence of characters that ends with a \0 character.
The C standard will tell you that this is undefined behavior. So, anything can happen. In C, you don't have runtime checks, the code just executes. If the code has undefined behavior, you have to be prepared for any effect. This includes working like you expected, just by accident.
It's very well possible that the byte following in memory after your array happens to be a \0 byte. In this case, it will look to any function processing this "string" as if you passed it a valid string. A crash is just waiting to happen on some seemingly unrelated change to the code.
You could try to add some char foo = 42; before or after the array definition, it's quite likely that you will see that in the output. But of course, there's no guarantee, because, again, undefined behavior is undefined :)
What you have done is undefined behavior. Apparently whatever compiler you used happened to initialize memory after your array to 0.
Here, array a is an character array of size 5. "hello" occupies the array index from 0 to 4. Since, we have declared the array size as 5, there is no space to store the null character at the end of the string.
So my understanding is when we try to print a, it should print until a null character is encountered.
Yes, when you use printf("%s", a), it prints characters until it hits a '\0' character (or segfaults or something else bad happens - undefined behavior). I can demonstrate that with a simple program:
#include <stdio.h>
int main()
{
char a[5] = "hello";
char b[5] = "world";
int c = 5;
printf("%s%s%d\n", a, b, c);
return 0;
}
Output:
$ ./a.out
helloworldworld5
You can see the printf function continuing to read characters after it has already read all the characters in array a. I don't know when it will stop reading characters, however.
I've slightly modified my program to demonstrate how this undefined behavior can create bad problems.
#include <stdio.h>
#include <string.h>
int main()
{
char a[5] = "hello";
char b[5] = "world";
int c = 5;
printf("%s%s%d\n", a, b, c);
char d[5];
strcpy(d, a);
printf("%s", d);
return 0;
}
Here's the result:
$ ./a.out
helloworld��world��5
*** stack smashing detected ***: <unknown> terminated
helloworldhell�p��UAborted (core dumped)
This is a classic case of stack overflow (pun intended) due to undefined behavior.
Edit:
I need to emphasize: this is UNDEFINED BEHAVIOR. What happened in this example may or may not happen to you, depending on your compiler, architecture, libraries, etc. You can make guesses to what will happen based on your understanding of different implementations of various libraries and compilers on different platforms, but you can NEVER say for certain what will happen. My example was on Ubuntu 17.10 with gcc version 7. My guess is that something very different could happen if I tried this on an embedded platform with a different compiler, but I cannot say for certain. In fact, something different could happen if I had this example inside of a larger program on the same machine.
After writing a program to reverse a string, I am having trouble understanding why I got a seg fault while trying to reverse the string. I have listed my program below.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void reverse(char *);
int main() {
char *str = calloc(1,'\0');
strcpy(str,"mystring0123456789");
reverse(str);
printf("Reverse String is: %s\n",str);
return 0;
}
void reverse(char *string) {
char ch, *start, *end;
int c=0;
int length = strlen(string);
start = string;
end = string;
while (c < length-1){
end++;
c++;
}
c=0;
while(c < length/2){
ch = *end;
*end = *start;
*start = ch;
start++;
end--;
c++;
}
}
1st Question:
Even though I have allocated only 1 byte of memory to the char pointer
str (calloc(1,'\0')), and I copied a 18 bytes string mystring0123456789 into it, and it didn't throw any error and the program worked fine without any SEGFAULT.
Why did my program not throw an error? Ideally it should throw some error as it don't have any memory to store that big string. Can someone throw light on this?
The program ran perfectly and gives me output Reverse String is: 9876543210gnirtsym.
2nd Question:
If the replace the statement
strcpy(str,"mystring0123456789");
with
str="mystring0123456789\0";
the program gives segmentation fault even though I have allocated enough memory for str (malloc(100)).
Why the program throwing segmentation fault?
Even though i have allocated only 1 byte of memory to the char pointer str(calloc(1,'\0')), and i copied a 18 bytes string "mystring0123456789" into it, and it didn't throw any error and the program worked fine without any SEGFAULT.
Your code had a bug -- of course it's not going to do what you expect. Fix the bug and the mystery will go away.
If the replace the statement
strcpy(str,"mystring0123456789");
with
str="mystring0123456789\0";
the program gives segmentation fault even though i have allocated enough memory for str (malloc(100)).
Because when you finish this, str points to a constant. This throws away the previous value of str, a pointer to memory you allocated, and replaces it with a pointer to that constant.
You cannot modify a constant, that's what makes it a constant. The strcpy function copies the constant into a variable which you can then modify.
Imagine if you could do this:
int* h = &2;
Now, if you did *h = 1; you'd be trying to change that constant 2 in your code, which of course you can't do.
That's effectively what you're doing with str="mystring0123456789\0";. It makes str point to that constant in your source code which, of course, you can't modify.
There's no requirement that it throw a segmentation fault. All that happens is that your broken code invokes undefined behavior. If that behavior has no visible effect, that's fine. If it formats the hard drive and paints the screen blue, that's fine too. It's undefined.
You're overwriting the pointer value with the address of a string literal, which totally doesn't use the allocated memory. Then you try to reverse the string literal which is in read-only memory, which causes the segmentation fault.
Your program did not throw an error because, even though you did the wrong thing, ncaught you (more below). You wrote data were you were not supposed to, but you got “lucky” and did not break anything by doing this.
strcpy(str,"mystring0123456789"); copies data into the place where str points. It so happens that, at that place, you are able to write data without causing a trap (this time). In contrast, str="mystring0123456789\0"; changes str to point to a new place. The place it points to is the place where "mystring0123456789\0" is stored. That place is likely read-only memory, so, when you try to write to it in the reverse routine, you get a trap.
More about 1:
When calloc allocates memory, it merely arranges for there to be some space that you are allowed to use. Physically, there is other memory present. You can write to that other memory, but you should not. This is exactly the way things work in the real world: If you rent a hotel room, you are allowed to use that hotel room, but it is wrong for you to use other rooms even if they happen to be open.
Sometimes when you trespass where you are not supposed to, in the real world or in a program, nobody will see, and you will get away with it. Sometimes you will get caught. The fact that you do not get caught does not mean it was okay.
One more note about calloc: You asked it to allocate space for one thing of zero size (the source code '\0' evaluates to zero). So you are asking for zero bytes. Various standards (such as C and Open Unix) may say different things about this, so it may be that, when you ask for zero bytes, calloc gives you one byte. However, it certainly does not give you as many bytes as you wrote with strcpy.
It sounds like you are writing C programs having come from a dynamic language or at least a language that does automatic string handling. For lack of a more formal definition, I find C to be a language very close to the architecture of the machine. That is, you make a lot of the programming decisions. A lot of your program problems are the result of your code causing undefined behavior.You got a segfault with strcpy, because you copied memory into a protected location; the behavior was undefined. Whereas, assigning your fixed string "mystring0123456789\0" was just assigning that pointer to str.
When you implement in C, you decide whether you want to define your storage areas at compile or run-time, or decide to have storage allocated from the heap (malloc/calloc). In either case, you have to write housekeeping routines to make sure you do not exceed the storage you have defined.
Assigning a string to a pointer merely assigns the string's address in memory; it does not copy the string, and a fixed string inside quotes "test-string" is read-only, and you cannot modify it. Your program may have worked just fine, having done that assignment, even though it would not be considered good C coding practice.
There are advantages to handling storage allocations this way, which is why C is a popular language.
Another case is that you can have a segfault when you use memory correct AND your heap became so big that your physical memory cannot manage it (without overlap with stack|text|data|bss -> link)
Proof: link, section Possible Cause #2
I'm having difficulty learning C language's malloc and pointer:
What I learned so far:
Pointer is memory address pointer.
malloc() allocate memory locations and returns the memory address.
I'm trying to create a program to test malloc and pointer, here's what I have:
#include<stdio.h>
main()
{
char *x;
x = malloc(sizeof(char) * 5);
strcpy(*x, "123456");
printf("%s",*x); //Prints 123456
}
I'm expecting an error since the size I provided to malloc is 5, where I put 6 characters (123456) to the memory location my pointer points to. What is happening here? Please help me.
Update
Where to learn malloc and pointer? I'm confused by the asterisk thing, like when to use asterisk etc. I will not rest till I learn this thing! Thanks!
You are invoking undefined behaviour because you are writing (or trying to write) beyond the bounds of allocated memory.
Other nitpicks:
Because you are using strcpy(), you are copying 7 bytes, not 6 as you claim in the question.
Your call to strcpy() is flawed - you are passing a char instead of a pointer to char as the first argument.
If your compiler is not complaining, you are not using enough warning options. If you're using GCC, you need at least -Wall in your compiler command line.
You need to include both <stdlib.h> for malloc() and <string.h> for strcpy().
You should also explicitly specify int main() (or, better, int main(void)).
Personally, I'm old school enough that I prefer to see an explicit return(0); at the end of main(), even though C99 follows C++98 and allows you to omit it.
You may be unlucky and get away with invoking undefined behaviour for a while, but a tool like valgrind should point out the error of your ways. In practice, many implementations of malloc() allocate a multiple of 8 bytes (and some a multiple of 16 bytes), and given that you delicately do not step over the 8 byte allocation, you may actually get away with it. But a good debugging malloc() or valgrind will point out that you are doing it wrong.
Note that since you don't free() your allocated space before you return from main(), you (relatively harmlessly in this context) leak it. Note too that if your copied string was longer (say as long as the alphabet), and especially if you tried to free() your allocated memory, or tried to allocate other memory chunks after scribbling beyond the end of the first one, then you are more likely to see your code crash.
Undefined behaviour is unconditionally bad. Anything could happen. No system is required to diagnose it. Avoid it!
If you call malloc you get and adress of a memory region on heap.
If it returns e.g. 1000 you memory would look like:
Adr Value
----------
1000 1
1001 2
1002 3
1003 4
1004 5
1005 6
1006 0
after the call to strcpy(). you wrote 7 chars (2 more than allocated).
x == 1000 (pointer address)
*x == 1 (dereferenced the value x points to)
There are no warnings or error messages from the compiler, since C doesn't have any range-checking.
My three cents:
Use x, as (*x) is the value that is stored at x (which is unknown in your case) - you are writing to unknown memory location. It should be:
strcpy(x, "123456");
Secondly - "123456" is not 6 bytes, it's 7. You forgot about trailing zero-terminator.
Your program with it's current code might work, but not guaranteed.
What I would do:
#include<stdio.h>
main()
{
char str[] = "123456";
char *x;
x = malloc(sizeof(str));
strcpy(x, str);
printf("%s",x); //Prints 123456
free(x);
}
Firstly, there is one problem with your code:
x is a pointer to a memory area where you allocated space for 5 characters.
*x it's the value of the first character.
You should use strcpy(x, "123456");
Secondly, the memory after your 5 bytes allocated, can be valid so you will not receive an error.
#include<stdio.h>
main()
{
char *x;
x = malloc(sizeof(char) * 5);
strcpy(x, "123456");
printf("%s",x); //Prints 123456
}
Use this...it will work
See difference in your & mine program
Now here you are allocating 5 bytes & writing 6 byte so 6th byte will be stored in next consecutive address. This extra byte can be allocated to some one else by memory management so any time that extra byte can be changed by other program because 6th byte is not yours because you haven't malloc'd that.. that's why this is called undefined behaviour.