I started learning C yesterday, so this might be a trivial question, but I still don't get this.
Let's say I have the following code:
#include <stdio.h>
#include <string.h>
int main()
{
char text[8];
strcpy(text, "Lorem ");
puts(text);
strcat(text, "ipsum!");
puts(text);
return 0;
}
This will result in a segmentation fault when (or after) concating the strings. However if I change the size of textfrom 8 to 9, it doesn't.
Please correct me if I'm wrong but this is what I thought was right:
"Lorem " - size 6 (or 7 with \0)
"ipsum!" - size 6 (or 7 with \0)
"Lorem ipsum!" - size 12 (or 13 with \0)
So, where does the 8/9 come from? Is this caused by the implementation of strcat? Or is there something like a minimum array length? Or am I making a stupid beginner's mistake?
Thanks in advance.
It's just pure luck that it didn't crash, at least on Linux I get *** stack smashing detected ***.
You are trying to append a string to another string even though the storage for the latter is insufficient. It is an example of undefined behaviour (as pointed out in the comments).
C is the sort of language that always trusts the programmer with what is in the program and so you may not even get a warning for this when compiling.
Always ensure that you have enough storage in your buffers, there are very few facilities in C that guarantee safe behaviour so do not assume things such as minimum array length.
When you overrun the end of an array, the program has undefined behaviour. this means it might do what you expect it to do or it might not. It might run as if you hadn't invoked undefined behaviour. It might crash. It might reformat your hard drive. It might print a blank page on your printer. It might do all of those things depending on when you run it.
You can't know. That's what 'undefined behaviour' is. Undefined.
I could give you an explanation for the behaviour, but it'd be unhelpful, and very hardware and implementation specific.
You can malloc to whatever size you want. and even you can realloc also according to the further input.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char *text;
text = (char *) malloc(sizeof(char) * 12);
memset(text, 0x0, sizeof(char) * 12);
sprintf(text, "%s", "Lorem ");
puts(text);
sprintf(text, "%s%s", text, "ipsum!");
puts(text);
return 0;
}
Related
I am using SHA1 to encrypt my ID.
However, even if I enter the same ID, it is hashed differently.
This is my code:
#include <stdio.h>
#include <string.h>
#include <openssl/sha.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
char *sha1_hash(char *input_url, char *hashed_url) {
unsigned char hashed_160bits[20];
char hashed_hex[41];
int i;
SHA1(input_url, 160, hashed_160bits);
for(i=0; i < sizeof(hashed_160bits); i++) {
sprintf(hashed_hex + i*2, "%02x", hashed_160bits[i]);
}
strcpy(hashed_url, hashed_hex);
return hashed_url;
}
int main()
{
char *input_url;
char *hashed_url;
while(1) {
input_url = malloc(sizeof(char)* 1024);
hashed_url = malloc(sizeof(char) * 1024);
printf("input url> ");
scanf("%s", input_url);
if (strcmp(input_url, "bye") == 0) {
free(hashed_url);
free(input_url);
break;
}
sha1_hash(input_url, hashed_url);
printf("hashed_url: %s\n", hashed_url);
free(hashed_url);
free(input_url);
}
return 0;
}
If I enter the same value for the first attempt and the second attempt, it will be hashed differently, but the third attempt will be hashed the same as the second attempt.
I think the dynamic allocation is a problem, but I can not think of a way to fix it.
You're not calling SHA1 correctly:
SHA1(input_ID, 160, hashed_ID_160bits);
The second parameter is the length of the data to hash. You're instead passing in the number of bits in the hash. As a result, you're reading past the end of the string contained in input_ID into uninitialized memory and possibly past the end of the allocated memory segment. This triggers undefined behavior.
You instead want:
SHA1(input_ID, strlen(input_ID), hashed_ID_160bits);
SHA1(input_ID, 160, hashed_ID_160bits);
That line is wrong. You are always getting hash for 160 bytes. I assume you want the hash for the input text only, so use that length:
SHA1(input_ID, strlen(input_ID), hashed_ID_160bits);
SHA1 always produces hash of 160 bits, so you do not need to pass 160 as a parameter. If you want different size of SHA hash, you need to use a different function, documented here, and then of course modify rest of the code to match that hash size.
Why you get different hashes at different times is because of accessing uninitialized part of malloc buffer. This is Undefined Behavior, so "anything" can happen, and it's not generally useful to try and figure out what exactly happens, because it's not necessarily very deterministic. If you want to dig deeper than that, you could for example use a debugger to examine the memory addresses and contents on different loop iterations to see what exactly changed. Though, since this is Undefined Behavior, it's notoriously common for bad code to behave differently when you try to run it under debugger, or add debug prints.
The problem seems to be in the uninitialized memory you are allocating.
malloc reserves memory for you, but the contents are 'whatever has been in there before'. And since you are not only hashing the string contents, but the entire buffer, you get different results each time.
Try using calloc, running memset over the buffer or limit your hashing to strlen(input) and see if that helps.
I've written the following code to understand better how strnlen behaves:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char bufferOnStack[10]={'a','b','c','d','e','f','g','h','i','j'};
char *bufferOnHeap = (char *) malloc(10);
bufferOnHeap[ 0]='a';
bufferOnHeap[ 1]='b';
bufferOnHeap[ 2]='c';
bufferOnHeap[ 3]='d';
bufferOnHeap[ 4]='e';
bufferOnHeap[ 5]='f';
bufferOnHeap[ 6]='g';
bufferOnHeap[ 7]='h';
bufferOnHeap[ 8]='i';
bufferOnHeap[ 9]='j';
int lengthOnStack = strnlen(bufferOnStack,39);
int lengthOnHeap = strnlen(bufferOnHeap, 39);
printf("lengthOnStack = %d\n",lengthOnStack);
printf("lengthOnHeap = %d\n",lengthOnHeap);
return 0;
}
Note the deliberate lack of null termination in both buffers.
According to the documentation, it seems that the lengths should
both be 39:
RETURN VALUE
The strnlen() function returns strlen(s), if that is less than maxlen, or
maxlen if there is no null terminating ('\0') among the first maxlen characters
pointed to by s.
Here's my compile line:
$ gcc ./main_08.c -o main
And the output:
$ ./main
lengthOnStack = 10
lengthOnHeap = 10
What's going on here? Thanks!
First of all, strnlen() is not defined by C standard; it's a POSIX standard function.
That being said, read the documentation carefully
The strnlen() function returns the number of bytes in the string pointed to by s, excluding the terminating null byte ('\0'), but at most maxlen. In doing this, strnlen() looks only at the first maxlen bytes at s and never beyond s+maxlen.
So that means, while calling the function, you need to make sure, for the value you provide for maxlen, the array idexing is valid for [maxlen -1] for the supplied string, i.e, the string has at least maxlen elements in it.
Otherwise, while accessing the string, you'll venture into memory location which is not allocated to you (array out of bound access) hereby invoking undefined behaviour.
Remember, this function is to calculate the length of an array, upper-bound to a value (maxlen). That implies, the supplied arrays are at least equal to or greater than the bound, not the other way around.
[Footnote]:
By definition, a string is null-terminated.
Quoting C11, chapter §7.1.1, Definitions of terms
A string is a contiguous sequence of characters terminated by and including the first null
character. [...]
Firstly, don't cast malloc.
Secondly, you are reading past the end of your arrays. The memory outside your array bounds is undefined, and therefore there is no guarantee that it is not zero; in this instance, it is!
In general, this kind of behaviour is sloppy - see this answer for a good summary of the potential consequences
Your question is roughly equivalent to the following:
I know that a burglar alarm is supposed to prevent your house from getting robbed. This morning when I left the house, I turned off the burglar alarm. Sometime during the day when I was away, a burglar broke in and stole my stuff. How did this happen?
Or to this:
I know you can use the cruise control on your car to help you avoid getting speeding tickets. Yesterday I was driving on a road where the speed limit was 65. I set the cruise control to 95. A cop pulled me over and I got a speeding ticket. How did this happen?
Actually, those aren't quite right. Here's a more contrived analogy:
I live in a house with a 10 yard long driveway to the street. I have trained my dog to fetch my newspaper. One day I made sure there were no newspapers on the driveway. I put my dog on a 39 yard leash, and I told him to fetch the newspapwer. I expected him to go to the end of the leash, 39 yards away. But instead, he only went 10 yards, then stopped. How did this happen?
And of course there are many answers. Perhaps, when your dog got to the end of your newspaper-free driveway, right away he found someone else's newspaper in the gutter. Or perhaps, when the leash failed to stop him at the end of the driveway and he continued into the street, he got run over by a car.
The point of putting your dog on a leash is to restrict him to a safe area -- in this case, your property, that you control. If you put him on such a long leash that he can go off into the street, or into the woods, you're kind of defeating the purpose of controlling him by putting him on a leash.
Similarly, the whole point of strnlen is to behave gracefully if, within the buffer you have defined, there is no null character for strnlen to find.
The problem with non-null-terminated strings is that functions like strlen (which blindly search for null terminators) sail off the end and rummage blindly around in undefined memory, desperately trying to find the terminator. For example, if you say
char non_null_terminated_string[3] = "abc";
int len = strlen(non_null_terminated_string);
the behavior is undefined, because strlen sails off the end. One way to fix this is to use strnlen:
char non_null_terminated_string[3] = "abc";
int len = strnlen(non_null_terminated_string, 3);
But if you hand a bigger number to strnlen, it defeats the whole purpose. You're back wondering what will happen when strnlen sails off the end, and there's no way to answer that.
What happens when ... "Undefined behaviour (UB)"?
“When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose”
Your heading is actually not UB, since calling strnlen("hi", 5) is perfectly legal, but the specifics of your question shows it is indeed UB...
Both strlen and strnlen expect a string, i.e. a nul-terminated char sequence. Providing your non-nul-terminatedchar array to the function is UB.
What happens in your case is that the function reads the first 10 chars, finds no '\0', and since it hasn't went out-of-bounds it continues to read further, and by that invoking UB (reading un-allocated memory). It could be that your compiler took the liberty to end your array with '\0', it could be that the '\0' was there before... the possibilities are limited only by the compiler designers.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm learning C language following Youtube video and I've got couple of questions.
I typed these below in Xcode
#include<stdio.h>
int main() // 1. why do we have to use this line?
{
char food[] = "tuna";
printf("the best food is %s",food);
strcpy(food,"bacon"); // error here
return 0;
}
When I typed these all, the error come out saying " implicity declaring library function 'strcpy' with type 'char*(char*,const char*)'"
I have no idea what this mean and why it happened? Sorry to bother with question guys but I need your help. Thank you
C is a super great language to learn but because of how low level it is in comparison to python, javascript, etc. There are many manual tasks you need to do including memory management.
But before we get into that, your initial problem is not including the string.h header. Once you include that header, you'll actually get Segmentation Fault.
Below is the explanation on why you'll get the Segmentation Fault.
So when you defined:
char food[] = "tuna"
C created char array with 5 bytes. Why 5? because all strings end with a NULL byte ('\0').
So your string in memory looks like "tuna\0".
Side note:
The importance of the NULL byte is to signify the end of a string. This is super important when you are using a function like printf. Because when printing a string, printf will look for the NULL byte to stop printing.
So this char array has the max size of 5 bytes, 1 being a null byte, so just 4 bytes (chars) of writeable memory.
Then you try to copy the word "bacon" into your 5 byte array.
So strcpy tries to copy the b, then the c, then the o, then the n, then when it tries to terminate the string with a NULL byte, it will seg fault because you're trying to access a byte of memory you don't have access to.
So in order to fix your issue, you could try to strcpy a string that is the same length or shorter than your original string. Or you can look into something like strdup and use pointers.
That code will look like:
#include <stdio.h>
#include <string.h>
int main()
{
char *food = "tuna";
printf("the best food is %s\n",food);
food = strdup("bacon");
printf("the best food is %s\n",food);
return 0;
}
Hope this helps!
I suggest you as exercise to explain me the behavior of this code. You have to understand how "strings" are managed by the C language into the memory.
The code I post below is not right, contains a violation of the space allocated for the variable food. In this case this code doesn't generate a Segmentation Fault, but It might generate such a error!
If you compile the code below with gcc -Wall --pedantic -std=c99 prgname.c
, the compiler doesn't signal warnings nor error, but the code is NOT RIGHT.
PAY ATTENTION! :)
#include <stdio.h>
#include <string.h>
int main() // 1. why do we have to use this line?
{ // --- 'cause is a standard!!! :)
char food[] = "tuna";
char b[]="prova";
printf("the best food is %s - %s - food-size %u\n",food,b,(unsigned int)sizeof(food));
strcpy(food,"fried or roasted bacon"); // error here
printf("the best food is %s - %s - food-size %u\n",food,b,(unsigned int)sizeof(food));
return 0;
}
We have to use main(),this is the function through which execution of C program starts.
For using strcpy()(A predefined function) you need to declare it first. and its declaration is present in string.h so just #include<string.h>
This will resolve your problem.implicit declaring library function 'strcpy' with type'char*(char*,const char*)``
There is one more problem when you declare
char food[]="tuna";
you get only 5 bytes allocated for you,four for character and one more for NULL. When you will try putting baconin it (using strcpy()),which is 6 bytes long. It will cause compiler to create segmentation fault.
I'm new to C, so I apologize if the answer is obvious, I've searched elsewhere.
the libraries I'm including are:
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <sys/stat.h>
the code that is failing is:
char *USER = getlogin();
char CWD[128];
if (USER == NULL)
printf("cry\n");
getcwd(CWD, sizeof(CWD));
printf("this prints\n");
printf(USER);
printf("this does not\n");
printf("%s#myshell:%s> ", USER, CWD);
cry does not print, so that should mean that getlogin is successful. the segfault is caused on printf(USER);
further testing shows that the folling block prints entirely
printf("this prints\n");
printf(USER);
printf("this prints\n");
but the folling block will print this prints end then segfault without showing USER
printf("this prints\n");
printf(USER);
EDIT:
Sorry for wasting your time. I accidentally deleted an fgets that was supposed to follow it and that was causing the segfault. I've been on this bug for a couple hours now, I love it when the problem is so small.
Thanks
You should check getcwd return value. According to man page of getcwd:
If the length of the absolute pathname of the current working
directory, including the terminating null byte, exceeds size bytes,
NULL is returned, and errno is set to ERANGE; an application should
check for this error, and allocate a larger buffer if necessary.
If USER is null, printf will dereference a null pointer, which is undefined behavior. The compiler isn't required to do something that makes sense when undefined behavior occurs, so it would be allowed to not print "cry" when USER is null. You'll want to avoid undefined behavior.
Something else that might be causing your result is the fact that data sent to stdout is usually buffered. If the program crashes before the data is flushed from the buffer, the data will be lost instead of printed.
So here is how printf works...
it has to read in the format string.
it has to grab values off of the stack in order to fulfill format specifiers, format them and output them.
examples of things that could cause this...
char stringOfTest[5] = {'1','2','3','4','5'};
or
char * stringOfTest = "here are some formats that will be unsatisfied: %d%f%i%s%x";
so the first one could crash because the string isn't null terminated, and depending on the state of the application could basically read until it overruns a buffer (a good implementation should guard against this), or just happens to run into a format specifier that causes a crash... This goes for any garbage data also.
and the second one deals with how variadic functions work... all of the variables are pushed onto the stack in some order and the function doesn't have a safe way to know which the last one is... so it will keep grabbing things that are specified until it grabs something out of the stack and (maybe) crashes.
the third way is also in the second example... if you have a %s it will cause a pointer to be dereferenced... so that can also crash.
The following code reads up to 10 chars from stdin and output the chars.
When I input more than 10 chars, I expect it to crash because msg has not enough room, but it does NOT! How could that be?
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[])
{
char* msg = malloc(sizeof(char)*10);
if (NULL==msg)
exit(EXIT_FAILURE);
scanf("%s", msg);
printf("You said: %s\n", msg);
if (strlen(msg)<10)
free(msg);
return EXIT_SUCCESS;
}
Use fgets instead, scanf is not buffer safe. What you are seeing is Undefined Behavior.
You may allocate "safe" big size when using scanf(). On user input it should be 2 lines (cca. 2x80 chars), in case of files some bigger.
Conclusion: scanf() is kinda quick-and-dirty stuff, don't use it in serious projects.
You can specify max size in scanf() format string
scanf("%9s", msg);
I would imagine that malloc() allocates blocks of memory aligned to word boundaries. On a 32-bit machine, that means whatever you ask for will be rounded up to the nearest multiple of 4. That means you might get away with a string of at least 11 characters (plus a '\0' terminator) without suffering any problems.
But don't ever assume this to be the case. Like everyone else is saying, you should always specify a safe maximum length in your format string if you want to avoid problems.
It does not crash because c is very lenient, contrary to popular belief. It is not required for the program to crash or even complain if a buffer is overflown. Say you define
union{
uint8_t a[3]
uint32_t b
}
then a[4] is perfectly fine memory and there is no reason to crash (but don't ever do this). Even a[5] or a[100] may be perfectly fine.
On the other hand I may try to access a[-1] which happens to be memory the OS does not allow you to access, causing a segfault.
As to what you should do to fix this:as others have pointed out, scanf is not safe to use with buffers. Use on of their suggetsions.