Weird behavior of printf() calls after usage of itoa() function - c

I am brushing up my C skills.I tried the following code for learning the usage of itoa() function:
#include<stdio.h>
#include<stdlib.h>
void main(){
int x = 9;
char str[] = "ankush";
char c[] = "";
printf("%s printed on line %d\n",str,__LINE__);
itoa(x,c,10);
printf(c);
printf("\n %s \n",str); //this statement is printing nothing
printf("the current line is %d",__LINE__);
}
and i got the following output:
ankush printed on line 10
9
//here nothing is printed
the current line is 14
The thing is that if i comment the statement itoa(x,c,10); from the code i get the above mentioned statement printed and got the following output:
ankush printed on 10 line
ankush //so i got it printed
the current line is 14
Is this a behavior of itoa() or i am doing something wrong.
Regards.

As folks pointed out in the comments, the size of the array represented by the variable c is 1. Since C requires strings have a NULL terminator, you can only store a string of length 0 in c. However, when you call itoa, it has no idea that the buffer you're handing it is only 1 character long, so it will happily keep writing out digits into memory after c (which is likely to be memory that contains str).
To fix this, declare c to be of a size large enough to handle the string you plan to put into it, plus 1 for the NULL terminator. The largest value a 32-bit int can hold is 10 digits long, so you can use char c[11].
To further explain the memory overwriting situation above, let's consider that c and str are allocated in contiguous regions on the stack (since they are local variables). So c might occupy memory address 1000 (because it is a zero character string plus a NULL terminator), and str would occupy memory address 1001 through 1008 (because it has 6 characters, plus the NULL terminator). When you try to write the string "9" into c, the digit 9 is put into memory address 1000 and the NULL terminator is put in memory address 1001. Since 1001 is the first address of str, str now represents a zero-length string (NULL terminator before any other characters). That's why you are getting the blank.

c must be a buffer long enough to hold your number.
Write
char c[20] ;
instead of
char c[] = "";

Related

How does realloc treat null bytes in strings?

Relatively new C programmer here. I am reviewing the following code for a tutorial for a side project I am working on to practice C. The point of the abuf struct is to create a string that can be appended to. Here is the code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef struct abuf {
char* str;
unsigned int size;
} abuf;
void abAppend(abuf *ab, const char *s, int len) {
char *new = realloc(ab->str, ab->size + len);
if (new == NULL) return;
memcpy(&new[ab->size], s, len);
ab->str = new;
ab->size += len;
}
int main(void) {
abuf ab = {
NULL,
0
};
char *s = "Hello";
abAppend(&ab, s, 5);
abAppend(&ab, ", world", 7);
return 0;
}
Everything compiles and my tests (redacted for simplicity) show that the string "Hello" is stored in ab's str pointer, and then "Hello, world" after the second call to abAppend. However, something about this code confuses me. On the initial call to abAppend, the str pointer is null, so realloc, according to its man page, should behave like malloc and allocate 5 bytes of space to store the string. But the string "Hello" also contains the terminating null byte, \0. This should be the sixth and final byte of the string, if I understand this correctly. Isn't this null byte lost if we store "Hello\0" in a malloc-ed container large enough only to store "Hello"?
On the second call to abAppend, we concatenate ", world" to str. The realloc will enlarge str to 12 bytes, but the 13th byte, \0, is not accounted for. And yet, everything works, and if I test for the null byte with a loop like for (int i = 0; ab.str[i] != '\0'; i++), the loop works fine and increments i 12 times (0 thru 11), and stops, meaning it encountered the null byte on the 13th iteration. What I don't get is why does it encounter the null byte, if we don't allocate space for it?
I tried to break this code by doing weird combinations of strings, to no avail. I also tried to allocate an extra byte in each call to abAppend and changed the function a little to account for the extra space, and it performed the exact same as this version. How the null byte gets processed is eluding me.
How does realloc treat null bytes in strings?
The behavior of realloc is not affected by the contents of the memory it manages.
But the string "Hello" also contains the terminating null byte, \0. This should be the sixth and final byte of the string,…
The characters are copied with memcpy(&new[ab->size], s, len);, where len is 5. memcpy copies characters without regard to whether there is a terminating null byte. Given length of 5, it copies 5 bytes. It does not append a terminating null character to those.
The realloc will enlarge str to 12 bytes, but the 13th byte, \0, is not accounted for.
On the second called to abAppend, 7 more bytes are copied with memcpy, after the first 5 bytes. memcpy is given a length of 7 and copies only 7 bytes.
… it encountered the null byte on the 13th iteration.
When you tested ab.str[12], you exceeded the rules for which the C standard defines the behavior. ab.str[12] is outside the allocated memory. It is possible it contained a null byte solely because nothing else in your process had used that memory for another purpose, and that is why your loop stopped. If you attempted this in the middle of a larger program that had done previous work, that byte might have contained a different value, and your test might have gone awry in a variety of ways.
You're correct that you only initially allocated space for the characters in the string "Hello" but not the terminating null byte, and that the second call only added enough bytes for the characters in tge string ", world" with no null terminating byte.
So what you have is an array of characters but not a string since it's not null terminated. If you then attempt to read past the allocated bytes, you trigger undefined behavior, and one of the ways UB can manifest itself is that things appear to work properly.
So you got "lucky" that things happened to work as if you allocated space for the null byte and set it.

Printing second string causes the first string to repeat itself in C

When I try to execute this code with size of string b as size 5, it prints out
mangoapple
Despite not printing string a, the output still is
mangoapple
But if I increase the size of the string b to 6, it only prints
mango
int i=0;
char a[5]="apple";
char b[5]="mango";
pf("\n%s",b);
You got buffer overflow.
The size of "mango" is 6 char and when you declare b[5] there is no room for null terminator.
And when there is no null terminator, printf with %s will try to print out whatever remain in the stack, in this case happen to be "mangoapple".
This is undefined behavior since you don't know really what is there in the stack.
Best practice, don't forget the room for null terminator when assigning a string.
Either increase the array with one for the\0 terminator as mentioned in the comments above, or just print the necessary size:
pf("\n%.*s", sizeof(b), b);

When using strcpy() does the destination string need to be one element bigger than the source string?

Consider this code snippet (simplified syntax for clarity).
void simple (char *bar) {
char MyArray[12];
strcpy(MyArray, bar);
}
My instructor says that MyArray can copy at most 12 elements from bar, but from what I've read, MyArray can only store 11 characters because it needs room for the null character at the end. So if the received value of bar is 12 or greater, a buffer overflow would occur. My instructor says that this will only happen if the received value of bar
is 13 or greater. Who's right? I'd appreciate if you could cite a credible source so I can convince him.
The definition char MyArray[12] creates an array of 12 char, which can be used to store a string. Since strings in C are null terminated, one of those characters needs to be able to store the null byte at the end of the string.
So a variable of type char [12] can hold a string of at most 11 characters. Attempting to copy a string of length 12 or longer using strcpy as in your example will overflow the bounds of the array.
If you were to use strncpy as follows:
strncpy(MyArray, bar, 12);
This will not overflow the buffer, as it would copy at most 12 characters. However, if 12 characters are copied, that means the string is not null terminated and is therefore not technically a string. Then attempting to use any other string function on MyArray that expect a null terminated string would read off the end of the array.
The/a proper use of strncpy would be:
void simple(char *bar) {
char MyArray[12];
strncpy(MyArray, bar, sizeof(MyArray)-1);
MyArray[sizeof(MyArray)-1]= '\0';
}
This just puts in a terminating null character, whether strncpy was able to do that or not.
It's hard to tell, because your question is a bit confusingly worded, but I think you're right, and that your instructor is wrong.
Given the code
void simple(char *bar) {
char MyArray[12];
strcpy(MyArray, bar);
}
if the passed-in bar points to a string of 11 or fewer characters, a valid string will be copied to MyArray, with no buffer overflow. But if the string is 12 (or more) characters long, you're right, there'll be a buffer overflow, because strcpy will also copy the 13th, terminating null character.
Earlier you asked about strncpy. Given the code
void simple2(char *bar) {
char MyArray[12];
strncpy(MyArray, bar, 12);
}
if the passed-in bar points to a string of 11 or fewer characters, a valid string will be copied to MyArray. But if the string is 12 characters long, we have a different problem. strncpy will copy 12 characters and stop, meaning that it won't copy the terminating null character. There won't be a buffer overflow, but MyArray still won't end up containing a valid string.
Also, you asked for a credible source. I wrote the C FAQ list -- would you consider that credible? :-)
My instructor says that MyArray can copy at most 12 elements from bar
It will be more correctly to say that the array MyArray may accomodate at most 12 elements of the array bar. Otherwise there will be an attempt to access memory beyond the array.
So in fact your instructor is right.
The array MyArray is declared having only 12 elements
char MyArray[12];
but from what I've read, MyArray can only store 11 characters because
it needs room for the null character at the end
The terminating zero is also a character. And the function strcpy copies all characters from the source string including the terminating zero that is present in the source string.
So if the received value of bar is 12 or greater, a buffer overflow
would occur
What does mean the magic number 12 in this context? is it the number of characters in the array bar or it is the length of the string stored in the array bar (that used as an argument is converted to pointer to its first element)?
If the number 12 means the size of the string stored in the array bar then the function strcpy will try to copy all characters of the array including the terminating zero and in this case the array MyArray has to be declared as having 13 elements.
char MyArray[13];
However if the number 12 means the number of elements in the array bar (used as an argument of the function) and it contains a string then the length of the string is evidently is less than 12. So the array MyArray can accept all characters of the source array including the terminating zero.
So the reason for the confusion is that you did not make a common conclusion what the number 12 means whether it is the length of the source string or it is the size of the source array.
In the first case there will be indeed undefined behavior.
In the second case if the source array contains a string then the code will be well-formed.
A char array and string are similar, but not the same.
In C,
A string is a contiguous sequence of characters terminated by and including the first null character. C11dr §7.1.1 1
void simple (char *bar) {
char MyArray[12];
strcpy(MyArray, bar);
}
My instructor says that MyArray can copy at most 12 elements from bar,
This is correct: MyArray[] can receive up to 12 characters.
strcpy() copies the memory, starting at bar to the array MyArray[] and continues until it copies a null character. If more that 12 characters (the count of 12 includes the null character) are attempted to be copied, the result is undefined behavior (UB).
MyArray can only store 11 characters
Not quite. MyArray[] can store 12 characters. To treat that data as a string, a null character must be one of those 12. When interpreted as a string, the string include all the characters up to the null character. It also include the null chracter. Each element of MyArray[] could be an 'x', but then that memory would not be a string as it lacks a null character.
So if the received value of bar is 12 or greater, a buffer overflow would occur.
Not quite. If the strcpy() attempts to write outside MyArray[], the result is undefined. Buffer overflow may occur. The program may stop, etc. The result is not defined. It is undefined behavior.
My instructor says that this will only happen if the received value of bar is 13 or greater.
bar is a pointer - it likely does not have a "value of 13". bar likely points to memory that is a string. A string includes its terminating null character, so the string may consists of 12 non-null characters and a final null character for a total of 13 characters. MyArray[] is insufficient to store a copy of that string.
Who's right?
I suspect the dis-connect is in the imprecise meaning of "bar is 13"`. I see nothing the reported by the instructor as incorrect.

Does strlen return same value for a binary and ascii data

Please find the below code snippet.
unsigned char bInput[20];
unsigned char cInput[20];
From a function, I get a binary data in bInput and I determined its length using strlen(bInput).
I converted bInput which is in binary to ASCII and stored in cInput and printed its length. But both are different.
I am new to programming. Please guide regarding its behaviour.
Function strlen returns the index of the first character in memory with a value of 0 (AKA '\0'), starting from the memory address indicated by the input argument passed to this function.
If you pass a memory address of "something else" other than a zero-terminated string of characters (which has been properly allocated at that memory address), then there's a fair chance that it will result with a memory-access violation (AKA segmentation fault).
result wont be same for both cases.
Below is one sample scenario:
Null is valid UTF-8, it just doesn't work with C 'strings'.
char temp[8];
buf = "abcde\0f";
What we have here is a buffer of length 8, which contains these char values:
97 98 99 100 101 0 102 0
here,strlen(temp) is equal to 5 as per strlen design,however,The actual length of the buffer is eight.
strlen() counts each byte untill it reaches NULL character ('\0' that means value of a byte is zero). So if you are getting different length for binary and ascii characters means you need to check the below two points in your conversion logic,
what you are doing if binary value is zero.
whether you are converting any nonzero binary value to zero.

difference between sizeof and strlen in C linux

The first printf statement is giving output 3 and second giving 20.
Can anybody please explain what's the difference between the two here?
char frame[20],str[20];
printf("\nstrlen(frame)= %d",strlen(frame));
printf("\nsizeof(frame) = %d",sizeof(frame));
Thanks :)
sizeof is a compile-time operator and determines the size in bytes that a type consumes. In the case of frame (char[20]) that is 20 bytes.
strlen is a run-time function and scans a given pointer until the first occurrence of a nul terminator '\0' returning the amount of characters until then.
Because the contents of frame is not initialized, which means it is not a valid C string, so strlen(frame) could return any value, or crash. Actually, its behavior is undefined in this case.
Because frame is an array of 20 characters, therefore sizeof(frame) will return 20 * sizeof(char), which will always be 20 (sizeof(char) always equals 1).
strlen actually gives you the length of the string, whereas sizeof gives you the size of the allocated memory in bytes. It is infact quite nicely explained here http://www.cplusplus.com/reference/cstring/strlen/ Extract given below.
The length of a C string is determined by the terminating null-character: A C string is as long as the number of characters between the beginning of the string and the terminating null character (without including the terminating null character itself).
This should not be confused with the size of the array that holds the string. For example:
char mystr[100]="test string";
defines an array of characters with a size of 100 chars, but the C string with which mystr has been initialized has a length of only 11 characters. Therefore, while sizeof(mystr) evaluates to 100, strlen(mystr) returns 11.
And yes as per the other comments, you are trying to get length for uninitialized strings and that leads to undefined behaviour, it can be 3 or anything else, depending on whatever garbage is present in the memory that got allocated for your string.

Resources