How sprintf works in C - c

Linux Kernel = 2.6.32-41-generic #94-Ubuntu
Language : C
Code Snippet:
#include <stdio.h>
int main ()
{
char buf[5];
int index = 0;
for (index = 0; index < 5; index++ )
{
sprintf(buf,"sud_%d", index);
printf("for index = %d\n",index);
printf("buf = %s\n",buf);
}
return 0;
}
Question1 : Why the above code snippet goes into loop while executing the above code?
Question2 : Does sprintf requires its last bit of the target buffer to be filled with 0 or '\0' ?
if I made the buffer size 6 (buf[6]) in the above code, it works fine.
can anyone please let me know the reason of this behavior?
Regards,
Sudhansu

Because undefined behavior.
The output buffer buf is only 5 characters big, but the first call to sprintf() will generate the string "sud_0", which requires 6 characters due to the terminator. It then writes outside of buf, triggering undefined behavior. Use snprintf().
snprintf() doesn't "require" the last character to be filled with anything before you call it, but it will make sure it's set to '\0' after the call has completed. This is because it aims to build a complete and valid C string in the given buffer, and thus it must make sure the string is properly terminated.
It's hard (and some would say pointless) to reason about undefined behavior, but I suspect that what happens is that the 6th character written into buf overflows into index, writing the first byte to 0. If you're on a little-endian system, this will be the same as doing index &= ~255. Since the value of index is just supposed to be between 0 and 5, it's reset to 0 which causes the loop to go on for ever.

You are writing outside declared buffer. That is undefined behavior.
Your char buf[5]; is too small. You require at least 6 chars for "sud_0" because of the '\0' terminator.

Related

A null character '\0' at the end of a string

I have the following code.
#include <stdio.h>
#include <string.h>
#define MAXLINE 1000
int main()
{
int i;
char s[MAXLINE];
for (i = 0; i < 20; ++i) {
s[i] = i + 'A';
printf("%c", s[i]);
}
printf("\nstrlen = %d\n", strlen(s)); // strlen(s): 20
return 0;
}
Should I write
s[i] = '\0';
explicitly after the loop executing to mark the end of the string or it is done automatically? Without s[i] = '\0'; function strlen(s) returns correct value 20.
Yes, you need to add a null terminator yourself. One is not added automatically.
You can verify this by explicitly initializing s to something that doesn't contain a NUL at byte 20.
char s[MAXLINE] = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
If you do that strlen(s) won't return 20.
Yes, you should add the null terminator after the loop. Alternatively, you could initialize the entire array with 0. That way, you don't have to add a 0 after the loop because there is one already:
...
char s[MAXLINE] = {0};
...
You MUST add a NUL terminator to mark the end of a C string.
Adding a NUL terminator character isn't automatic (unless documentation states that a function call writes the NUL terminator character for you).
In your case, use:
s[20] = 0;
As mentioned in the comments, C strings are defined by the terminator NUL character. The NUL character is required also by all the strXXX C functions.
If you don't mark the end of the string with a NUL, you have a (binary) sequence of characters, but not a C string. These are sometimes referred to as binary strings and they cannot use the strXXX library functions.
Why do you get Correct Results
It is likely that you get correct results mostly by chance.
The most probable explanation for the correct results is that the OS you are using provides you with a "clean" memory stack (the initial stack memory is all zero)... this isn't always the case.
Since you never wrote on the stack memory prior to executing your code, the following byte is whatever was there before (on your OS, that byte was set to zero when the stack was first initialized).
However, this will not be true if the OS does not provide you with a "clean" stack or if your code runs on a previously used stack.

Is this C user input code vulnerable?

I have this code that reads input from the user:
unsigned int readInput(char * buffer, unsigned int len){
size_t stringlen = 0;
char c;
while((c = getchar()) != '\n' && c != EOF){
if(stringlen < (len-1)){
buffer[stringlen] = c;
stringlen++;
}
}
buffer[stringlen+1] = '\x00';
return stringlen;
}
The size of char * buff is already set to len and has been memset to contain "0"s. Is this code vulnerable to any vulnerability attacks?
Depending on the platform, unsigned int might be too small to hold the number 13194139533312. You should always use size_t for buffer sizes in C, not doing so might be a vulnerability, yes.
Also, of course getchar() doesn't return char, so that's broken too.
I'd say "yes", that code is vulnerable.
Your code potentially writes out of bounds and leaves an array element uninitialized:
char buf[10]; // uninitialized!
readInput(buf, 10); // feed 12 TB of data
This has undefined behaviour because you write to buf[10].
readInput(buf, 10); // feed 8 bytes of data
strlen(buf);
This has undefined behaviour because you read the uninitialized value buf[8].
The error lies in the way you assign the null terminator, which uses the wrong index. It should say:
buffer[stringlen] = '\0';
// ^^^^^^^^^
Because you compute len - 1, your code should also have a precondition that len must be strictly positive. This is sensible, because you promise to produce a null-terminated string.
Assuming the buffer is allocated len bytes, the most glaring problem is:
buffer[stringlen+1] = '\x00';
This is because the loop can exit with stringlen equal to len-1, and therefore you are writing to buffer[len]. However, you should only be writing to indices up to len-1.
So let's fix this as follows:
buffer[stringlen] = '\x00';
This is what you really want because you have not written to buffer[stringlen] yet.
A subtler error is that if len is 0 (which you probably would say should never happen), then len-1 is MAXINT and hence (stringlen < (len-1)) is always true. Thus, the code will always buffer overflow on a 0 length buffer.
Your question is whether or not the code is vulnerable to attacks. The answer is no, it is not vulnerable, certainly not by the common definitions of vulnerability.
This is an interface between some unknown input (possibly by an adversary) and a buffer. You have correctly included a mechanism that prevents a buffer overflow, so your code is safe. [We assume here that everything from getchar() down is not subject of your question].
Whether the code will work as intended is a different story, (others already pointed out the hole before the terminating NULL), but that was not your question.

Why does the program print out an "#" when I enter nothing?

I wrote a program in order to reverse a string. However, when I enter nothing but press the "enter" key, it prints out an "#".
The code is as follows:
#include <stdio.h>
#include <string.h>
int main(void)
{
int i, j, temp;
char str[80];
scanf("%[^\n]s", str);
i = strlen(str);
//printf("%d\n", i);
//printf("%d\n", sizeof(str));
for (j=0; j<i/2; j++) {
temp=str[i-j-1];
str[i-j-1]=str[j];
str[j]=temp;
}
for(i = 0; str[i] != 0; i++)
putchar(str[i]);
}
I tried to use printf() function to see what happens when I press the "Enter" key.
However, after adding printf("%d\n", i);, the output became "3 #".
After adding printf("%d\n", sizeof(str));, the output became "0 80".
It seemed as if "sizeof" had automatically "fixed" the problem.
My roommate said that the problem may result from initialization. I tried to change the code char str[80] to char str[80] = {0}, and everything works well. But I still don't understand what "sizeof" does when it exists in the code. If it really results in the initialization, why will such thing happen when the program runs line by line?
When you declare an array without initializing any part of the array, you receive a pointer to a memory location that has not been initialized. That memory location could contain anything. In fact, you're lucky it stopped just at the #.
By specifying char str[80] = {0} you are effectively saying:
char str[80] = {0, 0, 0, /* 77 more times */ };
Thereby initializing the string to all null values. This is because the compiler automatically pads arrays with nulls if it is partially initialized. (However, this is not the case when you allocate memory from the heap, just a warning).
To understand why everything was happening, let's follow through your code.
When you set i to the value returned by strlen(str), strlen iterates over the location starting at the memory location pointed to by str. Since your memory is not initialized, it finds a # at location 0 and then 0 at location 1, so it correctly returns 1.
What happens with the loops when you don't enter anything? i is set to 0, j is set to 0, so the condition j<i/2 evaluates to 0<0, which is false so it moves on to the second condition. The second condition only tests if the current location in the array is null. Coincidentally you are returned a memory location where the first char is #. It prints it and luckily the next value is null.
When you use the sizeof operator, you are receiving the size of the entire array that you were allocated on the stack (this is important you you may run into this issue later if you start using pointers). If you used strlen, you would have received 1 instead.
Suggestions
Instead of trying to do i = strlen(str);, I would suggest doing i = scanf("%[^\n]s", str);. This is because scanf returns the number of chars read and placed in the buffer. Also, try to use more descriptive variable names, it makes reading code so much easier.
Do a memset of str then it will nothing instead of garbage
char str[80];
memset(str,0,80);

C - How can I concatenate an array of strings into a buffer?

I am trying to concatenate a random number of lines from the song twinkle twinkle. Into the buffer before sending it out because I need to count the size of the buffer.
My code:
char temp_buffer[10000];
char lyrics_buffer[10000];
char *twinkle[20];
int arr_num;
int i;
twinkle[0] = "Twinkle, twinkle, little star,";
twinkle[1] = "How I wonder what you are!";
twinkle[2] = "Up above the world so high,";
twinkle[3] = "Like a diamond in the sky.";
twinkle[4] = "When the blazing sun is gone,";
twinkle[5] = "When he nothing shines upon,";
srand(time(NULL));
arr_num = rand() % 5;
for (i=0; i<arr_num; i++);
{
sprintf(temp_buffer, "%s\n", twinkle[i]);
strcat(lyrics_buffer, temp_buffer);
}
printf("%s%d\n", lyrics_buffer, arr_num);
My current code only prints 1 line even when I get a number greater than 0.
There are two problems: The first was found by BLUEPIXY and it's that your loop never does what you think it does. You would have found this out very easily if you just used a debugger to step through the code (please do that first in the future).
The second problem is that contents of non-static local variables (like your lyrics_buffer is indeterminate. Using such variables without initialization leads to undefined behavior. The reason this happens is because the strcat function looks for the end of the destination string, and it does that by looking for the terminating '\0' character. _If the contents of the destination string is indeterminate it will seem random, and the terminator may not be anywhere in the array.
To initialize the array you simply do e.g.
char lyrics_buffer[10000] = { 0 };
That will make the compiler initialize it all to zero, which is what '\0' is.
This initialization is not needed for temp_buffer because sprintf unconditionally starts to write at the first location, it doesn't examine the content in any way. It does, in other words, initialize the buffer.
Update the buffer address after each print after initializing buffer with 0.
char temp_buffer[10000] = {0};
for (i=0; i<arr_num; i++) //removed semicolon from here
{
sprintf(temp_buffer + strlen(temp_buffer), "%s\n", twinkle[i]);
}
temp_buffer should contain final output. Make sure you have enough buffer size
You don't need strcat

Find String Length without recursion in C

#include<stdio.h>
#include<conio.h>
void main()
{
int str1[25];
int i=0;
printf("Enter a string\n");
gets(str1);
while(str1[i]!='\0')
{
i++;
}
printf("String Length %d",i);
getch();
return 0;
}
i'm always getting string length as 33. what is wrong with my code.
That is because, you have declared your array as type int
int str1[25];
^^^-----------Change it to `char`
You don't show an example of your input, but in general I would guess that you're suffering from buffer overflow due to the dangers of gets(). That function is deprecated, meaning it should never be used in newly-written code.
Use fgets() instead:
if(fgets(str1, sizeof str1, stdin) != NULL)
{
/* your code here */
}
Also, of course your entire loop is just strlen() but you knew that, right?
EDIT: Gaah, completely missed the mis-declaration, of course your string should be char str1[25]; and not int.
So, a lot of answers have already told you to use char str1[25]; instead of int str1[25] but nobody explained why. So here goes:
A char has length of one byte (by definition in C standard). But an int uses more bytes (how much depends on architecture and compiler; let's assume 4 here). So if you access index 2 of a char array, you get 1 byte at memory offset 2, but if you access index 2 of an int array, you get 4 bytes at memory offset 8.
When you call gets (which should be avoided since it's unbounded and thus might overflow your array), a string gets copied to the address of str1. That string really is an array of char. So imaging the string would be 123 plus terminating null character. The memory would look like:
Adress: 0 1 2 3
Content: 0x31 0x32 0x33 0x00
When you read str1[0] you get 4 bytes at once, so str1[0] does not return 0x31, you'll get either 0x00333231 (little-endian) or 0x31323300 (big endian).
Accessing str1[1] is already beyond the string.
Now, why do you get a string length of 33? That's actually random and you're "lucky" that the program didn't crash instead. From the start address of str1, you fetch int values until you finally get four 0 bytes in a row. In your memory, there's some random garbage and by pure luck you encounter four 0 bytes after having read 33*4=132 bytes.
So here you can already see that bounds checks are very important: your array is supposed to contain 25 characters. But gets may already write beyond that (solution: use fgets instead). Then you scan without bounds and may thus also access memory well beyond you array and may finally run into non-existing memory regions (which would crash your program). Solution for that: do bounds checks, for example:
// "sizeof(str1)" only works correctly on real arrays here,
// not on "char *" or something!
int l;
for (l = 0; l < sizeof(str1); ++l) {
if (str1[l] == '\0') {
// End of string
break;
}
}
if (l == sizeof(str1)) {
// Did not find a null byte in array!
} else {
// l contains valid string length.
}
I would suggest certain changes to your code.
1) conio.h
This is not a header that is in use. So avoid using it.
2) gets
gets is also not recommended by anyone. So avoid using it. Use fgets() instead
3) int str1[25]
If you want to store a string it should be
char str1[25]
The problem is in the string declaration int str1[25]. It must be char and not int
char str1[25]
void main() //"void" should be "int"
{
int str1[25]; //"int" should be "char"
int i=0;
printf("Enter a string\n");
gets(str1);
while(str1[i]!='\0')
{
i++;
}
printf("String Length %d",i);
getch();
return 0;
}

Resources