wierd behaviour while copying address in string - c

#include<stdio.h>
#include<malloc.h>
#include<string.h>
int main()
{
int *p = (int *)malloc(4 * sizeof(int));
char str1[20] ;
char str2[20] ;
sprintf(str1,"%20.20p",p);
sprintf(str2,"%20.20p",p);
printf("%d\t%20.20s\n",strlen(str1),str1);
printf("%d\t%20.20s\n",strlen(str2),str2);
if(strcmp(str1,str2) == 0)
printf("SAME\n");
else
printf("DIFFERENT\n");
free(p);
return 0;
}
OUTPUT:
42 0x000000000000083bc0
22 0x000000000000083bc0
DIFFERENT
The string length always differ in each and every compiler I ran . Though the pointer was same every single time . Not sure why . Because the length is different the strings are also not matching.

It looks like you are printing a 22 character string (23 counting the trailing \0) into a pair of 20 character buffers. This means that the program is free to overwrite the end of your strings at any point, since that memory is not reserved. You will see even stranger results if you replace %20.20s with plain %s in your print statements. The fix is to declare str1[23]; str2[23]; (don't forget the trailing \0), and don't restrict the print output: use plain %s.
You got off lucky in that your program prints an output without crashing. Not allocating enough memory can cause two problems:
You overwrite something important without realizing it.
Something else overwrites your data. This can cause things like your strings suddenly having a length of thousands of characters if the terminator is overwritten.

That's because sprintf(str1,"%20.20p",p); will print more than 20 symbols into str1, so you have a buffer overflow in your code - the first printf prints to both str1 and to a part of str2, and the second printf overwrites that part inside str2.
It actually prints 0x0000000000000173d010, which is 23 symbols, including zero character at the end.

Related

string gets filled with garbage

i got a string and a scanf that reads from input until it finds a *, which is the character i picked for the end of the text. After the * all the remaining cells get filled with random characters.
I know that a string after the \0 character if not filled completly until the last cell will fill all the remaining empty ones with \0, why is this not the case and how can i make it so that after the last letter given in input all the remaining cells are the same value?
char string1 [100];
scanf("%[^*]s", string1);
for (int i = 0; i < 100; ++i) {
printf("\n %d=%d",i,string1[i]);
}
if i try to input something like hello*, here's the output:
0=104
1=101
2=108
3=108
4=111
5=0
6=0
7=0
8=92
9=0
10=68
You have an uninitialized array:
char string1 [100];
that has indeterminate values. You could initialize the array like
char string1 [100] = { 0 };
or
char string1 [100] = "";
In this call
scanf("%[^*]s", string1);
you need to remove the trailing character s, because %[] and %s are distinct format specifiers. There is no %[]s format specifier. It should look like this:
scanf("%[^*]", string1);
The array contains a string terminated by the zero character '\0'.
So to output the string you should write for example
for ( int i = 0; string1[i] != '\0'; ++i) {
printf( "%c", string1[i] ); // or putchar( string1[i] );
putchar( '\n' );
or like
for ( int i = 0; string1[i] != '\0'; ++i) {
printf("\n %d=%c",i,string1[i]);
putchar( '\n' );
or just
puts( string1 );
As for your statement
printf("\n %d=%d",i,string1[i]);
then it outputs each character (including non-initialized characters) as integers due to using the conversion specifier d instead of c. That is the function outputs internal ASCII representations of characters.
I know that a string after the \0 character if not filled completly
until the last cell will fill all the remaining empty ones with \0
No, that's not true.
It couldn't be true: there is no length to a string. No where neither the compiler nor any function can even know what is the size of the string. Only you do. So, no, string don't autofill with '\0'
Keep in minds that there aren't any string types in C. Just pointer to chars (sometimes those pointers are constant pointers to an array, but still, they are just pointers. We know where they start, but there is no way (other than deciding it and being consistent while coding) to know where they stop.
Sure, most of the time, there is an obvious answer, that make obvious for any reader of the code what is the size of the allocated memory.
For example, when you code
char string1[20];
sprintf(string1, "hello");
it is quite obvious for a reader of that code that the allocated memory is 20 bytes. So you may think that the compiler should know, when sprinting in it of sscaning to it, that it should fill the unused part of the 20 bytes with 0. But, first of all, the compiler is not there anymore when you will sscanf or sprintf. That occurs at runtime, and compiler is at compilation time. At run time, there is not trace of that 20.
Plus, it can be more complicated than that
void fillString(char *p){
sprintf(p, "hello");
}
int main(){
char string1[20];
string1[0]='O';
string1[1]='t';
fillString(&(string1[2]));
}
How in this case does sprintf is supposed to know that it must fill 18 bytes with the string then '\0'?
And that is for normal usage. I haven't started yet with convoluted but legal usages. Such as using char buffer[1000]; as an array of 50 length-20 strings (buffer, buffer+20, buffer+40, ...) or things like
union {
char str[40];
struct {
char substr1[20];
char substr2[20];
} s;
}
So, no, strings are not filled up with '\0'. That is not the case. It is not the habit in C to have implicit thing happening under the hood. And that could not be the case, even if we wanted to.
Your "star-terminated string" behaves exactly as a "null-terminated string" does. Sometimes the rest of the allocated memory is full of 0, sometimes it is not. The scanf won't touch anything else that what is strictly needed. The rest of the allocated memory remains untouched. If that memory happened to be full of '\0' before the call to scanf, then it remains so. Otherwise not. Which leads me to my last remark: you seem to believe that it is scanf that fills the memory with non-null chars. It is not. Those chars were already there before. If you had the feeling that some other methods fill the rest of memory with '\0', that was just an impression (a natural one, since most of the time, newly allocated memory are 0. Not because a rule says so. But because that is the most frequent byte to be found in random area of memory. That is why uninitialized variables bugs are so painful: they occur only from times to times, because very often uninitialized variables are 0, just by chance, but still they are)
The easiest way to create a zeroed array is to use calloc.
Try replacing
char string1 [100];
with
char *string1=calloc(1,100);

What is the point of assigning the size of a string?

For an instance if I store ABCDE from scanf function, the later printf function gives me ABCDE as output. So what is the point of assigning the size of the string(Here 4).
#include <stdio.h>
int main() {
int c[4];
printf("Enter your name:");
scanf("%s",c);
printf("Your Name is:%s",c);
return 0;
}
I'll start with, don't use int array to store strings!
int c[4] allocates an array of 4 integers. An int is typically 4 bytes, so usually this would be 16 bytes (but might be 8 or 32 or something else on some platforms).
Then, you use this allocation first to read characters with scanf. If you enter ABCDE, it uses up 6 characters (there is an extra 0 byte at the end of the string marking the end, which needs space too), which happens to fit into the memory reserved for array of 4 integers. Now you could be really unlucky and have a platform where int has a so called "trap representation", which would cause your program to crash. But, if you are not writing the code for some very exotic device, there won't be. Now it just so happens, that this code is going to work, for the same reason memcpy is going to work: char type is special in C, and allows copying bytes to and from different types.
Same special treatment happens, when you print the int[4] array with printf using %s format. It works, because char is special.
This also demonstrates how very unsafe scanf and printf are. They happily accept c you give them, and assume it is a char array with valid size and data.
But, don't do this. If you want to store a string, use char array. Correct code for this would be:
#include <stdio.h>
int main() {
char c[16]; // fits 15 characters plus terminating 0
printf("Enter your name:");
int items = scanf("%15s",c); // note: added maximum characters
// scanf returns number of items read successfully, *always* check that!
if (items != 1) {
return 1; // exit with error, maybe add printing error message
}
printf("Your Name is: %s\n",c); // note added newline, just as an example
return 0;
}
The size of an array must be defined while declaring a C String variable because it is used to calculate how many characters are going to be stored inside the string variable and thus how much memory will be reserved for your string. If you exceed that amount the result is undefined behavior.
You have used int c , not char c . In C, a char is only 1 byte long, while a int is 4 bytes. That's why you didn't face any issues.
(Simplifying a fair amount)
When you initialize that array of length 4, C goes and finds a free spot in memory that has enough consecutive space to store 4 integers. But if you try to set c[4] to something, C will write that thing in the memory just after your array. Who knows what’s there? That might not be free, so you might be overwriting something important (generally bad). Also, if you do some stuff, and then come back, something else might’ve used that memory slot (properly) and overwritten your data, replacing it with bizarre, unrelated, and useless (to you) data.
In C language the last of the string is '\0'.
If you print with the below function, you can see the last character of the string.
scanf("%s", c); add the last character, '\0'.
So, if you use another function, getc, getch .., you should consider adding the laster character by yourself.
#include<stdio.h>
#include<string.h>
int main(){
char c[4+1]; // You should add +1 for the '\0' character.
char *p;
int len;
printf("Enter your name:");
scanf("%s", c);
len = strlen(c);
printf("Your Name is:%s (%d)\n", c, len);
p = c;
do {
printf("%x\n", *(p++));
} while((len--)+1);
return 0;
}
Enter your name:1234
Your Name is:1234 (4)
31
32
33
34
0 --> last character added by scanf("%s);
ffffffae --> garbage

Unexplainable behaviour when printing out strings in C

The following code works as expected and outputs ABC:
#include <stdio.h>
void printString (char toPrint [100]);
int main()
{
char hello [100];
hello[0] = 'A';
hello[1] = 'B';
hello[2] = 'C';
hello[3] = '\0';
printString(hello);
}
void printString (char toPrint [100])
{
int i = 0;
while (toPrint[i] != '\0')
{
printf("%c", toPrint[i]);
++i;
}
}
But if I remove the line that adds the null-character
hallo[3] = '\0';
I get random output like wBCÇL, ╗BCÄL, ┬BCNL etc.
Why is that so? What I expected is the loop in printString() to run forever because it doesn't run into a '\0', but what happend to 'A', 'B' and 'C'? Why do B and C still show up in the output but A is replaced by some random character?
You declaration of hello leaves it uninitialized and filled with random bytes
int main()
{
char hello [100];
...
}
If you want zero initialized array use
int main()
{
char hello [100] = {0};
...
}
There must have been, by pure chance, the value for \r somewhere in the memory cells following those of my array hello. That's why my character 'A' was overwritten.
On other machines, "ABC" was ouput as expected, followed by random characters.
Initializing the array with 0s, purposely omitted here, of course solves the problem.
edit:
I let the code print out each character in binary and toPrint[5] was indeed 00001101 which is ASCII for \r (carriage return).
When you declare an automatic like char hello [100];, the first thing to understand is that the 100 bytes can contain just about anything. You must assign values to each byte explicitly to do / have something meaningful.
You are terminating you loop when you find the \0 a.k.a the NUL character. Now, if you comment out the instruction which puts the \0 after the character c, your loop runs until you actually find \0.
Your array might contain \0 at some point or it might not. There are chances you might go beyond the 100 bytes still looking for a \0 and invoke undefined behaviour. You also invoke UB when you try to work with an unassigned piece of memory.

Program only works if dummy char array is declared [C]

The following code will print to the file correctly if char finalstr[2048]; is declared, however if I remove it (since it's not used anywhere) the program prints garbage ascii instead. This makes me believe it's something related to memory, however I have no clue.
#include <stdio.h>
#include <stdlib.h>
int main()
{
FILE *fp;
FILE *fp2;
char str[2048];
char finalstr[2048];
fp = fopen("f_in.txt", "r");
fp2 = fopen("f_out.txt", "w");
while(fgets(str,2047,fp))//read line by line until end of file
{
int i;
for(i=0;i<=strlen(str);i++)//go trough the string cell by cell
{
if(str[i]>47 && str[i]<58 && str[i+1]>47 && str[i+1]<58)//from 0 to 9
{
char temp[2];//to hold temporary two digit string number
temp[0]=str[i];
i++;
temp[1]=str[i];
if(atoi(temp)<27)//if it's an upper case letter
fprintf(fp2,"%c",atoi(temp)+64);
else//if it's lowercase, skip the special characters between Z and a
fprintf(fp2,"%c",atoi(temp)+70);
}
else fprintf(fp2,"%c",str[i]);
}
}
fclose(fp);
fclose(fp2);
}
Input
20343545 3545 27 494140303144324738 343150 404739283144: ffabcd. 094540' 46 3546?
01404146343144 283127474635324738 404739283144 09 453131 3545 abcdefYXWVUTSRQP
2044474546 3931. 09 37404149 27 384146!
Output if finalstr[] is declared
This is a wonderful hex number: ffabcd. Isn' t it?
Another beautiful number I see is abcdefYXWVUTSRQP
Trust me. I know a lot!
Output if finalstr[] is not declared
?99? 9? 9 ?9999?9?9 99? 9?999?: ffabcd. ??9' ? 9??
((((.(( (((((.((. ((.((( ( ((( .( abcdefYXWVUTSRQP
øòòøò øò. ø òòòò ø òòò!
I did notice that the first if() statement could cause an overflow, however replacing <= with < had no effect on the end result.
I really wonder what the explanation behind this is, and whether it's C specific or if it would have happened in C++ too.
The main problem is with the temporary string you're using. It's not long enough to store a null terminating character, so you have an unterminated string.
Make the array 3 bytes long and add the terminator:
char temp[3];//to hold temporary two digit string number
temp[0]=str[i];
i++;
temp[1]=str[i];
temp[2]=0;
Also, you're looking too far off of the end of the array in your for loop. Use < instead of <=:
for(i=0;i<strlen(str);i++)//go trough the string cell by cell
Finally, make sure you #include <string.h> so that you have a proper declaration for strlen.
atoi(temp) causes undefined behaviour. The atoi function expects a pointer to null-terminated string as argument, however you provided a pointer to two characters without a terminator.
The atoi function will read off the end of your array. Your dummy array influences this because it changes what junk is present after the temp array.
BTW you could use (str[i] - '0') * 10 + (str[i+1] - '0') instead of atoi.
To my understanding, the problem is that the program fills the array potentially up to its full capacity by
fgets(str,2047,fp)
which means that the condition
i <= strlen(str)
works as expected only if the location after str is terminated with a zero; this might be the case when declaring finalstr.

puts() displays the whole content of strcpy even if an overflow of dest occurs

After creating a char array of size 5, then I use strcpy to fill the contents of the array but with a string larger than the original size; then I use puts() to display the contents of the array an the whole string is displayed which is odd because I iterate through the array contents and it doesn't seems to me that the contents are stored in memory (but they are displayed). This is the code I am testing
#include <stdio.h>
#include <string.h>
int main(){
char str1[5];
int i = 0;
strcpy(str1,"Hello world");
puts(str1);
printf("Size of str1: %d\n",sizeof(str1));
for(i = 0;i < 15; i++){
printf("%c",str1[i]);
}
puts(""); // Blank space
puts(str1); // Display contents again... Different result!
return 0;
}
Output:
Hello world
Size of str1: 5
Hello ld [
Hello
The 3rd line in the output is the actual contents in memory (I iterated further to verify).
I wouldn't expect the first puts(str1) to display the whole phrase but it does, also after displaying the contents I repeat puts(str1) and the output changes which seems random to me, also the array size is only 5.
Could you help me figure out what is going on?
strcpy doesn't know about the length of arrays/strings. It just keeps going until the string is copied (till a null character is hit).
This writes into memory you haven't allocated and is not guaranteed to return consistent results.
strcpy does not know how many characters to copy as mentioned by other engineer. You have to use strncpy() function, and then terminate the string by str1[4]='\0'; since 4 is the index of 5 th character, which is max size. Else the program may crash inconsistently.
Try this:
char str1[6];
strncpy(str1,"Hello world",5);
str1[5] = 0;
This works by using strncpy. You have to tell strncpy how many characters to actually copy. Also, you must mark the end of the string with a null (0). That is what the last line does. Note that str1[6] must have enough storage for your string plus the terminating null character.

Resources