Having difficulty printing strings - c

When I run the program, the second printf() prints string2 with whatever was scanned into string1 attached to the end.
e.g. 123 was scanned into string1 then it prints: Is before "12ab123".
as opposed to 12ab.
Why not just "12ab"?
char string1[MAX_STR_LEN];
char string2[4]={'1','2','a','b'};
char five='5';
char abc[3]={'a','b','c'};
printf("Enter a string:");
scanf("%s", string1);
printf("Is before \"%s\":",string2);

A string is a null terminated char array in C.
Change
char string2[4]={'1','2','a','b'};
to
char string2[5]={'1','2','a','b', '\0'};
(which is the same as char string2[] = "12ab";)

You need to terminate your array with NULL character as
char string2[5]={'1','2','a','b','\0'};
When you are doing the scanf(), string1 is stored in next memory so it is printing string2 with string1. It will print upto it gets \0 so its Undefined Behavior

In your code
char string2[4]={'1','2','a','b'};
string2 is not null-terminated. Using that array as an argument to %s format specifier invokes undefined behavior, as it runs past the allocated memory in search of the null-terminator.
You need to add the null-terminator yourself like
char string2[5]={'1','2','a','b','\0'};
to use string2 as a string.
Also, alternatively, you can write
char string2[ ]= "12ab";
to allow the compiler to decide the size, which considers the space for (and adds) the null-terminator.
Same goes for abc also.
That said, you're scanning into string1 and printing string2, which is certainly not wrong, but does not make much sense, either.

Expanding on the previous answers, the strings appear to be joined due to the order the variables are stored in the stack memory. This won't always work the same on every processor architecture or compiler (optimiser settings can change this behaviour too).

If the format specifier %s does not have the precision flag then the function outputs characters until it encounteres zero character '\0'
Character array string2 is defined such a way that it does not have the terminating zero
char string2[4]={'1','2','a','b'};
So the function outputs characters beyond the array until it meets zero character.
You could use the precision flag that to specify explicitly how many characters you are going to output. For example
printf("Is before \"%4.4s\":",string2);
Or you could define the array that includes the terminating zero. For example
char string2[5] = { '1', '2', 'a', 'b', '\0' };
Take into account that in this case the size of the array if it is specified shall be equal at least to 5 ( though the size can be greater than 5; in this case other characters that do not have initializers will be zero-initialized)
or simply
char string2[] = { "12ab" };
or without braces
char string2[] = "12ab";

Related

string gets filled with garbage

i got a string and a scanf that reads from input until it finds a *, which is the character i picked for the end of the text. After the * all the remaining cells get filled with random characters.
I know that a string after the \0 character if not filled completly until the last cell will fill all the remaining empty ones with \0, why is this not the case and how can i make it so that after the last letter given in input all the remaining cells are the same value?
char string1 [100];
scanf("%[^*]s", string1);
for (int i = 0; i < 100; ++i) {
printf("\n %d=%d",i,string1[i]);
}
if i try to input something like hello*, here's the output:
0=104
1=101
2=108
3=108
4=111
5=0
6=0
7=0
8=92
9=0
10=68
You have an uninitialized array:
char string1 [100];
that has indeterminate values. You could initialize the array like
char string1 [100] = { 0 };
or
char string1 [100] = "";
In this call
scanf("%[^*]s", string1);
you need to remove the trailing character s, because %[] and %s are distinct format specifiers. There is no %[]s format specifier. It should look like this:
scanf("%[^*]", string1);
The array contains a string terminated by the zero character '\0'.
So to output the string you should write for example
for ( int i = 0; string1[i] != '\0'; ++i) {
printf( "%c", string1[i] ); // or putchar( string1[i] );
putchar( '\n' );
or like
for ( int i = 0; string1[i] != '\0'; ++i) {
printf("\n %d=%c",i,string1[i]);
putchar( '\n' );
or just
puts( string1 );
As for your statement
printf("\n %d=%d",i,string1[i]);
then it outputs each character (including non-initialized characters) as integers due to using the conversion specifier d instead of c. That is the function outputs internal ASCII representations of characters.
I know that a string after the \0 character if not filled completly
until the last cell will fill all the remaining empty ones with \0
No, that's not true.
It couldn't be true: there is no length to a string. No where neither the compiler nor any function can even know what is the size of the string. Only you do. So, no, string don't autofill with '\0'
Keep in minds that there aren't any string types in C. Just pointer to chars (sometimes those pointers are constant pointers to an array, but still, they are just pointers. We know where they start, but there is no way (other than deciding it and being consistent while coding) to know where they stop.
Sure, most of the time, there is an obvious answer, that make obvious for any reader of the code what is the size of the allocated memory.
For example, when you code
char string1[20];
sprintf(string1, "hello");
it is quite obvious for a reader of that code that the allocated memory is 20 bytes. So you may think that the compiler should know, when sprinting in it of sscaning to it, that it should fill the unused part of the 20 bytes with 0. But, first of all, the compiler is not there anymore when you will sscanf or sprintf. That occurs at runtime, and compiler is at compilation time. At run time, there is not trace of that 20.
Plus, it can be more complicated than that
void fillString(char *p){
sprintf(p, "hello");
}
int main(){
char string1[20];
string1[0]='O';
string1[1]='t';
fillString(&(string1[2]));
}
How in this case does sprintf is supposed to know that it must fill 18 bytes with the string then '\0'?
And that is for normal usage. I haven't started yet with convoluted but legal usages. Such as using char buffer[1000]; as an array of 50 length-20 strings (buffer, buffer+20, buffer+40, ...) or things like
union {
char str[40];
struct {
char substr1[20];
char substr2[20];
} s;
}
So, no, strings are not filled up with '\0'. That is not the case. It is not the habit in C to have implicit thing happening under the hood. And that could not be the case, even if we wanted to.
Your "star-terminated string" behaves exactly as a "null-terminated string" does. Sometimes the rest of the allocated memory is full of 0, sometimes it is not. The scanf won't touch anything else that what is strictly needed. The rest of the allocated memory remains untouched. If that memory happened to be full of '\0' before the call to scanf, then it remains so. Otherwise not. Which leads me to my last remark: you seem to believe that it is scanf that fills the memory with non-null chars. It is not. Those chars were already there before. If you had the feeling that some other methods fill the rest of memory with '\0', that was just an impression (a natural one, since most of the time, newly allocated memory are 0. Not because a rule says so. But because that is the most frequent byte to be found in random area of memory. That is why uninitialized variables bugs are so painful: they occur only from times to times, because very often uninitialized variables are 0, just by chance, but still they are)
The easiest way to create a zeroed array is to use calloc.
Try replacing
char string1 [100];
with
char *string1=calloc(1,100);

What's the trailing symbols within an char array which initialized with brace-enclosed lists in clang?

#include <stdio.h>
int main(int argc, const char *argv[]) {
char name1[] = {'f','o','o'};
char name2[] = "foo";
printf("%s\n", name1);
printf("%s\n", name2);
return 0;
}
running the code above results in :
foox\363\277\357\376
foo
Program ended with exit code: 0
So, what's the difference between these 2 initializers?
name1 is an array of three characters {'f', 'o', 'o'}.
name2 is an array of four characters {'f', 'o', 'o', '\0'}.
printf("%s", ...) expects an array terminated with a null character. Because name1 isn't, you start dereferencing characters past the end of the array which can have the traditional undefined behavior.
The first array (i.e., {'f','o','o'}) will not have the null character '\0', wheres the second (i.e., "foo") will.
The printf specification when using the %s says the following:
If no l modifier is present: The const char * argument is expected to
be a pointer to an array of character type (pointer to a string).
Characters from the array are written up to (but not including) a
terminating null byte ('\0'); if a precision is specified, no more
than the number specified are written. If a precision is given, no
null byte need be present; if the precision is not specified, or is
greater than the size of the array, the array must contain a
terminating null byte.
Since, your printf did not include the precision, it will write up characters from the array until reaching the null byte ('\0'). Consequently, in the case of the char name1[] = {'f','o','o'}; resulting in the printf write up characters out of the memory that was allocated for the name1 array. Such behaviour is considered to be undefined.
This is the reason why printf("%s\n", name1); prints foo plus some extra symbols from the next positions in memory that should not have been accessed, whereas with printf("%s\n", name2); it prints exactly the string "foo" as it is.
There are no trailing symbols in the array.
But printf’s %s format expects a string, and the array name1 isn’t a string: by definition, C strings are zero terminated … and your array isn’t. So the behaviour is undefined, and what seems to happen in your particular case is that printf continues printing random values that happen to be in memory just behind the contents of name1.
In C language if you are initializing string with character by character initializer you need to put '\0' which is NULL/terminating character to indicate the end of string.
so name1 should be {'f', 'o', 'o', '\0'}
x\363\277\357\376 that you can see at the end of your output is just garbage value which is printed just because printf could not find '\0' at the end of your string name1.
For name2 you used double quote to initialize the string which automatically puts a '\0' at the end of string.

String Initialization in c

I am quite new to C programming so feel free to correct me, I insist. My basic understanding of strings in C is when we initialize a string a null character is automatically assigned at the end of the string and the null character cannot be read read or written, but is used internally only.
So when I create a string of size 4 as char str[3] and assign a word to it say "RED" and print it using puts function or printf("%s",str), I get an unusual output printed as RED(SMIILEY FACE)
I then again reduce the size of string to char str[2] and assign RED to it and then compile it and the again receive a output stating RE(Smiley face)
If someone can explain it to me I will be thankful . Posting the C code below
int main()
{
char s1[3]="RED";
char s2[]="RED";
puts(s1);
puts(s2);
printf("%s",s1);
return 0;
}
char s1[3] = "RED";
Is a valid statement. It copies 3 characters from the constant string literal "RED" (which is 4 characters long) into the character array s1. There is no terminating '\0' in s1, because there is no room for it.
Note the copy, because s1 is mutable, while "RED" is not. This makes the statement different from e.g. const char *s1 = "RED";, where the string is not copied.
The result of both puts(s1) and printf("%s", s1) are undefined. There is no terminating '\0' in s1. Treating it as a string with one can lead to arbitrary behavior.
char s2[] = "RED";
Here, sizeof(s2) == 4, because "RED" has four characters, you need to count the trailing '\0' when calculating space.
The null character takes one exra character(byte). So you need to use an extra space in addition to the number of characters in the word you are initializing.
char s1[4]="RED"; //3 for RED and 1 for the null character
On the other hand
char s2[3]="RED";
there is no space for null character. "RED" is in there but you would encounter I/O problems when printing it as there is no null character stored at the end. Your data is stored fine but it can't be recognized properly by the printf as there is no null character.
char s2[]="RED";
This would work as memory of 4 (bytes) is automatically assigned which includes space for the terminating null character.

Why is the entirety of this first array being added onto the second, on top of the two values (from the first) that I assign it?

I want to assign the first two values from the hash array to the salt array.
char hash[] = {"HAodcdZseTJTc"};
char salt[] = {hash[0], hash[1]};
printf("%s", salt);
However, when I attempt this, the first two values are assigned and then all thirteen values are also assigned to the salt array. So my output here is not:
HA
but instead:
HAHAodcdZseTJTC
salt is not null-terminated. Try:
char salt[] = {hash[0], hash[1], '\0'};
Since you are adding just two characters to the salt array and you are not adding the '\0' terminator.
Passing a non nul terminated array as a parameter to printf() with a "%s" specifier, causes undefined behavior, in your case it prints hash in my case
HA#
was printed.
Strings in c use a special convetion to know where they end, a non printable special character '\0' is appended at the end of a sequence of non-'\0' bytes, and that's how a c string is built.
For example, if you were to compute the length of a string you would do something like
size_t stringlength(const char *string)
{
size_t length;
for (length = 0 ; string[length] != '\0' ; ++length);
return length;
}
there are of course better ways of doing it, but I just want to illustrate what the significance of the terminating '\0' is.
Now that you know this, you should notice that
char string[] = {'A', 'B', 'C'};
is an array of char but it's not a string, for it to be a string, it needs a terminating '\0', so
char string[] = {'A', 'B', 'C', '\0'};
would actually be a string.
Notice that then, when you allocate space to store n characters, you need to allocate n + 1 bytes, to make room for the '\0'.
In the case of printf() it will try to consume all the bytes that the passed pointer points at, until one of them is '\0', there it would stop iterating through the bytes.
That also explains the Undefined Behavior thing, because clearly printf() would be reading out of bounds, and anything could happen, it depends on what is actually there at the memory address that does not belong the the passed data but is off bounds.
There are many functions in the standard library that expect strings, i.e. _sequences of non nul bytes, followed by a nul byte.

Sizeof(char[]) in C

Consider this code:
char name[]="123";
char name1[]="1234";
And this result
The size of name (char[]):4
The size of name1 (char[]):5
Why the size of char[] is always plus one?
Note the difference between sizeof and strlen. The first is an operator that gives the size of the whole data item. The second is a function that returns the length of the string, which will be less than its sizeof (unless you've managed to get string overflow), depending how much of its allocated space is actually used.
In your example
char name[]="123";
sizeof(name) is 4, because of the terminating '\0', and strlen(name) is 3.
But in this example:
char str[20] = "abc";
sizeof(str) is 20, and strlen(str) is 3.
As Michael pointed out in the comments the strings are terminated by a zero. So in memory the first string will look like this
"123\0"
where \0 is a single char and has the ASCII value 0. Then the above string has size 4.
If you had not this terminating character, how would one know, where the string (or char[] for that matter) ends? Well, indeed one other way is to store the length somewhere. Some languages do that. C doesn't.
In C, strings are stored as arrays of chars. With a recognised terminating character ('\0' or just 0) you can pass a pointer to the string, with no need for any further meta-data. When processing a string, you read chars from the memory pointed at by the pointer until you hit the terminating value.
As your array initialisation is using a string literal:
char name[]="123";
is equivalent to:
char name[]={'1','2','3',0};
If you want your array to be of size 3 (without the terminating character as you are not storing a string, you will want to use:
char name[]={'1','2','3'};
or
char name[3]="123";
(thanks alk)
which will do as you were expecting.
Because there is a null character that is attached to the end of string in C.
Like here in your case
name[0] = '1'
name[1] = '2'
name[2] = '3'
name[3] = '\0'
name1[0] = '1'
name1[1] = '2'
name1[2] = '3'
name1[3] = '4'
name1[4] = '\0'
A String in C (and in, probably, every programming language - behind the scenes) is an array of characters which is terminated by \0 with the ASCII value of 0.
When assigning: char arr[] = "1234";, you assign a string literal, which is, by default, null-terminated (\0 is also called null) as you can see here.
To avoid a null (assuming you want just an array of chars and not a string), you can declare it the following way char arr[] = {'1', '2', '3', '4'}; and the program will behave as you wish (sizeof(arr) would be 4).
name = {'1','2','3','\0'};
name1 = {'1','2','3','4','\0'};
So
sizeof(name) = 4;
sizeof(name1) = 5;
sizeof returns the size of the object and in this case the object is an array and it is defined that your array is 4 bytes long in first case and 5 bytes in second case.
In C, string literals have a null terminating character added to them.
Your strings,
char name[]="123";
char name1[]="1234";
look more like:
char name[]="123\0";
char name1[]="1234\0";
Hence, the size is always plus one. Keep in mind when reading strings from files or from whatever source, the variable where you store your string, should always have extra space for the null terminating character.
For example if you are expected to read string, whose maximum size is 100, your buffer variable, should have size of 101.
Every string is terminated with the char nullbyte '\0' which add 1 to your length.

Resources