C language strlen function with new line character in the string - c

Generally strlen() function in C language returns unsigned int but if the string has new line character then what will be the output ?
For example :
What will be the output of strlen("stack\n") in C language ?

strlen("stack\n") --> 6. Nothing special about '\n'.

"What will be the output of strlen("stack\n")?"
6. The newline character such as any character except '\0' (NUL) is counted as any character else.
"Generally strlen() function in C language returns unsigned int."
That is not correct. strlen() returns a size_t value which is quite a distinct type from unsigned int, although in most implementations size_t can be an alias for unsigned int. But to keep the difference is important.
Note: If the string is stored in an char array instead (it is not a string literal and with that immutable) and you want to remove the newline, you can use strcspn():
char a[7] = "stack\n"; // 7 elements, not 6. Required to store terminating NUL.
printf("%zu\n", strlen(a)); // This will print 6.
a[strcspn(a, "\n")] = 0; // Replace newline with NUL.
printf("%zu", strlen(a)); // This will print 5.

Related

Array character as parameters in c [duplicate]

When should I use single quotes and double quotes in C or C++ programming?
In C and in C++ single quotes identify a single character, while double quotes create a string literal. 'a' is a single a character literal, while "a" is a string literal containing an 'a' and a null terminator (that is a 2 char array).
In C++ the type of a character literal is char, but note that in C, the type of a character literal is int, that is sizeof 'a' is 4 in an architecture where ints are 32bit (and CHAR_BIT is 8), while sizeof(char) is 1 everywhere.
Some compilers also implement an extension, that allows multi-character constants. The C99 standard says:
6.4.4.4p10: "The value of an integer character constant containing more
than one character (e.g., 'ab'), or
containing a character or escape
sequence that does not map to a
single-byte execution character, is
implementation-defined."
This could look like this, for instance:
const uint32_t png_ihdr = 'IHDR';
The resulting constant (in GCC, which implements this) has the value you get by taking each character and shifting it up, so that 'I' ends up in the most significant bits of the 32-bit value. Obviously, you shouldn't rely on this if you are writing platform independent code.
Single quotes are characters (char), double quotes are null-terminated strings (char *).
char c = 'x';
char *s = "Hello World";
'x' is an integer, representing the numerical value of the
letter x in the machine’s character set
"x" is an array of characters, two characters long,
consisting of ‘x’ followed by ‘\0’
I was poking around stuff like: int cc = 'cc'; It happens that it's basically a byte-wise copy to an integer. Hence the way to look at it is that 'cc' which is basically 2 c's are copied to lower 2 bytes of the integer cc. If you are looking for a trivia, then
printf("%d %d", 'c', 'cc'); would give:
99 25443
that's because 25443 = 99 + 256*99
So 'cc' is a multi-character constant and not a string.
Cheers
Single quotes are for a single character. Double quotes are for a string (array of characters). You can use single quotes to build up a string one character at a time, if you like.
char myChar = 'A';
char myString[] = "Hello Mum";
char myOtherString[] = { 'H','e','l','l','o','\0' };
single quote is for character;
double quote is for string.
In C, single-quotes such as 'a' indicate character constants whereas "a" is an array of characters, always terminated with the \0 character
Double quotes are for string literals, e.g.:
char str[] = "Hello world";
Single quotes are for single character literals, e.g.:
char c = 'x';
EDIT As David stated in another answer, the type of a character literal is int.
A single quote is used for character, while double quotes are used for strings.
For example...
printf("%c \n",'a');
printf("%s","Hello World");
Output
a
Hello World
If you used these in vice versa case and used a single quote for string and double quotes for a character, this will be the result:
printf("%c \n","a");
printf("%s",'Hello World');
output :
For the first line. You will get a garbage value or unexpected value or you may get an output like this:
�
While for the second statement, you will see nothing. One more thing, if you have more statements after this, they will also give you no result.
Note: PHP language gives you the flexibility to use single and double-quotes easily.
Use single quote with single char as:
char ch = 'a';
here 'a' is a char constant and is equal to the ASCII value of char a.
Use double quote with strings as:
char str[] = "foo";
here "foo" is a string literal.
Its okay to use "a" but its not okay to use 'foo'
Single quotes are denoting a char, double denote a string.
In Java, it is also the same.
While I'm sure this doesn't answer what the original asker asked, in case you end up here looking for single quote in literal integers like I have...
C++14 added the ability to add single quotes (') in the middle of number literals to add some visual grouping to the numbers.
constexpr int oneBillion = 1'000'000'000;
constexpr int binary = 0b1010'0101;
constexpr int hex = 0x12'34'5678;
constexpr double pi = 3.1415926535'8979323846'2643383279'5028841971'6939937510;
In C & C++ single quotes is known as a character ('a') whereas double quotes is know as a string ("Hello"). The difference is that a character can store anything but only one alphabet/number etc. A string can store anything.
But also remember that there is a difference between '1' and 1.
If you type
cout<<'1'<<endl<<1;
The output would be the same, but not in this case:
cout<<int('1')<<endl<<int(1);
This time the first line would be 48. As when you convert a character to an int it converts to its ascii and the ascii for '1' is 48.
Same, if you do:
string s="Hi";
s+=48; //This will add "1" to the string
s+="1"; This will also add "1" to the string
different way to declare a char / string
char char_simple = 'a'; // bytes 1 : -128 to 127 or 0 to 255
signed char char_signed = 'a'; // bytes 1: -128 to 127
unsigned char char_u = 'a'; // bytes 2: 0 to 255
// double quote is for string.
char string_simple[] = "myString";
char string_simple_2[] = {'m', 'S', 't', 'r', 'i', 'n', 'g'};
char string_fixed_size[8] = "myString";
char *string_pointer = "myString";
char string_poionter_2 = *"myString";
printf("char = %ld\n", sizeof(char_simple));
printf("char_signed = %ld\n", sizeof(char_signed));
printf("char_u = %ld\n", sizeof(char_u));
printf("string_simple[] = %ld\n", sizeof(string_simple));
printf("string_simple_2[] = %ld\n", sizeof(string_simple_2));
printf("string_fixed_size[8] = %ld\n", sizeof(string_fixed_size));
printf("*string_pointer = %ld\n", sizeof(string_pointer));
printf("string_poionter_2 = %ld\n", sizeof(string_poionter_2));

'\0' and printf() in C

In an introductory course of C, I have learned that while storing the strings are stored with null character \0 at the end of it. But what if I wanted to print a string, say printf("hello") although I've found that that it doesn't end with \0 by following statement
printf("%d", printf("hello"));
Output: 5
but this seem to be inconsistent, as far I know that variable like strings get stored in main memory and I guess while printing something it might also be stored in main memory, then why the difference?
The null byte marks the end of a string. It isn't counted in the length of the string and isn't printed when a string is printed with printf. Basically, the null byte tells functions that do string manipulation when to stop.
Where you will see a difference is if you create a char array initialized with a string. Using the sizeof operator will reflect the size of the array including the null byte. For example:
char str[] = "hello";
printf("len=%zu\n", strlen(str)); // prints 5
printf("size=%zu\n", sizeof(str)); // prints 6
printf returns the number of the characters printed. '\0' is not printed - it just signals that the are no more chars in this string. It is not counted towards the string length as well
int main()
{
char string[] = "hello";
printf("szieof(string) = %zu, strlen(string) = %zu\n", sizeof(string), strlen(string));
}
https://godbolt.org/z/wYn33e
sizeof(string) = 6, strlen(string) = 5
Your assumption is wrong. Your string indeed ends with a \0.
It contains of 5 characters h, e, l, l, o and the 0 character.
What the "inner" print() call outputs is the number of characters that were printed, and that's 5.
In C all literal strings are really arrays of characters, which include the null-terminator.
However, the null terminator is not counted in the length of a string (literal or not), and it's not printed. Printing stops when the null terminator is found.
All answers are really good but I would like to add another example to complete all these
#include <stdio.h>
int main()
{
char a_char_array[12] = "Hello world";
printf("%s", a_char_array);
printf("\n");
a_char_array[4] = 0; //0 is ASCII for null terminator
printf("%s", a_char_array);
printf("\n");
return 0;
}
For those don't want to try this on online gdb, the output is:
Hello world
Hell
https://linux.die.net/man/3/printf
Is this helpful to understand what escape terminator does? It's not a boundary for a char array or a string. It's the character that will say to the guy that parses -STOP, (print) parse until here.
PS: And if you parse and print it as a char array
for(i=0; i<12; i++)
{
printf("%c", a_char_array[i]);
}
printf("\n");
you get:
Hell world
where, the whitespace after double l, is the null terminator, however, parsing a char array, will just the char value of every byte. If you do another parse and print the int value of each byte ("%d%,char_array[i]), you'll see that (you get the ASCII code- int representation) the whitespace has a value of 0.
In C function printf() returns the number of character printed, \0 is a null terminator which is used to indicate the end of string in c language and there is no built in string type as of c++, however your array size needs to be a least greater than the number of char you want to store.
Here is the ref: cpp ref printf()
But what if I wanted to print a string, say printf("hello") although
I've found that that it doesn't end with \0 by following statement
printf("%d", printf("hello"));
Output: 5
You are wrong. This statement does not confirm that the string literal "hello" does not end with the terminating zero character '\0'. This statement confirmed that the function printf outputs elements of a string until the terminating zero character is encountered.
When you are using a string literal as in the statement above then the compiler
creates a character array with the static storage duration that contains elements of the string literal.
So in fact this expression
printf("hello")
is processed by the compiler something like the following
static char string_literal_hello[] = { 'h', 'e', 'l', 'l', 'o', '\0' };
printf( string_literal_hello );
Th action of the function printf in this you can imagine the following way
int printf( const char *string_literal )
{
int result = 0;
for ( ; *string_literal != '\0'; ++string_literal )
{
putchar( *string_literal );
++result;
}
return result;
}
To get the number of characters stored in the string literal "hello" you can run the following program
#include <stdio.h>
int main(void)
{
char literal[] = "hello";
printf( "The size of the literal \"%s\" is %zu\n", literal, sizeof( literal ) );
return 0;
}
The program output is
The size of the literal "hello" is 6
You have to clear your concept first..
As it will be cleared when you deal with array, The print command you are using its just counting the characters that are placed within paranthesis. Its necessary in array string that it will end with \0
A string is a vector of characters. Contains the sequence of characters that form the
string, followed by the special ending character
string: '\ 0'
Example:
char str[10] = {'H', 'e', 'l', 'l', 'o', '\0'};
Example: the following character vector is not one string because it doesn't end with '\ 0'
char str[2] = {'h', 'e'};

Indexing in arrays when using fgets

I'm teaching myself programming. I read that a string stored in an array of characters can be indexed to extract the nth character.
However, I've been trying to solve this for hours: I realized trying to solve an exercise that I can only access the first character of the array myarray[0]; whereas the rest of index (1,2,3...) values will return nothing. However, if I use the puts function it does return the whole string. Curious thing: strlen is returning the length of my array +1.
example:
int main (void)
{
char myarray[1000]={0};
int i;
fgets(myarray,1000,stdin);
for(i=0;i<strlen(myarray);i++)
printf("myarray[%d]:%c\n",i,myarray[i]);
printf("\n");
printf("strlen:%d\n",strlen(myarray));
puts(myarray);
return 0;
}
input:
6536
output:
strlen:5
myarray[0]:6
myarray[1]:
myarray[2]:
myarray[3]:
myarray[4]:
6536
You are getting this result most probably because of undefined behavior of your program. You are using wrong format specifier to print a size_t type (strlenreturn size_t type). Change the format specifier to %zu.
Also note that in for loop you need to declare i as size_t type.
Here is the fixed code: http://ideone.com/0sMadV
fgets writes a newline character \n into the buffer to represent newlines in the input stream. Thus strlen returns 5.
The actual output is :
6536
myarray[0]:6
myarray[1]:5
myarray[2]:3
myarray[3]:6
myarray[4]:
strlen:5
6536
As you can see, myarray[4] stores the newline (due to fgets). Also, it would be better to calculate strlen once by placing it above the loop, instead of in every iteration.
From here:
char *fgets(char *restrict s, int n, FILE *restrict stream);
The fgets() function shall read bytes from stream into the array
pointed to by s, until n-1 bytes are read, or a is read and
transferred to s, or an end-of-file condition is encountered. The
string is then terminated with a null byte.
A simple way is to for strlen(var)-1. Another way is to remove the newline with null terminating character:
if (myarray[length - 1] == '\n')
myarray[length - 1] = '\0';
where length = strlen(myarray);
The strlen counts the '\n' at the end of the string. You can "fix" it with strtok:
strtok(myarray, "\n");

Having difficulty printing strings

When I run the program, the second printf() prints string2 with whatever was scanned into string1 attached to the end.
e.g. 123 was scanned into string1 then it prints: Is before "12ab123".
as opposed to 12ab.
Why not just "12ab"?
char string1[MAX_STR_LEN];
char string2[4]={'1','2','a','b'};
char five='5';
char abc[3]={'a','b','c'};
printf("Enter a string:");
scanf("%s", string1);
printf("Is before \"%s\":",string2);
A string is a null terminated char array in C.
Change
char string2[4]={'1','2','a','b'};
to
char string2[5]={'1','2','a','b', '\0'};
(which is the same as char string2[] = "12ab";)
You need to terminate your array with NULL character as
char string2[5]={'1','2','a','b','\0'};
When you are doing the scanf(), string1 is stored in next memory so it is printing string2 with string1. It will print upto it gets \0 so its Undefined Behavior
In your code
char string2[4]={'1','2','a','b'};
string2 is not null-terminated. Using that array as an argument to %s format specifier invokes undefined behavior, as it runs past the allocated memory in search of the null-terminator.
You need to add the null-terminator yourself like
char string2[5]={'1','2','a','b','\0'};
to use string2 as a string.
Also, alternatively, you can write
char string2[ ]= "12ab";
to allow the compiler to decide the size, which considers the space for (and adds) the null-terminator.
Same goes for abc also.
That said, you're scanning into string1 and printing string2, which is certainly not wrong, but does not make much sense, either.
Expanding on the previous answers, the strings appear to be joined due to the order the variables are stored in the stack memory. This won't always work the same on every processor architecture or compiler (optimiser settings can change this behaviour too).
If the format specifier %s does not have the precision flag then the function outputs characters until it encounteres zero character '\0'
Character array string2 is defined such a way that it does not have the terminating zero
char string2[4]={'1','2','a','b'};
So the function outputs characters beyond the array until it meets zero character.
You could use the precision flag that to specify explicitly how many characters you are going to output. For example
printf("Is before \"%4.4s\":",string2);
Or you could define the array that includes the terminating zero. For example
char string2[5] = { '1', '2', 'a', 'b', '\0' };
Take into account that in this case the size of the array if it is specified shall be equal at least to 5 ( though the size can be greater than 5; in this case other characters that do not have initializers will be zero-initialized)
or simply
char string2[] = { "12ab" };
or without braces
char string2[] = "12ab";

sizeof() showing different output

Here is a snippet of C99 code:
int main(void)
{
char c[] = "\0";
printf("%d %d\n", sizeof(c), strlen(c));
return 0;
}
The program is outputting 2 0. I do not understand why sizeof(c) implies 2 seeing as I defined c to be a string literal that is immediately NULL terminated. Can someone explain why this is the case? Can you also provide a (some) resource(s) where I can investigate this phenomenon further on my own time.
didn't understand why size of is showing 2.
A string literal has an implicit terminating null character, so the ch[] is actually \0\0, so the size is two. From section 6.4.5 String literals of the C99 standard (draft n1124), clause 5:
In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals
As for strlen(), it stops counting when it encounters the first null terminating character. The value returned is unrelated to the sizeof the array that is containing the string. In the case of ch[], zero will be returned as the first character in the array is a null terminator.
In C, "" means: give me a string and null terminate it for me.
For example arr[] = "A" is completely equivalent to arr[] = {'A', '\0'};
Thus "\0" means: give me a string containing a null termination, then null terminate it for me.
arr [] = "\0"" is equivalent to arr[] = {'\0', '\0'};
"\0" is not the same as "". String literals are nul-terminated, so the first is the same as the compound literal (char){ 0, 0 } whereas the second is just (char){ 0 }. strlen finds the first character to be zero, so assumes the string ends. That doesn't mean the data ends.
When you declare a string literal as :
char c[]="\0";
It already has a '\0' character at the end so the sizeof(c) gives 2 because your string literal is actually : \0\0.
strlen(c) still gives 0 because it stops at the first \0.
strlen measures to the first \0 and gives the count of characters before the \0, so the answer is zero
sizeof on a char x[] gives the amount of storage used in bytes which is two, including the explict \0 at the end of the string
Great question. Consider this ...
ubuntu#amrith:/tmp$ more x.c
#include <stdio.h>
#include <string.h>
int main() {
char c[16];
printf("%d %d\n",sizeof(c),strlen(c));
return 0;
}
ubuntu#amrith:/tmp$ ./x
16 0
ubuntu#amrith:/tmp$
Consider also this:
ubuntu#amrith:/tmp$ more x.c
#include <stdio.h>
#include <string.h>
int main() {
int c[16];
printf("%d\n",sizeof(c));
return 0;
}
ubuntu#amrith:/tmp$ ./x
64
ubuntu#amrith:/tmp$
When you initialize a variable as an array (which is effectively what c[] is), sizeof(c) will give you the allocated size of the array.
The string "\0" is the literal string \NUL\NUL which takes two bytes.
On the other hand, strlen() computes the string length which is the offset into the string of the first termination character and that turns out to be zero and hence you get 2, 0.

Resources