what exactly happens when I do this assignment - c

Consider this snippet
char a[]="";
Will a NULL-pointer be assigned to the a character pointer *a?
If not how do I check that the no string has been assigned to a?

Will a NULL pointer be assigned to the a character pointer *a?
There is no character pointer here, but an array a of char.
a will be defined as an array of char and initialised to hold an empty-string, that is a c-string with just carrying the 0-terminator, which is one char.
how do I check that the no string has been assigned to a?
So a will have exactly one element. This element compares equal to '\0', which in turn compares equal to 0.
To test this do
#include <stdio.h> /* for puts() */
#include <string.h> /* for strlen() */
int main(void)
{
char a[] = ""; /* The same as: char a[1] = ""; */
/* Possibility 1: */
if (0 == a[0]) /* alternatively use '\0' == a[0] */
{
puts("a is an empty string.");
}
/* Possibility 2: */
if (0 == strlen(a))
{
puts("a has length zero.");
}
}

a will contain 1 element:
a[0] == '\0'
Note: a is not a pointer.

First of all, a is not a character pointer, it is an array of char. There are some cases where the latter is converted to the former, but they are inherently not the same type.
That said, in this initialization, array a will be initialized with an empty string.
Empty string means, the first element will be the terminating null character, so the easiest way to check if an array contains an empty string is to compare the first element with null, like
if (a[0] == '\0') { /*do the action*/ }

==> Will a NULL-pointer be assigned to the a character pointer *a?
Arrays are not pointers.
For better understanding, lets refer an example from C Standard#6.7.9p32 [emphasis mine]
EXAMPLE 8 The declaration
char s[] = "abc", t[3] = "abc";
defines ''plain'' char array objects s and t whose elements are initialized with character string literals. This declaration is identical to
char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };
The contents of the arrays are modifiable. On the other hand, the declaration
char *p = "abc";
defines p with type ''pointer to char'' and initializes it to point to an object with type ''array of char'' with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.
So, this statement
char a[]="";
defines char array object a whose elements are initialized with character string literal "". Note that "" is a string literal containing a single character '\0'.
The above statement is equivalent to
char a[] = {'\0'};
If we omit the dimension, compiler computes it for us based on the size of the initializer (here it will be 1 because initializer is only having one character '\0'). So the statement is same as
char a[1] = {'\0'};

Related

In this solution, if *inputString is a memory address (and the input is a char type) then does *inputString qualify as an array?

I was solving a challenge on CodeSignal in C. Even though the correct libraries where included, I couldn't use the strrev function in the IDE, so I looked up a similar solution and modified it to work. This is good. However, I don't understand the distinction between a literal string and an array. Reading all this online has left me a bit confused. If C stores all strings as an array with each character terminated by \0 (null terminated), how can there be any such thing as a literal string? Also if it is the case that strings are stored as an array, *would inputString store the address of the array or is it an array itself of all the individual characters stored.
Thanks in advance for any clarification provided!
Here is the original challenge, C:
Given the string, check if it is a palindrome.
bool solution(char * inputString) {
// The input will be character array type, storing a single character each terminated by \0 at each index
// * inputString is a pointer that stores the memory address of inputString. The memory address points to the user inputted string
// bonus: inputString is an array object starting at index 0
// The solution function is set up as a Boolean type ("1" is TRUE and the default "0" is FALSE)
int begin;
// The first element of the inputString array is at position 0, so is the 'counter'
int end = strlen(inputString) - 1;
// The last element is the length of the string minus 1 since the counter starts at 0 (not 1) by convention
while (end > begin) {
if (inputString[begin++] != inputString[end--]) {
return 0;
}
} return 1;
}
A string is also an array of symbols. I think that what you don't understand is the difference between a char pointer and a string. Let me explain in an example:
Imagine I have the following:
char str[20]="helloword";
str is the address of the first symbol of the array. In this case str is the address of h. Now try to printf the following:
printf("%c",str[0]);
You can see that it has printed the element of the addres that is 'h'.
If now I declare a char pointer, it will be poining to whatever char adress I want:
char *c_pointer = str+1;
Now print the element of c_pointer:
printf("%c",c_pointer[0]);
You can see that it will print 'e' as it is the element of the second adress of the original string str.
In addition, what printf("%s", string) does is to printf every elemet/symbol/char from the starting adress(string) to the end adress where its element is '\0'.
The linked question/answers in the comments pretty much cover this, but saying the same thing a slightly different way helps sometimes.
A string literal is a quoted string assigned to a char pointer. It is considered read only. That is, any attempts to modify it result in undefined behavior. I believe that most implementations put string literals in read-only memory. IMO, it's a shortcoming of C (fixed in C++) that a const char* type isn't required for assigning a string literal. Consider:
int main(void)
{
char* str = "hello";
}
str is a string literal. If you try to modify this like:
#include <string.h>
...
str[2] = 'f'; // BAD, undefined behavior
strcpy(str, "foo"); // BAD, undefined behavior
you're broken the rules. String literals are read only. In fact, you should get in the habit of assigning them to const char* types so the compiler can warn you if you try to do something stupid:
const char* str = "hello"; // now you should get some compiler help if you
// ever try to write to str
In memory, the string "hello" resides somewhere in memory, and str points to it:
str
|
|
+-------------------> "hello"
If you assign a string to an array, things are different:
int main(void)
{
char str2[] = "hello";
}
str2 is not read only, you are free to modify it as you want. Just take care not to exceed the buffer size:
#include <string.h>
...
str2[2] = 'f'; // this is OK
strcpy(str2, "foo"); // this is OK
strcpy(str2, "longer than hello"); // this is _not_ OK, we've overflowed the buffer
In memory, str2 is an array
str2 = { 'h', 'e', 'l', 'l', '0', '\0' }
and is present right there in automatic storage. It doesn't point to some string elsewhere in memory.
In most cases, str2 can be used as a char* because in C, in most contexts, an array will decay to a pointer to it's first element. So, you can pass str2 to a function with a char* argument. One instance where this is not true is with sizeof:
sizeof(str) // this is the size of pointer (either 4 or 8 depending on your
// architecture). If _does not matter_ how long the string that
// str points to is
sizeof(str2) // this is 6, string length plus the NUL terminator.

what is the relation between these char str[10], char *str and char *str[10] in C?

Can I consider *str[10] as two dimensional array ?
If I declare char *str[10]={"ONE","TWO","THREE"} how we can access single character ?
This record
char str[10];
is a declaration of an array with 10 elements of the type char, For example you can initialize the array like
char str[10] = "ONE";
This initialization is equivalent to
char str[10] = { 'O', 'N', 'E', '\0' };
all elements of the array that are not explicitly initialized are zero-initialized.
And you may change elements of the array like
str[0] = 'o';
or
strcpy( str, "TWO" );
This record
char *str;
declares a pointer to an object of the type char. You can initialize it for example like
char *str = "ONE";
In this case the pointer will be initialize by the address of the first character of the string literal.
This record
char * str[10];
is a declaration of an array of 10 elements that has the pointer type char *.
You can initialize it as for example
char * str[10] = { "ONE", "TWO", "THREE" };
In this case the first three elements of the array will be initialized by addresses of first characters of the string literals specified explicitly. All other elements will be initialized as null pointers.
You may not change the string literals pointed to by elements of the array. Any attempt to change a string literal results in undefined behavior.
To access elements of the string literals using the array you can use for example two subscript operator. For example
for ( sisze_t i = 0; str[0][i] != '\0'; ++i )
{
putchar( str[0][i] );
}
putchar( '\n' );
If you want to change strings then you need to declare for example a two dimensional array like
char str[][10] = { "ONE", "TWO", "THREE" };
In this case you can change elements of the array that are in turn one-dimensional arrays as for example
str[0][0] = 'o';
or
strcpy( str[0], "FOUR" );
Yes: char* str[10]; would create an array of 10 pointers to chars.
To access a single character, we can access it like a 2 dimensional array; i.e.:
char* str[10]={"ONE","TWO","THREE"};
char first = str[0][0];
Can I consider *str[10] as two dimensional array ?
It's unclear what you mean. *str[10] is not a valid type name, and the context is a bit lacking to determine how else to interpret it.
If you mean it as an expression referencing the subsequent definition then no, its type is char, but evaluating it produces undefined behavior.
If you are asking about the type of the object identified by str, referencing the subsequent definition, then again no. In this case it is a one-dimensional array of pointers to char.
If I declare char *str[10]={"ONE","TWO","THREE"} how we can access single character ?
You can access one of the pointers by indexing str, among other other ways. For example, str[1]. You can access one of the characters in the string into which that pointer points by using the indexing operator again, among other ways. For example, str[1][0]. That you are then using a double index does not make str a 2D array. The memory layout is quite different than if you declared, say, char str[3][10];.

When does a while loop stop when it reads a string in C?

I'm trying to implement the strcpy function by myself. The original strcpy is part of the the string.h library.
char *strcpy(char *dest, const char *src)
{
assert(dest != NULL && src != NULL);
char *temp = dest;
while (*src)
{
*dest = *src;
src++;
dest++;
}
return temp;
}
void strcpyTest()
{
char source[20] = "aaaaaa";
char dest1[20] = "bbbbbbbbb";
char desta[10]="abcd";
puts(dest1); // bbbbbbbbb
strcpy(dest1, source);
puts(dest1); // aaaaaa
strcpy(desta, source);
puts(desta); // aaaaaa
strcpy(desta, dest1);
puts(desta); // aaaaaa
strcpy(dest1, desta);
puts(dest1); // aaaaaa
strcpy(source, desta);
puts(source); // aaaaaa
}
As you can see, even the first call for the function with a longer dest than the src gives the right result although, by logic, it should give
aaaaaabb and not aaaaaa:
char source[20] = "aaaaaa";
char dest1[20] = "bbbbbbbbb";
strcpy(dest1, source);
puts(dest1);
/** aaaaaa **/
Why does my function work? I would guess that i'll have to manually add the /0 char in the end of *dest* after the while (*src)` will exit.
I mean the whole point of this while (*src) is to exit when it reaches the end of *src* which is the last char in the string which is /0.
Therefore, I would guess i'll have to add this character to *dest* by myself but the code somehow works and copies the string without the manual addition of /0.
So my question is why and how it still works?
When I create a new array, lets say int *arr or char *arr, of 10, i.e char arr[10] or int arr[10] and I initialize only the 2 first indexes, what happens to the values that inside the rest of the indexes? Does they will be filled with zeros or garbage value or what?
Maybe my code works because it filled with zeros and that's why the while loop stops?
For starters you should select another name instead of strcpy.
Let's consider all the calls of your function step by step.
The variable source is declared like
char source[20] = "aaaa";
This declaration is equivalent to the following declaration
char source[20] =
{
'a', 'a', 'a', 'a', '\0', '\0', '\0', '\0', '\0', '\0',
'\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0'
};
because according to the C Standard elements of the array that were not explicitly initialized are implicitly initialized by zeroes.
The variable desta is declared like
char desta[10]="abcd";
This declaration is equivalent to the following declaration
char desta[10]= { 'a', 'b', 'c', 'd', '\0', '\0', '\0', '\0', '\0', '\0' };
So the first call
strcpy(desta, source);
just substitute four characters "abcd" for four characters "aaaa". The result array desta will contain a string because nether terminating zero is overwritten.
After this call
strcpy(desta, dest1);
the array desta will contain the string "bbbbbbbbb" because the last zero character of the array desta is not overwritten by this call.
This call
strcpy(dest1, desta);
in fact is not being changed the array dest1.
In this call
strcpy(source, desta);
as all the zero characters of the array source were not overwritten the array will contain a string.
You could get an unpredictable result if you called at first
strcpy(desta, dest1);
and then
strcpy(desta, source);
because your function does not append a terminating zero to the destination array.
Here is a demonstrative program.
#include <stdio.h>
#include <assert.h>
char * my_strcpy(char *dest, const char *src)
{
assert(dest != NULL && src != NULL);
char *temp = dest;
while (*src)
{
*dest = *src;
src++;
dest++;
}
return temp;
}
int main(void)
{
char source[20] = "aaaaaa";
char dest1[20] = "bbbbbbbbb";
char desta[10]="abcd";
my_strcpy(desta, dest1);
my_strcpy(desta, source);
puts( desta );
return 0;
}
The program output is
aaaaaabbb
That is the desta contains the string "aaaaaabbb" instead of the string aaaaaa.
The updated function could look the following way
char * strcpy(char *dest, const char *src)
{
assert(dest != NULL && src != NULL);
char *temp = dest;
while ( ( *dest++ = *src++ ) );
return temp;
}
Correct that this function will not add a \0 to the end of the dest string. You will need to add a final \0 assignment to dest.
Why does it seem to work as-is?
It "works" because your initialization of dest just happens to place a \0 character at the right point in the string. These cases are honestly "unlucky" as they hide all sorts of problems. Another case where this can happen is if you are running a debug build where memory is automatically set to 0 and therefore the final set bug is hidden.
So my question is why and how it still works?
If you initialize the first 2 values on an array, the rest are considered to be garbage, again exactly as you state. The exception to this would be if the array were "global" or "static". In these cases, the compiler will set them to 0 for you.
Your arrays are padded with zeros, which are equivalent to the null-terminator '\0'. When you initialize an array, any elements not explicitly set will be set the same way as if it were a static variable, which is to say the elements not explicitly initialized will be implicitly initialized to 0. So in this case, your strings just happen to have a null-terminator after you finished your copy because when you initialized the array, all of the values not explicitly set by your initializer were set to 0.
If you copy one a 4-character string into a buffer holding an 8-character string, you'll only see the first 4 characters changed in your destination string while leaving another 4 characters still there before you hit a null-terminator.
From the C11 Standard Working Draft
6.7.9 p21
If there are fewer initializers in a brace-enclosed list than there are elements or members
of an aggregate, or fewer characters in a string literal used to initialize an array of known
size than there are elements in the array, the remainder of the aggregate shall be
initialized implicitly the same as objects that have static storage duration.
So according to the above passage, because you initialized some elements of the array, the elements that you did not explicitly initialize will be treated the same as if it were a static-storage duration object.
So for the rules for statuc storage duration, we have 6.7.9 p10:
If an object that has automatic storage duration is not initialized explicitly, its value is
indeterminate. If an object that has static or thread storage duration is not initialized
explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules,
and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these
rules, and any padding is initialized to zero bits;
The above passage tells us that every member of the aggregate (an array in this case) will be initialized as per the element's rules, and in this case, they are of type char, which is considered an arithmetic type, which the rules states will be initialized to 0.

Why can you have an pointer to array of strings in C

why does
char *names [] = {"hello", "Jordan"};
work fine
but this does not
char names [] = {"hello", "Jordan"};
would appreciate if someone could explain this to me, thank you :).
Here
char *names [] = {"hello", "Jordan"};
names is array of char pointers i.e it can holds pointers i.e names each elements itself is one char array. But here
char names [] = {"hello", "Jordan"};
names is just a char array i.e it can hold only single char array like "hello" not multiple.
In second case like
int main(void) {
char names[] = {"hello", "Jordan"};
return 0;
}
when you compile(Suggest you to compile with -Wall -pedantic -Wstrict-prototypes -Werror flags), compiler clearly says
error: excess elements in char array initializer
which means you can't have more than one char array in this case. Correct one is
char names[] = {'h','e','l','l','o','\0'}; /* here names is array of characters */
Edit :- Also there is more possibility if syntax of names looks like below
char names[] = { "hello" "Jordan" }; /* its a valid one */
then here both hello and Jordan gets joined & it becomes single char array helloJordan.
char names[] = { "helloJordan" };
The first is an array of pointers to char. The second is an array of char and would have to look like char names[] = {'a', 'b', 'c'}
A string literal, such as "hello", is stored in static memory as an array of chars. In fact, a string literal has type char [N], where N is the number of characters in the array (including the \0 terminator). In most cases, an array identifier decays to a pointer to the first element of the array, so in most expressions a string literal such as "hello" will decay to a pointer to the char element 'h'.
char *names[] = { "hello", "Jordan" };
Here the two string literals decay to pointers to char which point to 'h' and 'J', respectively. That is, here the string literals have type char * after the conversion. These types agree with the declaration on the left, and the array names[] (which is not an array of character type, but an array of char *) is initialized using these two pointer values.
char names[] = "hello";
or similarly:
char names[] = { "hello" };
Here we encounter a special case. Array identifiers are not converted to pointers to their first elements when they are operands of the sizeof operator or the unary & operator, or when they are string literals used to initialize an array of character type. So in this case, the string literal "hello" does not decay to a pointer; instead the characters contained in the string literal are used to initialize the array names[].
char names[] = {"hello", "Jordan"};
Again, the string literals would be used to initialize the array names[], but there are excess initializers in the initializer list. This is a constraint violation according to the Standard. From §6.7.9 ¶2 of the C11 Draft Standard:
No initializer shall attempt to provide a value for an object not contained within the entity being initialized.
A conforming implementation must issue a diagnostic in the event of a constraint violation, which may take the form of a warning or an error. On the version of gcc that I am using at the moment (gcc 6.3.0) this diagnostic is an error:
error: excess elements in char array initializer
Yet, for arrays of char that are initialized by an initializer list of char values rather than by string literals, the same diagnostic is a warning instead of an error.
In order to initialize an array of char that is not an array of pointers, you would need a 2d array of chars here. Note that the second dimension is required, and must be large enough to contain the largest string in the initializer list:
char names[][100] = { "hello", "Jordan" };
Here, each string literal is used to initialize an array of 100 chars contained within the larger 2d array of chars. Or, put another way, names[][] is an array of arrays of 100 chars, each of which is initialized by a string literal from the initializer list.
char name[] is an array of characters so you can store a word in it:
char name[] = "Muzol";
This is the same of:
char name[] = {'M', 'u', 'z', 'o', 'l', '\0'}; /* '\0' is NULL, it means end of the array */
And char* names[] is an array of arrays where each element of the first array points to the start of the elements of the second array.
char* name[] = {"name1", "name2"};
It's the same of:
char name1[] = {'n', 'a', 'm', 'e', '1', '\0'}; /* or char name1[] = "name1"; */
char name2[] = {'n', 'a', 'm', 'e', '2', '\0'}; /* or char name2[] = "name2"; */
char* names[] = { name1, name2 };
So basically names[0] points to &name1[0], where it can read the memory until name1[5], this is where it finds the '\0' (NULL) character and stops. The same happens for name2[];

Identification: Is that a string?

I dont know if that a string or an array...
char str4[100] = { 0 };
That code is a string?
If yes what it printing?
I dont know if that a string or an array...
It is definitely an array. It can also be a string since a string is an array of characters terminated by a null character in C.
You can use it as an array:
char str4[100] = { 0 };
str4[0] = 'a';
You can also use it as a string:
if ( strcmp(str4, "ABC") == 0 )
{
// This string contains "ABC"
}
When an array of characters is not a string
You can create an array of characters that cannot be used like a string.
char str[4] = { 'a', 'b', 'c', 'd' };
if ( str[0] == 'a' ) // OK
{
// Do something
}
if ( strcmp(str, "ABC") == 0 ) // Problem. str does not have a null character.
// It cannot be used like a string.
{
}
str4 is an array of char's, so yes: it can be a string. You're initializing it to {0}. This means the first element in the array is being initialized to a terminating nul character (the end of a string), the result being: str4 is a valid, albeit empty, string. Implicitly, the rest of the array will be initialized to 0, too BTW.
Printing this string is the same as printing an empty string:
printf("");
The code you posted is exactly the same as this:
char str4[100] = "";
//or this
char str4[100] = {0, 0, '\0'};//'\0' is the same as 0
//or even
char str4[] = {0, 0, ..., 0};//100 0's is just a pain to write...
Or, in case of a global variable:
char str4[100];
simply because objects that have static storage are initialized to their nul-values (integer compatible types are initialized to 0, pointers to NULL):
If an object that has static storage duration is not initialized explicitly, it is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant.
Either way, the short answer is: str4 is an empty string.
By definition in C a string is a contiguous sequence of characters terminated by and including the first null character. So here you array also represents a string of length 0.
In C a string is just an array of bytes that follows a particular convention, namely that the array of bytes be terminated by a null character. In this case, if you try to print str4 with something like printf, you'll find it looks like an empty string because the first byte is a null character, terminating it immediately.

Resources