Printf functions when referring to strings - c

If
char d[3];
d[0] ='p';
d[1] ='o';
d[2] ='\0';
how come
printf("%s\n",d[0]);
won't work properly.
But if I have
char n[2][4];
n[0][0]=’T’; n[0][1]=’o’; n[0][2]=’m’; n[0][3]=0;
n[1][0]=’S’; n[1][1]=’u’; n[1][2]=’e’; n[1][3]=0;
printf("%s %s\n", n[0],n[1]);
it will print the entire string?

Because
d[0] - is a character
And
n - is and array of and array of characters. I.e. an array of strings

d[0] is the first character contained in the array whereas printf requires the address of that first character.
It's that address that you get when you use d in your source code, or you can explicitly work it out with &(d[0]), the address of the character that's at the address at the start of the array :-).
The reason why your two-dimensional arrays work is exactly the same: n[0] is the address of n[0][0], the same way that d is the address of d[0].
If you were to pass n[0][0] (the character) to printf, you would have the same problem as when you passed d[0].

printf("%s\n",d[0]); is technically undefined behavior. The documentation for printf describes the various conversion specifiers.
s
If no l modifier is present: The const char * argument is expected to
be a pointer to an array of character type (pointer to a string).
Characters from the array are written up to (but not including) a
terminating null byte ('\0');
If you enable warnings, i.e. -Wall, you may get:
warning: format '%s' expects argument of type 'char *', but argument 2 has type
'int' [-Wformat=]
printf("%s\n",d[0]);
For why the second example works, read about array-to-pointer conversions. James McNellis writes:
In both C and C++, an array can be used as if it were a pointer to its
first element. Effectively, given an array named x, you can replace
most uses of &x[0] with just x.
[...]
void f(int* p);
int x[5];
f(x); // this is the same as f(&x[0])
So n[0] is equivalent to &n[0][0], just as d is equivalent to &d[0]. But d is not equivalent to d[0].

When you try and print yor string d, you pass it the first character of the array:
printf("%s\n",d[0]);
d[0] means "The thing stored in index 0 of array d". This is the literal character 'p' and not the beginning of the string.
Passing a character to printf with the string (%s) us undefined behaviour and could cause problems. Don't do it
When printing a string with printf you need to pass a pointer to the string, which is the pointer to the first element of the string (and not the element itself).
&d[0];
^^ ^
|| Index 0
|array d
Address of
This evaluates to the address of the 0th index of array d.
Additionally, we can take advantage of the fact that arrays degrade to pointers of the type.
&d[0] == &(*(d+0)) == d
This means that we can do away with the & and [] operators and just pass in d:
printf("%s\n", d);
This will print the string.
With your second example, having the two dimensional array [][] provides an extra level of indirection (equates to a char **).
This means that when you were calling printf, when you were passing in n[0] and n[1], you were actually passing in the pointer to the first character in each string. i.e. &n[0][0] and &n[1][0].
ideone example

Related

Array of Pointers in C how is the operator [ ] used

Hello I am new to C and its pointers. I thought I understood pointers untill I stumbled upon Array of Pointers.
Why is the output of these two code fragements exactly the same. I would expect that my normal array gives me the values. And the array that holds of pointers gives me the addresse of the values.
char *array[] = {'a','b','c','d'};
for(int i=0; i<4; i++){
printf("%c\n", array[i]);
}
char array[] = {'a','b','c','d'};
for(int i=0; i<4; i++){
printf("%c\n", array[i]);
}
I know that '[]' is used to dereference and get the value of the adress the pointer is pointing at but it is also used to access array elements and the only reasonable explanation here is that it does both at the same time. Is this how I should think about it?
In the first code snippet, you are initializing an array of pointers with character constants. This results in an integer-to-pointer conversion of those constants. So for example the first element of the array contains the address 97 (assuming ASCII encoding).
When you later attempt to print, you are passing a char * where a char is expected. Using the wrong format specifier triggers undefined behavior. One of the ways that UB can manifest is that things appear to work properly which is the case here.
What probably happened is that pointers and integers get passed to functions in the same manner. And if your system uses little-endian byte representation (which it appears it does), it will end up reading the value used to initialize the array.
Regarding the array index operator [], the expression E1[E2] is exactly the same as *((E1) + (E2)). In the first code snippet array[i] has type char * while in the second code snippet it has type char because that is the type of the respective array elements.
Like already explained by dbush, your first example is wrong.
A more practical use case to demonstrate the pointer part with a string:
#include <stdio.h>
int main(int argc, char** argv) {
char *str = "abc"; //string literal - constant!
printf("%p address of str[0] - str[0] = %c \n", str, *str);
printf("%p address of str[1] - str[1] = %c \n", str+1, *(str+1));
printf("%p addredd of str[2] - str[2] = %c \n", &str[2], str[2]); //alternative notation
return 0;
}
Also there are several similarities between arrays and pointers, for the string example
you can not change the string (the string literal is constant), but you can change the pointer to another string. Other with an array, where you can change the content, but you can not iterate over the array with ++arr, like you can do with the pointer notation, nor can you change the address of arr. Another fact, the array name is the address of the array. And yeah its the same address as the first array element.
With time you will find several useful cases for arrays of pointers, because you already
know, that switching the pointer (the adresses of the object the pointer points to) often provide a more performant method than using the objects themselves...
I would expect that my normal array gives me the values. And the array that holds of pointers gives me the addresse of the values.
You put the same data into each array, so it's not surprising that you got the same data out of them. Specifically:
char *array[] = {'a','b','c','d'};
In this case, you created an array containing four values, and you told the compiler that those values are of type char *. But whatever their type, they're still the same values, 'a', 'b', and so on. You might as well have written:
char *array[] = {97, 98, 99, 100);
because that's exactly the same thing. Specifying the type of the contents of the array as char * doesn't make the compiler treat the values in the array differently -- it doesn't say "oh, I guess in this case I should take the addresses of those values instead of using the values themselves" just because the type is char *. If you want an array of pointers to some set of values, you'll need to get those pointers yourself using other means, such as the & operator, allocating space for each one, etc.
Note: In a comment below, M.M. points out that using values of type char to initialize an array of type char * isn't allowed by the C standard. In my experience, compilers typically warn about this kind of thing (you should've gotten several warnings like warning: incompatible integer to pointer conversion initializing 'char *' with an expression of type 'int' when you compiled your code), but they'll still soldier on and compile the code. In summary: 1) Don't do that, and 2) different compilers may do different things in this situation.
I know that '[]' is used to dereference and get the value of the address the pointer is pointing at but it is also used to access array elements and the only reasonable explanation here is that it does both at the same time.
An array in C is a single contiguous piece of memory in which a number of values, all of the same type and size, are arranged one after another. The [] operator accesses individual values within the array by calculating an offset from the array's base address and using that to get the value. I think the thing that's confusing in your example is that you've created an array of char *, but with values that look like char. As far as the compiler is concerned, though, 'a' (a.k.a. 97, a.k.a. 0x61) is an acceptable value for a pointer to a character, and you could dereference that and get whatever character is stored at location 0x61. (In reality, doing that might cause an exception; the lowest region of memory is reserved on many machines.)
I know that [] is used to dereference and get the value of the adress the pointer is pointing at but it is also used to access array elements and the only reasonable explanation here is that it does both at the same time. Is this how I should think about it?
No. [] is used (in the place you use it) to indicate an array in which the initializer will state the number of elements the array has. Derreferencing a pointer is done with the * left unary operator.
The expression
char *array[] = ...
is used to declare an array of pointers to chars, that will be initialized (normally) with a list of string literals, like these:
char *array[] = { "First string", "Second string", "third string" };
and
char array[] = { 'a', 'b', 'c', 'd' };
making use of the information given above, declares an array of characters with space for four characters, and initialized with array[0] = 'a', array[1] = 'b', array[2] = 'c', array[3] = 'd'.
So the first example will give you an error when trying to initialize an array cell (of pointer to char type) with an integer value (a char is a type of integer) and the second example will compile and execute fine.
The reason you get the same output in both case is unknown(Undefined Behaviour) (well I have some idea) because you have assigned to a pointer variable an integer value (this is probably the thing as the value stored is small enough that can be reinterpreted back without losing information) but the thing is that you have stored the same thing in the array cells, so why do you ask why the output is the same..... what did you expect? You store a char value in a pointer variable and then get it back and print in both samples with exactly the same format specifier, why do you expect (except for conversion errors back to an integer) both codes printing different things? (well, it's undefined behaviour, so you can expect anything and the computer can do otherwise)
There are two compiler warnings you elegantly skip:
warning: initialization of 'char *' from 'int' makes pointer from
integer without a cast [-Wint-conversion]
6 | char *array[] = {'a','b','c','d'};
| ^~~
And the second warning:
warning: format '%c' expects argument of type 'int', but argument 2
has type 'char *' [-Wformat=]
8 | printf("%c\n", array[i]);
| ~^ ~~~~~~~~
| | |
| int char *
| %s
The suggested %s of course would not help. To get valid char-pointers into array[] it would take a "..." string literal, a malloc or a & adress-of on char variable.

Are pointers or memory addresses can be parameters in printf() function?

I got confused making a printing function.
void Printing(int* pi, char* pa)
{
printf("%d", *pi);
printf("%s", *pa);
}
Code above has an error in 2nd printf().
But code below doesn't have. It prints the integer and string well.
void Printing(int* pi, char* pa)
{
printf("%d", *pi);
printf("%s", pa);
}
So far, I gave variables to printf(). But I don't understand why I need to give the pointer to the 2nd printf().
In your code
printf("%s", *pa);
should be
printf("%s", pa);
as %s expects the starting address of a null-terminated character array (i.e., a pointer, not the char as you have supplied).
From C11, chapter 7.21.6.1 The fprintf
s
If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type. Characters from the array are written up to (but not including) the terminating null character. [...]
To add, *pa is same as pa[0], which is of type char. To print that, you'd need to use %c conversion specifier.
But I don't understand why I need to give the pointer to the 2nd printf().
Because strings work a little weirdly in C. Technically, there is no type for strings in C. So const char */char * is used for strings. The way this works is that the pointer points to the beginning of the string, and the string ends with a NUL character '\0'. To visualize it, say you call Printing with Printing(0, "Hello");, you pass a pointer to the beginning of a string literal which looks like this in memory:
+---+---+---+---+---+---+
| H | e | l | l | o |END|
+---+---+---+---+---+---+
And the pointer you pass points to the first character, H. If you understand this, you will understand why it needs a pointer. If you dereference it, you will only give the first character H, so it won't be able to print the whole string.

What is different between array String and common array in C?

#include<stdio.h>
int main()
{
char str[7];
scanf("%s",&str);
for(i=0; i<7; i++)
{
printf("%x(%d) : %c\n",&str[i], &str[i], str[i]);
}
printf("\n\n%x(%d) : %c or %s",&str, &str, str, str);
return 0;
}
I'm confused about pointer of C Array because of Array with String.
Actually I want to save each character for single line input.
It is worked but I found something strange...
The main issue is &str and &str[0] have same address value.
But str have String Value with %s..
str[0] have Char Value with %c..
I used str with %c then it has first two numbers of str's address.
What is going on in Array..?
Where is real address for Stirng value??
And how can scanf("%s",&str) distribute String to each char array space?
Input : 123456789
62fe40(6487616) : 1
62fe41(6487617) : 2
62fe42(6487618) : 3
62fe43(6487619) : 4
62fe44(6487620) : 5
62fe45(6487621) : 6
62fe46(6487622) : 7
62fe40(6487616) : # 123456789
This is result window of my code.
You are confused because the string and the array are the same thing. - In the memory there are only data (and pointers to that data)
When you allocate an integer or a buffer for a string you reserve some of this memory. Strings in c is defined as a sequence of bytes terminated by one byte with the value 0 - The length is not known. With the fix length array you have a known size to work with.
The real value to the string is the pointer to the first character.
When you print with %c it expects a char - str[0] not the pointer - When you print with %s it expects a pointer to a sequence of chars.
printf("\n\n%x(%d) : %c or %s",&str, &str, str[0], str);
What is different between array String and common array in C?
An array is a contiguous sequence of objects of one type.1
A string is a contiguous sequence of characters terminated by the first null character.2 So a string is simply an array of characters where we mark the end by putting a character with value zero. (Often, strings are temporarily held in larger arrays that have more elements after the null character.)
So every string is an array. A string is simply an array with two extra properties: Its elements are characters, and a zero marks the end.
&str and &str[0] have same address value.
&str is the address of the array. &str[0] is the address of the first element.
These are the same place in memory, because the first element starts in the same place the array does. So, when you print them or examine them, they will often appear the same. (Addresses can have different representations, the same way you might write “200” or “two hundred” or “2•102” for the same number. So the same address might sometimes look different. In most modern systems, an address is just a simple number for a place in memory, and you will not see differences. But it can happen.)
printf("%x(%d) : %c\n",&str[i], &str[i], str[i]);
This is not a correct way to print addresses. To print an address properly, convert it to void * and use %p3:
printf("%p(%p) : %c\n", (void *) &str[i], (void *) &str[i], str[i]);
printf("\n\n%x(%d) : %c or %s",&str, &str, str, str);
…
I used str with %c then it has first two numbers of str's address.
In the above printf, the third conversion specification is %c, and the corresponding argument is str. %c is intended to be used for a character,4 but you are passing it an argument that is a pointer. What may have happened here is that printf used the pointer you passed it as if it were an int. Then printf may have used a part of that int as if it were a character and printed that. So you saw part of the address shown as a character. However, it is a bit unclear when you write “it has the first two numbers of str's address”. You could show the exact output to clarify that.
Although printf may have used the pointer as if it were an int, the behavior for this is not defined by the C standard. Passing the wrong type for a printf conversion is improper, and other results can occur, including the program printing garbage or crashing.
And how can scanf("%s",&str) distribute String to each char array space?
The proper way to pass str to scanf for %s is to pass the address of the first character, &str[0]. C has a special rule for arrays like str: If an array is used in an expression other than as the operand of sizeof or the address-of operator &, it is converted to a pointer to its first element.5 So, you can use scanf("%s", str), and it will be the same as scanf("%s", &str[0]).
However, when you use scanf("%s",&str), you are passing the address of the array instead of the address of the first character. Although these are the same location, they are different types. Recall that two different types of pointers to the same address might have different representations. Because scanf does not have knowledge of the actual argument type you pass it, it must rely on the conversion specifier. %s tells scanf to expect a pointer to a character.6 Passing it a pointer to an array is improper.
C has this rule because some machines have different types of pointers, and some systems might pass different types of pointers in different ways. Nonetheless, often code that passes &str instead of str behaves as the author desired because the C implementation uses the same representation for both pointers. So scanf may actually receive the pointer value that it needs to make %s work.7
Footnotes
1 C 2018 6.2.5 20. (This means the information comes from the 2018 version of the C standard, ISO/IEC 9899, Information technology—Programming Languages—C, clause 6.2.5, paragraph 20.)
2 C 2018 7.1.1 1. Note that the terminating null character is considered to be a part of the string, although it is not counted by the strlen function.
3 C 2018 7.21.6.1 8.
4 Technically, the argument should have type int, and printf converts it to unsigned char and prints the character with that code. C 2018 7.21.6.1 8.
5 C 2018 6.3.2.1 3. A string literal used to initialize an array, as in char x[] = "Hello";, is also not converted to a pointer.
6 C 2018 7.21.6.2 12.
7 Even if a C implementation uses the same representations for different types of pointers, that does not guarantee that using one pointer type where another is required will work. When a compiler optimizes a program, it relies on the program’s author having obeyed the rules, and the optimizations may change the program in ways that would not break a program that followed the rules but that do break a program that breaks the rules.
String is only some kind of the shorthand of the zero terminated char array. So there is no difference between the string and the "normal" array.
Where is real address for Stirng value??
Arrays are not pointers and they only decay to pointers. So there is no physical space in the memory where the address of the first element of the array is stored.
The main issue is &str and &str[0] have same address value.
It is not the issue - array is the chunk of memory. So the address of this chunk is the same as the address of its first element. The types are different.

Does printf only allow taking a string literal as the first argument?

According to the definition of printf, it says that first argument should be an array i.e char* followed by ellipses ... i.e variable arguments after that. If I write:
printf(3+"helloWorld"); //Output is "loWorld"`
According to the definition shouldn't it give an error?
Here is the definition of printf:
#include <libioP.h>
#include <stdarg.h>
#include <stdio.h>
#undef printf
/* Write formatted output to stdout from the format string FORMAT. */
/* VARARGS1 */
int __printf(const char *format, ...) {
va_list arg;
int done;
va_start(arg, format);
done = vfprintf(stdout, format, arg);
va_end (arg);
return done;
}
#undef _IO_printf
ldbl_strong_alias(__printf, printf);
/* This is for libg++. */
ldbl_strong_alias(__printf, _IO_printf);
This is not an error.
If you pass "helloWorld" to printf, the string literal is converted to a pointer to the first character.
If you pass 3+"helloWorld", you're adding 3 to a pointer to the first character, which results in a pointer to the 4th character. This is still a valid pointer to a string, it's just not the whole string that was defined.
3+"helloWorld" is of char * type (after conversion in call to printf). In C, the type of a string literal is char []. When passed as an argument to a function, char [] will convert to pointer to its first element (array to pointer conversion rule). Therefore, "helloWorld" will be converted to a pointer to the element h and 3+"helloWorld" will move the pointer to the 4th element of the array "helloWorld".
From Pointer Arithmetic:
If the pointer P points at an element of an array with index I, then
P+N and N+P are pointers that point at an element of the same array with index I+N
P-N is a pointer that points at an element of the same array with index {tt|I-N}}
The behavior is defined only if both the original pointer and the result pointer are pointing at elements of the same array or one past the end of that array. ....
The type of string literal is char[N], where N is size of string (including null terminator).
From C Standard#6.3.2.1p3 [emphasis mine]
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ''array of type'' is converted to an expression with type ''pointer to type'' that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.
So, in the expression
3+"helloWorld"
"helloWorld", which is of type char [11] (array of characters), convert to pointer to character that points to the initial element of the array object.
Which means, the expression is:
3 + P
where P is pointer to initial element of "helloWorld" string
----------------------------------------------
| h | e | l | l | o | W | o | r | l | d | \0 |
----------------------------------------------
^
|
P (pointer pointing to initial element of array)
when 3 gets added to pointer P, the resulting pointer will be pointing to 4th character:
----------------------------------------------
| h | e | l | l | o | W | o | r | l | d | \0 |
----------------------------------------------
^
|
P (after adding 3 the resulting pointer pointing to 4th element of array)
This resulting pointer of expression 3+"helloWorld" will be passed to printf(). Note that the first parameter of printf() is not array but pointer to a null-terminated string and the expression 3+"helloWorld" resulting in pointer to 4th element of "helloWorld" string. Hence, you are getting output "loWorld".
The answers from dbush and haccks are concise and illuminating, and I upvoted the one from haccks in support of the bounty offered by dbush.
The only thing that I find left unsaid is that the way the question title is phrased makes me wonder if the OP would think that e.g. this also should produce an error:
char sometext[] = {'h', 'e', 'l', 'l', 'o', '\0'};
printf (sometext);
since there is no string literal involved at all. The OP needs to understand that one should never think that a function call that takes a char * argument can "only allow taking a string literal as [that] argument".
The answers from dbush and haccks hint at this by mentioning the conversion of a string literal to a char * (and how adding an integer to that evaluates), but I feel that it's worth pointing out explicitly that anything that is treated as a char * can be used, even things not converted from a string literal.
printf(3+"helloWorld"); //Output is "loWorld"
It will not give error because String in C Language String is array of characters and Array name give base address of array.
In case of printf(3+"helloWorld"); 3+"helloWorld" is giving the address of fourth element of array of charecters i.e of String
This is still a valid pointer to a string i.e char*
printf only allow taking a char* as the first argument
The first argument to printf is declared as const char *format. It means printf should be passed a pointer to char and the characters pointed to by this pointer will not be changed by printf. There are additional constraints on this first argument:
it should point to a proper C string, that is an array of characters terminated by a null byte.
it may contain conversion specifiers, which must be properly constructed and the corresponding arguments must be passed as extra arguments to printf, with the expected types and order as derived from the format string.
Passing a string constant such as "helloWorld" as the format argument is the most common way to invoke printf. String constants are arrays of char terminated by a null byte which should not be modified by the program. Passing them to functions expecting a pointer to char will cause a pointer to their first byte to be passed, as is the case for all arrays in C.
The expression "helloWorld" + 3 or 3 + "helloWorld" evaluates to a pointer to the 4th byte of the string. It is equivalent to the expression 3 + &("helloWorld"[0]), &("helloWorld"[3]) or simply &"helloWorld"[3]. As a matter of fact, it is also equivalent to &3["helloWorld"] but this latest form is only used for pathological obfuscation.
printf does not use the bytes that precede the format argument, so passing 3 + "helloWorld" is equivalent to passing "loWorld" and produces the same output.
To be more precise, the first argument to printf() function is not an array of chars but a pointer to array of chars. the difference between them is the same difference between byval and byref in VB world.
a pointer can be incremented and decremented using (++ and --) or applying arithmetic operations (+ and -).
in your case you are passing a pointer to "helloWorld" incremented by three, thus it points to the forth element of the array of chars "helloWorld".
lets simplify this in asm pseudo code
MOV EAX, offset ("helloWorld")
ADD EAX, 3
PUSH EAX
CALL printf
you maybe think that 3+"hello word" do concatenation between 3 and "hello world", but in C concatenation is done otherwise. the simplest way to do is sprintf(buff, "%d%s",3,"hello wrord");
It's pretty strange, but not wrong. By doing 3 + you are moving your pointer to a different location.
The same thing work when you initialize a char *:
char *str1 = "Hello";
char *str2 = 2 + str1;
str2 is now equal to "llo".

Assign pointer type to string type

I'm expecting a compile error , taking into account that a pointer has to be assigned in %p, but the codes below doesn't give me error when i intentionally assign a pointer to %s. By adding an ampersand &, by right it should generate the address of the array and assign the memory address into %p, instead of giving the value of the string. Unless I dereference the pointer, but I don't dereference the pointer at all, I never put an asterisk * in front of my_pointer in printf.
#include <stdio.h>
int main()
{
char words[] = "Daddy\0Mommy\0Me\0";
char *my_pointer;
my_pointer = &words[0];
printf("%s \n", my_pointer);
return 0;
}
please look at this :
printf("%s \n", my_pointer);
My understanding is , *my_pointer (with asterisk *)should give me the value of the string.
But my_pointer (without asterisk) shouldn't give me the value of the string, but it should give me only the memory address,but when I run this code, I get the value of string eventhough I didn't put the asterisk * at the front. I hope I'm making myself clear this time.
Here:
printf("%s \n", my_pointer);
%s, expects a char* and since my_pointer is a char* which points to an array holding a NUL-terminated string, the printf has no problems and is perfectly valid. Relevant quote from the C11 standard (emphasis mine):
7.21.6.1 The fprintf function
[...]
The conversion specifiers and their meanings are:
[...]
s - If no l length modifier is present, the argument shall be a pointer to the initial
element of an array of character type. 280) Characters from the array are
written up to (but not including) the terminating null character. If the
precision is specified, no more than that many bytes are written. If the
precision is not specified or is greater than the size of the array, the array shall
contain a null character.
[...]
IMO, You are being confused here:
taking into account that a pointer has to be assigned in %p, but the codes below doesn't give me error when i intentionally assign a pointer to %s
First of all, %s, %p etc are conversion specifiers. They are used in some functions like printf, scanf etc.
Next, you are the one specifying the type of the pointers. So here:
my_pointer = &words[0];
&words[0] as well as my_pointer is of type char*. Assigning these two is therefore perfectly valid as both are of the same type.
The compiler is treating your code exactly as it is required to.
The %s format specifier tells printf() to expect a const char * as the corresponding argument. It then deems that pointer to be the address of the first element of an array of char and prints every char it finds until it encounters one with value zero ('\0').
Strictly speaking, the compiler is not even required to check that my_pointer is, or can be implicitly converted to, a const char *. However, most modern compilers (assuming the format string is supplied at compile time) do that.
In c, array name is also pointer to the first element, means in your case words and &words[0] when as a pointer, they have the same value.
And, you assign it to another pointer of the same type, so this is legal.
About string in c, it's just an array of chars ending with '\0', with its name pointer to the first char.

Resources