Confused why function isn't printing address - c

Could anyone explain why running the following code prints only the newline character?
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv) {
int x = 12;
char *s = (char *) &x;
printf("%s\n", s);
return 0;
}
Since we're casting &x as a string, shouldn't what is printed be the string representation of the address of x (presumably some hexadecimal memory address)?

A string is a sequence of characters, terminated by the special character '\0'. When you print a string using the "%s" format, the printf function takes the address as a base address and prints characters from that base until it finds the terminator. If the "string" isn't actually a string, you have undefined behavior.
If you want to print an address you should use the "%p" format:
printf("Address of variable x is %p\n", (void *) &p);

Your code exhibits undefined behavior because you are trying to print an int's address using %s.
%s in printf family of function is used to print \0 terminated character array or c-type strings
From C11 specs, 7.21.6.1 The fprintf function
(8) %s: If no l length modifier is present, the argument shall be a pointer to
the initial element of an array of character type.280) Characters from
the array are written up to (but not including) the terminating null
character. If the precision is specified, no more than that many bytes
are written. If the precision is not specified or is greater than the
size of the array, the array shall contain a null character.
280) No special provisions are made for multibyte characters
And later
(9) If a conversion specification is invalid, the behavior is
undefined.282) If any argument is not the correct type for the
corresponding conversion specification, the behavior is undefined.
282) See ‘‘future library directions’’ (7.31.11).
One of the many possibilities that may happen is: (I am assuming a lot about the implementation here)
your int appears in memory as following 4 bytes
s (not guaranteed to hold the same address)
s s+1 s+2 s+3 s+4
+---+---+---+---+
| 0 | 0 | 0 | 12|
+---+---+---+---+
&x
or
s (not guaranteed to hold the same address)
s   s+1 s+2 s+3 s+4
+---+---+---+---+
| 12| 0 | 0 | 0 |
+---+---+---+---+
&x
Where 12 or form-feed or \f is a non-printable ascii character and may not print anything on the screen.
When you reinterpret it as char * and print, an empty string is printed followed by the newline. Although this is not guaranteed and anything may happen from crashing to printing indefinitely (or even worse).
Correct way to print an int is:
printf("%d\n", x);

Related

Why the second scanf also changes the variate presented in the first scanf?

I want to input an integer number and a character with scanf funtion, but it didn't work as I want.
The codes are as follows.
#include <stdio.h>
int main()
{
int a;
char c;
scanf("%d",&a);
scanf("%2c",&c);
printf("%d%c",a,c);
return 0;
}
I tried to input 12a (there is a space after a) from the terminal, but the output is not "12a" but "32a".
I also tried to run the code above step by step and found that when it run into the first "scanf", the value of "a" is 12, but when run into second "scanf", the value of "a" turned 32.
I want to figure out why the second scanf changes the value of a, which is not presented.
The problem is that the compiler has put variable a just behind variable c. When you do the second scanf() you specify to read two characters into a variable that has space only for one. You have incurred in a buffer overflow, and have overwritten memory past the variable c (and a happens to be there). The space has been written into a and this is the reason that you get 32 output (a has been stored the value of an ASCII SPACE, wich is 32).
What has happened is known as Undefined Behaviour, and it's common when you make this kind of mistakes. You can solve this by definning an array of char cells with at least two cells for reading the two characters . and then use something like:
#include <stdio.h>
int main()
{
int a;
char c[2];
scanf("%d", &a);
scanf("%2c", c); /* now c is a char array so don't use & */
printf("%d%.2s", a, c); /* use %.2s format instead */
return 0;
}
Note:
the use of %.2s format specifier is due to the fact that c is an array of two chars that has been filled completely (without allowing space to include a \0 string end delimiter) this would cause undefined behaviour if we don't ensure that the formatting will end at the second character (or before, in case a true \0 is found in the first or the second array positions)
Quoting C11, chapter 7.21.6.2, The fscanf function (emphasis mine)
c
[...]If an l length modifier is present, the input shall be a sequence of multibyte characters that begins in the initial shift state. Each multibyte character in the sequence is converted to a wide character as if by a call to the mbrtowc function, with the conversion state described by an mbstate_t object initialized to zero before the first multibyte character is converted. The corresponding argument shall be a pointer to the initial element of an array of wchar_t large enough to accept the resulting sequence of wide characters. [...]
and you're supplying a char *. The supplied argument does not match the expected type of argument, so this is undefined behavior.
Therefore the outcome cannot be justified.
To hold an input like "a ", you'll need a (long enough) char array, a char variable is not sufficient.

What does char a[50][50] mean in C?

I'm working on a homework that has to do with strings.
Here's the code
int main(){
char a[50][50];
int n;
printf("Enter the value of n\n");
scanf("%d",&n);
printf("Enter %d names\n",n);
fflush(stdin);
for(int i=0; i<n; i++){
gets(a[i]);
}
I tried to change the char a[50][50] into char a[50] but the entire program didn't run, came along with this error message: "Invalid conversion from 'char' to '*char'
I don't really understand how this works.
char a[50][50] declares a to be an array of 50 arrays of 50 char.
Then a[0] is an array of 50 char, and so is a[1],a[2]. a[3], and so on up to a[49]. There are 50 separate arrays, and each of them has 50 char.
Since a[0] is an array of 50 char, a[0][0] is a char. In general, a[i][j] is character j of array i.
gets(a[i]) says to read characters from input and put them into a[i]. For this to work, a[i] must be an array of char—gets reads multiple characters and puts them in the array. If a[i] were a single character, gets could not work.
Although gets(a[i]) says to put characters into a[i], it works by passing an address instead of passing the array. When an array is used in an expression other than as the operand of sizeof or the address operator &, C automatically converts it to a pointer to its first element. Since a[i] is an array, it is automatically converted to a pointer to its first element (a pointer to a[i][0]). gets receives this pointer and uses it to fill in characters that it reads from the standard input stream.
char a[50][50] declares a as a 50-element array of 50-element arrays of char. That means each a[i] is a 50-element array of char. It will be laid out in memory like:
+---+
a: | | a[0][0]
+---+
| | a[0][1]
+---+
| | a[0][2]
+---+
...
+---+
| | a[0][49]
+---+
| | a[1][0]
+---+
| | a[1][1]
+---+
...
+---+
| | a[1][49]
+---+
| | a[2][0]
+---+
...
This code is storing up to 50 strings, each up to 49 characters long, in a (IOW, each a[i] can store a 49-character string). In C, a string is a sequence of character values including a 0-valued terminator. For example, the string "hello", is represented as the sequence {'h', 'e', 'l', 'l', 'o', 0}. That trailing 0 marks the end of the string. String handling functions and output functions like puts and printf with the %s specifier need that 0 terminator in order to process the string correctly.
Strings are stored in arrays of character type, either char (for ASCII, UTF-8, or EBCDIC character sets) or wchar_t for "wide" strings (character sets that require more than 8 or so bits to encode). An N-character string requires an array that's at least N+1 elements wide to account for the 0 terminator.
Unless it is the operand of the sizeof or unary & operator, or is a string literal used to initialize an array of character type, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element of the array.
When you call
gets( a[i] );
the expression a[i] is converted from type "50-element array of char" to "pointer to char", and the value of the expression is the address of the first element of the array (&a[i][0])1. gets will read characters from standard input and store them to the array starting at that address. Note that gets is no longer part of the standard C library - it was removed in the 2011 version of the standard because it is unsafe. C does not require any sort of bounds checking on array accesses - if you type in more characters than the target buffer is sized to hold (in this case, 50), those extra characters will be written to memory immediately following the last element of the array, which can cause all sorts of mayhem. Buffer overflows are a popular malware exploit. You should replace the gets call with
fgets( a[i], 50, stdin );
which will read up to 49 characters into a[i] from standard input. Note that any excess characters are left in the input stream.
Also, the behavior of fflush is not defined for input streams2 - there's no good, safe, portable way to clear excess input except to read it using getchar or fgetc.
This is why you got the error message you did when you changed a from char [50][50] to char [50] - in that case, a[i] has type char, not char *, and the value of a[i] is not an address.
Microsoft's Visual Studio C compiler is a notable exception - it will clear excess input from the input stream. However, that's specific to MSVC, and not portable across different compilers. The operation is also a little nonsensical with respect to "flush" semantics.
Basically, in C this signifies an array of length 0 to 50 of that contains the character value of 50 in each cell of the array
That program seems to store n names in the array a. It first asks for the number of names, and then the names. The method char *gets(char *str) stores each different line in an entry of a.
n has 2 dimensions. The first refers to the number of names, and the second is for the length of each name. Something like n[number_of_names][lenght_of_name]
However, it will probably crash if the user provides an n > 50, or if a name contains more than 50 chars.
Also, gets() is dangerous. See this other post.
EDIT: Changing a to one dimensions makes the program try to store a whole line inside a char, hence the error

Why does printf function ignore the latter \0? [duplicate]

This question already has answers here:
What happens if I do printf("one\0two");?
(4 answers)
How to print a string with embedded nulls so that "(null)" is substituted for '\0'
(4 answers)
Closed 3 years ago.
I am stuck with some features that \0 has.
I know that \0 is a null character and it is a term to indicate that the formal is a string.
int j;
j = printf("abcdef\0abcdefg\0");
printf("%d", j);
return 0;
When I tried to print "abcdef\0abcdefg\0" out, C would only print string 'abcdef' and '6' instead of both 'abcdef' and 'abcdefg' which would sum up to 13. Why does this happen?
"abcdef\0abcdefg\0", the string literal, is effectively a static, const (for all intents an purposes) char array and so it has an associated size that the compiler maintains:
#include <stdio.h>
#define S "abcdef\0abcdefg\0"
//^string literals implicitly add a(nother) hidden \0 at the end
int main()
{
printf("%zu\n", sizeof(S)); //prints 16
}
But arrays are treated specially in C and passing them as a parameter to a function or almost any operator converts them to a pointer to their first element.
Pointers do not have an associated size.
When you pass a char const* to a function (e.g., printf), the function receives just one number--the address of the first element.
The way printf and most string functions in C obtain the size is by counting character until the first '\0'.
If you pass a pointer to the first element of a char array that has explicit embedded zeros in it, then for a function that counts until the first '\0', the string effectively ends at the first '\0'.
A string in C is a sequence of characters followed by a NUL character, which is '\0'. There is no separate "length" field.
When a string appears as a literal, such as "hello", what actually gets stored in memory is:
'h', 'e', 'l', 'l', 'o', '\0'
So you can see that if your string itself contains a '\0', as far as any of the C standard library functions are concerned, that's the end of the string.
Once printf see the first '\0' in your string, it stops printing and returns, because that's the end of the format string. printf has no way of knowing that there's another string after the '\0'. Maybe there is--or maybe there's just random other program data in memory after that point. It can't tell the difference.
If you want to actually print the '\0' characters, then you need to have some other way to track the "real" length of the string and use a function that accepts that length as a parameter. Alternately you could add the '\0' characters during the formatting process by specifying %c in the format string and passing 0 as the character value.
Here
j = printf("abcdef\0abcdefg\0"); /* printf stops printing once \0 encounters hence it prints abcdef */
printf() starts printing from base address of string literal "abcdef\0abcdefg\0" i.e from a until first \0 char encounters. So it prints abcdef.
-----------------------------------------------------------------
| a | b | c | d | e | f | \0 | a | b | c | d | e | f | f | g | \0 |
-----------------------------------------------------------------
0x100 0x101 ...............| 0x100 - assume this as base address of the string literal
| |
starts printing when printf sees
from 0x100 memory first \0
location it stops the printing & returns.
And then printf() returns number of printable characters i.e 6.
printf("%d", j); /* prints 6 */
From the manual page of printf
RETURN VALUE
Upon successful return, these functions return the number of
characters printed (excluding the null byte used to end output
to
strings).

Printf prints more than the size of an array

So, I'm rewriting the tar extract command, and I stumbled upon a weird problem:
In short, I allocate a HEADER struct that contains multiple char arrays, let's say:
struct HEADER {
char foo[42];
char bar[12];
}
When I fprintf foo, I get a 3 character-long string, which is OK since the fourth character is a '\0'. But when I print bar, I have 25 characters that are printed.
How can I do to only get the 12 characters of bar?
EDIT The fact that the array isn't null terminated is 'normal' and cannot be changed, otherwise I wouldn't have so much trouble with it. What I want to do is parse the x first characters of my array, something like
char res[13];
magicScanf(res, 12, bar);
res[12] = '\0'
EDIT It turns out the string WAS null-terminated already. I thought it wasn't since it was the most logic possibility for my bug. As it's another question, I'll accept an answer that matched the problem described. If someone has an idea as to why sprintf could've printed 25 characters INCLUDING 2 \0, I would be glad.
You can print strings without NUL terminators by including a precision:
printf ("%.25s", s);
or, if your precision is unknown at compilation time:
printf ("%.*s", length, s);
The problem is that the size of arrays are lost when calling a function. Thus, the fprintf function does not know the size of the array and can only end at a \0.
No, unless you have supplied the precision, fprintf() has no magical way to know the size of the array supplied as argument to %s, it still relies on the terminating null.
Quoting C11, chapter §7.21.6.1, (emphasis mine)
s
If no l length modifier is present, the argument shall be a pointer to the initial
element of an array of character type.280) Characters from the array are
written up to (but not including) the terminating null character. If the
precision is specified, no more than that many bytes are written. If the
precision is not specified or is greater than the size of the array, the array shall
contain a null character.
So, in case your array is not null terminated, you must use a precision wo avoid out of bound access.
void printbar(struct HEADER *h) {
printf("%.12s", h->bar);
}
You can use it like this
struct HEADER data[100];
/* ... */
printbar(data + 42); /* print data[42].bar */
Note that if one of the 12 bytes of bar has a value of zero, not all of them get printed.
You might be better off printing them one by one
void printbar(struct HEADER *h) {
printf("%02x", h->bar[0]);
for (int i = 1; i < 12; i++) printf(" %02x", h->bar[i]);
}

Weird behavior of printf() calls after usage of itoa() function

I am brushing up my C skills.I tried the following code for learning the usage of itoa() function:
#include<stdio.h>
#include<stdlib.h>
void main(){
int x = 9;
char str[] = "ankush";
char c[] = "";
printf("%s printed on line %d\n",str,__LINE__);
itoa(x,c,10);
printf(c);
printf("\n %s \n",str); //this statement is printing nothing
printf("the current line is %d",__LINE__);
}
and i got the following output:
ankush printed on line 10
9
//here nothing is printed
the current line is 14
The thing is that if i comment the statement itoa(x,c,10); from the code i get the above mentioned statement printed and got the following output:
ankush printed on 10 line
ankush //so i got it printed
the current line is 14
Is this a behavior of itoa() or i am doing something wrong.
Regards.
As folks pointed out in the comments, the size of the array represented by the variable c is 1. Since C requires strings have a NULL terminator, you can only store a string of length 0 in c. However, when you call itoa, it has no idea that the buffer you're handing it is only 1 character long, so it will happily keep writing out digits into memory after c (which is likely to be memory that contains str).
To fix this, declare c to be of a size large enough to handle the string you plan to put into it, plus 1 for the NULL terminator. The largest value a 32-bit int can hold is 10 digits long, so you can use char c[11].
To further explain the memory overwriting situation above, let's consider that c and str are allocated in contiguous regions on the stack (since they are local variables). So c might occupy memory address 1000 (because it is a zero character string plus a NULL terminator), and str would occupy memory address 1001 through 1008 (because it has 6 characters, plus the NULL terminator). When you try to write the string "9" into c, the digit 9 is put into memory address 1000 and the NULL terminator is put in memory address 1001. Since 1001 is the first address of str, str now represents a zero-length string (NULL terminator before any other characters). That's why you are getting the blank.
c must be a buffer long enough to hold your number.
Write
char c[20] ;
instead of
char c[] = "";

Resources