Unexpected output in C program when printing - c

Assuming that all the header files are included.
void main() {
int i;
clrscr();
printf("india"-'A'+'B');
getch(); }
The Output of the following function is : ndia
Can anyone please explain me this output?

int printf(const char *restrict format, ...);
When you do format - 'A' + 'B' , it's equal to format + 1 considering the ASCII values of A and B.
format is the base address and when you do format + 1 it points to second memory location of this character string and from there it starts printing which is ndia.

The expression "india"-'A'+'B' does not make sense in any sane code. But it will (but see below) result in a pointer to the second element, because 'B' - 'A' will evaluate to 1.
However, the subexpression "india"-'A' will invoke undefined behavior, because the resulting pointer will point outside the array. This is explained here: Why is pointing to one before the first element of an array not allowed in C?
If we rewrite a bit and add parenthesis: "india"+('B'-'A') the expression is well defined and equal to "india" + 1

Related

Why the second scanf also changes the variate presented in the first scanf?

I want to input an integer number and a character with scanf funtion, but it didn't work as I want.
The codes are as follows.
#include <stdio.h>
int main()
{
int a;
char c;
scanf("%d",&a);
scanf("%2c",&c);
printf("%d%c",a,c);
return 0;
}
I tried to input 12a (there is a space after a) from the terminal, but the output is not "12a" but "32a".
I also tried to run the code above step by step and found that when it run into the first "scanf", the value of "a" is 12, but when run into second "scanf", the value of "a" turned 32.
I want to figure out why the second scanf changes the value of a, which is not presented.
The problem is that the compiler has put variable a just behind variable c. When you do the second scanf() you specify to read two characters into a variable that has space only for one. You have incurred in a buffer overflow, and have overwritten memory past the variable c (and a happens to be there). The space has been written into a and this is the reason that you get 32 output (a has been stored the value of an ASCII SPACE, wich is 32).
What has happened is known as Undefined Behaviour, and it's common when you make this kind of mistakes. You can solve this by definning an array of char cells with at least two cells for reading the two characters . and then use something like:
#include <stdio.h>
int main()
{
int a;
char c[2];
scanf("%d", &a);
scanf("%2c", c); /* now c is a char array so don't use & */
printf("%d%.2s", a, c); /* use %.2s format instead */
return 0;
}
Note:
the use of %.2s format specifier is due to the fact that c is an array of two chars that has been filled completely (without allowing space to include a \0 string end delimiter) this would cause undefined behaviour if we don't ensure that the formatting will end at the second character (or before, in case a true \0 is found in the first or the second array positions)
Quoting C11, chapter 7.21.6.2, The fscanf function (emphasis mine)
c
[...]If an l length modifier is present, the input shall be a sequence of multibyte characters that begins in the initial shift state. Each multibyte character in the sequence is converted to a wide character as if by a call to the mbrtowc function, with the conversion state described by an mbstate_t object initialized to zero before the first multibyte character is converted. The corresponding argument shall be a pointer to the initial element of an array of wchar_t large enough to accept the resulting sequence of wide characters. [...]
and you're supplying a char *. The supplied argument does not match the expected type of argument, so this is undefined behavior.
Therefore the outcome cannot be justified.
To hold an input like "a ", you'll need a (long enough) char array, a char variable is not sufficient.

Difference between printf("%c",*(*(ptr+i)+x)) and printf("%s",*(*(ptr+i)+x))

I've an array of pointer, When I'm trying this C code getting an error of segmentation fault. What I'm doing wrong here?
char *ptr[] = {"exam","example","testexample"};
printf("%c\n",*(*(ptr+2)+7));
printf("%s\n",*(*(ptr+2)+7));
In output of First print statement giving the expected result
printf("%c\n",*(*(ptr+2)+7));
m
but the second one instead of giving output of
mple
is giving
printf("%s\n",*(*(ptr+2)+7)); Segmentation fault (core dumped)
What I'm doing wrong here?
The type of the expression
*(*(ptr+2)+7)
is char. So the first call of printf is correct.
But the second call is incorrect because the format specifier %s expects an argument of the type char *. So the value of the character obtained by the expression *(*(ptr+2)+7) that is the character 'm' (that for example in ASCII has the value 100) is interpreted as an address.
In the second call just use
*(ptr+2)+7
Here is a demonstrative program
#include <stdio.h>
int main(void)
{
char *ptr[] = { "exam", "example", "testexample" };
printf( "%c\n", *( *( ptr + 2 ) + 7 ) );
printf( "%s\n", *( ptr + 2 ) + 7 );
return 0;
}
Its output is
m
mple
Your argument *(*(ptr+2)+7) evaluates to one character with the value 'm', which (on an ASCII-based platform) is the same as the number 109.
When you do
printf("%s\n",*(*(ptr+2)+7));
this is exactly the same as
printf("%s\n",'m');
It gets compiled to machine code that loads the number 109into a register and pushes that register on the stack. It doesn't tellprintfanything about _where it found_ that'm'` -- only that one raw ASCII value gets passed to the function.
printf then tries to interpret the 109 not as a character itself but as a pointer to some characters -- because that's what you asked it to do by writing %s. This, unsurprisingly, goes horribly wrong, since 109 is not the address of anything the program is allowed to access.
If you want to print the tail end of the string, you could instead write
printf("%s\n", *(ptr+2)+7 );
where you don't apply the * operator to the pointer you want to pass.
((ptr+2)+7) is a char value
So the first statement is correct since "%c" tells printf that first parameter is a char:
printf("%c\n",*(*(ptr+2)+7));
But in the second statement , "%s" tells printf that first parameter is (char *), that is a pointer to a char.
That way, when program executes
printf("%s\n",*(*(ptr+2)+7));
looks at the value of ((ptr+2)+7) ( a char value, in this case 'm' ) as if it where a char pointer.
That is the reason of the segmentation fault.
So the corrected secon statement would be
printf("%s\n",*(ptr+2)+7);
Hope this helps.
The problem is the use of the expression *(*(ptr+2)+7) - it is hard to see what it does. That construct is used mostly by beginners who do not know C intimately yet. That is why C has a syntactic sugar for it: *(*(ptr+2)+7) is exactly equivalent to ptr[2][7], which is the form you should be using.
Now, ptr[2][7] clearly is a char, but %s expects a pointer to a char, the first character in a null-terminated string, so let's pass in a pointer to that character:
printf("%s\n", &ptr[2][7]);

Scanf a Character into an Array in C

I'm trying to take a single character in an array and then print that character using a specific syntax. Here's my code :
int main(){
char in[18];
scanf("%c",in);
printf("%c",in);
return 0;
}
I know how to take a character from user in C & many other ways to do the same task but I'm curious to know Why this code prints nothing on the screen. Here's my explanation for this code. Kindly correct me if wrong.
First of all array of 18 characters is declared.
Using scanf, Character is stored in the 1st position of array.("in" refers to the address of its first element.)
Then when I'm trying to print that character, It prints nothing.
When I changed "in" to "in[0]" then Character prints on the screen.
I think "in" also points to the 1st element as well as in[0] too. Then Why I'm getting two different answers. ?
Thanks In Advance !!
in[0] does not point to the first element in the array. It is the first element in the array.
in has type char * (when passed to a function) while in[0] has type char. And the %c format specifier to printf expects a char, not a char *.
Your code invokes undefined behavior, the compiler might be warning about the fact that the "%c" specifier expects a char (rigorously speaking it expects an int parameter that is after converted to unsigned char) parameter but you passed a char * (an array of char).
To make it print the character use
printf("%c", in[0]);
Passing the wrong type for a given format specifier in both printf() and scanf() is undefined behavior.

C: Why does C need the memory address of a char in order to convert it to an int? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Coming from Python, where I would simply use type() to find out the type of an object, C, lacking introspection, is forcing me to better grasp its data types, their relatedness, and pointers, before moving on to more advanced topics. This is a good thing. So I have the following piece of code, which I will tweak in various ways and try understand the resulting behavior:
int main(int argc, char *argv[])
{
int i = 0;
for(i = 0; argv[1][i] != '\0'; i++) {
printf("%d\n", argv[1][i]);
char letter = argv[1][i];
switch(letter) {
case 2:
printf("%d: 2\n", i);
break;
If I run this and pass the number 2 as a single argument, nothing happens. My understanding then is that because I have defined argv1[i] as a char, comparing it to 2 (an int) will return false, hence the code does not get called. And indeed, if I change the case from 2 to '2', the code does get called. Makes sense, but it leads me to my first question:
Q1. I have read in various places that to C, a character and an integer are essentially the same thing. So why doesn't C "know" that the 2 passed as an argument should be interpreted as an integer and not a string? After all, C allows me to, for example, use the %d string formatter for characters.
If I then change the type of variable letter from a char to an int:
int letter = argv[1][i];
... I still get the same behavior as in the first variant (i.e. nothing happens), even though now I am apparently comparing an int to and int in the case statement. This leads me to surmise that although I am defining letter now as an int, C is still reading it in as a char on the command line, and just calling it an int isn't enough to change it's type from the point of view of subsequent program flow.
Q2. Is the above reasoning correct?
So now I figure that if I change the type of letter to an int using atoi(), things should go OK. So:
int letter = atoi(argv[1][i]);
When I now try compile, I get:
Switch.c:14:27: warning: incompatible integer to pointer conversion passing
'char' to parameter of type 'const char *'; take the address with &
[-Wint-conversion]
int letter = atoi(argv[1][i]);
^~~~~~~~~~
&
/usr/include/stdlib.h:132:23: note: passing argument to parameter here
int atoi(const char *);
I then look up the documentation for atoi() and see that it can only be used to converted a string (a more precisely, a const char *), not a character. I would have though that since a char * is just a sequence of chars, that atoi() would work with both. And apparently there is no equivalent of atoi() for a char, rather only workarounds, such as the one described here.
Anyway, I decide the take the warning's instructions, and place an ampersand before the value (knowing that this implies a memory address, but not yet knowing why it is being suggested). So:
int letter = atoi(&argv[1][i]);
When I do so, it compiles. And now the program in this final form - with letter defined as an int, with the case statement comparing to an int, and with atoi being passed the address rather than value of argv[1][i] - runs successfully.
But I don't know why, so I strip this down to test the values of argv[1][i] and &argv[1][i] by printing them. I observe that the program will only compile if I use the %s string formatter to print &argv[1][i], as it tells me that &argv[1][i] is a char *.
Q3. Why is &argv[1][i], an address in memory, a char *?
In my printout, I observe that the values of &argv[1][i] and argv[1][i] are the same, namely: 2. So:
Q4. Why didn't the compiler allow me to use argv[1][i], if its value is no different to that of &argv[1][i]?
Q5. Any time I've printed a memory address in a C program, it has always been some long number such as 1246377222. Why is the memory address == the value in this case?
No doubt there will be someone objecting that this mammoth post should be split into separate posts with separate questions, but I think the flow of trial-and-error, and the answers you provide, will help not only me but others looking to understand these aspects of C. Also feel free to suggest a better title for the post.
Many thanks.
Your misunderstanding is that 2 != '2'. They both have integral values but those values are different from one another. The ascii value of '2' does not equal 2 it equals 50. This means that int a = '2'; causes a to evaluate to 50. The standard way of converting integral chars to a numeric value is writing int a = '2' - '0'; This will cause a to evaluate to 2.
argv is an array of char *. This means that argv[j][i] is a char and &argv[j][i] is a char * it is the address of the character at location argv[j][i]. This means that atoi(&argv[j][i]) will compile but I am not sure it is doing what you expect because it will try to translate the entire string starting at argv[j][i] into a number instead of only the specific character at argv[j][i].
Q1: char and int are "the same" only in the sense that both are (signed - usually) integers. There are multiple differences, for example char is (usually) 1 byte long, while int is (usually) at least 4 bytes long.
Q2: your reasoning is wrong, because you compare an ASCII code of letter '2' (which is 50) with a number 2 (which has no visual representation in ASCII).
Q3: you made a mistake in your debugging - &argv[1][i] is an address of i-th character in the argv[1] string. So essentially this is a pointer. In your debugger you probably saw the character that was "pointed to".
Q4: see above - argv[1][i] is '2', while &argv[1][i] is an address in memory where this '2' can be found.
Q5: you probably made a mistake in debugging - see answer to Q3.
A string in C is a sequence of characters with a null terminator. It will always be referenced by address (of the first character), because its length is variable.
Q4. Why didn't the compiler allow me to use argv[1][i], if its value is no different to that of &argv[1][i]?
They are different
&argv[1][i] is a pointer to a memory position and
argv[1][i] is the value of the char in that position
It is just that
printf("%s", &argv[1][i]); // Prints the c-string at memory position
printf("%c", argv[1][i]); // Prints the char
I assume that when you say "printed" you mean the printf() function.
For Q5, as you say you come from Python, the id in implemented in C Python as the address of the variable. And even in Python, the id of a numeric variable is not the value of the variable.

Not able to understand Obfuscated C code

I am not able to understand this. Please explain.
Edit: It prints: 'hello, world!'
#include <stdio.h>
int i;
main()
{
for(;i["]<i;++i){--i;}"];read('-'-'-',i+++"hell\o, world!\n",'/'/'/'));
//For loop executes once, calling function read with three arguments.
}
read(j,i,p)
{
write(j/p+p,i---j,i/i); //how does it work? like printf?
}
Breaking is down you have:
for({initial expr};{conditional expr};{increment expr})
The '{initial expr}' is blank so it does nothing.
The '{conditional expr}' is 'i["]<i;++i){--i;}"]'
which is the same as
"]<i;++i){--i;}"[i]
or
const char* str = "]<i;++i){--i;}";
for (; str[i]; )
so it's looping until the expression is false (i.e. is hits the null at the end of the string).
The {increment expr} is
read('-'-'-',i+++"hell\o, world!\n",'/'/'/')
If you break that down the read parameters you have:
'-' - '-' == char('-') - char('-') == 0
For parameter two you have:
i+++"hell\o, world!\n"
which is the same as:
i++ + "hell\o, world!\n"
So it increments the 'i' variable, this means the for loop will loop for the number of characters in conditional string "]
For the first time around you end up with:
0 + "hell\o, world!\n"
The second time around the loop will be 1 + "hell\o, world!\n", etc.
So the second parameter is a pointer into the "hell\o, world!\n".
The third parameter is:
'/'/'/' == '/' / '/' == char('/') / char('/') == 1
So the third parameter is always 1.
Now we break down the read function that calls write:
write(j/p+p,i---j,i/i);
There are three parameters, the first is:
j/p+p where j == 0, p == 1 so 0/1+1 == 1.
If read the link to the write function 1 is hardcoded to write to standard out.
The second parameter to write is
i---j
which is the same is i-- - j, where i is the pointer to the string and j = 0, since i is post-decremented is doesn't do anything and '- 0' does nothing, it's simply passing the pointer through to the write function.
The third parameter is 'i / i' which will always be 1.
So for each call to 'read' it writes one character out of the "hell\o, world!\n" string each time.
read('-'-'-',i+++"hell\o, world!\n",'/'/'/')
Calls read with the first argument:
'-' - '-'
So that's the subtraction of a char from itself, i.e. zero.
The second argument is:
i++ + "hell\o, world!\n"
So that is an address within the string constant "hell\o world!\n" which will depend on the value of i.
The third argument is:
'/' / '/'
A reprise of the arithmetic on character liberals theme, this time producing 1.
Rather than the normal read, that call goes to the method defined at the bottom, which actually performs a write.
Argument 1 to the write is:
j/p+p
Which is 0/1+1 = 1.
Argument 2 is:
i-- - j
Which undoes the transformation on the earlier string literal, evaluating back to the string "hell\o world...".
The third argument is:
i/i
i.e. 1.
So the net effect of the read is to write one byte from the string passed in to file descriptor 1.
It doesn't return anything, though it should, so the result and therefore the exact behaviour of the earlier loop is undefined.
The subscript on i in the for loop is identical to writing:
*((i) + (the string given))
i.e. it grabs a byte from within that string. As the initial value of i is undefined, this could be an out-of-bounds access.
Note that the i within read is local, not the global. So the global one continues to increment, passing along one character at a time, until it gets to a terminating null in the other string literal.
If i were given 0 as an initial value then this code would be correct.
(EDIT: as has been pointed out elsewhere, I was wrong here: i is initially zero because it's a global. Teleologically, it costs nothing at runtime to give globals defined initial values so C does. It would cost to give anything on the stack an initial value, so C doesn't.)
First see the syntax of read and write function in C and what they do:
ssize_t read(int fildes, void *buf, size_t nbyte);
The read() function shall attempt to read nbyte bytes from the file associated with the open file descriptor, fildes, into the buffer pointed to by buf.
ssize_t write(int fildes, const void *buf, size_t nbyte);
The write() function shall attempt to write nbyte bytes from the buffer pointed to by buf to the file associated with the open file descriptor, fildes.
Now, rewriting your for loop as
for(;i["]<i;++i){--i;}"]; read('-' - '-', i++ + "hell\o, world!\n", '/' / '/'));
Starting with i["]<i;++i){--i;}"];
"]<i;++i){--i;}" is a string. In C, if
char ch;
char *a = "string";`
then you can write ch = "string"[i] which is equivalent to i["string"] (as a[i] = i[a]). This basically add the address of the string to i (i is initialized to 0 as it is globally defined). So, the i is initialized with the starting address of string hell\o, world!\n.
Now the point is that the for loop is not iterating only once!
The expression read('-' - '-', i++ + "hell\o, world!\n", '/' / '/') can be rewritten as (for the sake of convenience);
read(0, i++ + "hell\o, world!\n", 1)
^ read only one byte (one character)
Now what it will do actually is to call read and increment i (using its previous value). Starting address of string hell\o, world! get added to i. So the first call of read will just print H. On next iteration the i is incremented (contains the address of next character) and call to read will print the next character.
This will continue until i["]<i;++i){--i;}"] becomes false (at \0).
Overall the behavior of code is undefined!
EXPLANATION for UB:
Note that a function call f(a,b,c) is not a use of the comma operator and the order of evaluation for a, b, and c is unspecified.
Also C99 states that:
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.
Hence the call
write(j/p+p, i-- -j, i/i);
invokes UB. You can't modify and use a variable in the same expression. The compiler should raise a warning
[Warning] operation on 'i' may be undefined [-Wsequence-point]

Resources