Pointers in c language program output - c

I have a question :
char *c[] = {"GeksQuiz", "MCQ", "TEST", "QUIZ"};
char **cp[] = {c+3, c+2, c+1, c};
char ***cpp = cp;
int main()
{
printf("%s ", *--*++cpp+3);
}
I am not able to understand the output = sQUIZ ,
my approach: first it will point to cpp+3 i.e c now ++c means pointing to "MCQ" , * of that would give the value "MCQ" ,can't understand what the -- before * would do here . or is my approach totally wrong ?

I will post it as an answer as was mentioned in comments. You should read at first this: http://en.wikipedia.org/wiki/Sequence_point also look here and you can search for dozens of articles accross the Internet about sequence points. This stuff is as BAD as undefined behaviour and unspecified behaviour. You can read this post, especially the part What is the relation between Undefined Behaviour and Sequence Points? in the accepted answer.
Probably this interview question implied your knowledge about sequence points then it is not as bas as I see it, but nevertheless NEVER EVER write such a code even for your pet projects and I don't even want to mention production code. This is silly.
If they look for experienced C++/C developer they shouldn't ask such questions at all.
EDIT
Just for the tip about sequence points, because I saw some misunderstandings in other posted answer and in the comments. This is *--*++cpp+3 not an unspecified behaviour or undefined behaviour (I mean it is a bad code in general), but this IS:
int i =1;
*--*++cpp+i+i++;
The code above is unsequenced and unspecified. Please read about differences between undefined behaviour, unspecified behaviour, implementation-defined behavior and sequence points e.g. here .I wrote all this in order to explain you why you should avoid such a terrible code at all (whether it legal from the point of language standard or not). Yes, your code is legal, but unreadable, and, as you see in my edits, small changes made it illegal. Do not think we don't want to help you, I mean the code similar to your is a bad code in general wherever it will be asked. It will be better if they asked you to explain WHY such a code is bad and fragile - then it will be a good interview question.
P.S. The actual output is an empty string, because you print a null-terminator. See an excellent answer below - it explained the output from the point of C operators preceding (you should also learn it then such questions will not bother you at all).

All variables in this expression are modified only once. Maybe I don't understand something about sequence points, but I don't have no idea why people call this expression undefined behavior.
char *c[] = {"GeksQuiz", "MCQ", "TEST", "QUIZ"};
char **cp[] = {c+3, c+2, c+1, c};
char ***cpp = cp;
/*1*/ cpp; // == &cp[0]
/*2*/ ++cpp; // == &cp[1] (`cpp` changed)
/*3*/ *++cpp; // == cp[1] == c+2
/*4*/ --*++cpp; // == c+2-1 == &c[1] (`cp[1]` changed)
/*5*/ *--*++cpp; // == "MCQ"
/*6*/ *--*++cpp+3; // == "MCQ"+4 - it's pointer to '\0'
So it should not print anything.

Related

One string is affecting the size, length and value of another one on C

int main ()
{
/*
char a[] = "abc";
printf("strlen(a): %li", strlen(a));
printf("\nsizeof(a): %li", sizeof(a));
*/
char b[3];
printf("\nstrlen(b): %li", strlen(b));
printf("\nsizeof(b): %li", sizeof(b));
printf("\nb = ");
puts(b);
return 0;
}
When I run the above code it outputs the following:
strlen(b): 1
sizeof(b): 3
b =
but if I undo the comment, it outputs:
strlen(a): 3
sizeof(a): 4
strlen(b): 6
sizeof(b): 3
b = ���abc
Why does this happens? I would appreciate a good in depth explanation about it principally and if possible a quick "fix" for it so I don't get this problem again.
I'm relatively a beginner in programming and C in general and based on what I learned until now, this shouldn't happen
thanks and sorry if I broke any rule from this website, I'm new here too!
strlen(b) causes undefined behavior because the array b is not initialized. The contents of the array are therefore indeterminate. strlen may return a small number if there happens to be a null byte in the garbage contents of the array (acting as a null terminator), or a large number if there is no null byte in the array but there is one in memory adjacent to it (that happens not to crash when accessed), or it may segfault, or fail in some other unpredictable way. The particular misbehavior you observe can easily depend on the contents of other nearby memory and therefore be influenced by adding or removing other variables, or altering surrounding code in apparently unrelated ways.
puts(b) is similarly undefined behavior.
(Another bug: sizeof and strlen both return size_t, for which the correct printf format specifier is %zu, not %li which would be for long int.)
I would appreciate a good in depth explanation about it principally and if possible a quick "fix" for it so I don't get this problem again.
Do not attempt to read or use the contents of local variables that have not been initialized.
See also What happens to a declared, uninitialized variable in C? Does it have a value? and (Why) is using an uninitialized variable undefined behavior?.
If you enable compiler warnings, your compiler can warn you about some instances of this, e.g. gcc catches this example. Tools like valgrind can help too.
I'm relatively a beginner in programming and C in general and based on what I learned until now, this shouldn't happen
On the contrary, such behavior is extremely common in C. The C language does not guarantee any checks for bugs like this, and implementations generally don't provide them. You should get used to the possibility that the language will not stop you from doing something erroneous, and will instead misbehave in unpredictable ways (or worse, appear to work just fine for a while). As a result, when programming in C, you have to be much more careful and attentive to the language rules than when working with "safer" languages. It's a tough and unfriendly language for beginners.

Where can I find weird, specific C syntax rules?

I will take an exam and my teacher asks weird C syntax rules. Like:
int q=5;
for(q=-2;q=-5;q+=3) { //assignment in condition part??
printf("%d",q); //prints -5
break;
}
Or
int d[][3][2]={4,5,6,7,8,9,10,11,12,13,14,15,16};
int i=-1;
int j;
j=d[i++][++i][++i];
printf("%d",j); //prints 4?? why j=d[0][0][0] ?
Or
extern int a;
int main() {
do {
do {
printf("%o",a); //prints 12
} while(!1);
} while(0);
return 0;
}
int a=10;
I could not find it rules any site or book. Really absurd and uncommon. Where can I find?
To me it seems that your teacher is asking questions which invole undefined behavior.
If you tell him that this is incorrect, you're directly confronting him.
However, you could do the following:
Compile the code on different platforms
Compile the code with different compilers
Compile the code with different versions of the same compiler
Build a matrix with the results. You'll find out that they differ
Show the results to your teacher ans ask him to explain why that happens
That way you do not say that he's wrong, you're just showing some facts and you're showing that you're willing to learn and work.
Do that a long before the exam so that the teacher can look into it and think about his questions so that he can change the exam in time.
I could not find it rules any site or book. Where can I find?
See Where do I find the current C or C++ standard documents?. If you have a good library at university, they should own a copy.
Concerning for(q=-2;q=-5;q+=3) {, all you need to do is to break this down into its components. q=-2 is ran first, then q=-5 is tested, and if that is not 0 (which it isn't since it's an expression with value -5), then the loop body runs once. Then break forces a premature exit from an otherwise infinite loop. The expression then q+=3 is never reached.
The behaviour of d[i++][++i][++i] is undefined. Tell your teacher that, tactfully.
The "%o" format denotes octal output. a is set to 10 in decimal which is 12 in octal. Your code would be clearer if you had written:
int a=012; // octal constant.
The online version of the C language standard has what you need (and is what I will be referring to in this answer); just bear in mind is is a language definition and not a tutorial, and as such may not be easy to read for someone who doesn't have a lot of experience yet.
Having said that, your teacher is throwing you a few foul balls. For example:
j=d[i++][++i][++i];
This statement results in undefined behavior for several reasons. The first several paragraphs of section 6.5 of the document linked above explain the problem, but in a nutshell:
Except in a few situations, C does not guarantee left-to-right evaluation of expressions; neither does it guarantee that side effects are applied immediately after evaluation;
Attempting to modify the value of an object more than once between sequence points1, or modifying and then trying to use the value of an object without an intervening sequence point, results in undefined behavior.
Basically, don't write anything of the form:
x = x++;
x++ * x++;
a[i] = i++;
a[i++] = i;
C does not guarantee that each ++i and i++ is evaluated from left to right, and it does not guarantee that the side effect of each evaluation is applied immediately. So the result of j[i++][++i][++i] is not well-defined, and the result will not be consistent over different programs, or even different builds of the same program2.
AND, on top of that, i++ evaluates to the current value of i; so clearly, your teacher's intent was for j[i++][++i][++i] to evaluate to j[-1][1][2], which would also result in undefined behavior since you're attempting to index outside of the array bounds.
This is why I hate, hate, hate it when teachers throw this kind of code at their students - not only is it needlessly confusing, not only does it encourage bad practice, but more often than not it's just plain wrong.
As for the other questions:
for(q=-2;q=-5;q+=3) { //assignment in condition part??
See sections 6.5.16 and 6.8.5.3. In short, an assignment expression has a value (the value of the left operand after any type conversions), and it can appear as part of a controlling expression in a for loop. As long as the result of the assignment is non-zero (as in the case above), the loop will execute.
printf("%o",a); //prints 12
See section 7.21.6.1. The o conversion specifier tells printf to format the integer value as octal: 1010 == 128
A sequence point is a point in a programs execution where an expression has been fully evaluated and any side effects have been applied. Sequence points occur at the ends of statements, between the evaluation of a function's parameters and the function call, after evaluating the left operand of the &&, ||, and ?: operators, and a few other places. See Annex C for the complete list.
Or even different runs of the same build, although in practice you won't see values change from run to run unless you're doing something really hinky.

how is an expression involving several ^= operators evaluated?

#include<stdio.h>
int main(){
int arr[ 5 ] = { 1, 2, 3, 4, 5 };
int *f = arr;
int *l = (4+arr);
while(f<l){
*f^=*l^=*f^=*l;
++f; --l;
}
printf("\n%d\t%d\t%d\n", *arr, *f, *l)
return 0;
}
My output is 1 3 3 on paper but compiler is showing 033.
Please anyone explain it to me.
Thanks in advance.
*f^=*l^=*f^=*l;
The evaluation of the operands of ^= is not sequenced, and you use the same variables several times in the same expression, with no sequence point in between.
This means that the behavior of the program is undefined. Nobody can know how that expression will be evaluated and anything can happen. The program may crash or the output can be anything.
You have to fix this bug by changing the code into this:
*f ^= *l;
*l ^= *f;
*f ^= *l;
Then each semi-colon will introduce a sequence point and there are no order of evaluation issues.
Standard references.
I actually don't care one bit how this is evaluated. If you have code and ask "what exactly does this code do", then the correct answer is "don't write that kind of code". (Except if you are writing a compiler, in which case the answer is "you shouldn't be writing compilers if you ask on stack overflow how some code should be executed").
In addition, the result is undefined behaviour in C, C++ before C++ 11, and Objective-C, so that's a good reason not to do it where it is defined. In addition, it has zero chance to pass any code review, and there is a rule "always assume that the next maintenance programmer reading your code is a violent sociopath who knows your home address".
The output is correct:
*arr = ((1^5)^1)^5);
which indeed is 0. Note that *arr is modified during the first iteration and that's all you worry about. In the second and later iterations you're not modifying *arr anymore. At that point f has been updated and doesn't point to arr anymore
A few lines explaining the maths:
1^5 = 4
4^1 = 5
5^5 = 0
EDIT:
I assumed that the language guarantees that the order of evaluation is from left to right. As pointed out by Lundin this is not the case. However, judging by the output of the executable that's most likely the way that the compiler has handled it.

C Exam Char Array

I am doing previous year C programming exam. And I came up with this:
A program (see below) defines the two variables x and y.
It produces the given output. Explain why the character ‘A’ appears in the output of variable x.
Program:
#include <stdio.h>
main ()
{
char x[6] = "12345\0";
char y[6] = "67890\0";
y[7]='A';
printf("X: %s\n",x);
printf("Y: %s\n",y);
}
Program output:
X: 1A345
Y: 67890
It has pretty high points (7). And I don't know how to explain it in detail. My answer would be:
char array (y) only have 6 chars allocated so changing 7th character will change whatever is after that in stack.
Any help would highly appreciated! (I'm only 1st year)
Your formal answer should be that this program yields undefined behavior.
The C-language standard does not define the result of an out-of-bound access operation.
With char y[6], by reading from or writing into y[7], this is exactly what you are doing.
Some compilers may choose to allocate array x[6] immediate after array y[6] in the stack.
So by writing 'A' into y[7], this program might indeed write 'A' into x[1].
But the standard does not dictate that, so it depends on compiler implementation.
As others have implied on previous comments to your question, if it was really given on a formal exam, then you may want to consider continuing your studies elsewhere...
The classic stack corruption problem in C. With the help of a debugger, you will find that your frame stack will look like this after the original assignments:
67890\012345\0
y points to the char 6. y[7] means 7 positions after that (2). So y[7] = 'A' replaces the char 2.
Access array beyond bound is undefined in the C standard, just one more quirk of C to be aware of. Some references:
Understanding stack corruption
Why do compilers not warn about out-of-bounds static array indices?

C char array overflow, okay practice?

I'm going through the K & R book and the answer to one of the exercises is troubling me.
In the solutions manual, exercise 1-22 declares a char array:
#define MAXCOL 10
char line[MAXCOL];
so my understanding is that in C arrays go from 0 ... n-1. If that's the case then the above declaration should allocate memory for a char array of length 10 starting with 0 and ending with 9. More to the point line[10] is out of bounds according to my understanding? A function in the sample program is eventually passed a integer value pos that is equal to 10 and the following comparison takes place:
int findblnk(int pos) {
while(pos > 0 && line[pos] != ' ')
--pos;
if (pos == 0) //no blanks in line ?
return MAXCOL;
else //at least one blank
return pos+1; //position after blank
}
If pos is 10 and line[] is only of length 10, then isn't line[pos] out of bounds for the array?
Is it okay to make comparisons this way in C, or could this potentially lead to a segmentation fault? I am sure the solutions manual is right this just really confused me. Also I can post the entire program if necessary. Thanks!
Thanks for the speedy and very helpful responses, I guess it is definitely a bug then. It is called through the following branch:
else if (++pos >= MAXCOL) {
pos = findblnk(pos);
printl(pos);
pos = newpos(pos);
}
MAXCOL is defined as 10 as stated above. So for this branch findblnk(pos) pos would be passed 10 as a minimum.
Do you think the solution manual for K & R is worth going through or is it known for having buggy code examples?
It is never, ever okay to over-run the bounds of an array in C. (Or any language really).
If 10 is really passed to that function, that is certainly a bug. While there are better ways of doing it, that function should at least verify that pos is within the bounds of line before attempting to use it as an index.
If pos is indeed 10 then it would be an out of bounds access and accessing an array out of bounds is undefined behavior and therefore anything can happen even a program that appears to work properly at the time, the results are unreliable. The draft C99 standard Annex J.2 undefined behavior contains the follows bullet:
An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int
a[4][5]) (6.5.6).
I don't have a copy of K&R handy but the errata does not list anything for this problem. My best guess is the condition should < instead of >=.
Code above is fine as long as pos == 9 when its passed to that function . If pos ==10 when its passed then its undefined behaviour and .. you are correct , it should be avoided.
However it may or may not give segmentation fault .
my_type buffer[SOME_CONSTANT_NAME]; almost always is a bug.
Code like the one you present in the question is the source of the majority of security problems: when the buffer overflows, it invokes undefined behaviour, and that undefined behaviour (if it does not directly crash the program) can frequently be exploited by attackers to execute their own code within your process.
So, my advice is to stay away from all fixed buffer sizes and either use C++'s std::vector<> or dynamically allocate enough memory to fit. The Posix 2008 standard makes this quite easy even in C with the asprintf() function and friends.

Resources