sub expression and grouping of subexpressins - c

I'm new to c language. Did "precedence" determine the grouping of sub expression. Can you explain how sub grouping works?
Explain why the strange output come when I do i=7; ++i+++i+++i; shows error while just putting space between ++i + ++i + ++i; don't give any error and answer is 22 in Gcc; how this output come?
I checked books also most of them have some "precedence" order and than some" associativity rules", no clear explanation about sub grouping.
can you explain me what to do whenever I saw these kind of mix expression. Almost every c language aptitude ask such type of question.

This is a duplicate of a few questions on SO, but here goes.
Maximal Munch
The C parser will try to grab as many characters as it can to split your program into tokens. In ++i+++i+++i; the parser splits the string into:
++
i
++
+
i
++
+
i
;
It then sees that preincrement (token 1) and postincrement (token 3) are both applied to the first i (token 2), and reports an error. The parser does not backtrack and reparse the string to use + for token 3 and ++ for token 4. If the compiler had the license to do this, a malicious program could take arbitrarily-long time to parse.
Multiple Side-Effects
C and its family of languages defines a sequence point as a point in a statement's execution where all variables have definite values. It is undefined behavior to have more than one side-effect occur to a variable between sequence points. Simplify your example a bit. What could this code do? I have changed a preincrement to a predecrement so I can talk about them easier.
int j = ++i + --i;
Increment i.
Use the incremented value for the first summand.
Decrement i.
Use the decremented value for the second summand.
Add the two values and assign to j.
However, the C standard does not fix the order of these effects except that step 1 must precede step 2, step 3 must precede step 4, and step 5 must be last. What your compiler does need not be what another compiler does, and it need not be consistent, even in the same program. As the joke in the Jargon File goes:
nasal demons, n.
Recognized shorthand on the Usenet group comp.std.c for any unexpected behavior of a C
compiler on encountering an undefined construct. During a discussion on that group in early
1992, a regular remarked “When the compiler encounters [a given undefined construct] it is
legal for it to make demons fly out of your nose” (the implication is that the compiler may
choose any arbitrarily bizarre way to interpret the code without violating the ANSI C
standard). Someone else followed up with a reference to “nasal demons”, which quickly
became established. The original post is web-accessible at http://groups.google.com/groups?hl=en&selm=10195%40ksr.com.

Related

Need to understand syntax in C program

I have been tasked with studying and modifying a C program. Generally, I write code in pl/sql, but not C. I have been able to decipher most of the code, but the program flow is still eluding me. After looking up several C references guides, I am not understanding how the C code works. I'm hoping someone here can answer a few syntax questions and tell me what each statement is trying to do.
Here is one sample, with my guesses below.
input(ask_fterm,TM_NLS_Get("0004","FROM TERM: "),6,ALPHA);
if ( !*ask_fterm ) goto opt_fterm;
tmstrcpy(fterm,ask_fterm);
goto nextparmb;
opt_fterm:
tmstrcpy(parm_no,_TMC("02"));
sel_optional_ind(FIRST_ROW);
if ( compare(rpt_optional_ind,_TMC("O"),EQS) ) goto nextparmb;
goto missing_parms;
First, I don't understand !*. What does the exclamation asterisk combination?
Second I assume that if must be ended with endif, unless it is on a single line?
Third tmstrcopy() apparently copies the value of the 2nd parameter into the 1st parameter?
I also have several parameters which I don't understand. I'm hoping someone gives me a hint.
tmstrcpy(valid_ind,_TMC("N"));
input(ask_toterm,TM_NLS_Get("0005","TO TERM: "),6,ALPHA);
I don't know where to find _TMC and TM_NLS_Get.
First, I don't understand !*. What does the exclamation asterisk combination?
That's two separate operators. ! is logical negation. Unary * is for dereferencing a pointer. Put together, they each have their separate effect, so !*ask_fterm means determine the value of the object to which pointer ask_fterm points (this is *); if that value is 0 then the result is 1, else the result is 0 (this is !). If ask_fterm is a pointer to the first character of a string, then that's a check for whether the string is empty (zero-length), because C strings are terminated by a character with value 0.
Second I assume that if must be ended with endif, unless it is on a single line?
There is no endif in C. An if construct controls exactly one statement, but that can be and often is a compound one (which you can recognize by the { and } delimiters enclosing it). There may also be an else clause, also controlling exactly one statement, which can be a compound one.
Third tmstrcopy() apparently copies the value of the 2nd parameter into the 1st parameter?
That appears to be a user-defined function. It is certainly not from the C standard library. If I were to guess based on the name and usage, I would guess that it copies a trimmed version of the string to which the right-hand argument points into the space to which the left-hand argument points.
I don't know where to find _TMC and TM_NLS_Get.
Those are not standard C features. Possibly they are recognized directly by your C implementation, or possibly they are macros defined earlier in the file or in one of the header files it includes.

What is the syntax in c to combine statements as a parameter

I have an inkling there is an old nasty way to get a function run as a parameter is calculated, but sine I do not know what it is called I cannot search out the rules.
An example
char dstr[20];
printf("a dynamic string %s\n", (prep_dstr(dstr),dstr));
The idea is that the "()" will return the address dstr after having executed the prep_dstr function.
I know it is ugly and I could just do it on the line before - but it is complicated...
#
Ok - in answer to the pleading not to do it.
I am actually doing a MISRA cleanup on some existing code (not mine don't shoot me), currently the 'prep_dstr' function takes a buffer modifies it (without regard to the length of the buffer) and returns the pointer it was passed as a parameter.
I like to take a small step - test then another small step.
So - a slightly less nasty approach than returning a pointer with no clue about its persistence is to stop the function returning a pointer and use the comma operator (after making sure it does not romp off the end of the buffer).
That gets the MISRA error count down, when it all still works and the MISRA errors are gone I will try to get around to elegance - perhaps the year after next :).
Comma operator has the appropriate precedence and, besides, it gives a sequence point, that is, it defines a point in the execution flow of the program where all the previous side effects are resolved.
So, whatever your function prep_dstr() does to the string dstr, it's completely performed before the comma operator is reached.
On the other hand, comma operator gives an expression whose value is the rightest operand.
The following examples give you the value dstr, as you want:
5+3, prep_dstr(dstr), sqrt(25.0), dstr;
a+b-c, NULL, dstr;
(prep_dstr(dstr), dstr);
Of course, such expression can be used wherever you need the string dstr.
Theerefore, the syntax you employed in the question, then, it does the job perfectly.
Since you are open to play with the syntax, there is another possibility you can use.
By taking in account that the function printf() is a function, it is, in particular, an expression.
In this way, it can be put in a comma expression:
prep_dstr(dstr), printf("Show me the string: %s\n", dstr);
It seems that every body is telling you that "don't write code in this way and so and so...".
This kind of religious advices in the programming style are overestimated.
If you need to do something, just do it.
One of the principles of C says: "Don't prevent the programmer of doing what have be done."
However, whatever you do, try to write readable code.
Yes, the syntax you use will work for your purpose.
However, please consider writing clean and readable code. For instance,
char buffer[20];
char *destination = prepare_destination_string(buffer);
printf("a dynamic string %s\n", destination);
Everything can be cleanly named & understood, and intended behaviour easy to infer. You could even omit certain parts if you so would, like destination, or perform easier error checking.
Your inkling and your code are both correct. That said, please don't do this. Putting prep_dstr on its own line makes it much easier to reason about what happens and when.
What you're thinking of is the comma operator. In a context where the comma doesn't already have another meaning (such as separating function arguments), the expression a, b has the value of b, but evaluates a first. The extra parentheses in your code cause the comma to be interpreted this way, rather than as a function argument separator.

3 plus symbols between two variables (like a+++b) in C [duplicate]

This question already has answers here:
What does the operation c=a+++b mean?
(9 answers)
Closed 9 years ago.
#include <stdio.h>
int main()
{
int a=8,b=9,c;
c=a+++b;
printf("%d%d%d\n",a,b,c);
return 0;
}
The program above outputs a=9 b=9 and c=17. In a+++b why is the compiler takes a++ and then adds with b. Why is it not taking a + and
++b? Is there a specific name for this a+++b. Please help me to understand.
I like the explanation from Expert C Programming:
The ANSI standard specifies a convention that has come to be known as
the maximal munch strategy. Maximal munch says that if there's more
than one possibility for the next token, the compiler will prefer to
bite off the one involving the longest sequence of characters. So the
example will be parsed
c = a++ + b;
Read Maximum Munch Principle
"maximal munch" or "longest match" is the principle that when creating some construct, as much of the available input as possible should be consumed.
Every compiler has a tokenizer, which is a component that parses a source file into distinct tokens (keywords, operators, identifiers etc.). One of the tokenizer's rules is called "maximal munch", which says that the tokenizer should keep reading characters from the source file until adding one more character causes the current token to stop making sense
Order of operations in C dictate that unary operations have higher precedence than binary operations.
You could use a + (++b) if you wanted b to be incremented first.

"Simplify" to one line

just doing my Homeworks and discovered this piece
A[j]=A[j-1];
j--;
is there a way to simplify this to one line? edit one statement?
I've tried
A[j--]=A[j];
but it doesn't seem to work well.
the code is from an InsertSort algorithm
edit this question is not required to do my homework, i am just curious
From the standard:
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.
That is, A[j] = A[--j]; will result in undefined behavior. Don't do it. A[j]=A[j-1]; j--; is perfectly clear, concise, and satisfactory.
If the goal is just to eliminate the ; in the middle so you can use this in a macro context or as a single statement without braces, try using the comma operator:
A[j]=A[j-1], j--;
or if you want the assigned value as the result of the expression:
j--, A[j+1]=A[j];
Both should generate identical code on a decent compiler if the result of the expression is not used.
As others have said, any attempt to do this without the comma operator will result in undefined behavior due to sequence point issues. If you don't have a good reason for condensing code like this, I would recommend not even doing it. Unless you're very experienced with C, you're almost sure to mess it up and introduce subtle bugs (some of which may manifest not with your current compiler, but in future versions of it, creating hell for whoever gets stuck debugging the code).
There is actually a way
A[j+1]=A[--j];
is works well in VC but causes UB on g++

Syntactic errors

I'm writing code for the exercise 1-24, K&R2, which asks to write a basic syntactic debugger.
I made a parser with states normal, dquote, squote etc...
So I'm wondering if a code snippet like
/" text "
is allowed in the code? Should I report this as an error? (The problem is my parser goes into comment_entry state after / and ignores the ".)
Since a single / just means division it should not be interpreted as a comment. There is no division operator defined for strings, so something like "abc"/"def" doesn't make much sense, but it should not be a syntax error. Figuring out if this division is possible should not be done by the parser, but be left for later stages of the compilation to be decided there.
That is syntactically valid, but not semantically. It should parse as the division operator followed by a string literal. You can't divide stuff by a string literal, so it's not legal code, overall.
Comments start with a two-character token, /*, and end with */.
As a standalone syntactical element this should be reported as an error.
Theoretically (as part of an expression) it would be possible to write
a= b /"text"; / a = b divided through address of string literal "text"
which is also wrong (you can't divide through a pointer).
But on the surface level would seem okay because it would syntactically decode as: variable operator variable operator constant-expression (address of string).
The real error would probably have to be caught in a deeper state of syntactical analysis (i.e. when checking if given types are suitable for the division operator).

Resources