Syntactic errors - c

I'm writing code for the exercise 1-24, K&R2, which asks to write a basic syntactic debugger.
I made a parser with states normal, dquote, squote etc...
So I'm wondering if a code snippet like
/" text "
is allowed in the code? Should I report this as an error? (The problem is my parser goes into comment_entry state after / and ignores the ".)

Since a single / just means division it should not be interpreted as a comment. There is no division operator defined for strings, so something like "abc"/"def" doesn't make much sense, but it should not be a syntax error. Figuring out if this division is possible should not be done by the parser, but be left for later stages of the compilation to be decided there.

That is syntactically valid, but not semantically. It should parse as the division operator followed by a string literal. You can't divide stuff by a string literal, so it's not legal code, overall.
Comments start with a two-character token, /*, and end with */.

As a standalone syntactical element this should be reported as an error.
Theoretically (as part of an expression) it would be possible to write
a= b /"text"; / a = b divided through address of string literal "text"
which is also wrong (you can't divide through a pointer).
But on the surface level would seem okay because it would syntactically decode as: variable operator variable operator constant-expression (address of string).
The real error would probably have to be caught in a deeper state of syntactical analysis (i.e. when checking if given types are suitable for the division operator).

Related

Need to understand syntax in C program

I have been tasked with studying and modifying a C program. Generally, I write code in pl/sql, but not C. I have been able to decipher most of the code, but the program flow is still eluding me. After looking up several C references guides, I am not understanding how the C code works. I'm hoping someone here can answer a few syntax questions and tell me what each statement is trying to do.
Here is one sample, with my guesses below.
input(ask_fterm,TM_NLS_Get("0004","FROM TERM: "),6,ALPHA);
if ( !*ask_fterm ) goto opt_fterm;
tmstrcpy(fterm,ask_fterm);
goto nextparmb;
opt_fterm:
tmstrcpy(parm_no,_TMC("02"));
sel_optional_ind(FIRST_ROW);
if ( compare(rpt_optional_ind,_TMC("O"),EQS) ) goto nextparmb;
goto missing_parms;
First, I don't understand !*. What does the exclamation asterisk combination?
Second I assume that if must be ended with endif, unless it is on a single line?
Third tmstrcopy() apparently copies the value of the 2nd parameter into the 1st parameter?
I also have several parameters which I don't understand. I'm hoping someone gives me a hint.
tmstrcpy(valid_ind,_TMC("N"));
input(ask_toterm,TM_NLS_Get("0005","TO TERM: "),6,ALPHA);
I don't know where to find _TMC and TM_NLS_Get.
First, I don't understand !*. What does the exclamation asterisk combination?
That's two separate operators. ! is logical negation. Unary * is for dereferencing a pointer. Put together, they each have their separate effect, so !*ask_fterm means determine the value of the object to which pointer ask_fterm points (this is *); if that value is 0 then the result is 1, else the result is 0 (this is !). If ask_fterm is a pointer to the first character of a string, then that's a check for whether the string is empty (zero-length), because C strings are terminated by a character with value 0.
Second I assume that if must be ended with endif, unless it is on a single line?
There is no endif in C. An if construct controls exactly one statement, but that can be and often is a compound one (which you can recognize by the { and } delimiters enclosing it). There may also be an else clause, also controlling exactly one statement, which can be a compound one.
Third tmstrcopy() apparently copies the value of the 2nd parameter into the 1st parameter?
That appears to be a user-defined function. It is certainly not from the C standard library. If I were to guess based on the name and usage, I would guess that it copies a trimmed version of the string to which the right-hand argument points into the space to which the left-hand argument points.
I don't know where to find _TMC and TM_NLS_Get.
Those are not standard C features. Possibly they are recognized directly by your C implementation, or possibly they are macros defined earlier in the file or in one of the header files it includes.

What is the syntax in c to combine statements as a parameter

I have an inkling there is an old nasty way to get a function run as a parameter is calculated, but sine I do not know what it is called I cannot search out the rules.
An example
char dstr[20];
printf("a dynamic string %s\n", (prep_dstr(dstr),dstr));
The idea is that the "()" will return the address dstr after having executed the prep_dstr function.
I know it is ugly and I could just do it on the line before - but it is complicated...
#
Ok - in answer to the pleading not to do it.
I am actually doing a MISRA cleanup on some existing code (not mine don't shoot me), currently the 'prep_dstr' function takes a buffer modifies it (without regard to the length of the buffer) and returns the pointer it was passed as a parameter.
I like to take a small step - test then another small step.
So - a slightly less nasty approach than returning a pointer with no clue about its persistence is to stop the function returning a pointer and use the comma operator (after making sure it does not romp off the end of the buffer).
That gets the MISRA error count down, when it all still works and the MISRA errors are gone I will try to get around to elegance - perhaps the year after next :).
Comma operator has the appropriate precedence and, besides, it gives a sequence point, that is, it defines a point in the execution flow of the program where all the previous side effects are resolved.
So, whatever your function prep_dstr() does to the string dstr, it's completely performed before the comma operator is reached.
On the other hand, comma operator gives an expression whose value is the rightest operand.
The following examples give you the value dstr, as you want:
5+3, prep_dstr(dstr), sqrt(25.0), dstr;
a+b-c, NULL, dstr;
(prep_dstr(dstr), dstr);
Of course, such expression can be used wherever you need the string dstr.
Theerefore, the syntax you employed in the question, then, it does the job perfectly.
Since you are open to play with the syntax, there is another possibility you can use.
By taking in account that the function printf() is a function, it is, in particular, an expression.
In this way, it can be put in a comma expression:
prep_dstr(dstr), printf("Show me the string: %s\n", dstr);
It seems that every body is telling you that "don't write code in this way and so and so...".
This kind of religious advices in the programming style are overestimated.
If you need to do something, just do it.
One of the principles of C says: "Don't prevent the programmer of doing what have be done."
However, whatever you do, try to write readable code.
Yes, the syntax you use will work for your purpose.
However, please consider writing clean and readable code. For instance,
char buffer[20];
char *destination = prepare_destination_string(buffer);
printf("a dynamic string %s\n", destination);
Everything can be cleanly named & understood, and intended behaviour easy to infer. You could even omit certain parts if you so would, like destination, or perform easier error checking.
Your inkling and your code are both correct. That said, please don't do this. Putting prep_dstr on its own line makes it much easier to reason about what happens and when.
What you're thinking of is the comma operator. In a context where the comma doesn't already have another meaning (such as separating function arguments), the expression a, b has the value of b, but evaluates a first. The extra parentheses in your code cause the comma to be interpreted this way, rather than as a function argument separator.

VHDL Array element in if-statement

This is my first work with VHDL so it's surely something basic but just don't know what to do.
I have this code:
--this is in the architecture segment
type my_code is array(0 to 15) of integer;
signal code: my_code;
....
--here I use the array
code(count) <=0; --I save a value into the array on position defined by the count variable
if (code(0) = '0') then --fail line (want to do something if the first element is 0)
--do something
end if;
The compiler stops me because "can not have such operands in this context". The problem is on line with the if statement. What is wrong with that?
I am basically working on a digital lock like you write a code and it will open or remain closed if the code is wrong so I just want to check the array of pressed keys if the code in there is right.
Sorry for bothering but I just don't get it. Thanks and have a nice day ^^
You have an array of integers, so code(0) must be an integer. You cannot compare an integer to character literal '0'.
Either check for code(0) = 0 or redefine your array as type my_code is array(0 to 15) of bit; You can use bit or std_logic, or any other type that has '0' as valid element.
code is an array of integers. You are trying to compare it to a character literal '0', which has no interpretation as an enumeration literal in the context of integers. Try 0 instead.
Also the parentheses surrounding the expression (code(0) = '0') are redundant.
There is no operator defined to compare an integer to a type represented by a character literal in the code context (which you haven't display in your example, not having any context clauses visible nor other declarations).
The "=" equality operator to use is selected by comparing the input operands and result type. There is no operator visible by selection providing a left argument of integer and a right argument of some (not known from your code fragment) type represented by the character literal '0' as an enumeration literal.
Operators that can be selected among several possibilities based on operand type are said to be overloaded.
After a bit of Googling and finding the answer to the general question raised by this Xilinx error message to always dwell on the immediate case I found a few places you can look to understand the issue.
Reference texts:
There are two sub sections in IEEE Standard VHDL Language Reference Manual (the LRM, e.g. IEEE Std 1076-2008, -1993), entitled Subprogram overloading (operators on non predefined typed are functions) in the section Subprograms and packages, and The context of overload resolution in the section Scope and visibility.
VHDL: HARDWARE DESCRIPTION AND DESIGN, Lipsett, Schaefer and Ussery, 1989, Kluwer Academic Publishers.
Also Peter Ashenden and Jim Lewis's 3rd Edition of The Designer's Guide to VHDL, Morgan Kaufman, 2008.
The solution can be either expanding or adding a context clause for missing references to overloaded operators or fixing (as in your case) a syntax error. Which of the two can be case specific.

"Simplify" to one line

just doing my Homeworks and discovered this piece
A[j]=A[j-1];
j--;
is there a way to simplify this to one line? edit one statement?
I've tried
A[j--]=A[j];
but it doesn't seem to work well.
the code is from an InsertSort algorithm
edit this question is not required to do my homework, i am just curious
From the standard:
Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.
That is, A[j] = A[--j]; will result in undefined behavior. Don't do it. A[j]=A[j-1]; j--; is perfectly clear, concise, and satisfactory.
If the goal is just to eliminate the ; in the middle so you can use this in a macro context or as a single statement without braces, try using the comma operator:
A[j]=A[j-1], j--;
or if you want the assigned value as the result of the expression:
j--, A[j+1]=A[j];
Both should generate identical code on a decent compiler if the result of the expression is not used.
As others have said, any attempt to do this without the comma operator will result in undefined behavior due to sequence point issues. If you don't have a good reason for condensing code like this, I would recommend not even doing it. Unless you're very experienced with C, you're almost sure to mess it up and introduce subtle bugs (some of which may manifest not with your current compiler, but in future versions of it, creating hell for whoever gets stuck debugging the code).
There is actually a way
A[j+1]=A[--j];
is works well in VC but causes UB on g++

parsing of mathematical expressions

(in c90) (linux)
input:
sqrt(2 - sin(3*A/B)^2.5) + 0.5*(C*~(D) + 3.11 +B)
a
b /*there are values for a,b,c,d */
c
d
input:
cos(2 - asin(3*A/B)^2.5) +cos(0.5*(C*~(D)) + 3.11 +B)
a
b /*there are values for a,b,c,d */
c
d
input:
sqrt(2 - sin(3*A/B)^2.5)/(0.5*(C*~(D)) + sin(3.11) +ln(B))
/*max lenght of formula is 250 characters*/
a
b /*there are values for a,b,c,d */
c /*each variable with set of floating numbers*/
d
As you can see infix formula in the input depends on user.
My program will take a formula and n-tuples value.
Then it calculate the results for each value of a,b,c and d.
If you wonder I am saying ;outcome of program is graph.
/sometimes,I think i will take input and store in string.
then another idea is arise " I should store formula in the struct"
but ı don't know how I can construct
the code on the base of structure./
really, I don't know way how to store the formula in program code so that
I can do my job.
can you show me?
/* a,b,c,d is letters
cos,sin,sqrt,ln is function*/
You need to write a lexical analyzer to tokenize the input (break it into its component parts--operators, punctuators, identifiers, etc.). Inevitably, you'll end up with some sequence of tokens.
After that, there are a number of ways to evaluate the input. One of the easiest ways to do this is to convert the expression to postfix using the shunting yard algorithm (evaluation of a postfix expression is Easy with a capital E).
You should look up "abstract syntax trees" and "expression trees" as well as "lexical analysis", "syntax", "parse", and "compiler theory". Reading text input and getting meaning from it is quite difficult for most things (though we often try to make sure we have simple input).
The first step in generating a parser is to write down the grammar for your input language. In this case your input language is some Mathematical expressions, so you would do something like:
expr => <function_identifier> ( stmt )
( stmt )
<variable_identifier>
<numerical_constant>
stmt => expr <operator> stmt
(I haven't written a grammar like this {look up BNF and EBNF} in a few years so I've probably made some glaring errors that someone else will kindly point out)
This can get a lot more complicated depending on how you handle operator precedence (multiply and device before add and subtract type stuff), but the point of the grammar in this case is to help you to write a parser.
There are tools that will help you do this (yacc, bison, antlr, and others) but you can do it by hand as well. There are many many ways to go about doing this, but they all have one thing in common -- a stack. Processing a language such as this requires something called a push down automaton, which is just a fancy way of saying something that can make decisions based on new input, a current state, and the top item of the stack. The decisions that it can make include pushing, popping, changing state, and combining (turning 2+3 into 5 is a form of combining). Combining is usually referred to as a production because it produces a result.
Of the various common types of parsers you will almost certainly start out with a recursive decent parser. They are usually written directly in a general purpose programming language, such as C. This type of parser is made up of several (often many) functions that call each other, and they end up using the system stack as the push down automaton stack.
Another thing you will need to do is to write down the different types of words and operators that make up your language. These words and operators are called lexemes and represent the tokens of your language. I represented these tokens in the grammar <like_this>, except for the parenthesis which represented themselves.
You will most likely want to describe your lexemes with a set of regular expressions. You should be familiar with these if you use grep, sed, awk, or perl. They are a way of describing what is known as a regular language which can be processed by something known as a Finite State Automaton. That is just a fancy way of saying that it is a program that can make a decision about changing state by considering only its current state and the next input (the next character of input). For example part of your lexical description might be:
[A-Z] variable-identifier
sqrt function-identifier
log function-identifier
[0-9]+ unsigned-literal
+ operator
- operator
There are also tools which can generate code for this. lex which is one of these is highly integrated with the parser generating program yacc, but since you are trying to learn you can also write your own tokenizer/lexical analysis code in C.
After you have done all of this (it will probably take you quite a while) you will need to have your parser build a tree to represent the expressions and grammar of the input. In the simple case of expression evaluation (like writing a simple command line calculator program) you could have your parser evaluate the formula as it processed the input, but for your case, as I understand it, you will need to make a tree (or Reverse Polish representation, but trees are easier in my opinion).
Then after you have read the values for the variables you can traverse the tree and calculate an actual number.
Possibly the easiest thing to do is use an embedded language like Lua or Python, for both of which the interpreter is written in C. Unfortunately, if you go the Lua route you'll have to convert the binary operations to function calls, in which case it's likely easier to use Python. So I'll go down that path.
If you just want to output the result to the console this is really easy and you won't even have to delve too deep in Python embedding. Since, then you only have to write a single line program in Python to output the value.
Here is the Python code you could use:
exec "import math;A=<vala>;B=<valb>;C=<valc>;D=<vald>;print <formula>".replace("^", "**").replace("log","math.log").replace("ln", "math.log").replace("sin","math.sin").replace("sqrt", "math.sqrt").replace("cos","math.cos")
Note the replaces are done in Python, since I'm quite sure it's easier to do this in Python and not C. Also note, that if you want to use xor('^') you'll have to remove .replace("^","**") and use ** for powering.
I don't know enough C to be able to tell you how to generate this string in C, but after you have, you can use the following program to run it:
#include <Python.h>
int main(int argc, char* argv[])
{
char* progstr = "...";
Py_Initialize();
PyRun_SimpleString(progstr);
Py_Finalize();
return 0;
}
You can look up more information about embedding Python in C here: Python Extension and Embedding Documentation
If you need to use the result of the calculation in your program there are ways to read this value from Python, but you'll have to read up on them yourself.
Also, you should review your posts to SO and other posts regarding Binary Trees. Implement this using a tree structure. Traverse as infix to evaluate. There have been some excellent answers to tree questions.
If you need to store this (for persistance as in a file), I suggest XML. Parsing XML should make you really appreciate how easy your assignment is.
Check out this post:
http://blog.barvinograd.com/2011/03/online-function-grapher-formula-parser-part-2/
It uses ANTLR library for parsing math expression, this one specifically uses JavaScript output but ANTLR has many outputs such as Java, Ruby, C++, C# and you should be able to use the grammar in the post for any output language.

Resources