I got this confusion when I was working on infix to prefix conversion using C
Infix Expression:A-B+(N$N)(O+P)+Q/R^ST+Z
where I found $ being used as an operator.
So to clear my doubt I made a program performing a$b operation in C to see if it is yielding results like other operators but got to see no answer.
The output further increased my doubt regarding the $ operator and its use in C.
If a=5,b=6 then if we do a+b and print the sum we get 11. In the case of $ if we are considering it as an operator and doing a$b then I am not getting any result. So I was confused that whether $ is an operator or not in the case of C
Some compilers (including GCC) accept, as an extension to the C language, a $ inside an identifier. You may also need support from your linker and assembler.
But standard C usually disallow that. Read e.g. Modern C to learn more, and also the n2176 draft standard. Check also this C reference website (it seems that $ are permitted in recent C standards).
IIRC, on VAX computers under VMS, you did have a lot of system identifiers with a $.
I guess you could improve some open source C compiler (not only GCC, but also Clang, tinycc, nwcc...) to accept $ as some operator.
My recommendation would be to not use $ in your identifiers, and to take inspiration from existing open source software (e.g. GTK).
For example the following /tmp/dollar.c file
#include <stdio.h>
int main(int argc, char**argv)
{
int doll$ar = 2;
printf("from %s doll$ar is %d and argc is %d\n",
argv[0], doll$ar, argc);
}
using GCC 10 on Linux/x86-64 in December 2020 as gcc -Wall -Wextra /tmp/dollar.c -o /tmp/dollar compiles without warnings. When I run /tmp/dollar the output is:
from /tmp/dollar doll$ar is 2 and argc is 1
Few open source software (look into the source code of GNU make or GNU bash) are using $ in identifiers.
PS. I believe that future standards might allow UTF-8 letters in identifiers, but you need to check with your ISO representative.
Related
The carefully crafted, self-including code in this IOCCC winning entry from 1988:
http://www.ioccc.org/years.html#1988_isaak
...was still too much for certain systems back then. Also, ANSI C was finally emerging as a stable alternative to the chaotic K&R ecosystem. As a result, the IOCCC judges also provided an ANSI version of this entry:
http://www.ioccc.org/1988/isaak.ansi.c
Its main attraction is its gimmick of including <stdio.h> in the last line (!) with well-thought-out #defines, both inside the source and at compile time, to only allow certain parts of the code into the right level. This is what allows the <stdio.h> header to be ultimately included at the latest stage possible, just before it is necessary, in the source fed to the compiler.
However, this version still fails to produce its output when compiled today, with the provided compiler settings:
gcc -std=c89 -DI=B -DO=- -Dy isaak.ansi.c
tcc -DI=B -DO=- -Dy isaak.ansi.c
Versions used: GCC 9.3.0, TCC 0.9.27
There isn't any evident reliance on the compiled binary filename, hence I left it to the compiler's choice. Even when using -o isaak or -o isaak.ansi, the same result happens: no output.
What is causing this? How are the output functions failing? What can be done to correct this?
Thanks in advance!
NOTE: The IOCCC judges, realising that this entry had portability issues that would detract from its obfuscation value, decided to also include a UUENCODEd version of the code's output:
http://www.ioccc.org/1988/isaak.encode
There is nothing remotely portable about this program. As I see it tries to overwrite the exit standard library function with its own code, expecting that return from empty main() would call that exit(), which is not true. And even then, such behaviour is not standard-conforming - even C89 said it would have undefined behaviour.
You can "fix" the program on modern GCC / Linux by actually calling exit(); inside main - just change the first line to
main(){exit(0);}
I compiled gcc -std=c89 -DI=B -DO=- -Dy isaak.ansi.c and run ./a.out and got sensible output out.
I'm using a proprietary development environment that compiles code written in C, as well as the IEC 61131 languages. For the C compilation, it uses GCC 4.1.2 with these build options:
-fPIC -O0 -g -nostartfiles -Wall -trigraphs -fno-asm
The compilation is done by a program running on windows utilizing Cygwin.
My issue is, IEC language preprocessor is not that useful (doesn't support #define at all) and I want to use macros! I don't see why the GCC preprocessor would really care what language it is processing (my target language is Structured Text), so I'm looking to see if anyone might know a way to get it to process files of different file types that then are not compiled further (I'm just looking for macro expansion before the file is run through the IEC compiler). I'm very ignorant of compiler options and environments since I've never had to deal with them, I just write C code and it magically compiles and transfers to my target system to run.
The only things I can really do are add build options and execute a batch file before anything is executed. I think my best hope lies in using a batch file to process all files of a certain extension, but I don't even know what executable in the gnuinst folder to use, let alone what flags to use to run through the files.
Just about any C preprocessor, including gcc's cpp, is going to assume that its input is valid C code. It has to tokenize the input following C (or C++, or Objective-C) rules, because it had to resolve its input into tokens (more precisely preprocessing tokens). Constructs above the token level shouldn't be an issue.
You certainly can use cpp or gcc -E to preprocess text that isn't C source code, but some input constructs will cause problems.
Taking an example from the comments:
$ cat foo.txt
#define ADDTHEM(x, y) ((x) + (y))
ADDTHEM(2, 3)
$ gcc -E - < foo.txt
# 1 "<stdin>"
# 1 "<command-line>"
# 1 "<stdin>"
((2) + (3))
Note that I had to use gcc -E - < foo.txt rather than gcc -E foo.txt, because gcc treats a .txt file as a linker input file by default.
But if you add some content to foo.txt that doesn't consist of valid C preprocessor tokens, you can have problems:
$ cat foo.txt
#define ADDTHEM(x, y) ((x) + (y))
ADDTHEM(2, 3)
ADDTHEM('c, "s)
$ gcc -E - < foo.txt
# 1 "<stdin>"
# 1 "<command-line>"
# 1 "<stdin>"
((2) + (3))
<stdin>:3:9: warning: missing terminating ' character [enabled by default]
<stdin>:3:0: error: unterminated argument list invoking macro "ADDTHEM"
ADDTHEM
(Attempts to feed Ada source code to a C preprocessor have run into this kind of problem, since Ada uses isolated apostrophe ' characters for its attribute syntax.)
So you can do it if the input language doesn't use things that aren't valid C preprocessor tokens.
See the N1570 draft of the C standard, section 6.4, for more information about preprocessing tokens.
I actually wrote the above before I checked the GNU cpp manual, which says:
The C preprocessor is intended to be used only with C, C++, and
Objective-C source code. In the past, it has been abused as a general
text processor. It will choke on input which does not obey C's lexical
rules. For example, apostrophes will be interpreted as the beginning of
character constants, and cause errors. Also, you cannot rely on it
preserving characteristics of the input which are not significant to
C-family languages. If a Makefile is preprocessed, all the hard tabs
will be removed, and the Makefile will not work.
Having said that, you can often get away with using cpp on things
which are not C. Other Algol-ish programming languages are often safe
(Pascal, Ada, etc.) So is assembly, with caution. `-traditional-cpp'
mode preserves more white space, and is otherwise more permissive. Many
of the problems can be avoided by writing C or C++ style comments
instead of native language comments, and keeping macros simple.
Wherever possible, you should use a preprocessor geared to the
language you are writing in. Modern versions of the GNU assembler have
macro facilities. Most high level programming languages have their own
conditional compilation and inclusion mechanism. If all else fails,
try a true general text processor, such as GNU M4.
(The authors of that manual apparently missed the problem with Ada's attribute syntax.)
I wrote a program, where the size of an array is taken as an input from user.
#include <stdio.h>
main()
{
int x;
scanf("%d", &x);
int y[x];
/* some stuff */
}
This program failed to compile on my school's compiler Turbo C (an antique compiler).
But when I tried this on my PC with GNU CC, it compiled successfully.
So my question is, is this a valid C program? Can I set the size of the array using a user's input?
It is a valid C program now, but it wasn't 15 years ago.
Either way, it's a buggy C program because x is used without any knowledge of how large it might be. The user can input a malicious value for x and cause the program to crash or worse.
C99 gives C programmers the ability to use variable length arrays,which are arrays whose sizes are not known until run time. --C:A Reference Manual
c90 does not support variable length arrays you can see this using this command line:
gcc -std=c90 -pedantic code.c
you will see an error message like this:
warning: ISO C90 forbids variable length array ‘y’ [-Wvla]
but c99 this is perfectly valid:
gcc -std=c99 -pedantic code.c
Instead of asking whether this is strictly valid C code, it may be better to ask whether it is good C code. Although it is valid, as you have seen, a number of compilers do not support variable length arrays.
Variable length arrays are not supported by a number of modern compilers. These include Microsoft Visual Studio and some versions of the IBM XL compilers. As you have found, variable length arrays are not entirely portable. That's fine if the code will only be used on systems that support the feature but not if it has to be run on other systems. Instead, it may be better to allocate the array with constant size using a reasonable limit or use a malloc and free to create the array in portable manner.
This code does not compile for me on GCC version 4.3.2 (Debian 4.3.2-1.1)
main() {
int unix;
}
I've checked the C keywords list and "unix" is not one of them.
Why am I getting the following error?
unix.c:2: error: expected identifier or ‘(’ before numeric constant
unix is not a identifier reserved by the Standard.
If you compile with -std=c89 or -std=c99 the gcc compiler will accept the program as you expected.
From gcc manual ( https://gcc.gnu.org/onlinedocs/cpp/System-specific-Predefined-Macros.html ), the emphasis is mine.
... However,
historically system-specific macros
have had names with no special prefix;
for instance, it is common to find
unix defined on Unix systems. For all
such macros, GCC provides a parallel
macro with two underscores added at
the beginning and the end. If unix is
defined, __unix__ will be defined too.
There will never be more than two
underscores; the parallel of _mips is
__mips__.
unix is one of the defines the preprocessor uses in gcc
to get a list of defs use
gcc -dM -E -x c /dev/null
(-dM tells gcc to debugdump the defs -E tells it to stop after prepreocessing and -x c /dev/null tells him to pretend /dev/null is a c file)
Run your code through the preprocessor to find out what the compiler is actually seeing:
gcc -E unix.c
Then see if your variable unix is preserved or converted by the preprocessor.
It is not a keyword.
It is a predefined macro to identify the type of system. On Unix and Unix like systems it is defined to be 1.
To disable this use the -ansi option:
In C mode, this is equivalent to -std=c89. In C++ mode, it is equivalent to -std=c++98.
This turns off certain features of GCC that are incompatible with ISO C90 (when compiling C code), or of standard C++ (when compiling C++ code), such as the "asm" and "typeof" keywords, and predefined macros such as "unix" and "vax" that identify the type of system you are using. It also enables the undesirable and rarely used ISO trigraph feature. For the C compiler, it disables recognition of C++ style // comments as well as the "inline" keyword.
I'm gona take a wild stab at this and guess that gcc effectively #defined unix as 1 on UNIX systems.
try
main(){
printf("%d", unix);
}
and see what you get.
To answer your question, no unix is not a reserved word in C.
However, the symbol unix is most likely defined by the preprocessor either because you include a header file or because the compiler defines it.
Lets say I have written a program in C and compiled it with both gcc (as C) and g++ (as C++), which compiled executable will run faster: the one created by gcc or by g++? I think using the g++ compiler will make the executable slow, but I'm not sure about it.
Let me clarify my question again because of confusion about gcc:
Let's say I compile program a.c like this in the terminal:
gcc a.c
g++ a.c
Which a.out executable will run faster?
Firstly: the question (and some of the other answers) seem to be based on the faulty premise that C is a strict subset of C++, which is not in fact the case. Compiling C as C++ is not the same as compiling it as C: it can change the meaning of your program!
C will mostly compile as C++, and will mostly give the same results, but there are some things that are explicitly defined to give different behaviour.
Here's a simple example - if this is your a.c:
#include <stdio.h>
int main(void)
{
printf("%d\n", sizeof('x'));
return 0;
}
then compiling as C will give one result:
$ gcc a.c
$ ./a.out
4
and compiling as C++ will give a different result (unless you're using an unusual platform where int and char are the same size):
$ g++ a.c
$ ./a.out
1
because the C specification defines a character literal to have type int, and the C++ specification defines it to have type char.
Secondly: gcc and g++ are not "the same compiler". The same back end code is used, but the C and C++ front ends are different pieces of code (gcc/c-*.c and gcc/cp/*.c in the gcc source).
Even if you stick to the parts of the language that are defined to do the same thing, there is no guarantee that the C++ front end will parse the code in exactly the same way as the C front end (i.e. giving exactly the same input to the back end), and hence no guarantee that the generated code will be identical. So it is certainly possible that one might happen to generate faster code than the other in some cases - although I would imagine that you'd need complex code to have any chance of finding a difference, as most of the optimisation and code generation magic happens in the common back end of the compiler; and the difference could be either way round.
I think they they will both produce the same machine code, and therefore the same speed on your computer.
If you want to find out, you could compile the assembly for both and compare the two, but I'm betting that they create the same assembly, and therefore the same machine code.
Profile it and try it out. I'm certain it will depend on the actual code, even if it would require potentially a really weird case to get any different bytecode. Though if you don't have extern C {} around your C code, and or works fine in C, I'm not sure how "compiling it as though it were C++" could provide any speed, unless the particular compiler optimizations in g++ just happen to be a bit better for your particular situation...
The machine code generated should be identical. The g++ version of a.out will probably link in a couple of extra support libraries. This will make the startup time of a.out be slower by a few system calls.
There is not really any practical difference though. The Linux linker will not become noticeably slower until you reach 20-40 linked libraries and thousands of symbols to resolve.
The gcc and g++ executables are just frontends, they are not the actual compilers. They both run the actual C or C++ compilers (and ld, ar, whatever is needed to produce the output you asked for) based on the file extensions. So you'll get the exact same result. G++ is commonly used for C++ because it links with the standard C++ library (iostreams etc.).
If you want to compile C code as C++, either change the file extension, or do something like this:
gcc test.c -otest -x c++
http://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/G_002b_002b-and-GCC.html
GCC is a compiler collection. It is mainly used for compilation of C,C++,Ada,Java and many more programming languages.
G++ is a part of gnu compiler collection(gcc).
I mean gcc includes g++ as well. When we use gcc for compilation of C++ it uses g++. The output files will be different because the G++ compiler uses its own run time library.
Edit: Okay, to clarify things, because we have a bit of confusion in naming here. GCC is the GNU Compiler Collection. It can compile Ada, C++, C, and a billion and a half other languages. It is a "backend" to the various languages "front end" compilers like GNAT. Go read the link i made at the top of the page from GCC.GNU.Org.
GCC can also refer to the GNU C Compiler. This will compile C++ code if given the -lstdc++ command, but normally will choke and die because it's not pulling in the C++ libraries.
G++, the GNU C++ Compiler, like the GNU C Compiler is a front end to the GNU Compiler Collection. It's difference between the C Compiler is that it automatically includes those libraries and makes a few other small tweaks, because it's assuming it's going to be fed C++ code to compile.
This is where the confusion comes from. Does this clarify things a bit?