Which compiler should I trust? - c

This is going to be some what of a newbie question but I was trying to work on a small exercise in the C Language (not C++) and I was running into some issues.
Say I wanted to use an array within a method whose size depended on one of the arguments:
void someFunc(int arSize)
{
char charArray[arSize];
// DO STUFF
...
}
When I try to compile this as a .c file within Visual Studio 2013 I get an error saying that a non-constant array size is not allowed. However the same code works within CodeBlocks under a GNU Compiler. Which should I trust? Is this normal for compilers to behave so differently? I always thought that if you're doing something that a compiler doesn't like you shouldn't be doing it in the first place because it's not a standard.
Any input is useful! I come from a Background in Python and I am trying to get more heavily involved in programming with Data-Structures and Algorithms.
My platform is Windows as you can probably tell. Please let me know if this question needs more information before it can be answered.

Variable length arrays(VLA) are a C99 feature and Visual Studio until recently did not support C99 and I am not sure if it currently supports VLA in the lastest version. gcc on the other hand does support C99 although not fully. gcc supports VLA as an extension outside of C99 mode, even in C++.
From the draft C99 standard section 6.7.5.2 Array declarators paragraph 4:
[...] If the size is an integer constant expression and the element type has a known constant size, the array type is not a variable length array type; otherwise, the array type is a variable length array type.

You should trust the compilers that you're using and that you want to support.
On that particular issue: non-constant array sizes are valid in C99, which isn't fully supported by either gcc or by MSVC (Microsoft's C/C++ compiler). gcc, however, has this feature from the standard implemented even outside of C99 mode, while MSVC hasn't.

It depends upon the particular standard your C compiler is following.
The feature you want is called variable length array (VLA) and was introduced into the C99 standard.
Maybe your Visual Studio is supporting some earlier version of the standard. Perhaps you might configure it to support a later version.
Notice that using VLA with a huge size could be a bad habit: VLA are generally stack allocated, and a call frame stack should usually have a small size (a few kilobytes at most on current processors), especially for kernel code or for recursive or multithreaded functions. You may want to heap-allocate (e.g. with calloc) your array if it is has more than a thousand words. Then you'll need to free it later.

This is a GCC extension acting on you.

Related

Compatibility of C89/C90, C99 and C11

I just read: C Wikipedia entry. As far as I know there are 3 different versions of C that are widely used: C89, C99 and C11. My question concerns the compatibility of source code of different versions.
Suppose I am going to write a program (in C11 since it is the latest version) and import a library written in C89. Are these two versions going to work together properly when compiling all files according to the C11 specification?
Question 1:
Are the newer versions of C i.e. C99, C11 supersets of older C versions? By superset I mean, that old code will compile without errors and the same meaning when compiled according to newer C specifications.
I just read, that the // has different meanings in C89 and C99. Apart from this feature, are C99 and C11 supersets of C89?
If the answer to Question 1 is no, then I got another 2 questions.
How to 'port' old code to the new versions? Is there a document which explains this procedure?
Is it better to use C89 or C99 or C11?
Thanks for your help in advance.
EDIT: changed ISO C to C89.
Are the newer versions of C i.e. C99, C11 supersets of older C versions?
There are many differences, big and subtle both. Most changes were adding of new features and libraries. C99 and C11 are not supersets of C90, although there was a great deal of effort made to ensure backwards-compatibility. C11 is however mostly a superset of C99.
Still older code could break when porting from C90, in case the code was written poorly. Particularly various forms of "implicit int" and implicit function declarations were banned from the language with C99. C11 banned the gets function.
A complete list of changes can be found in the C11 draft page 13 of the pdf, where "the third edition" refers to C11 and "the second edition" refers to C99.
How to 'port' old code to the new versions? Is there a document which explains this procedure?
I'm not aware about any such document. If you have good code, porting is easy. If you have rotten code, porting will be painful. As for the actual porting procedure, it will be easy if you know the basics of C99 and C11, so the best bet is to find a reliable source of learning which addresses C99/C11.
Porting from C99 to C11 should be effortless.
Is it better to use C89 or C99 or C11?
It is best to use C11 as that is the current standard. C99 and C11 both contained various "language bug fixes" and introduced new, useful features.
In most ways, the later versions are supersets of earlier versions. While C89 code which tries to use restrict as an identifier will be broken by C99's addition of a reserved word with the same spelling, and while there are some situations in which code which is contrived to exploit some corner cases with a parser will be treated differently in the two languages, most of those are unlikely to be important.
A more important issue, however, has to do with memory aliasing. C89 include
rules which restrict the types of pointers that can be used to access certain
objects. Since the rules would have made functions like malloc() useless if
they applied, as written, to the objects created thereby, most programmers and
compiler writers alike treated the rules as applying only in limited cases (I doubt C89 would have been widely accepted if people didn't believe the rules applied only narrowly). C99 claimed to "clarify" the rules, but its new rules are much more expansive in effect than contemporaneous interpretations of the old ones, breaking a lot of code that would have had defined behavior under those common interpretations of C89, and even some code which would have been unambiguously defined under C89 has no practical C99 equivalent.
In C89, for example, memcpy could be used to copy the bit pattern associated with an object of any type to an object of any other type with the same size, in any cases where that bit pattern would represent a valid value in the destination type. C99 added language which allows compilers to behave in arbitrary fashion if memcpy is used to copy an object of some type T to storage with no declared type (e.g. storage returned from malloc), and that storage is then read as object of a type that isn't alias-compatible with T--even if the bit pattern of the original object would have a valid meaning in the new type. Further, the rules that apply to memcpy also apply in cases where an object is copied as an array of character type--without clarifying exactly what that means--so it's not clear exactly what code would need to do to achieve behavior matching the C89 memcpy.
On many compilers such issues can be resolved by adding a -fno-strict-aliasing option to the command line. Note that specifying C89 mode may not be sufficient, since compilers writers often use the same memory semantics regardless of which standard they're supposed to be implementing.
The newer versions of C are definitely not strict super-sets of the older versions.
Generally speaking, this sort of problem only arises when upgrading the compiler or switching compiler vendors. You must plan for a lot of minor touches of the code to deal with this event. A lot of the time, the new compiler will catch issues that were left undiagnosed by the old compiler, in addition to minor incompatibilities that may have occurred by enforcing the newer C standard.
If it is possible to determine, the best standard to use is the one that the compiler supports the best.
The C11 wikipedia article has a lengthy description of how it differs from C99.
In general, newer versions of the standard are backward compatible.
If not, you can compile different .c files to different .o files, using different standards, and link them together. That does work.
Generally you should use the newest standard available for new code and, if it's easy, fix code that the new standard does break instead of using the hacky solution above.
EDIT: Unless you're dealing with potentially undefined behavior.

Does GCC atomic buitlins work with std=C99?

I am using this built-in atomic methods link
It is mentioned that:
The following built-in functions approximately match the requirements
for the C++11 memory model.
However I have tried compiling these methods with std=C99 and std=C89. The program compiles and I get the right results. Is there something I am missing here ?
Does C99 and C89 have a memory model as well ?
It is a compiler extension and therefore it is allowed to provide functionality outside of the what the standard allows but that page does not make it obvious that is the can be used in C.
Fortunately, gcc does have good online documents and if we check out for example the 4.9 series document on C extensions the __atomic Builtins points to the same page.
So that would indicate that it is valid to use in C and it will stick the requirements as laid out in the documentation and so it will work in the C99 as it does in C++. Usually if there is a difference between how a feature/extension is implemented between C and C++ the documents will note this, for example compound literals have significant differences.

Efficiency issues when using C99 and C11.

The other day I was converting a program written with C99 standard into C11. Basically the motive was to use the code with MSVC but It was written in Linux and was mostly compiled with default GCC behaviour. During the code conversion, I found out that you can not decalre variables of a function after any statement i.e. you must declare them at the top of the function.
But my question is that wouldn't it be against the efficient programming rule that variables should be declared near their use so that it maximizes the cache hits? For example, In a large function of say 200 LOC, I want to use some big static look up array at nearly the end of the function. Wouldn't declaring and initializing it just before the usage cause more cache hits? or am I simple missing some basic point of C11 C language standard?
You seem to have some confusion for which version of the standard you are compiling your program. AFAIK, MSVC doesn't support any of the more recent C standards.
But to come to the core of your question, no this is not an efficiency issue. The compiler is allowed to reorder statements to its liking, as long as the observable behavior of the program doesn't change. Thus a modern compiler will always touch a new variable the latest possible before its first use.
Where the variable declaration appears has no effect on cache behavior. Just having a declaration doesn't touch memory.
You may need to separate out initialization into a separate assignment, however, in order to make sure you don't have an initializer causing a memory access at (near) the beginning of the function.

Which version of C is more appropriate for students to learn- C89/90 or C99?

I'm looking into learning C basics and syntax before beginning Systems Programming next month. When doing some reading, I came across the C89/99 standards. According to Wikipedia,
C99 introduced several new features,
including inline functions, several
new data types (including long long
int and a complex type to represent
complex numbers), variable-length
arrays, support for variadic macros
(macros of variable arity) and support
for one-line comments beginning with
//, as in BCPL or C++. Many of these
had already been implemented as
extensions in several C compilers.
C99 is for the most part backward
compatible with C90, but is stricter
in some ways; in particular, a
declaration that lacks a type
specifier no longer has int
implicitly assumed. A standard macro
STDC_VERSION is defined with value 199901L to indicate that C99 support
is available. GCC, Sun Studio and
other compilers now support many or
all of the new features of C99.
I borrowed a copy of K&R, 2nd Edition, and it uses the C89 standard. For a student, does the use of C89 invalidate some subjects covered in K&R, and if so, what should I look out for?
There is no reason to learn C89 or C90 over C99- it's been very literally superseded. It's easy to find C99 compilers and there's no reason whatsoever to learn an earlier standard.
This doesn't mean that your professor won't force C89 upon you. From the various questions posted here marked homework, I get the feeling that many, many C (and, unfortunately, C++) courses haven't moved on since C89.
From the perspective of a starting student, the chances are that you won't really notice the difference- there's plenty of C that's both C99 and C89/90 to be covered.
Use the C99 standard, it's newer and has more features. Particularly useful may be the bool type in <stdbool.h> and the int32_t etc. family of types; the latter prevents a lot of unportable code that relies on ints having a certain size. AFAIK, it doesn't invalidate K&R, though some example programs may be written in a slightly different style now.
Note that some compilers still don't support C99 properly. I believe that GCC still requires the use of a -std=c99 flag to enable it; many Unix/Linux systems have a c99 command that wraps GCC and enables C99.
The same goes for many university professors. I surprised mine by handing in a program that used bool in my freshman year. He'd never heard of that type in C :)
While I generally agree with the others, it is worth noting that K&R is such a good book that it might be worth learning C from it and then updating your knowledge as you read about the C99 standard.
If you are at student level you probably won't even notice the differences.
Yes, it's a bit odd that you can get a loud consensus that K&R is a great C book, and also a loud consensus that C99 is the correct/current/best version of C. The two positions are incompatible - even if K&R is the best book available to learn "C meaning C99", that just implies the rest are rubbish, or are also hopelessly outdated.
I would advise learning and using C99, but keeping an eye to C89 as you do so. If you use a compiler that has both C89 and C99 compliant modes, then you can write a few bits of C89 just to get an idea of the differences. Then if you ever need to write some code intended to be portable to places that C99 doesn't go, you'll know what to do. If you never have to write any such code, then you've wasted perhaps a day.
Writing C89 properly is actually surprisingly difficult, because getting hold of a copy of the C89 standard is difficult. So, C99 if you can, C89 if for some odd reason you have to, and have some awareness what the difference is. Maybe use K&R to cover the very basics, but get a look at some idiomatic C99 as soon as possible.
As for specific issues to be aware of when reading K&R: there's a list of major changes in the foreword of the standard (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf), although the details aren't laid out there. A lot of them are new features added to C99, so it's not that K&R is wrong, it just may not always use the best tools for a given job. Some of them are quite fiddly things where you should probably consult the standard if you need the details anyway. The rest are things removed from C89, that usually a C99 compiler will tell you about as and when you try to use them.
As a student, that doesn't influence you so much. But if possible, you should find a new C book which covers C99
The term "C89" describes two very different languages:
The language that programmers in 1989 thought the Committee was describing in places where the Standard was ambiguous, and which supported features that were common in pre-existing implementations.
The language that the Committee has since decided that it wanted to have described, which threw compatibility with existing functionality out the
window.
C99 "clarifies" ambiguous parts of the standard by saying that they meant
to have the Standard interpreted in a way that would have broken a substantial
fraction of existing code and made it impossible to perform many tasks as
efficiently as they had been performed in C before 1989.
The right language to program in, for many applications, would be the superset of pre-Standard C, C89, C99, and C11. It's important, however, that anyone programming in that language be clear that they're using that language rather than a shrinking subset which favors speed over reliability.
While I think it's beneficial to know which features are more recent and less likely to be supported by obscure (or intentionally-broken, like MSVC) compilers, there are a few C99 features that you should absolutely use:
snprintf: This is the definitive function for safe and clean string assembly in C. If your compiler is missing it, you can either replace the whole printf subsystem (probably a good idea since most implementations with missing snprintf are also full of (often intentional) bugs in printf behavior), or wrap tmpfile/fprintf/fread/fclose.
stdint.h: If you need fixed-size types (16/32/64-bit), use the standard names int16_t, uint16_t, int32_t, etc. Do not invent your own, and absolutely don't use system-specific ones like INT64 or u32. It just makes your code ugly and hard to integrate and reuse. If your compiler is missing stdint.h, just drop in your own to define the types in terms of the correct-for-your-platform types.
Specifically uint64_t, in place of int foo[2]; or struct { int lo, int hi; } foo; or other hideous legacy hacks to work with 64-bit numbers. Any sane compiler even without C99 support has its own 64-bit types you can use to define int64_t and uint64_t.

Is C99 backward compatible with C89?

I'm used to old-style C and and have just recently started to explore c99 features. I've just one question: Will my program compile successfully if I use c99 in my program, the c99 flag with gcc and link it with prior c99 libraries?
So, should I stick to old C89 or evolve?
I believe that they are compatible in that respect. That is as long as the stuff that you are compiling against doesn't step on any of the new goodies. For instance, if the old code contains enum bool { false, true }; then you are in trouble. As a similar dinosaur, I am slowly embracing the wonderful new world of C99. After all, it has only been out there lurking for about 10 years now ;)
You should evolve. Thanks for listening :-)
Actually, I'll expand on that.
You're right that C99 has been around for quite a while. You should (in my opinion) be using that standard for anything other than legacy code (where you just fix bugs rather than add new features). It's probably not worth it for legacy code but you should (as with all business decisions) do your own cost/benefit analysis.
I'm already ensuring my new code is compatible with C1x - while I'm not using any of the new features yet, I try to make sure it won't break.
As to what code to look out for, the authors of the standards take backward compatibility very seriously. Their job was not ever to design a new language, it was to codify existing practices.
The phase they're in at the moment allows them some more latitude in updating the language but they still follow the Hippocratic oath in terms of their output: "first of all, do no harm".
Generally, if your code is broken with a new standard, the compiler is forced to tell you. So simply compiling your code base will be an excellent start. However, if you read the C99 rationale document, you'll see the phrase "quiet change" appear - this is what you need to watch out for.
These are behavioral changes in the compiler that you don't need to be informed about and may be the source of much angst and gnashing of teeth if your application starts acting strange. Don't worry about the "quiet change in c89" bits - if they were a problerm, you would have already been bitten by them.
That document, by the way, is an excellent read to understand why the actual standard says what it says.
Some C89 features are not valid C99
Arguably, those features exist only for historical reasons, and should not be used in modern C89 code, but they do exist.
The C99 N1256 standard draft foreword paragraph 5 compares C99 to older revisions, and is a good place to start searching for those incompatibilities, even though it has by far more extensions than restrictions.
Implicit int return and variable types
Mentioned by Lutz in a comment, e.g. the following are valid C89:
static i;
f() { return 1; }
but not C99, in which you have to write:
static int i;
int f() { return 1; }
This also precludes calling functions without prototypes in C99: Are prototypes required for all functions in C89, C90 or C99?
n1256 says:
remove implicit int
Return without expression for non void function
Valid C89, invalid C99:
int f() { return; }
I think in C89 it returns an implementation defined value. n1256 says:
return without expression not permitted in function that returns a value
Integer division with negative operand
C89: rounds to an implementation defined direction
C99: rounds to 0
So if your compiler rounded to -inf, and you relied on that implementation defined behavior, your compiler is now forced to break your code on C99.
https://stackoverflow.com/a/3604984/895245
n1256 says:
reliable integer division
Windows compatibility
One major practical concern is being able to compile in Windows, since Microsoft does not intend to implement C99 fully too soon.
This is for example why libgit2 limits allowed C99 features.
Respectfully: Try it and find out. :-)
Though, keep in mind that even if you need to fix a few minior compiling differences, moving up is probably worth it.
If you don't violate the explicit C99 features,a c90 code will work fine c99 flag with another prior c99 libraries.
But there are some dos based libraries in C89 like ,, that will certainly not work.
C99 is much flexible so feel free to migrate :-)
The calling conventions between C libraries hasn't changed in ages, and in fact, I'm not sure it ever has.
Operating systems at this point rely heavily on the C calling conventions since the C APIs tend to be the glue between the pieces of the OS.
So, basically the answer is "Yes, the binaries will be backwards compatible. No, naturally, code using C99 features can't later be compiled with a non-C99 compiler."
It's intended to be backwards compatible. It formalizes extensions that many vendors have already implemented. It's possible, maybe even probable, that a well written program won't have any issues when compiling with C99.
In my experience, recompiling some modules and not others to save time... wastes a lot of time. Usually there is some easily overlooked detail that needs the new compiler to make it all compatible.
There are a few parts of the C89 Standard which are ambiguously written, and depending upon how one interprets the rule about types of pointers and the objects they're accessing, the Standard may be viewed as describing one of two very different languages--one of which is semantically much more powerful and consequently usable in a wider range of fields, and one of which allows more opportunities for compiler-based optimization. The C99 Standard "clarified" the rule to make clear that it makes no effort to mandate compatibility with the former language, even though it was overwhelmingly favored in many fields; it also treats as undefined some things that were defined in C89 but only because the C89 rules weren't written precisely enough to forbid them (e.g. the use of memcpy for type punning in cases where the destination has heap duration).
C99 may thus be compatible with the language that its authors thought was described by C89, but is not compatible with the language that was processed by most C89 compilers throughout the 1990s.

Resources