Related
The declaration of strtok_s in C11, and its usage, look to be very different from the strtok_s in compilers like the latest bundled with Visual Studio 2022 (17.4.4) and also GCC 12.2.0 (looking at MinGW64 distribution).
I fear the different form has been developed as a safer and accepted alternative to strtok long before C11. What happens now if someone wants to use strtok_s and stay C11 compliant?
Are the compiler supplied libraries C11 compliant?
Maybe it's just that I've been fooled by something otherwise obvious, and someone can help me...
This is C11 (and similar is to C17 and early drafts of C23):
char *strtok_s(char * restrict s1,
rsize_t * restrict s1max,
const char * restrict s2,
char ** restrict ptr);
the same can be found as a good reference in the safec library
While MSC/VC and GCC have the form
char* strtok_s(
char* str,
const char* delimiters,
char** context
);
The C11 "Annex K bounds checking interfaces" was received with a lot of scepticism and in practice nearly no standard lib implemented it. See for example Field Experience With Annex K — Bounds Checking Interfaces.
As for the MSVC compiler, it doesn't conform to any C standard and never made such claims - you can try this out to check if you are using such a compiler or not:
#if !defined(__STDC__) || (__STDC__==0)
#error This compiler is non-conforming.
#endif
In particular, MSVC did not implement Annex K either, but already had non-standard library extensions in place prior to C11.
In practice _s means:
Possibly more safe or possibly less safe, depending on use and what the programmer expected.
Non-portable.
Possibly non-conforming.
If portability and standard conformance are important, then avoid _s functions.
In practice _s functions protect against two things: getting passed non-sanitized input or null pointers. So assuming that you do proper input sanitation and don't pass null pointers to library functions, the _s functions aren't giving you extra safety, just extra execution bloat and portability problems.
What happens now if someone wants to use strtok_s and stay C11 compliant?
You de facto can't.
And it's not limited to just strtok_s(). The entire C11 Annex K set of implementations is fractured, and because the major deviations from the standard are from Microsoft's implementation, there will probably never be a way to write portable, standard-conforming code using the Annex K functions.
Per N1967 Field Experience With Annex K — Bounds Checking Interface:
Available Implementations
Despite the specification of the APIs having been around for over a
decade only a handful of implementations exist with varying degrees of
completeness and conformance. The following is a survey of
implementations that are known to exist and their status.
While two of the implementations below are available in portable
source code form as Open Source projects, none of the popular Open
Source distribution such as BSD or Linux has chosen to make either
available to their users. At least one (GNU C Library) has repeatedly
rejected proposals for inclusion for some of the same reasons as those
noted by the Austin Group in their initial review of TR 24731-1
N1106]. It appears unlikely that the APIs will be provided by future
versions of these distributions.
Microsoft Visual Studio
Microsoft Visual Studio implements an early version of the APIs.
However, the implementation is incomplete and conforms neither to C11
nor to the original TR 24731-1. For example, it doesn't provide the
set_constraint_handler_s function but instead defines a
_invalid_parameter_handler _set_invalid_parameter_handler(_invalid_parameter_handler) function with similar behavior but a slightly different and incompatible
signature. It also doesn't define the abort_handler_s and
ignore_handler_s functions, the memset_s function (which isn't
part of the TR), or the RSIZE_MAX macro. The Microsoft
implementation also doesn't treat overlapping source and destination
sequences as runtime-constraint violations and instead has undefined
behavior in such cases.
As a result of the numerous deviations from the specification the
Microsoft implementation cannot be considered conforming or portable.
...
Safe C Library
Safe C Library [SafeC] is a fairly efficient and portable but
unfortunately very incomplete implementation of Annex K with support
for the string manipulation subset of functions declared in
<string.h>.
Due to its lack of support for Annex K facilities beyond the
<string.h> functions the Safe C Library cannot be considered a
conforming implementation.
Even the Safe C library is non-conforming.
Whether these functions are "safer" is debatable. Read the entire document.
Unnecessary Uses
A widespread fallacy originated by Microsoft's deprecation of the standard functions in an effort to increase the adoption of the APIs is that every call to the standard functions is necessarily unsafe and should be replaced by one to the "safer" API. As a result, security-minded teams sometimes naively embark on months-long projects rewriting their working code and dutifully replacing all instances of the "deprecated" functions with the corresponding APIs. This not only leads to unnecessary churn and raises the risk of injecting new bugs into correct code, it also makes the rewritten code less efficient.
Also, read the updated N1969 Updated Field Experience With Annex K — Bounds Checking Interfaces:
Despite more than a decade since the original proposal and nearly ten years since the ratification of ISO/IEC TR 24731-1:2007, and almost five years since the introduction of the Bounds checking interfaces into the C standard, no viable conforming implementations has emerged. The APIs continue to be controversial and requests for implementation continue to be rejected by implementers.
The design of the Bounds checking interfaces, though well-intentioned, suffers from far too many problems to correct. Using the APIs has been seen to lead to worse quality, less secure software than relying on established approaches or modern technologies. More effective and less intrusive approaches have become commonplace and are often preferred by users and security experts alike.
Therefore, we propose that Annex K be either removed from the next revision of the C standard, or deprecated and then removed.
I just read: C Wikipedia entry. As far as I know there are 3 different versions of C that are widely used: C89, C99 and C11. My question concerns the compatibility of source code of different versions.
Suppose I am going to write a program (in C11 since it is the latest version) and import a library written in C89. Are these two versions going to work together properly when compiling all files according to the C11 specification?
Question 1:
Are the newer versions of C i.e. C99, C11 supersets of older C versions? By superset I mean, that old code will compile without errors and the same meaning when compiled according to newer C specifications.
I just read, that the // has different meanings in C89 and C99. Apart from this feature, are C99 and C11 supersets of C89?
If the answer to Question 1 is no, then I got another 2 questions.
How to 'port' old code to the new versions? Is there a document which explains this procedure?
Is it better to use C89 or C99 or C11?
Thanks for your help in advance.
EDIT: changed ISO C to C89.
Are the newer versions of C i.e. C99, C11 supersets of older C versions?
There are many differences, big and subtle both. Most changes were adding of new features and libraries. C99 and C11 are not supersets of C90, although there was a great deal of effort made to ensure backwards-compatibility. C11 is however mostly a superset of C99.
Still older code could break when porting from C90, in case the code was written poorly. Particularly various forms of "implicit int" and implicit function declarations were banned from the language with C99. C11 banned the gets function.
A complete list of changes can be found in the C11 draft page 13 of the pdf, where "the third edition" refers to C11 and "the second edition" refers to C99.
How to 'port' old code to the new versions? Is there a document which explains this procedure?
I'm not aware about any such document. If you have good code, porting is easy. If you have rotten code, porting will be painful. As for the actual porting procedure, it will be easy if you know the basics of C99 and C11, so the best bet is to find a reliable source of learning which addresses C99/C11.
Porting from C99 to C11 should be effortless.
Is it better to use C89 or C99 or C11?
It is best to use C11 as that is the current standard. C99 and C11 both contained various "language bug fixes" and introduced new, useful features.
In most ways, the later versions are supersets of earlier versions. While C89 code which tries to use restrict as an identifier will be broken by C99's addition of a reserved word with the same spelling, and while there are some situations in which code which is contrived to exploit some corner cases with a parser will be treated differently in the two languages, most of those are unlikely to be important.
A more important issue, however, has to do with memory aliasing. C89 include
rules which restrict the types of pointers that can be used to access certain
objects. Since the rules would have made functions like malloc() useless if
they applied, as written, to the objects created thereby, most programmers and
compiler writers alike treated the rules as applying only in limited cases (I doubt C89 would have been widely accepted if people didn't believe the rules applied only narrowly). C99 claimed to "clarify" the rules, but its new rules are much more expansive in effect than contemporaneous interpretations of the old ones, breaking a lot of code that would have had defined behavior under those common interpretations of C89, and even some code which would have been unambiguously defined under C89 has no practical C99 equivalent.
In C89, for example, memcpy could be used to copy the bit pattern associated with an object of any type to an object of any other type with the same size, in any cases where that bit pattern would represent a valid value in the destination type. C99 added language which allows compilers to behave in arbitrary fashion if memcpy is used to copy an object of some type T to storage with no declared type (e.g. storage returned from malloc), and that storage is then read as object of a type that isn't alias-compatible with T--even if the bit pattern of the original object would have a valid meaning in the new type. Further, the rules that apply to memcpy also apply in cases where an object is copied as an array of character type--without clarifying exactly what that means--so it's not clear exactly what code would need to do to achieve behavior matching the C89 memcpy.
On many compilers such issues can be resolved by adding a -fno-strict-aliasing option to the command line. Note that specifying C89 mode may not be sufficient, since compilers writers often use the same memory semantics regardless of which standard they're supposed to be implementing.
The newer versions of C are definitely not strict super-sets of the older versions.
Generally speaking, this sort of problem only arises when upgrading the compiler or switching compiler vendors. You must plan for a lot of minor touches of the code to deal with this event. A lot of the time, the new compiler will catch issues that were left undiagnosed by the old compiler, in addition to minor incompatibilities that may have occurred by enforcing the newer C standard.
If it is possible to determine, the best standard to use is the one that the compiler supports the best.
The C11 wikipedia article has a lengthy description of how it differs from C99.
In general, newer versions of the standard are backward compatible.
If not, you can compile different .c files to different .o files, using different standards, and link them together. That does work.
Generally you should use the newest standard available for new code and, if it's easy, fix code that the new standard does break instead of using the hacky solution above.
EDIT: Unless you're dealing with potentially undefined behavior.
This question is about the same subject as strdup or _strdup? but it is not the same. That question asks how to work around MS's renamings, this question asks why they did it in the first place.
For some reason Microsoft has deprecated a whole slew of POSIX C functions and replaced them with _-prefixed variants. One example among many is isatty:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/posix-isatty
This POSIX function is deprecated. Use the ISO C++ conformant _isatty instead.
What exactly is ISO C++ conformant about _isatty? It appears to me that the MSDN help is totally wrong.
The other questions answer explained how to deal with this problem. You add the _CRT_NONSTDC_NO_DEPRECATE define. Fine. But I want to know what Microsoft's thinking is. What was their point in renaming and deprecating functions? Was it just to make C programmers lives even harder?
The fact that _isatty() is ISO C++ conformant makes sense if you think of it like a language-lawyer.
Under ISO C++, the compiler is only supposed to provide the functions in the standard (at least for the standard headers) -- they're not allowed to freely add extra functions, because it could conflict with functions declared in the code being compiled. Since isatty() is not listed in the standard, providing an isatty() function in a standard header would not be ISO C++ compliant.
However, the standard does allow the compiler to provide any function it wants as long as the function starts with a single underscore. So -- language lawyer time -- _isatty() is compliant with ISO C++.
I believe that's the logic that leads to the error message being phrased the way it is.
(Now, in this specific case, isatty() was provided in io.h, which is not actually a C++ standard header, so technically Microsoft could provide it and still claim to be standards-conformant. But, they had other non-compliant functions like strcmpi() in string.h, which is a standard header. So, for consistency, they deprecated all of the POSIX functions the same way and they all report the same error message.)
Names starting with an underscore, like _isatty are reserved for the implementation. They do not have a meaning defined by ISO C++, nor by ISO C, and you can't use them for your own purposes. So Microsoft is entirely right in using this prefix, and POSIX is actually wrong.
C++ has namespaces, so a hypthetical "Posix C++" could define namespace posix, but POSIX has essentially become fossilized - no new innovation in that area.
isatty & co., although POSIX, are not standard C, and are provided as "extensions" by the VC++ runtime1.
As such, they are prefixed with an underscore supposedly to avoid name clashes - as names starting with an underscore followed by a lowercase letter are reserved for implementation-defined stuff at global scope. So if, for example, you wanted to use an actual POSIX compatibility layer providing its own versions of these functions, they wouldn't have to fight with the VC++-provided "fake" ones for the non-underscored names.
Extensions which have no presumption to be actually POSIX-compliant, by the way.
I'm looking into learning C basics and syntax before beginning Systems Programming next month. When doing some reading, I came across the C89/99 standards. According to Wikipedia,
C99 introduced several new features,
including inline functions, several
new data types (including long long
int and a complex type to represent
complex numbers), variable-length
arrays, support for variadic macros
(macros of variable arity) and support
for one-line comments beginning with
//, as in BCPL or C++. Many of these
had already been implemented as
extensions in several C compilers.
C99 is for the most part backward
compatible with C90, but is stricter
in some ways; in particular, a
declaration that lacks a type
specifier no longer has int
implicitly assumed. A standard macro
STDC_VERSION is defined with value 199901L to indicate that C99 support
is available. GCC, Sun Studio and
other compilers now support many or
all of the new features of C99.
I borrowed a copy of K&R, 2nd Edition, and it uses the C89 standard. For a student, does the use of C89 invalidate some subjects covered in K&R, and if so, what should I look out for?
There is no reason to learn C89 or C90 over C99- it's been very literally superseded. It's easy to find C99 compilers and there's no reason whatsoever to learn an earlier standard.
This doesn't mean that your professor won't force C89 upon you. From the various questions posted here marked homework, I get the feeling that many, many C (and, unfortunately, C++) courses haven't moved on since C89.
From the perspective of a starting student, the chances are that you won't really notice the difference- there's plenty of C that's both C99 and C89/90 to be covered.
Use the C99 standard, it's newer and has more features. Particularly useful may be the bool type in <stdbool.h> and the int32_t etc. family of types; the latter prevents a lot of unportable code that relies on ints having a certain size. AFAIK, it doesn't invalidate K&R, though some example programs may be written in a slightly different style now.
Note that some compilers still don't support C99 properly. I believe that GCC still requires the use of a -std=c99 flag to enable it; many Unix/Linux systems have a c99 command that wraps GCC and enables C99.
The same goes for many university professors. I surprised mine by handing in a program that used bool in my freshman year. He'd never heard of that type in C :)
While I generally agree with the others, it is worth noting that K&R is such a good book that it might be worth learning C from it and then updating your knowledge as you read about the C99 standard.
If you are at student level you probably won't even notice the differences.
Yes, it's a bit odd that you can get a loud consensus that K&R is a great C book, and also a loud consensus that C99 is the correct/current/best version of C. The two positions are incompatible - even if K&R is the best book available to learn "C meaning C99", that just implies the rest are rubbish, or are also hopelessly outdated.
I would advise learning and using C99, but keeping an eye to C89 as you do so. If you use a compiler that has both C89 and C99 compliant modes, then you can write a few bits of C89 just to get an idea of the differences. Then if you ever need to write some code intended to be portable to places that C99 doesn't go, you'll know what to do. If you never have to write any such code, then you've wasted perhaps a day.
Writing C89 properly is actually surprisingly difficult, because getting hold of a copy of the C89 standard is difficult. So, C99 if you can, C89 if for some odd reason you have to, and have some awareness what the difference is. Maybe use K&R to cover the very basics, but get a look at some idiomatic C99 as soon as possible.
As for specific issues to be aware of when reading K&R: there's a list of major changes in the foreword of the standard (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf), although the details aren't laid out there. A lot of them are new features added to C99, so it's not that K&R is wrong, it just may not always use the best tools for a given job. Some of them are quite fiddly things where you should probably consult the standard if you need the details anyway. The rest are things removed from C89, that usually a C99 compiler will tell you about as and when you try to use them.
As a student, that doesn't influence you so much. But if possible, you should find a new C book which covers C99
The term "C89" describes two very different languages:
The language that programmers in 1989 thought the Committee was describing in places where the Standard was ambiguous, and which supported features that were common in pre-existing implementations.
The language that the Committee has since decided that it wanted to have described, which threw compatibility with existing functionality out the
window.
C99 "clarifies" ambiguous parts of the standard by saying that they meant
to have the Standard interpreted in a way that would have broken a substantial
fraction of existing code and made it impossible to perform many tasks as
efficiently as they had been performed in C before 1989.
The right language to program in, for many applications, would be the superset of pre-Standard C, C89, C99, and C11. It's important, however, that anyone programming in that language be clear that they're using that language rather than a shrinking subset which favors speed over reliability.
While I think it's beneficial to know which features are more recent and less likely to be supported by obscure (or intentionally-broken, like MSVC) compilers, there are a few C99 features that you should absolutely use:
snprintf: This is the definitive function for safe and clean string assembly in C. If your compiler is missing it, you can either replace the whole printf subsystem (probably a good idea since most implementations with missing snprintf are also full of (often intentional) bugs in printf behavior), or wrap tmpfile/fprintf/fread/fclose.
stdint.h: If you need fixed-size types (16/32/64-bit), use the standard names int16_t, uint16_t, int32_t, etc. Do not invent your own, and absolutely don't use system-specific ones like INT64 or u32. It just makes your code ugly and hard to integrate and reuse. If your compiler is missing stdint.h, just drop in your own to define the types in terms of the correct-for-your-platform types.
Specifically uint64_t, in place of int foo[2]; or struct { int lo, int hi; } foo; or other hideous legacy hacks to work with 64-bit numbers. Any sane compiler even without C99 support has its own 64-bit types you can use to define int64_t and uint64_t.
I'm used to old-style C and and have just recently started to explore c99 features. I've just one question: Will my program compile successfully if I use c99 in my program, the c99 flag with gcc and link it with prior c99 libraries?
So, should I stick to old C89 or evolve?
I believe that they are compatible in that respect. That is as long as the stuff that you are compiling against doesn't step on any of the new goodies. For instance, if the old code contains enum bool { false, true }; then you are in trouble. As a similar dinosaur, I am slowly embracing the wonderful new world of C99. After all, it has only been out there lurking for about 10 years now ;)
You should evolve. Thanks for listening :-)
Actually, I'll expand on that.
You're right that C99 has been around for quite a while. You should (in my opinion) be using that standard for anything other than legacy code (where you just fix bugs rather than add new features). It's probably not worth it for legacy code but you should (as with all business decisions) do your own cost/benefit analysis.
I'm already ensuring my new code is compatible with C1x - while I'm not using any of the new features yet, I try to make sure it won't break.
As to what code to look out for, the authors of the standards take backward compatibility very seriously. Their job was not ever to design a new language, it was to codify existing practices.
The phase they're in at the moment allows them some more latitude in updating the language but they still follow the Hippocratic oath in terms of their output: "first of all, do no harm".
Generally, if your code is broken with a new standard, the compiler is forced to tell you. So simply compiling your code base will be an excellent start. However, if you read the C99 rationale document, you'll see the phrase "quiet change" appear - this is what you need to watch out for.
These are behavioral changes in the compiler that you don't need to be informed about and may be the source of much angst and gnashing of teeth if your application starts acting strange. Don't worry about the "quiet change in c89" bits - if they were a problerm, you would have already been bitten by them.
That document, by the way, is an excellent read to understand why the actual standard says what it says.
Some C89 features are not valid C99
Arguably, those features exist only for historical reasons, and should not be used in modern C89 code, but they do exist.
The C99 N1256 standard draft foreword paragraph 5 compares C99 to older revisions, and is a good place to start searching for those incompatibilities, even though it has by far more extensions than restrictions.
Implicit int return and variable types
Mentioned by Lutz in a comment, e.g. the following are valid C89:
static i;
f() { return 1; }
but not C99, in which you have to write:
static int i;
int f() { return 1; }
This also precludes calling functions without prototypes in C99: Are prototypes required for all functions in C89, C90 or C99?
n1256 says:
remove implicit int
Return without expression for non void function
Valid C89, invalid C99:
int f() { return; }
I think in C89 it returns an implementation defined value. n1256 says:
return without expression not permitted in function that returns a value
Integer division with negative operand
C89: rounds to an implementation defined direction
C99: rounds to 0
So if your compiler rounded to -inf, and you relied on that implementation defined behavior, your compiler is now forced to break your code on C99.
https://stackoverflow.com/a/3604984/895245
n1256 says:
reliable integer division
Windows compatibility
One major practical concern is being able to compile in Windows, since Microsoft does not intend to implement C99 fully too soon.
This is for example why libgit2 limits allowed C99 features.
Respectfully: Try it and find out. :-)
Though, keep in mind that even if you need to fix a few minior compiling differences, moving up is probably worth it.
If you don't violate the explicit C99 features,a c90 code will work fine c99 flag with another prior c99 libraries.
But there are some dos based libraries in C89 like ,, that will certainly not work.
C99 is much flexible so feel free to migrate :-)
The calling conventions between C libraries hasn't changed in ages, and in fact, I'm not sure it ever has.
Operating systems at this point rely heavily on the C calling conventions since the C APIs tend to be the glue between the pieces of the OS.
So, basically the answer is "Yes, the binaries will be backwards compatible. No, naturally, code using C99 features can't later be compiled with a non-C99 compiler."
It's intended to be backwards compatible. It formalizes extensions that many vendors have already implemented. It's possible, maybe even probable, that a well written program won't have any issues when compiling with C99.
In my experience, recompiling some modules and not others to save time... wastes a lot of time. Usually there is some easily overlooked detail that needs the new compiler to make it all compatible.
There are a few parts of the C89 Standard which are ambiguously written, and depending upon how one interprets the rule about types of pointers and the objects they're accessing, the Standard may be viewed as describing one of two very different languages--one of which is semantically much more powerful and consequently usable in a wider range of fields, and one of which allows more opportunities for compiler-based optimization. The C99 Standard "clarified" the rule to make clear that it makes no effort to mandate compatibility with the former language, even though it was overwhelmingly favored in many fields; it also treats as undefined some things that were defined in C89 but only because the C89 rules weren't written precisely enough to forbid them (e.g. the use of memcpy for type punning in cases where the destination has heap duration).
C99 may thus be compatible with the language that its authors thought was described by C89, but is not compatible with the language that was processed by most C89 compilers throughout the 1990s.