What happened to antlr3NewAsciiStringCopyStream in ANTLR 3.4 (C-runtime)? - c

The C runtime distribution of ANTLR 3.2 used to have a function declared as
ANTLR3_API pANTLR3_INPUT_STREAM antlr3NewAsciiStringCopyStream
(pANTLR3_UINT8 inString, ANTLR3_UINT32 size, pANTLR3_UINT8 name);
in include/antlr3defs.h. There were also a few similar functions, such as antlr3NewAsciiStringInPlaceStream, antlr3NewUCS2StringInPlaceStream and so forth.
But in the 3.4 version these functions seem to have gone. They are neither declared in any of the .h files, nor are definitions compiled into the library.
I checked the release notes for 3.3 and 3.4 and the FAQ, but I couldn't find any mention of this. On the contrary, the FAQ recommends (see [2] below):
How to get a pANTLR3_INPUT_STREAM from a std::string (or char* variable)?
Functions
[1]pANTLR3_INPUT_STREAM
[2]antlr3NewAsciiStringCopyStream
([3]pANTLR3_UINT8 inString, [4]ANTLR3_UINT32 size, [5]pANTLR3_UINT8 name)
Create an ASCII string stream as input to ANTLR 3,
copying the input string.
I have legacy code that uses ANTLR 3 and this function, and I can't easily switch to ANTLR 4. I could continue using the 3.2 version or one of the other functions listed above, but it would be good to know what happened, and how to best handle this.

I came across this problem just recently too. Seemingly, you can use the following function to get the same functionality:
pANTLR3_INPUT_STREAM antlr3StringStreamNew (pANTLR3_UINT8 data, ANTLR3_UINT32 encoding, ANTLR3_UINT32 size, pANTLR3_UINT8 name);
For encoding, you'd use ANTLR3_ENC_8BIT (or otherwise)
Tom.

Related

Meaning of notation (_()) in C code [duplicate]

I'm here looking at some C source code and I've found this:
fprintf(stderr, _("Try `%s --help' for more information.\n"), command);
I already saw the underscore when I had a look at wxWidget, and I read it's used for internationalization. I found it really horrible (the least intutive name ever), but I tought it's just another weird wxWidget convention.
Now I find it again in some Alsa source. Does anyone know where it comes from?
It comes from GNU gettext, a package designed to ease the internationalization process. The _() function is simply a string wrapper. This function basically replaces the given string on runtime with a translation in the system's language, if available (i.e. if they shipped a .mo file for this language with the program).
It comes from gettext. Originally thought out, internationalization was too long to type each time you needed a string internationalized. So programmers created the shortcut i18n (because there are 18 letters in between the 'i' and the 'n' in internationalization) and you may see source code out there using that. Apparently though i18n was still too long, so now its just an underscore.
That would be from gettext

Underscore `_` before the format string

I'm here looking at some C source code and I've found this:
fprintf(stderr, _("Try `%s --help' for more information.\n"), command);
I already saw the underscore when I had a look at wxWidget, and I read it's used for internationalization. I found it really horrible (the least intutive name ever), but I tought it's just another weird wxWidget convention.
Now I find it again in some Alsa source. Does anyone know where it comes from?
It comes from GNU gettext, a package designed to ease the internationalization process. The _() function is simply a string wrapper. This function basically replaces the given string on runtime with a translation in the system's language, if available (i.e. if they shipped a .mo file for this language with the program).
It comes from gettext. Originally thought out, internationalization was too long to type each time you needed a string internationalized. So programmers created the shortcut i18n (because there are 18 letters in between the 'i' and the 'n' in internationalization) and you may see source code out there using that. Apparently though i18n was still too long, so now its just an underscore.
That would be from gettext

How to walk a directory in C

I am using glib in my application, and I see there are convenience wrappers in glib for C's remove, unlink and rmdir. But these only work on a single file or directory at a time.
As far as I can see, neither the C standard nor glib include any sort of recursive directory walk functionality. Nor do I see any specific way to delete an entire directory tree at once, as with rm -rf.
For what I'm doing this I'm not worried about any complications like permissions, symlinks back up the tree (infinite recursion), or anything that would rule out a very naive
implementation... so I am not averse to writing my own function for it.
However, I'm curious if this functionality is out there somewhere in the standard libraries gtk or glib (or in some other easily reused C library) already and I just haven't stumbled on it. Googling this topic generates a lot of false leads.
Otherwise my plan is to use this type of algorithm:
dir_walk(char* path, void* callback(char*) {
if(is_dir(path) && has_entries(path)) {
entries = get_entries(path);
for(entry in intries) { dir_walk(entry, callback); }
}
else { callback(path) }
}
dir_walk("/home/user/trash", remove);
Obviously I would build in some error handling and the like to abort the process as soon as a fatal error is encountered.
Have you looked at <dirent.h>? AFAIK this belongs to the POSIX specification, which should be part of the standard library of most, if not all C compilers. See e.g. this <dirent.h> reference (Single UNIX specification Version 2 by the Open Group).
P.S., before someone comments on this: No, this does not offer recursive directory traversal. But then I think this is best implemented by the developer; requirements can differ quite a lot, so one-size-fits-all recursive traversal function would have to be very powerful. (E.g.: Are symlinks followed up? Should recursion depth be limited? etc.)
You can use GFileEnumerator if you want to do it with glib.
Several platforms include ftw and nftw: "(new) file tree walk". Checking the man page on an imac shows that these are legacy, and new users should prefer fts. Portability may be an issue with either of these choices.
Standard C libraries are meant to provide primitive functionality. What you are talking about is composite behavior. You can easily implement it using the low level features present in your API of choice -- take a look at this tutorial.
Note that the "convenience wrappers" you mention for remove(), unlink() and rmdir(), assuming you mean the ones declared in <glib/gstdio.h>, are not really "convenience wrappers". What is the convenience in prefixing totally standard functions with a "g_"? (And note that I say this even if I who introduced them in the first place.)
The only reason these wrappers exist is for file name issues on Windows, where these wrappers actually consist of real code; they take file name arguments in Unicode, encoded in UTF-8. The corresponding "unwrapped" Microsoft C library functions take file names in system codepage.
If you aren't specifically writing code intended to be portable to Windows, there is no reason to use the g_remove() etc wrappers.

Customizable implementation of sprintf()

Can anyone point me to a source code file or to a package that has a good, reusable implementation of sprintf() in C which I can customize as per my own need?
An explanation on why I need it: Strings are not null terminated in my code (binary compatible). Therefore sprintf("%s") is useless unless I fix the code to understand how to render string.
Thanks to quinmars for pointing out that there is way to print string through %s without it being null terminated. Though it solves the requriement right now, I shall eventually need the sprintf (or snprintf) implementation for higher level functions which use variants. Out of other mentioned till now, it seems to me that SQLite implementation is the best. Thanks Doug Currie for pointing it out.
I haven't tried it, because I don't have a compiler here, but reading the man page, it looks like that you can pass a precision for '%s':
... If a precision is given, no null character
need be present; if the precision is not specified, or is greater
than the size of the array, the array must contain a terminating
NUL character.
So have you tried to do something like that?
snprintf(buffer, sizeof(buffer), "%.*s", bstring_len, bstring);
As said I haven't test it, and if it works, it works of course only if you have no '\0'-byte inside of the string.
EDIT: I've tested it now and it works!
You should really be looking for snprintf (sprintf respecting output buffer size); google suggests http://www.ijs.si/software/snprintf/.
There is a nice public domain implementation as part of SQLite here.
I agree with Dickon Reed that you want snprintf, which is included in the SQLite version.
I have used this guys source code.
It is small, understandable and easy to modify(as opposed to glib & libc).
According to this link- http://www.programmingforums.org/thread12049.html :
If you have the full gcc distribution,
the source for the C library (glib or
libc) is one of the subdirectories
that comes for the ride.
So you can look it up there.
I don't know how helpful that will be...
The only reason I can think of for wanting to modify sprintf is to extend it, and the only reason for extending it is when you're on your way to writing some sort of parser.
If you are looking to create a parser for something like a coding language, XML, or really anything with a syntax, I suggest you look into Lexers and Parser Generators (2 of the most commonly used ones are Flex and Bison) which can pretty much write the extremely complex code for parsers for you (though the tools themselves are somewhat complex).
Otherwise, you can find the code for it in the source files that are included with Visual Studio (at least 2005 and 2008, others might have it, but those 2 definitely do).
snprintf from glibc is customizable via hook/handler mechanism
Just an idea...
Example:
#include <stdio.h>
#include <string.h>
#include <stdarg.h>
int sprintf(char * str, const char * format, ... )
{// Here you can redfine your input before continuing to compy with standard inputs
va_list args;
va_start(args, format);
vsprintf(str,format, args);// This still uses standaes formating
va_end(args);
return 0;// Before return you can redefine it back if you want...
}
int main (void)
{
char h[20];
sprintf(h,"hei %d ",10);
printf("t %s\n",h);
getchar();
return 0;
}
Look at Hanson's C Interfaces: Implementations and Techniques. It is an interesting book in that it is written using Knuth's Literate Programming techniques, and it specifically includes an extensible formatted I/O interface based on snprintf().
A small implementation originally authored by Marco Paland, and I have been maintaining it, fixing many bugs and adding missing functionality, in this repository: eyalroz/printf. It's ~1170 lines-of-code for full C99 sprintf/vsprintf/etc. compared to sqlite's 3993 (although SQLite could use a lot less; it includes this sqliteint.h header with a lot of unrelated stuff; also, %a is not yet supported, as are sub-normal doubles)

Do you use the TR 24731 'safe' functions? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
The community reviewed whether to reopen this question 9 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
The ISO C committee (ISO/IEC JTC1/SC21/WG14) has published TR 24731-1 and is working on TR 24731-2:
TR 24731-1: Extensions to the C Library Part I: Bounds-checking interfaces
WG14 is working on a TR on safer C library functions. This TR is oriented towards modifying existing programs, often by adding an extra parameter with the buffer length. The latest draft is in document N1225. A rationale is in document N1173. This is to become a Technical Report type 2.
TR 24731-2: Extensions to the C Library - Part II: Dynamic allocation functions
WG14 is working on a TR on safer C library functions. This TR is oriented towards new programs using dynamic allocation instead of an extra parameter for the buffer length. The latest draft is in document N1337. This is to become a Technical Report type 2.
Questions
Do you use a library or compiler with support for the TR24731-1 functions?
If so, which compiler or library and on which platform(s)?
Did you uncover any bugs as a result of fixing your code to use these functions?
Which functions provide the most value?
Are there any that provide no value or negative value?
Are you planning to use the library in the future?
Are you tracking the TR24731-2 work at all?
I have been a vocal critic of these TRs since their inception (when it was a single TR) and would never use them in any of my software. They mask symptoms instead of addressing causes and it is my opinion that if anything they will have a negative impact on software design as they provide a false sense of security instead of promoting existing practices that can accomplish the same goals much more effectively. I am not alone, in fact I am not aware of a single major proponent outside of the committee developing these TRs.
I use glibc and as such know that I will be spared having to deal with this nonsense, as Ulrich Drepper, lead maintainer for glibc, said about the topic:
The proposed safe(r) ISO C library
fails to address to issue completely.
... Proposing to make the life of a
programmer even harder is not going to
help. But this is exactly what is
proposed. ... They all require more
work to be done or are just plain
silly.
He goes on to detail problems with a number of the proposed functions and has elsewhere indicated that glibc would never support this.
The Austin Group (responsible for maintaining POSIX) provided a very critical review of the TR, their comments and the committee responses available here. The Austin Group review does a very good job detailing many of the problems with the TR so I won't go into individual details here.
So the bottom line is: I don't use an implementation that supports or will support this, I don't plan on ever using these functions, and I see no positive value in the TR. I personally believe that the only reason the TR is still alive in any form is because it is being pushed hard by Microsoft who has recently proved very capable of getting things rammed though standards committees despite wide-spread opposition. If these functions are ever standardized I don't think they will ever become widely used as the proposal has been around for a few years now and has failed to garner any real community support.
Direct answer to question
I like Robert's answer, but I also have some views on the questions I raised.
Do you use a library or compiler with support for the TR24731-1 functions?
No, I don't.
If so, which compiler or library and on which platform(s)?
I believe the functions are provided by MS Visual Studio (MS VC++ 2008 Edition, for example), and there are warnings to encourage you to use them.
Did you uncover any bugs as a result of fixing your code to use these functions?
Not yet. And I don't expect to uncover many in my code. Some of the other code I work with - maybe. But I've yet to be convinced.
Which functions provide the most value?
I like the fact that the printf_s() family of functions do not accept the '%n' format specifier.
Are there any that provide no value or negative value?
The tmpfile_s() and tmpnam_s() functions are a horrible disappointment. They really needed to work more like mkstemp() which both creates the file and opens it to ensure there is no TOCTOU (time-of-check, time-of-use) vulnerability. As it stands, those two provide very little value.
I also think that strerrorlen_s() provides very little value.
Are you planning to use the library in the future?
I am in two minds about it. I started work on a library that would implement the capabilities of TR 24731 over a standard C library, but got caught by the amount of unit testing needed to demonstrate that it is working correctly. I'm not sure whether to continue that. I have some code that I want to port to Windows (mainly out of a perverse desire to provide support on all platforms - it's been working on Unix derivatives for a couple of decades now). Unfortunately, to get it to compile without warnings from the MSVC compilers, I have to plaster the code with stuff to prevent MSVC wittering about me using the perfectly reliable (when carefully used) standard C library functions. And that is not appetizing. It is bad enough that I have to deal with most of two decades worth of a system that has developed over that period; having to deal with someone's idea of fun (making people adopt TR 24731 when they don't need to) is annoying. That was partly why I started the library development - to allow me to use the same interfaces on Unix and Windows. But I'm not sure what I'll do from here.
Are you tracking the TR24731-2 work at all?
I'd not been tracking it until I went to the standards site while collecting the data for the question. The asprintf() and vasprintf() functions are probably valuable; I'd use those. I'm not certain about the memory stream I/O functions. Having strdup() standardized at the C level would be a huge step forward. This seems less controversial to me than the part 1 (bounds checking) interfaces.
Overall, I'm not convinced by part 1 'Bounds-Checking Interfaces'. The material in the draft of part 2 'Dynamic Allocation Functions' is better.
If it were up to me, I'd move somewhat along the lines of part 1, but I'd also revised the interfaces in the C99 standard C library that return a char * to the start of the string (e.g. strcpy() and strcat()) so that instead of returning a pointer to the start, they'd return a pointer to the null byte at the end of the new string. This would make some common idioms (such as repeatedly concatenating strings onto the end of another) more efficient because it would make it trivial to avoid the quadratic behaviour exhibited by code that repeatedly uses strcat(). The replacements would all ensure null-termination of output strings, like the TR24731 versions do. I'm not wholly averse to the idea of the checking interface, nor to the exception handling functions. It's a tricky business.
Microsoft's implementation is not the same as the standard specification
Update (2011-05-08)
See also this question. Sadly, and fatally to the usefulness of the TR24731 functions, the definitions of some of the functions differs between the Microsoft implementation and the standard, rendering them useless (to me). My answer there cites vsnprintf_s().
For example, TR 24731-1 says the interface to vsnprintf_s() is:
#define __STDC_WANT_LIB_EXT1__ 1
#include <stdarg.h>
#include <stdio.h>
int vsnprintf_s(char * restrict s, rsize_t n,
const char * restrict format, va_list arg);
Unfortunately, MSDN says the interface to vsnprintf_s() is:
int vsnprintf_s(
char *buffer,
size_t sizeOfBuffer,
size_t count,
const char *format,
va_list argptr
);
Parameters
buffer - Storage location for output.
sizeOfBuffer - The size of the buffer for output.
count - Maximum number of characters to write (not including the terminating null), or _TRUNCATE.
format - Format specification.
argptr - Pointer to list of arguments.
Note that this is not simply a matter of type mapping: the number of fixed arguments is different, and therefore irreconcilable. It is also unclear to me (and presumably to the standards committee too) what benefit there is to having both 'sizeOfBuffer' and 'count'; it looks like the same information twice (or, at least, code will commonly be written with the same value for both parameters).
Similarly, there are also problems with scanf_s() and its relatives. Microsoft says that the type of the buffer length parameter is unsigned (explicitly stating 'The size parameter is of type unsigned, not size_t'). In contrast, in Annex K, the size parameter is of type rsize_t, which is the restricted variant of size_t (rsize_t is another name for size_t, but RSIZE_MAX is smaller than SIZE_MAX). So, again, the code calling scanf_s() would have to be written differently for Microsoft C and Standard C.
Originally, I was planning to use the 'safe' functions as a way of getting some code to compile cleanly on Windows as well as Unix, without needing to write conditional code. Since this is defeated because the Microsoft and ISO functions are not always the same, it is pretty much time to give up.
Changes in Microsoft's vsnprintf() in Visual Studio 2015
In the Visual Studio 2015 documentation for vsnprintf(), it notes that the interface has changed:
Beginning with the UCRT in Visual Studio 2015 and Windows 10, vsnprintf is no longer identical to _vsnprintf. The vsnprintf function complies with the C99 standard; _vnsprintf is retained for backward compatibility.
However, the Microsoft interface for vsnprintf_s() has not changed.
Other examples of differences between Microsoft and Annex K
The C11 standard variant of localtime_s() is defined in ISO/IEC 9899:2011 Annex K.3.8.2.4 as:
struct tm *localtime_s(const time_t * restrict timer,
struct tm * restrict result);
compared with the MSDN variant of localtime_s() defined as:
errno_t localtime_s(struct tm* _tm, const time_t *time);
and the POSIX variant localtime_r() defined as:
struct tm *localtime_r(const time_t *restrict timer,
struct tm *restrict result);
The C11 standard and POSIX functions are equivalent apart from name. The Microsoft function is different in interface even though it shares a name with the C11 standard.
Another example of differences is Microsoft's strtok_s() and Annex K's strtok_s():
char *strtok_s(char *strToken, const char *strDelimit, char **context);
vs:
char *strtok_s(char * restrict s1, rsize_t * restrict s1max, const char * restrict s2, char ** restrict ptr);
Note that the Microsoft variant has 3 arguments whereas the Annex K variant has 4. This means that the argument list to Microsoft's strtok_s() is compatible with POSIX's strtok_r() — so calls to these are effectively interchangeable if you change the function name (e.g. by a macro) — but the Standard C (Annex K) version is different from both with the extra argument.
The question Different declarations of qsort_r() on Mac and Linux has an answer that also discusses qsort_s() as defined by Microsoft and qsort_s() as defined by TR24731-1 — again, the interfaces are different.
ISO/IEC 9899:2011 — C11 Standard
The C11 standard (December 2010 Draft; you could at one time obtain a PDF copy of the definitive standard, ISO/IEC 9899:2011, from the ANSI web store for 30 USD) does have the TR24731-1 functions in it as an optional part of the standard. They are defined in Annex K (Bounds-checking Interfaces), which is 'normative' rather than 'informational', but it is optional.
The C11 standard does not have the TR24731-2 functions in it — which is sad because the vasprintf() function and its relatives could be really useful.
Quick summary:
C11 contains TR24731-1
C11 does not contain TR24731-2
C18 is the same as C11 w.r.t TR24731.
Proposal to remove Annex K from the successor to C11
Deduplicator pointed out in a comment to another question that there is a proposal before the ISO C standard committee (ISO/IEC JTC1/SC22/WG14)
N1967 Field Experience with Annex K — Bounds Checking Interfaces
It contains references to some of the extant implementations of the Annex K functions — none of them widely used (but you can find them via the document if you are interested).
The document ends with the recommendation:
Therefore, we propose that Annex K be either removed from the next revision of the C standard, or deprecated and then removed.
I support that recommendation.
The C18 standard did not alter the status of Annex K. There is a paper N2336 advocating for making some changes to Annex K, repairing its defects rather than removing it altogether.
Ok, now a stand for TR24731-2:
Yes, I've used asprintf()/vasprintf() ever since I've seen them in glibc, and yes I am a very strong advocate of them.
Why?
Because they deliver precisely what I need over and over again: A powerful, flexible, safe and (relatively) easy to use way to format any text into a freshly allocated string.
I am also much in favor of the memstream functions: Like asprintf(), open_memstream() (not fmemopen()!!!) allocates a sufficiently large buffer for you and gives you a FILE* to do your printing, so your printing functions can be entirely ignorant of whether they are printing into a string or a file, and you can simply forget about how much space you will need.
Do you use a library or compiler with support for the TR24731-1 functions?
If so, which compiler or library and on which platform(s)?
Yes, Visual Studio 2005 & 2008 (for Win32 development obviously).
Did you uncover any bugs as a result of fixing your code to use these functions?
Sort of.... I wrote my own library of safe functions (only about 15 that we use frequently) that would be used on multiple platforms -- Linux, Windows, VxWorks, INtime, RTX, and uItron. The reason for creating the safe functions were:
We had encountered a large number of bugs due to improper use of the standard C functions.
I was not satisfied with the information passed into or returned from the TR functions, or in some cases, their POSIX alternatives.
Once the functions were written, more bugs were discovered. So yes, there was value in using the functions.
Which functions provide the most value?
Safer versions of vsnprintf, strncpy, strncat.
Are there any that provide no value or negative value?
fopen_s and similar functions add very little value for me personally. I'm OK if fopen returns NULL. You should always check the return value of the function. If someone ignores the return value of fopen, what is going to make them check the return value of fopen_s? I understand that fopen_s will return more specific error information which can be useful in some contexts. But for what I'm working on, this doesn't matter.
Are you planning to use the library in the future?
We are using it now -- inside our own "safe" library.
Are you tracking the TR24731-2 work at all?
No.
No, these functions are absolutely useless and serve no purpose other than to encourage code to be written so it only compiles on Windows.
snprintf is perfectly safe (when implemented correctly) so snprintf_s is pointless. strcat_s will destroy data if the buffer is overflowed (by clearing the concatenated-to string). There are many many other examples of complete ignorance of how things work.
The real useful functions are the BSD strlcpy and strlcat. But both Microsoft and Drepper have rejected these for their own selfish reasons, to the annoyance of C programmers everywhere.

Resources