fputs( _("") ) what does the underscore stand for? - c

I finally got myself to look at some Linux code. I am looking right now at ls.c.
At the function usage() at the bottom I found a lot of these statements:
fputs (_("\
List information about the FILEs (the current directory by default).\n\
Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.\n\
\n\
"), stdout);
What does _("") mean? Is it something like L"string" or _T"string" or something completely new? I must also admit I don't know what words to use to search for something like this.

It's a convention used by libintl a.k.a. gettext, for translatable strings. When it runs, gettext function (which _ is aliased to) will return either original or translated string, depending on locale settings and availability of said string.

_ is a macro often used with the GNU gettext package.
GNU gettext is a package that:
takes lists of message strings intended for humans to read, and translations of those strings into other languages, and compiles them into databases;
provides a routine, named gettext(), to look up message strings in that database and return the translation for the message into a particular language.
If a program wanted to print a message in the language selected by the user in an environment variable and picked up by a setlocale() call, it would normally do something such as
fprintf(stderr, gettext("I cannot open the file named %s\n"), filename);
gettext() would look up the appropriate translation of the string "I cannot find the file named %s\n" in the database and return the translated string.
However, that's a bit awkward; as the documentation for GNU gettext notes, many programs use a macro to make just _(string) be an alias for gettext(string).

Function names can, of course, contain an _, and an _ can begin a function name. So, it's possible to name a function simply _.
All that's happening is that a #define or a real function is called _.

Related

How to find all C functions starting with a prefix in a library

For a small test framework, I want to do automatic test discovery. Right now, my plan is that all tests just have a prefix, which could basically be implemented like this
#define TEST(name) void TEST_##name(void)
And be used like this (in different c files)
TEST(one_eq_one) { assert(1 == 1); }
The ugly part is that you would need to list all test-names again in the main function.
Instead of doing that, I want to collect all tests in a library (say lib-my-unit-tests.so) and generate the main function automatically, and then just link the generated main function against the library. All of this internal action can be hidden nicely with cmake.
So, I need a script that does:
1. Write "int main(void) {"
2. For all functions $f starting with 'TEST_' in lib-my-unit-tests.so do
a) write "extern void $f(void);"
b) write "$f();
3. Write "}"
Most parts of that script are easy, but I am unsure how to reliably get a list of all functions starting with the prefix.
On POSIX systems, I can try to parse the output of nm. But here, I am not sure if the names will always be the same (on my MacBook, all names start with an additional '_'). To me, it looks like it might be OS/architecture-dependent which names will be generated for the binary. For windows, I do not yet have an idea on how to do that.
So, my questions are:
Is there a better way to implement test-discovery in C? (maybe something like dlsym)
How do I reliably get a list of all function-names starting with a certain prefix on a MacOS/Linux/Windows
A partial solution for the problem is parsing nm with a regex:
for line in $(nm $1) ; do
# Finds all functions starting with "TEST_" or "_TEST_"
if [[ $line =~ ^_?(TEST_.*)$ ]] ; then
echo "${BASH_REMATCH[1]}"
fi
done
And then a second script consumes this output to generate a c file that calls these functions. Then, cmake calls the second script to create the test executable
add_executable(test-executable generated_source.c)
target_link_libraries(test-executable PRIVATE library_with_test_functions)
add_custom_command(
OUTPUT generated_source.c
COMMAND second_script.sh library_with_test_functions.so > generated_source.c
DEPENDS second_script.sh library_with_test_functions)
I think this works on POSIX systems, but I don't know how to solve it for Windows
You can write a shell script using the nm or objdump utilities to list the symbols, pipe through awk to select the appropriate name and output the desired source lines.

emacs count c functions in a .c/.h src file

Does emacs have a function to count the no. of functions in .c/.h C src file?
I would like to be able to count a (large) no. of functions in a .c file and compare this to the no. of functions in the associated unit-test .c file to establish whether every function has a unit-test
ideally there is something built-in to do this as opposed to requiring some sort of reg-exp?
You can use semantic's tag generation to achieve this.
Semantic is a very powerful and underused feature of Emacs. Semantic can parse your .c and .h files and generate tags that you can look through to find your answer. I have written up examples for you:
First, ensure you have the semantic library loaded.
(defun c--count-things-in-buffer (thing buffer)
"return the count of THINGs in BUFFER.
THING may be: 'function, 'variable, or 'type"
(with-current-buffer buffer
;; get the buffers tags, they will be generated if not already
;; then remove the ones that are not 'function tags
;; return the count of what is left
(length (remove-if-not (lambda (tag) (equal thing (second tag)))
(semantic-fetch-tags)))))
(defun c-count-functions-in-buffer (buffer)
"Count and message the number of function declarations in BUFFER"
(interactive "b")
(message "%s has %d functions"
buffer
(c--count-things-in-buffer 'function buffer)))
(defun c-count-variables-in-buffer (buffer)
"Count and message the number of variable declarations in BUFFER"
(interactive "b")
(message "%s has %d variables"
buffer
(c--count-things-in-buffer 'variable buffer)))
(defun c-count-types-in-buffer (buffer)
"Count and message the number of type declarations in BUFFER"
(interactive "b")
(message "%s has %d types"
buffer
(c--count-things-in-buffer 'type buffer)))
Try evaluating this in your scratch buffer then switch to your .c file and do M-x c-count-functions-in-buffer
The information brought back from semantic-fetch-tags has everything you need to solve your unit test problems.
Let's say you have a function called Foobar and your unit test are written like: Test_Foobar. You could get the tags for the .c file and the tags for the test file and do a check for every function in the c file there exists a tag in the test file that matches Test_. This would likely be better than simply counting the total number of functions.
Run this code in your scratch buffer with C-j:
(with-current-buffer "what-ever-your-c-buffer-is.c" (semantic-fetch-tags))
Here you'll be able to see all the great info that comes back that you can use for this.

arguments to printf and g_print - many syntax ,same result

i saw the following kind of code :
g_print("%s\n",_("foo"));
i haven't seen this style of passing arguments to print function ,but then i tried these :
g_print("%s\n","foo");
g_print("%s\n",("foo"));
then i thought had something to do with gtk(i'm fairly new to it) , but then i tried the same thing with printf :
printf("%s\n",_("foo"));
printf("%s\n","foo");
printf("%s\n",("foo"));
and all the above do the same thing : print foo to stdout . So my question is does passing the argument as "foo" , _("foo") ,or ("foo") make any difference at all , or is any one syntactic sugar
for the others,both in the case of printf , as well as g_print ?
sorry if this turns out to be a duplicate question ,but i couldn't seem to put my finger on what i should have searched for exactly in the first place .
The _() is actually a C macro defined as:
#define _(x) some_func(x)
Don't confuse it with ("foo") or "foo". Both of these are same and are just C strings.
You are probably seeing some sort of gettext macro, such as the one described by the glib docs. A common source for these is including glib/gi18n.h directly or (most likely) indirectly.
Marks a string for translation, gets replaced with the translated
string at runtime.
That header file contains a few in this vein:
#define _(String) gettext (String)

Access #ident information in an executable?

gcc (and other compilers) support the #ident preprocessor directive:
The ‘#ident’ directive takes one argument, a string constant. On some
systems, that string constant is copied into a special segment of the
object file. On other systems, the directive is ignored. The ‘#sccs’
directive is a synonym for ‘#ident’.
And I see (with a hex dump) that by e.g. adding this to a source file:
#ident "Hello there !"
This string gets embedded in the executable.
Now, are there any tools (readelf,objdump, gdb or others) that can extract/view these strings ?
If you have RCS installed, I think the ident command will display them. This assumes you format them in the conventional way: $keyword: value $.
Without this keyword, the traditional way to get ident strings into binaries is by putting them in static variables, e.g.
static char const rcsid[] =
"$Id: f.c,v 5.4 1993/11/09 17:40:15 eggert Exp $";
The problem with this is that you get warnings about unused variables, and compilers might optimize them away. So you have to put bogus uses of the variables in your code to prevent this. Also, if ident strings are put in header files, they have to follow naming conventions to avoid conflicts.
I can't find a specification for where #ident puts them. I suspect they're just stuck somewhere in the pure data section, so that they can be found just like the above string.
strings -a will find it -- along with a lot of other stuff. (Experiment shows that strings without -a doesn't find it; apparently it's not stored in the data section.)
As the name #sccs implies, it was originally intended for use with the old SCCS version control system, which can expand keywords in files as they're checked out. You can do something similar with the slightly more modern RCS and CVS systems, which have a different keyword syntax. I don't remember the details of SCCS, but for RCS or CVS you can have something like:
#ident "$Header:$"
which will be expanded on RCS or CVS checkout to, for example,
#ident $Header: /path/to/foo.txt,v 1.6 2013/04/02 20:21:33 yourname Exp $
You can then use the ident command to find strings like that, even in binaries. (ident is part of RCS, not CVS.)
Even if you're not using RCS or CVS, you can still use the same keyword syntax:
#ident "$Header: anything you like here$"
(But your text can be clobbered if you check the file into an RCS or CVS repository.)

How can I validate if a file is name valid in Windows?

Is there a Windows API function that I can pass a string value to that will return a value indicating whether a file name is valid or not?
I need to verify that a file name is valid, and I'm looking for an easy way to do it without re-inventing the wheel. I'm working in straight C, but targeting the Win32 API.
If there's no such function built-in, how would I go about writing my own? Is there a general algorithm or pattern that Windows follows for determining file name validity?
The problem is not so simple, because it depends from what you consider a "valid file name".
The Windows APIs used with UNC paths will let you happily create a lot of names that are deemed invalid inside normal paths, since with the prefix \\?\ you are telling to the Windows APIs to just deliver the path to the filesystem driver, without performing any check; the filesystems themselves often do not really care about what it's used as a file name, once they know that some string is only the file name (i.e. the path/name split has already been done) they generally treat it just as an opaque sequence of characters.
On the other hand, if you want to play it safe, you should perform validation according to the rules specified by the MSDN document you already linked for Win32 names; I don't think that any file system is allowed to have more stringent rules than these on file naming. On the other hand, violating such requirements, although can be supported by the kernel itself, often give bad headaches to many "normal" applications that expect to deal with "traditional" Win32 paths.
But, in my opinion, if you have to create the file immediately, the best validation you can do is to try to actually create/open the file, letting the OS do such work for you, and be prepared to handle gracefully a failure (GetLastError should return ERROR_BAD_PATHNAME). This will check any other restriction you have on creating such file, e.g. that your application has the appropriate permissions, that the path is not on a readonly medium, ...
If, for some reason, this is not possible, you may like the shell function PathCleanupSpec: provided the requested file name and the directory in the file system where it has to be created, this function will remove all the invalid characters (I'm not sure about reserved DOS names, they are not listed in its documentation) making the path "probably valid" and notifying you if any modification was made (so you can use it also only for validation).
Notice that this function is marked as "modifiable or removable in any future Windows version", although Microsoft policy is generally that "anything that made it way to a public header will remain public forever".
In case you are checking if the file name is valid in the sense "can the file be named like this?" :
No, there is no function to directly check that. You will have to write you own function.
But, if you know what is a valid file name (the valid file name does now contain any of the following: \ / : * ? " < > |) that shouldn't be such a problem.
You could perhaps help your self with some of these functions from ctype.h (with them you can check if a specific character belongs to some specific character classes):
http://www.cplusplus.com/reference/clibrary/cctype/
This function gives you the list of invalid chars for a filename. Up to you to check that your filename doesn't contain any:
public static char[] Path.GetInvalidFileNameChars()
Docs here.
Note that if you want to validate a directory name, you should use GetInvalidPathChars().
EDIT: Oooops! Sorry, I thought you were on .NET. Using Reflector, here's what this functions boils down to:
'"', '<', '>', '|',
'\0', '\x0001', '\x0002', '\x0003', '\x0004', '\x0005', '\x0006',
'\a', '\b', '\t', '\n', '\v', '\f', '\r',
'\x000e', '\x000f', '\x0010', '\x0011', '\x0012', '\x0013', '\x0014', '\x0015',
'\x0016', '\x0017', '\x0018', '\x0019', '\x001a', '\x001b', '\x001c', '\x001d',
'\x001e', '\x001f',
':', '*', '?', '\\', '/'
Note that, in addition, there are reserved names such as prn, con, com1, com2,... , lpt1, lpt2,...

Resources