Expand integer macro inside quoted string - c

While contributing to exim, I saw many values where hard-coded :
uschar filebuffer[256];
(void)sprintf(CS filebuffer, "%.256s.db", filename);
rc = lf_check_file(-1, filebuffer, S_IFREG, modemask, owners, owngroups,
"dbm", errmsg);
if (rc < 0) /* stat() failed */
{
(void)sprintf(CS filebuffer, "%.256s.dir", filename);
rc = lf_check_file(-1, filebuffer, S_IFREG, modemask, owners, owngroups,
"dbm", errmsg);
if (rc == 0) /* x.dir was OK */
{
(void)sprintf(CS filebuffer, "%.256s.pag", filename);
rc = lf_check_file(-1, filebuffer, S_IFREG, modemask, owners, owngroups,
"dbm", errmsg);
}
}
}
As the code isn’t windows specific, every256values should be converted toPATH_MAX.
I know that expanding macros inside quoted strings isn’t possible, but that string concatenation is trivial :
#define STR "string"
size_t len=strlen("part"STR"part 2");
However, things like :
"%."PATH_MAX".db"
Shouldn’t work becausePATH_MAXexpands to an integer, not a string.
So is there a way to do this without calling a function that convert integers to C strings ?

The right way to do this is to use a * in your format string, which will cause it to take the value from your argument list. For example:
printf("%.*s\n", 3, "abcde");
This is equivalent to:
printf("%.3s\n", "abcde");
That way you can use PATH_MAX or any other value to control the format without having to worry about how they're defined (e.g., whether they contain parentheses or addition operators, etc.)

You can stringify a macro argument by using the # operator. But you need an indirect macro invocation to expand the argument:
#define Q(x) Q_(x)
#define Q_(x) #x
So you could do something like:
char filebuffer[PATH_MAX + 10];
sprintf(filebuffer, "%." Q(PATH_MAX)"s.db", filename);
The existing code limits the string sibstitution to 256 characters but then adds a file extension (and a NUL terminator) which will be a buffer overflow when the length is close to 256. I used an arbitrary 10-byte overallocation above, but it would be better to use a checked-length sprintf like snprintf. That would have the additional advantage of not requiring macro games.

Related

portability and safety of C macro using lambda parameter

Preface
I know that there are several libraries for auto-testing available.
Let's ignore that for this question, please.
Motivation
Implementing some library I got tired of manual testing, so I started to write a "self-test" program, starting with code using many assert()s.
Unfortunately when an assert() fails, only limited information is shown on the screen, and I would typically have to use the debugger to examine the core dump to get more details of the failure.
So I added a macro that allows a printf()-like output (implemented via the E() (for error) macro) when an assertion fails; I named that macro VA() (for verbose assertion):
#define VA(assert_cond, msg, ...) do { \
if ( !(assert_cond) ) E(msg, ##__VA_ARGS__); \
assert(assert_cond); \
} while (0)
Using that would look like this:
VA(FASTWORD(FASTWORD_BITS - 1) == 0, "%s: FASTWORD() failed", __func__);
As the self-test program used array-like data structures, I needed to inspact those as well, so I output those before doing the tests, resulting in a lot of output even when all tests succeed.
So I invented another macro, VFA() (verbose failed assertion) that uses a "lambda parameter" like this:
#define VFA(assert_cond, cmd, msg, ...) do { \
if ( !(assert_cond) ) { \
E(msg, ##__VA_ARGS__); \
cmd; \
} \
assert(assert_cond); \
} while (0)
While writing that I wondered how the preprocessor would parse commata for a use case like this:
VFA(fw[0] == out_fw0 && fw[1] == out_fw1,
dump_fastwords_range(fw, 4, pos, (pos + count) % FASTWORD_BITS),
"%s: __clear_fw_bits_up(%d, %d) failed", context, pos, count);
I mean it could be possible that the condition could be the first parameter, dump_fastwords_range(fw could be the second, 4 could be the third, and so on...
However that is not the case with gcc at least.
The other thing is cmd; in the macro:
My first version did not include the semicolon, so I would have to write (which looks really ugly):
VFA(fw[0] == out_fw0 && fw[1] == out_fw1,
dump_fastwords_range(fw, 4, pos, (pos + count) % FASTWORD_BITS);,
"%s: __clear_fw_bits_up(%d, %d) failed", context, pos, count);
OK, here's another use example of my macro:
VFA(fw[0] == out_fw0 && fw[1] == out_fw1,
{
const unsigned first = pos >= count ?
pos - count : FASTWORD_BITS + pos - count + 1;
dump_fastwords_range(fw, 4, first, pos);
},
"%s: __clear_fw_bits_dn(%d, %d) failed", context, pos, count);
Questions
The questions I have are:
Is parsing of the macro parameters portable across compilers?
Will the cmd use create any trouble, considering the parameter could be rather complex (as the last example suggests)?
Is parsing of the macro parameters portable across compilers?
No. ##__VA_ARGS__ is a non-portable gcc extension. What does ##__VA_ARGS__ mean?
Will the cmd use create any trouble, considering the parameter could be rather complex (as the last example suggests)?
Items within () of that macro parameter will mean that it all gets treated like a single pre-processor token and expanded as such. You can peek at the pre-processor output if you are curious. Formally this is specified in C17 6.10.3/10:
Each subsequent instance of the
function-like macro name followed by a ( as the next preprocessing token introduces the
sequence of preprocessing tokens that is replaced by the replacement list in the definition
(an invocation of the macro). The replaced sequence of preprocessing tokens is
terminated by the matching ) preprocessing token, skipping intervening matched pairs of left and right parenthesis preprocessing tokens.
So it shouldn't create any trouble unless you do truly evil stuff like using goto or setjmp etc from inside it.

In C, How to get capturing group RegEx?

This is the C function I am having problems with:
char get_access_token(char *client_credentials)
{
regex_t regex;
int reti;
char msgbuf[100];
reti = regcomp(&regex, "\\\"access_token\\\".\\\"(.*?)\\\"", 0);
regmatch_t pmatch[1];
if (reti) {
fprintf(stderr, "Could not compile regex\n");
exit(1);
}
reti = regexec(&regex, client_credentials, 1, pmatch, 0);
if (!reti) {
puts("Match");
} else if (reti == REG_NOMATCH) {
puts("No match");
} else {
regerror(reti, &regex, msgbuf, sizeof(msgbuf));
fprintf(stderr, "Regex match failed: %s\n", msgbuf);
exit(1);
}
return (char) "";
}
The string that I'm trying to parse is a JSON string, I don't care about the actual structure I only care about the access token.
It should look like this:
{"access_token": "blablablabal"}
I want my function to return just "blablablabla"
The RegEx that I'm trying to use is this one:
\"access_token"."(.*?)"
but I can't find that in the variable pmatch, I only find two numbers in that array, I don't really know what those numbers mean.
What am I doing wrong?
P.S. I'm a C noob, I'm just learning.
There's several problems. You have typos in your regex. And you're trying to use extended regex features with a POSIX regex.
First the typos.
reti = regcomp(&regex, "\\\"access_token\\\".\\\"(.*?)\\\"", 0);
^
That should be:
reti = regcomp(&regex, "\\\"access_token\\\": \\\"(.*?)\\\"", 0);
Then we don't need to escape quotes in regexes. That makes it easier to read.
reti = regcomp(&regex, "\"access_token\": \"(.*?)\"", 0);
This still doesn't work because it's using features that basic POSIX regexes do not have. Capture groups must be escaped in a basic POSIX regex. This can be fixed by using REG_EXTENDED. The *? non-greedy operators is an enhanced non-POSIX feature borrowed from Perl. You get them with REG_ENHANCED.
reti = regcomp(&regex, "\"access_token\": \"(.*?)\"", REG_ENHANCED|REG_EXTENDED);
But don't try to parse JSON with a regex for all the same reasons we don't parse HTML with a regex. Use a JSON library such as json-glib.
Well, your pmatch array must have at least two elements, as you probably know, group 0 is the whole matching regexp, and it is filled for the whole regexp (like if all the regular expression were rounded by a pair of parenthesis) you want group 1, so pmatch[1] will be filled with the information of the first subexpression group.
If you look in the doc, the pmatch element has two fields that index the beginning index in the original buffer where the group was matched, and the one past the last index of the place in the string where the group ends. These field names are rm_so and rm_eo, and like the ones in pmatch[0], they indicate the index at where the regular (sub)expression begins and ends, resp.
You can print the matched elements with (once you know that they are valid, see doc) with:
#define SIZEOF(arr) (sizeof arr / sizeof arr[0])
...
regmatch_t pmatch[2]; /* for global regexp and group 1 */
...
/* you don't need to escape " chars, they are not special for regcomp,
* they do, however, for C, so only one \ must be used. */
res = regcomp(&regex, "\"access_token\".\"([^)]*)\"", 0);
...
reti = regexec(&regex, client_credentials, SIZEOF(pmatch), pmatch, 0);
for (i = 0; i < regex.re_nsub; i++) {
char *p = client_credentials + pmatch[i].rm_so; /* p points to beginning of match */
size_t l = pmatch[i].rm_eo - pmatch[i].rm_so; /* match length */
printf("Group #%d: %0.*s\n", i, l, p);
}
My apologies for submitting a snippet of code instead of a verifiable and complete example, but as you didn't do it in the question (so we could not test your sample code) I won't do in the answer. So, the code is not tested, and can have errors on my side. Beware of this.
Testing a sample response requires time, worse if we have first to make your sample code testable at all. (this is a complaint about the beginners ---and some nonbeginners--- use of not posting Minimal, Complete, and Verifiable example).

How to compare my string, which is stored in an array, to function names from a complete library in c

After I enter a string in c and store it in for example char s[100], how can I compare that string to all function names in a math.h? For example, I enter pow and the result will look like this in stored form.
s[0]='p'
s[1]='o'
s[2]='w'
s[3]='\0'
Since my string is the equivalent of pow(), I want my program to recognise that and then call pow() during execution of my program. I know it is not that hard to do string comparison within the code, but that would mean that I would have to do string comparison for every function name in the library. I don't want to do that. How is it possible to compare my string against all names in the library without hard coding every comparison?
Thank you :)
You can't, not without doing work yourself. There are no names of functions present at runtime in general, and certainly not of functions you haven't called.
C is not a dynamic language, names are only used when compiling/linking.
Regular expressions in C
Try parsing the header files using FILE and use aforementioned link as a guide to check whether the function exists or not.
I tried to make a little sample about what I assume the questioner is looking for (eval.c):
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <assert.h>
/* mapping function names to function pointers and number of parameters */
struct Entry {
const char *name; /* function name */
double (*pFunc)(); /* function pointer */
int nArgs; /* number of arguments */
} table[] = {
#define REGISTER(FUNC, N_ARGS) { #FUNC, &FUNC, N_ARGS }
REGISTER(atan2, 2),
REGISTER(pow, 2),
REGISTER(modf, 2),
REGISTER(sin, 1),
REGISTER(cos, 1)
#undef REGISTER
};
/* let compiler count the number of entries */
enum { sizeTable = sizeof table / sizeof *table };
void printUsage(const char *argv0)
{
int i;
printf(
"Usage:\n"
" %s FUNC\n"
" where FUNC must be one of:\n", argv0);
for (i = 0; i < sizeTable; ++i) printf(" - %s\n", table[i].name);
}
int main(int argc, char **argv)
{
int i;
char *func;
struct Entry *pEntry;
/* read command line argument */
if (argc <= 1) {
fprintf(stderr, "ERROR: Missing function argument!\n");
printUsage(argv[0]);
return -1;
}
func = argv[1];
/* find function by name */
for (i = 0; i < sizeTable && strcmp(func, table[i].name) != 0; ++i);
if (i >= sizeTable) {
fprintf(stderr, "ERROR! Unknown function '%s'!\n", func);
printUsage(argv[0]);
return -1;
}
/* perform found function on all (standard) input */
pEntry = table + i;
for (;;) { /* endless loop (bail out at EOF or error) */
switch (pEntry->nArgs) {
case 1: {
double arg1, result;
/* get one argument */
if (scanf("%lf", &arg1) != 1) {
int error;
if (error = !feof(stdin)) fprintf(stderr, "Input ERROR!\n");
return error; /* bail out at EOF or error */
}
/* compute */
result = (*pEntry->pFunc)(arg1);
/* output */
printf("%s(%f): %f\n", pEntry->name, arg1, result);
} break;
case 2: {
double arg1, arg2, result;
/* get two arguments */
if (scanf("%lf %lf", &arg1, &arg2) != 2) {
int error;
if (error = !feof(stdin)) fprintf(stderr, "Input ERROR!\n");
return error; /* bail out at EOF or error */
}
/* compute */
result = (*pEntry->pFunc)(arg1, arg2);
/* output */
printf("%s(%f, %f): %f\n", pEntry->name, arg1, arg2, result);
} break;
default: /* should never happen */
fprintf(stderr,
"ERROR! Functions with %d arguments not yet implemented!\n",
pEntry->nArgs);
assert(0);
return -1; /* bail out at error */
}
}
}
I compiled and tested this with gcc in cygwin on Windows (64 bit):
$ gcc -std=c11 -o eval eval.c
$ ./eval
ERROR: Missing function argument!
Usage:
./eval FUNC
where FUNC must be one of:
- atan2
- pow
- modf
- sin
- cos
$ echo "1 2 3 4 5 6 7 8 9 10" | ./eval pow
pow(1.000000, 2.000000): 1.000000
pow(3.000000, 4.000000): 81.000000
pow(5.000000, 6.000000): 15625.000000
pow(7.000000, 8.000000): 5764801.000000
pow(9.000000, 10.000000): 3486784401.000000
$ echo "1 2 3 4 5 6 7 8 9 10" | ./eval sin
sin(1.000000): 0.841471
sin(2.000000): 0.909297
sin(3.000000): 0.141120
sin(4.000000): -0.756802
sin(5.000000): -0.958924
sin(6.000000): -0.279415
sin(7.000000): 0.656987
sin(8.000000): 0.989358
sin(9.000000): 0.412118
sin(10.000000): -0.544021
The usage of this application: the name of the function to apply is provided as command line argument. The values (to apply function to) are provided via standard input. In the sample session, I used echo and a pipe (|) to redirect the output of echo to the input of eval. (If eval is called stand-alone the numbers may be typed in by keyboard.)
Notes:
The table does the actual mapping of strings to function pointers. To solve that issue about the number of parameters, I considered this in struct Entry also.
The REGISTER macro is a trick to use the identifier as string constant also. The #FUNC is a stringize macro-operation (a typical C trick to prevent errors due to typos).
The sizeTable is another trick to prevent redundant definitions. I let the compiler count the number of entries. Thus, new entries may be added and it still will work without any other editing.
The actual trick is to provide a function pointer where the arguments are "left out". When it is called, the correct number of arguments is used and it works. (assuming, of course, the table initialization has been implemented carefully.) However, it would be a pain to do this in C++ because the functions with distinct number of arguments would need an appropriate function pointer with matching signature - horrible casts would be necessary. (Try to compile this with g++ -std=c++11 -c eval.c to see what I mean.)
For a productive solution, I would sort the entries by names (lexicographically) and apply a binary search (or even use hashing to be faster and more sophisticated). For this sample, I wanted to keep it simple.
math.h provides a lot of functions in "float flavor" also. These may not be added to this sample without additional effort. To support other than double arguments
some type info had to been added to the table entries
the type info has to be considered somehow in the switch statement of evaluation.
...not to mention functions where argument types are distinct to each other (or return type). (I cannot remember whether math.h even provides such functions.)
Btw. this will work for non-math.h functions also...

Converting C #define constants into variables

I have a device driver where the access to the wifi-router are hard coded using the #define macros in C. where the code looks like:
#define SSID "XXXX-XXX"
#define AUTH AAAA
#define PSK “YYyy00”
and the function that use these user defined constants is defined as:
router_connect((char *)SSID, sizeof(SSID), AUTH, (char *)PSK, CH_ALL);
I'd like to be change these constants as defined .h file as strings that I can read from a SD card and pass it to the function call rather than as hard coded in the main.h file. I'd like to have the ability to change them rather them as stored as fixed ones coded inside a program. After all they are user defined strings, as long as the values are correctly passed between the devices they don't need to be hard coded in the main code.
So I wrote them in the main.c as:
extern char SSID[] = "XXXX-XXX";
extern uint8 AUTH = AAAA;
extern char PSK[] = "YYyy00";
and in the SD card I have a file setup.txt where I entered the three strings as below in three lines:
XXXX-XXX;
AAAA;
YYyy00;
I wrote a little routine to read each line from the SD card and assign the strings to the variables
SSID, AUTH, PSK
char line[20];
int line_count;
FRESULT my_res;
FIL myfil;
my_res = f_open(&myfil, "setup.txt", FA_READ);
if (my_res != FR_OK) {
printf("file: setup.txt is not found. \n\r");
/* deal with errors */;
}
for (line_count = 1; line_count < 4;) {
f_gets(line, sizeof(line), &myfil);
if (line_count == 1) {
strcpy(SSID, line);
printf("SSID: %s %s \n\r", SSID, line);
}
if (line_count == 2) {
strcpy(AUTH, line);
printf("AUTH: %s %s \n\r", AUTH, line);
}
if (line_count == 3) {
strcpy(PSK, line);
printf("PSK: %s %s \n\r", PSK, line);
}
line_count++;
}
My intent was to read these strings from the SD card before the function "router_connect" is called.
I can do that with no problem, when I print out the stings I read from the SD card they are exactly like when they are hard coded, but I find the "router_connect" function does not like the parameters I'm passing. The device driver works when I hard code the values in the #define statement but for some reason it is not passing the values correctly to the function. Can you please advise if I'm doing it incorrectly for passing parameters to the function call and what will be the right way to achieve it. Thanks.
Firstly, it would be nice to see how exactly the function doesn't like those parameters. Secondly, I see that you read the AUTH value as a char array, and what the function wants is a uint8. How do you pass it? Maybe parsing AUTH into an uint8 would do the trick.

Compiling/Matching POSIX Regular Expressions in C

I'm trying to match the following items in the string pcode:
u followed by a 1 or 2 digit number
phaseu
phasep
x (surrounded by non-word chars)
y (surrounded by non-word chars)
z (surrounded by non-word chars)
I've tried to implement a regex match using the POSIX regex functions (shown below), but have two problems:
The compiled pattern seems to have no subpatterns (i.e. compiled.n_sub == 0).
The pattern doesn't find matches in the string " u0", which it really should!
I'm confident that the regex string itself is working—in that it works in python and TextMate—my problem lies with the compilation, etc. in C. Any help with getting that working would be much appreciated.
Thanks in advance for your answers.
if(idata=tb_find(deftb,pdata)){
MESSAGE("Global variable!\n");
char pattern[80] = "((u[0-9]{1,2})|(phaseu)|(phasep)|[\\W]+([xyz])[\\W]+)";
MESSAGE("Pattern = \"%s\"\n",pattern);
regex_t compiled;
if(regcomp(&compiled, pattern, 0) == 0){
MESSAGE("Compiled regular expression \"%s\".\n", pattern);
}
int nsub = compiled.re_nsub;
MESSAGE("nsub = %d.\n",nsub);
regmatch_t matchptr[nsub];
int err;
if(err = regexec (&compiled, pcode, nsub, matchptr, 0)){
if(err == REG_NOMATCH){
MESSAGE("Regular expression did not match.\n");
}else if(err == REG_ESPACE){
MESSAGE("Ran out of memory.\n");
}
}
regfree(&compiled);
}
It seems you intend to use something resembling the "extended" POSIX regex syntax. POSIX defines two different regex syntaxes, a "basic" (read "obsolete") syntax and the "extended" syntax. To use the extended syntax, you need to add the REG_EXTENDED flag for regcomp:
...
if(regcomp(&compiled, pattern, REG_EXTENDED) == 0){
...
Without this flag, regcomp will use the "basic" regex syntax. There are some important differences, such as:
No support for the | operator
The brackets for submatches need to be escaped, \( and \)
It should be also noted that the POSIX extended regex syntax is not 1:1 compatible with Python's regex (don't know about TextMate). In particular, I'm afraid this part of your regexp does not work in POSIX, or at least is not portable:
[\\W]
The POSIX way to specify non-space characters is:
[^[:space:]]
Your whole regexp for POSIX should then look like this in C:
char *pattern = "((u[0-9]{1,2})|(phaseu)|(phasep)|[^[:space:]]+([xyz])[^[:space:]]+)";

Resources