Why are these C macros not written as functions? - c
I'm studying the code of the netstat tool (Linux), which AFAIK mostly reads a /proc/net/tcp file and dowa pretty-printing out of it. (My focus is on the -t mode right now.)
I'm a bit puzzled by the coding style the authors have chosen:
static int tcp_info(void)
{
INFO_GUTS6(_PATH_PROCNET_TCP, _PATH_PROCNET_TCP6, "AF INET (tcp)", tcp_do_one);
}
where
#define INFO_GUTS6(file,file6,name,proc) \
char buffer[8192]; \
int rc = 0; \
int lnr = 0; \
if (!flag_arg || flag_inet) { \
INFO_GUTS1(file,name,proc) \
} \
if (!flag_arg || flag_inet6) { \
INFO_GUTS2(file6,proc) \
} \
INFO_GUTS3
where
#define INFO_GUTS3 \
return rc;
and
#if HAVE_AFINET6
#define INFO_GUTS2(file,proc) \
lnr = 0; \
procinfo = fopen((file), "r"); \
if (procinfo != NULL) { \
do { \
if (fgets(buffer, sizeof(buffer), procinfo)) \
(proc)(lnr++, buffer); \
} while (!feof(procinfo)); \
fclose(procinfo); \
}
#else
#define INFO_GUTS2(file,proc)
#endif
etc.
Clearly, my coding sense is tilting and says "those should be functions". I don't see any benefit those macros bring here. It kills readability, etc.
Is anybody around familiar with this code, can shed some light on what "INFO_GUTS" is about here and whether there could have been (or still has) a reason for such an odd coding style?
In case you're curious about their use, the full dependency graph goes like this:
# /---> INFO_GUTS1 <---\
# INFO_GUTS --* INFO_GUTS2 <----*---- INFO_GUTS6
# î \---> INFO_GUTS3 <---/ î
# | |
# unix_info() igmp_info(), tcp_info(), udp_info(), raw_info()
Your sense that "those macros should be functions" seems correct to me; I'd prefer to see them as functions.
It would be interesting to know how often the macros are used. However, the more they're used, the more there should be a space saving if they're a real function instead of a macro. The macros are quite big and use (inherently slow) I/O functions themselves, so there isn't going to be a speed-up from using the macro.
And these days, if you want inline substitution of functions, you can use inline functions in C (as well as in C++).
You can also argue that INFO_GUTS2 should be using a straight-forward while loop instead of the do ... while loop; it would only need to check for EOF once if it was:
while (fgets(buffer, sizeof(buffer), procinfo))
(*proc)(lnr++, buffer);
As it is, if there is an error (as opposed to EOF) on the channel, the code would probably go into an infinite loop; the fgets() would fail, but the feof() would return false (because it hasn't reached EOF; it has encountered an error - see ferror()), and so the loop would continue. Not a particularly plausible problem; if the file opens, you will seldom get an error. But a possible problem.
There is no reason why. The person who wrote the code was likely very confused about code optimizations in general, and the concept of inlining in particular. Since the compiler is most likely GCC, there are several ways to achieve function inlining, if inlining was even necessary for this function, which I very much doubt.
Inlining a function containing file I/O calls would be the same thing as shaving an elephant to reduce its weight...
It reads as someones terrible idea to implement optional IPv6 support. You would have to walk through the history to confirm, but the archive only seems to go back to 1.46 and the implied damage is at 1.20+.
I found a git archive going back to 1.24 and it is still there. Older code looks doubtful.
Neither BusyBox or BSD code includes such messy code. So it appeared in the Linux version and suffered major bit rot.
Macros generate code: when it is called, the whole macro definition is expanded at the place of the call. If say, INFO_GUTS6 were a function, it wouldn't be able to declare, e.g., the buffer variable which would subsequently be usable by the code that follows the macro invocation. The example you pasted is actually very neat :-)
Related
What are the benefits of using macros instead of functions in C?
First Code: #include <stdio.h> int area(int a,int b){ int area1 = a*b; return area1; } int main() { int l1 = 10, l2 = 5, area2; area2 = area(l1,l2); printf("Area of rectangle is: %d", area2); return 0; } Second Code: #include <stdio.h> // macro with parameter #define AREA(l, b) (l * b) int main() { int l1 = 10, l2 = 5, area; area = AREA(l1, l2); printf("Area of rectangle is: %d", area); return 0; } Same output to both codes: Area of rectangle is: 50 My Question: So obviously, macros in C language are the same as functions, but macros take less space (fewer lines) than functions. Is this the only benefit of using macros instead of functions? Because they look roughly the same.
A case where macros are useful is when you combine them with __FILE__ and __LINE__. I have a concrete example in the Bismon software project. In its file cmacros_BM.h I define // only used by FATAL_BM macro extern void fatal_stop_at_BM (const char *, int) __attribute__((noreturn)); #define FATAL_AT_BIS_BM(Fil,Lin,Fmt,...) do { \ fprintf(stderr, "BM FATAL:%s:%d: <%s>\n " Fmt "\n\n", \ Fil, Lin, __func__, ##__VA_ARGS__); \ fatal_stop_at_BM(Fil,Lin); } while(0) #define FATAL_AT_BM(Fil,Lin,Fmt,...) FATAL_AT_BIS_BM(Fil,Lin,Fmt,##__VA_ARGS__) #define FATAL_BM(Fmt,...) FATAL_AT_BM(__FILE__,__LINE__,Fmt,##__VA_ARGS__) and fatal errors are calling something like (example from file user_BM.c) FILE *fil = fopen (contributors_filepath_BM, "r+"); if (!fil) FATAL_BM ("find_contributor_BM cannot open contributors file %s : %m", contributors_filepath_BM); When that fopen fails, the fatal error message shows the source file and line number of that FATAL_BM macro invocation. The fatal_stop_at_BM function is defined in file main_BM.c Notice also that some of your C files could be generated by programs like GNU bison, GNU m4, ANTLR, SWIG and that preprocessor symbols are also used by GNU autoconf. Study also the source code of the Linux kernel. It uses macros extensively. Most importantly, read the documentation of your C compiler (e.g. GCC). Many C compilers can show you the preprocessed form of your C code. Your // macro with parameter #define AREA(l, b) (l * b) is wrong and should be #define AREA(l, b) ((l) * (b)) if you want AREA(x+2,y-3) to work as expected. For performance reasons, you could have defined your function as inline int area(int a,int b){ return a*b; } See also: The Modern C book; this C reference website; the documentation of GNU cpp; the documentation of the GCC compiler (to be invoked as gcc -Wall -Wextra -g); how to write your GCC plugins; some recent draft C standard like n1570; examples of macro usage in the source code of free software projects like... GNU make, or GNU findutils; the simple Nils Weller's C compiler and its source code the Frama-C static analyzer (open source) the Clang static analyzer (open source) the Tiny C compiler this DRAFT report on Bismon the CHARIOT European project the DECODER European project various ACM SIGPLAN conference papers mentioning C. some chapters of the Artificial Beings: the Conscience of a Conscious Machine book (ISBN 13: 978-1848211018) describing a software (CAIA, symbolic artificial intelligence) generating all the half million lines of its C code (and this blog of the same author, the late Jacques Pitrat) some chapters of the Dragon book (explaining compilers) the book about A retargetable C compiler ISBN-13: 978-0805316704
Macros are most definitely not the same as functions. Macros are text substitutions1; they are not called like functions. The problem with your AREA macro is that it won’t behave well if you pass an expression like AREA(l1+x,l2) - that will expand to (l1+x * l2), which won’t do what you want. Arguments to macros are not evaluated, they are expanded in place. Macros and function-like macros are useful for creating symbolic constants, simplifying repeated blocks of text, and for implementing crude template-like behavior. Strictly speaking they are token substitutions, but the principle is the same.
I agree with #Eric Postpischil. Macros are not supported by old compilers and they don't know anything about macros. So, this makes debugging harder if you use an old compiler. Is this the only benefit of using macros instead of functions? No, macros are function-like and can be used to define only very simple things such as simple formulas but it's not recommended to define a C function with them because they try to put everything linear and flat and that's why you may face software design issues and they make debugging harder. So, this is not always a benefit. And in your case, I think it's not a big and serious problem.
Figure out function parameter count at compile time
I have a C library (with C headers) which exists in two different versions. One of them has a function that looks like this: int test(char * a, char * b, char * c, bool d, int e); And the other version looks like this: int test(char * a, char * b, char * c, bool d) (for which e is not given as function parameter but it's hard-coded in the function itself). The library or its headers do not define / include any way to check for the library version so I can't just use an #if or #ifdef to check for a version number. Is there any way I can write a C program that can be compiled with both versions of this library, depending on which one is installed when the program is compiled? That way contributors that want to compile my program are free to use either version of the library and the tool would be able to be compiled with either. So, to clarify, I'm looking for something like this (or similar): #if HAS_ARGUMENT_COUNT(test, 5) test("a", "b", "c", true, 20); #elif HAS_ARGUMENT_COUNT(test, 4) test("a", "b", "c", true); #else #error "wrong argument count" #endif Is there any way to do that in C? I was unable to figure out a way. The library would be libogc ( https://github.com/devkitPro/libogc ) which changed its definition of if_config a while ago, and I'd like to make my program work with both the old and the new version. I was unable to find any version identifier in the library. At the moment I'm using a modified version of GCC 8.3.
This should be done at the configure stage, using an Autoconf (or CMake, or whatever) test step -- basically, attempting to compile a small program which uses the five-parameter signature, and seeing if it compiles successfully -- to determine which version of the library is in use. That can be used to set a preprocessor macro which you can use in an #if block in your code.
I think there's no way to do this at the preprocesing stage (at least not without some external scripts). On the other hand, there is a way to detect a function's signature at compiling time if you're using C11: _Generic. But remember: you can't use this in a macro like #if because primary expressions aren't evaluated at the preprocessing stage, so you can't dynamically choose to call the function with signature 1 or 2 in that stage. #define WEIRD_LIB_FUNC_TYPE(T) _Generic(&(T), \ int (*)(char *, char *, char *, bool, int): 1, \ int (*)(char *, char *, char *, bool): 2, \ default: 0) printf("test's signature: %d\n", WEIRD_LIB_FUNC_TYPE(test)); // will print 1 if 'test' expects the extra argument, or 2 otherwise I'm sorry if this does not answer your question. If you really can't detect the version from the "stock" library header file, there are workarounds where you can #ifdef something that's only present in a specific version of that library. This is just a horrible library design. Update: after reading the comments, I should clarify for future readers that it isn't possible in the preprocessing stage but it is possible at compile time still. You'd just have to conditionally cast the function call based on my snippet above. typedef int (*TYPE_A)(char *, char *, char *, bool, int); typedef int (*TYPE_B)(char *, char *, char *, bool); int newtest(char *a, char *b, char *c, bool d, int e) { void (*func)(void) = (void (*)(void))&test; if (_Generic(&test, TYPE_A: 1, TYPE_B: 2, default: 0) == 1) { return ((TYPE_A)func)(a, b, c, d, e); } return ((TYPE_B)func)(a, b, c, d); } This indeed works although it might be controversial to cast a function this way. The upside is, as #pizzapants184 said, the condition will be optimized away because the _Generic call will be evaluated at compile-time.
I don't see any way to do that with standard C, if you are compiling with gcc a very very ugly way can be using gcc aux-info in a command and passing the number of parameters with -D: #!/bin/sh gcc -aux-info output.info demo.c COUNT=`grep "extern int foo" output.info | tr -dc "," | wc -m` rm output.info gcc -o demo demo.c -DCOUNT="$COUNT + 1" ./demo This snippet #include <stdio.h> int foo(int a, int b, int c); #ifndef COUNT #define COUNT 0 #endif int main(void) { printf("foo has %d parameters\n", COUNT); return 0; } outputs foo has 3 parameters
Attempting to support compiling code with multiple versions of a static library serves no useful purpose. Update your code to use the latest release and stop making life more difficult than it needs to be.
In Dennis Ritchie's original C language, a function could be passed any number of arguments, regardless of the number of parameters it expected, provided that the function didn't access any parameters beyond those that were passed to it. Even on platforms whose normal calling convention wouldn't be able to accommodate this flexibility, C compilers would generally used a different calling convention that could support it unless functions were marked with qualifiers like pascal to indicate that they should use the ordinary calling convention. Thus, something like the following would have had fully defined behavior in Ritchie's original C language: int addTwoOrThree(count, x, y, z) int count, x, y, z; { if (count == 3) return x+y+z; else return x+y; } int test() { return count(2, 10,20) + count(3, 1,2,3); } Because there are some platforms where it would be impractical to support such flexibility by default, the C Standard does not require that compilers meaningfully process any calls to functions which have more or fewer arguments than expected, except that functions which have been declared with a ... parameter will "expect" any number of arguments that is at least as large as the number of actual specified parameters. It is thus rare for code to be written that would exploit the flexibility that was present in Ritchie's language. Nonetheless, many implementations will still accept code written to support that pattern if the function being called is in a separate compilation unit from the callers, and it is declared but not prototyped within the compilation units that call it.
you don't. the tools you're working with are statically linked and don't support versioning. you can get around it using all kind of tricks and tips that have been mentioned, but at the end of the day they are ugly patch works of something you're trying to do that makes no sense in this context(toolkit/code environment). you design your code for the version of the toolkit you have installed. its a hard requirement. i also don't understand why you would want to design your gamecube/wii code to allow building on different versions. the toolkit is constantly changing to fix bugs, assumptions etc etc. if you want your code to use an old version that potentially have bugs or do things wrong, that is on you. i think you should realize what kind of botch work you're dealing with here if you need or want to do this with an constantly evolving toolkit.. I also think, but this is because i know you and your relationship with DevKitPro, i assume you ask this because you have an older version installed and your CI builds won't work because they use a newer version (from docker). its either this, or you have multiple versions installed on your machine for a different project you build (but won't update source for some odd reason).
If your compiler is a recent GCC, e.g. some GCC 10 in November 2020, you might write your own GCC plugin to check the signature in your header files (and emit appropriate and related C preprocessor #define-s and/or #ifdef, à la GNU autoconf). Your plugin could (for example) fill some sqlite database and you would later generate some #include-d header file. You then would set up your build automation (e.g. your Makefile) to use that GCC plugin and the data it has computed when needed. For a single function, such an approach is overkill. For some large project, it could make sense, in particular if you also decide to also code some project-specific coding rules validator in your GCC plugin. Writing a GCC plugin could take weeks of your time, and you may need to patch your plugin source code when you would switch to a future GCC 11. See also this draft report and the European CHARIOT and DECODER projects (funding the work described in that report). BTW, you might ask the authors of that library to add some versioning metadata. Inspiration might come from libonion or Glib or libgccjit. BTW, as rightly commented in this issue, you should not use an unmaintained old version of some opensource library. Use the one that is worked on. I'd like to make my program work with both the old and the new version. Why? making your program work with the old (unmaintained) version of libogc is adding burden to both you and them. I don't understand why you would depend upon some old unmaintained library, if you can avoid doing that. PS. You could of course write a plugin for GCC 8. I do recommend switching to GCC 10: it did improve.
I'm not sure this solves your specific problem, or helps you at all, but here's a preprocessor contraption, due to Laurent Deniau, that counts the number of arguments passed to a function at compile time. Meaning, something like args_count(a,b,c) evaluates (at compile time) to the constant literal constant 3, and something like args_count(__VA_ARGS__) (within a variadic macro) evaluates (at compile time) to the number of arguments passed to the macro. This allows you, for instance, to call variadic functions without specifying the number of arguments, because the preprocessor does it for you. So, if you have a variadic function void function_backend(int N, ...){ // do stuff } where you (typically) HAVE to pass the number of arguments N, you can automate that process by writing a "frontend" variadic macro #define function_frontend(...) function_backend(args_count(__VA_ARGS__), __VA_ARGS__) And now you call function_frontend() with as many arguments as you want: I made you Youtube tutorial about this. #include <stdint.h> #include <stdarg.h> #include <stdio.h> #define m_args_idim__get_arg100( \ arg00,arg01,arg02,arg03,arg04,arg05,arg06,arg07,arg08,arg09,arg0a,arg0b,arg0c,arg0d,arg0e,arg0f, \ arg10,arg11,arg12,arg13,arg14,arg15,arg16,arg17,arg18,arg19,arg1a,arg1b,arg1c,arg1d,arg1e,arg1f, \ arg20,arg21,arg22,arg23,arg24,arg25,arg26,arg27,arg28,arg29,arg2a,arg2b,arg2c,arg2d,arg2e,arg2f, \ arg30,arg31,arg32,arg33,arg34,arg35,arg36,arg37,arg38,arg39,arg3a,arg3b,arg3c,arg3d,arg3e,arg3f, \ arg40,arg41,arg42,arg43,arg44,arg45,arg46,arg47,arg48,arg49,arg4a,arg4b,arg4c,arg4d,arg4e,arg4f, \ arg50,arg51,arg52,arg53,arg54,arg55,arg56,arg57,arg58,arg59,arg5a,arg5b,arg5c,arg5d,arg5e,arg5f, \ arg60,arg61,arg62,arg63,arg64,arg65,arg66,arg67,arg68,arg69,arg6a,arg6b,arg6c,arg6d,arg6e,arg6f, \ arg70,arg71,arg72,arg73,arg74,arg75,arg76,arg77,arg78,arg79,arg7a,arg7b,arg7c,arg7d,arg7e,arg7f, \ arg80,arg81,arg82,arg83,arg84,arg85,arg86,arg87,arg88,arg89,arg8a,arg8b,arg8c,arg8d,arg8e,arg8f, \ arg90,arg91,arg92,arg93,arg94,arg95,arg96,arg97,arg98,arg99,arg9a,arg9b,arg9c,arg9d,arg9e,arg9f, \ arga0,arga1,arga2,arga3,arga4,arga5,arga6,arga7,arga8,arga9,argaa,argab,argac,argad,argae,argaf, \ argb0,argb1,argb2,argb3,argb4,argb5,argb6,argb7,argb8,argb9,argba,argbb,argbc,argbd,argbe,argbf, \ argc0,argc1,argc2,argc3,argc4,argc5,argc6,argc7,argc8,argc9,argca,argcb,argcc,argcd,argce,argcf, \ argd0,argd1,argd2,argd3,argd4,argd5,argd6,argd7,argd8,argd9,argda,argdb,argdc,argdd,argde,argdf, \ arge0,arge1,arge2,arge3,arge4,arge5,arge6,arge7,arge8,arge9,argea,argeb,argec,arged,argee,argef, \ argf0,argf1,argf2,argf3,argf4,argf5,argf6,argf7,argf8,argf9,argfa,argfb,argfc,argfd,argfe,argff, \ arg100, ...) arg100 #define m_args_idim(...) m_args_idim__get_arg100(, ##__VA_ARGS__, \ 0xff,0xfe,0xfd,0xfc,0xfb,0xfa,0xf9,0xf8,0xf7,0xf6,0xf5,0xf4,0xf3,0xf2,0xf1,0xf0, \ 0xef,0xee,0xed,0xec,0xeb,0xea,0xe9,0xe8,0xe7,0xe6,0xe5,0xe4,0xe3,0xe2,0xe1,0xe0, \ 0xdf,0xde,0xdd,0xdc,0xdb,0xda,0xd9,0xd8,0xd7,0xd6,0xd5,0xd4,0xd3,0xd2,0xd1,0xd0, \ 0xcf,0xce,0xcd,0xcc,0xcb,0xca,0xc9,0xc8,0xc7,0xc6,0xc5,0xc4,0xc3,0xc2,0xc1,0xc0, \ 0xbf,0xbe,0xbd,0xbc,0xbb,0xba,0xb9,0xb8,0xb7,0xb6,0xb5,0xb4,0xb3,0xb2,0xb1,0xb0, \ 0xaf,0xae,0xad,0xac,0xab,0xaa,0xa9,0xa8,0xa7,0xa6,0xa5,0xa4,0xa3,0xa2,0xa1,0xa0, \ 0x9f,0x9e,0x9d,0x9c,0x9b,0x9a,0x99,0x98,0x97,0x96,0x95,0x94,0x93,0x92,0x91,0x90, \ 0x8f,0x8e,0x8d,0x8c,0x8b,0x8a,0x89,0x88,0x87,0x86,0x85,0x84,0x83,0x82,0x81,0x80, \ 0x7f,0x7e,0x7d,0x7c,0x7b,0x7a,0x79,0x78,0x77,0x76,0x75,0x74,0x73,0x72,0x71,0x70, \ 0x6f,0x6e,0x6d,0x6c,0x6b,0x6a,0x69,0x68,0x67,0x66,0x65,0x64,0x63,0x62,0x61,0x60, \ 0x5f,0x5e,0x5d,0x5c,0x5b,0x5a,0x59,0x58,0x57,0x56,0x55,0x54,0x53,0x52,0x51,0x50, \ 0x4f,0x4e,0x4d,0x4c,0x4b,0x4a,0x49,0x48,0x47,0x46,0x45,0x44,0x43,0x42,0x41,0x40, \ 0x3f,0x3e,0x3d,0x3c,0x3b,0x3a,0x39,0x38,0x37,0x36,0x35,0x34,0x33,0x32,0x31,0x30, \ 0x2f,0x2e,0x2d,0x2c,0x2b,0x2a,0x29,0x28,0x27,0x26,0x25,0x24,0x23,0x22,0x21,0x20, \ 0x1f,0x1e,0x1d,0x1c,0x1b,0x1a,0x19,0x18,0x17,0x16,0x15,0x14,0x13,0x12,0x11,0x10, \ 0x0f,0x0e,0x0d,0x0c,0x0b,0x0a,0x09,0x08,0x07,0x06,0x05,0x04,0x03,0x02,0x01,0x00, \ ) typedef struct{ int32_t x0,x1; }ivec2; int32_t max0__ivec2(int32_t nelems, ...){ // The largest component 0 in a list of 2D integer vectors int32_t max = ~(1ll<<31) + 1; // Assuming two's complement va_list args; va_start(args, nelems); for(int i=0; i<nelems; ++i){ ivec2 a = va_arg(args, ivec2); max = max > a.x0 ? max : a.x0; } va_end(args); return max; } #define max0_ivec2(...) max0__ivec2(m_args_idim(__VA_ARGS__), __VA_ARGS__) int main(){ int32_t max = max0_ivec2(((ivec2){0,1}), ((ivec2){2,3}, ((ivec2){4,5}), ((ivec2){6,7}))); printf("%d\n", max); }
How to force a crash in C, is dereferencing a null pointer a (fairly) portable way?
I'm writing my own test-runner for my current project. One feature (that's probably quite common with test-runners) is that every testcase is executed in a child process, so the test-runner can properly detect and report a crashing testcase. I want to also test the test-runner itself, therefore one testcase has to force a crash. I know "crashing" is not covered by the C standard and just might happen as a result of undefined behavior. So this question is more about the behavior of real-world implementations. My first attempt was to just dereference a null-pointer: int c = *((int *)0); This worked in a debug build on GNU/Linux and Windows, but failed to crash in a release build because the unused variable c was optimized out, so I added printf("%d", c); // to prevent optimizing away the crash and thought I was settled. However, trying my code with clang instead of gcc revealed a surprise during compilation: [CC] obj/x86_64-pc-linux-gnu/release/src/test/test/test_s.o src/test/test/test.c:34:13: warning: indirection of non-volatile null pointer will be deleted, not trap [-Wnull-dereference] int c = *((int *)0); ^~~~~~~~~~~ src/test/test/test.c:34:13: note: consider using __builtin_trap() or qualifying pointer with 'volatile' 1 warning generated. And indeed, the clang-compiled testcase didn't crash. So, I followed the advice of the warning and now my testcase looks like this: PT_TESTMETHOD(test_expected_crash) { PT_Test_expectCrash(); // crash intentionally int *volatile nptr = 0; int c = *nptr; printf("%d", c); // to prevent optimizing away the crash } This solved my immediate problem, the testcase "works" (aka crashes) with both gcc and clang. I guess because dereferencing the null pointer is undefined behavior, clang is free to compile my first code into something that doesn't crash. The volatile qualifier removes the ability to be sure at compile time that this really will dereference null. Now my questions are: Does this final code guarantee the null dereference actually happens at runtime? Is dereferencing null indeed a fairly portable way for crashing on most platforms?
I wouldn't rely on that method as being robust if I were you. Can't you use abort(), which is part of the C standard and is guaranteed to cause an abnormal program termination event?
The answer refering to abort() was great, I really didn't think of that and it's indeed a perfectly portable way of forcing an abnormal program termination. Trying it with my code, I came across msvcrt (Microsoft's C runtime) implements abort() in a special chatty way, it outputs the following to stderr: This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. That's not so nice, at least it unnecessarily clutters the output of a complete test run. So I had a look at __builtin_trap() that's also referenced in clang's warning. It turns out this gives me exactly what I was looking for: LLVM code generator translates __builtin_trap() to a trap instruction if it is supported by the target ISA. Otherwise, the builtin is translated into a call to abort. It's also available in gcc starting with version 4.2.4: This function causes the program to exit abnormally. GCC implements this function by using a target-dependent mechanism (such as intentionally executing an illegal instruction) or by calling abort. As this does something similar to a real crash, I prefer it over a simple abort(). For the fallback, it's still an option trying to do your own illegal operation like the null pointer dereference, but just add a call to abort() in case the program somehow makes it there without crashing. So, all in all, the solution looks like this, testing for a minimum GCC version and using the much more handy __has_builtin() macro provided by clang: #undef HAVE_BUILTIN_TRAP #ifdef __GNUC__ # define GCC_VERSION (__GNUC__ * 10000 \ + __GNUC_MINOR__ * 100 + __GNUC_PATCHLEVEL__) # if GCC_VERSION > 40203 # define HAVE_BUILTIN_TRAP # endif #else # ifdef __has_builtin # if __has_builtin(__builtin_trap) # define HAVE_BUILTIN_TRAP # endif # endif #endif #ifdef HAVE_BUILTIN_TRAP # define crashMe() __builtin_trap() #else # include <stdio.h> # define crashMe() do { \ int *volatile iptr = 0; \ int i = *iptr; \ printf("%d", i); \ abort(); } while (0) #endif // [...] PT_TESTMETHOD(test_expected_crash) { PT_Test_expectCrash(); // crash intentionally crashMe(); }
you can write memory instead of reading it. *((int *)0) = 0;
No, dereferencing a NULL pointer is not a portable way of crashing a program. It is undefined behavior, which means just that, you have no guarantees what will happen. As it happen, for the most part under any of the three main OS's used today on desktop computers, that being MacOS, Linux and Windows NT (*) dereferencing a NULL pointer will immediately crash your program. That said: "The worst possible result of undefined behavior is for it to do what you were expecting." I purposely put a star beside Windows NT, because under Windows 95/98/ME, I can craft a program that has the following source: int main() { int *pointer = NULL; int i = *pointer; return 0; } that will run without crashing. Compile it as a TINY mode .COM files under 16 bit DOS, and you'll be just fine. Ditto running the same source with just about any C compiler under CP/M. Ditto running that on some embedded systems. I've not tested it on an Arduino, but I would not want to bet either way on the outcome. I do know for certain that were a C compiler available for the 8051 systems I cut my teeth on, that program would run fine on those.
The program below should work. It might cause some collateral damage, though. #include <string.h> void crashme( char *str) { char *omg; for(omg=strtok(str, "" ); omg ; omg=strtok(NULL, "") ) { strcat(omg , "wtf"); } *omg =0; // always NUL-terminate a NULL string !!! } int main(void) { char buff[20]; // crashme( "WTF" ); // works! // crashme( NULL ); // works, too crashme( buff ); // Maybe a bit too slow ... return 0; }
If statement with ZERO as condition
I am readin linux sources and notice statements like if (0) { .... } What is this magic about? Example: http://lxr.free-electrons.com/source/arch/x86/include/asm/percpu.h#L132
In this particular macro you're referring to: 132 if (0) { \ 133 pao_T__ pao_tmp__; \ 134 pao_tmp__ = (val); \ 135 (void)pao_tmp__; \ 136 } \ the if (0) { ... } block is a way of "using" val without actually using it. The body of this block of code will be evaluated by the compiler, but no code will actually be generated, as an if (0) should always be eliminated - it can never run. Note that this is a macro. As such, var and val may be of any type - the preprocessor doesn't care. pao_T__ is typedefed to typeof(var). As Andy Shevchenko pointed out, this block of code exists to ensure that val and var are type-compatible, by creating a variable of the same type as var, and assigning val to it. If the types weren't compatible, this assignment would generate a compiler error. In general, many of the Linux kernel header files should be considered black magic. They are an interesting example of the meta programming that one can do with the C preprocessor, usually for the sake of performance.
How do I compile NetHack in Windows 7?
I like NetHack and I want to dink around with the source a bit for fun. Before I do that I would like to be able to get it to compile out of the box but I'm having no small amount of difficulty getting that to happen. I downloaded the source code from here and I followed the instructions here but it didn't work. I ended up getting the following C:\nethack-3.4.3\src>mingw32-make -f Makefile.gcc install creating directory o gcc -c -mms-bitfields -I../include -g -DWIN32CON -oo/makedefs.o ../util/makedefs.c gcc -c -mms-bitfields -I../include -g -DWIN32CON -DDLB -oo/monst.o ../src/monst.c gcc -c -mms-bitfields -I../include -g -DWIN32CON -DDLB -oo/objects.o ../src/objects.c ..\util\makedefs -v Makefile.gcc:655: recipe for target '../include/date.h' failed mingw32-make: *** [../include/date.h] Error -1073741819 I looked at the line it was talking about but it didn't really tell me anything. I did notice that the date.h file being created in the include directory was always empty but that doesn't help me very much either. I read the Install.nt README and the directions seemed pretty clear-cut. However since I didn't change anything I don't know why it would fail to compile... I consider myself to be a competent programmer but I know next to nothing when it comes to makefiles and compiling C code into an executable application so I'm pretty well lost here. I downloaded and installed the MinGW... everything, by which I mean that there is nothing left uninstalled when I run the MinGW installer. What am I doing wrong here? EDIT : As date.h was being mentioned: # # date.h should be remade every time any of the source or include # files is modified. # $(INCL)/date.h $(OPTIONS_FILE): $(U)makedefs.exe $(subst /,\,$(U)makedefs -v) I did notice it seems to be making some kind of call to OPTIONS_FILE, which seems to be commented out. I will uncomment it and see what happens. #$(OPTIONS_FILE): $(U)makedefs.exe #$(subst /,\,$(U)makedefs -v) EDIT 2 That didn't work. Is it possible that I have to manually create/update the date.h file? If so, what do I put into it? Sounds like a question for Google... EDIT 3 I found this for a much older version and tried to change it up but it didn't work either... EDIT 4 Someone mentioned Makedefs which seems to be the thing crashing. I found the C function that appears to be causing the problem: void do_date() { long clocktim = 0; char *c, cbuf[60], buf[BUFSZ]; const char *ul_sfx; filename[0]='\0'; #ifdef FILE_PREFIX Strcat(filename,file_prefix); #endif Sprintf(eos(filename), INCLUDE_TEMPLATE, DATE_FILE); if (!(ofp = fopen(filename, WRTMODE))) { perror(filename); exit(EXIT_FAILURE); } Fprintf(ofp,"/*\tSCCS Id: #(#)date.h\t3.4\t2002/02/03 */\n\n"); Fprintf(ofp,Dont_Edit_Code); #ifdef KR1ED (void) time(&clocktim); Strcpy(cbuf, ctime(&clocktim)); #else (void) time((time_t *)&clocktim); Strcpy(cbuf, ctime((time_t *)&clocktim)); #endif for (c = cbuf; *c; c++) if (*c == '\n') break; *c = '\0'; /* strip off the '\n' */ Fprintf(ofp,"#define BUILD_DATE \"%s\"\n", cbuf); Fprintf(ofp,"#define BUILD_TIME (%ldL)\n", clocktim); Fprintf(ofp,"\n"); #ifdef NHSTDC ul_sfx = "UL"; #else ul_sfx = "L"; #endif Fprintf(ofp,"#define VERSION_NUMBER 0x%08lx%s\n", version.incarnation, ul_sfx); Fprintf(ofp,"#define VERSION_FEATURES 0x%08lx%s\n", version.feature_set, ul_sfx); #ifdef IGNORED_FEATURES Fprintf(ofp,"#define IGNORED_FEATURES 0x%08lx%s\n", (unsigned long) IGNORED_FEATURES, ul_sfx); #endif Fprintf(ofp,"#define VERSION_SANITY1 0x%08lx%s\n", version.entity_count, ul_sfx); Fprintf(ofp,"#define VERSION_SANITY2 0x%08lx%s\n", version.struct_sizes, ul_sfx); Fprintf(ofp,"\n"); Fprintf(ofp,"#define VERSION_STRING \"%s\"\n", version_string(buf)); Fprintf(ofp,"#define VERSION_ID \\\n \"%s\"\n", version_id_string(buf, cbuf)); Fprintf(ofp,"\n"); #ifdef AMIGA { struct tm *tm = localtime((time_t *) &clocktim); Fprintf(ofp,"#define AMIGA_VERSION_STRING "); Fprintf(ofp,"\"\\0$VER: NetHack %d.%d.%d (%d.%d.%d)\"\n", VERSION_MAJOR, VERSION_MINOR, PATCHLEVEL, tm->tm_mday, tm->tm_mon+1, tm->tm_year+1900); } #endif Fclose(ofp); return; } Also I should mention when it gets to this point in the compile process, immediately there is this image: So we've narrowed down the problem (I think?) to the makedefs helper program that is breaking things so now I guess the next step would be to find out why? EDIT 5: It's been suggested that a special parameter should be used when compiling Makedefs.c. I've taken a look at the Makefile to find out where the compile takes place and I think I've found where that is happening but I don't really know what's going on here. $(U)makedefs.exe: $(MAKEOBJS) #$(link) $(LFLAGSU) -o$# $(MAKEOBJS) $(O)makedefs.o: $(CONFIG_H) $(INCL)/monattk.h $(INCL)/monflag.h \ $(INCL)/objclass.h $(INCL)/monsym.h $(INCL)/qtext.h \ $(INCL)/patchlevel.h $(U)makedefs.c $(O)obj.tag $(cc) $(CFLAGSU) -o$# $(U)makedefs.c I know that $(*) is a variable or the Makefile equivalent of a variable. $(U) points to $(UTIL)/, and $(UTIL) points to ../util. $(MAKEOBJS) points to $(O)makedefs.o $(O)monst.o $(O)objects.o. $(O) points to $(OBJ)/ which points to o so that would make $(O)makedefs.o be the same as o/makedefs.o which makes sense considering the behavior I've observed on semi-successful runs (Several files are compiled before the big freeze). Anyway, $(link) points to gcc. $(LFLAGSU) points to $(LFLAGSBASEC) which points to $(linkdebug) which points to -g. $(CONFIG_H) points to a large number of header files: CONFIG_H = $(INCL)/config.h $(INCL)/config1.h $(INCL)/tradstdc.h \ $(INCL)/global.h $(INCL)/coord.h $(INCL)/vmsconf.h \ $(INCL)/system.h $(INCL)/unixconf.h $(INCL)/os2conf.h \ $(INCL)/micro.h $(INCL)/pcconf.h $(INCL)/tosconf.h \ $(INCL)/amiconf.h $(INCL)/macconf.h $(INCL)/beconf.h \ $(INCL)/ntconf.h $(INCL)/nhlan.h $(INCL) points to ../include. $(CFLAGSU) points to $(CFLAGSBASE) $(WINPFLAG). $(CFLAGSBASE) points to -c $(cflags) -I$(INCL) $(WINPINC) $(cdebug) $(cflags) points to -mms-bitfields $(WINPINC) points to -I$(WIN32) $(WIN32) points to ../win/win32 $(cdebug) points to -g $(WINPFLAG) points to -DTILES -DMSWIN_GRAPHICS -D_WIN32_IE=0x0400 . . . And there it is. I think that's what I need to modify to make this work with what was mentioned by RossRidge -D_USE_32BIT_TIME_T. However since I've come that far I do want to find out what some of this stuff means. When looking at the first line I see $(U)makedefs.exe :. To me that appears to be a declaration of the target for the compiled output file? Is that correct? Also, what is the meaning of the # before the $(link) $(LFLAGSU) and after the -o$? And what is the meaning of the $ after the -o? Anyway, I want to try what I figured out and see if it works at all. ... Aaaand adding -D_USE_32BIT_TIME_T to WINPFLAG didn't work. FINAL(ish) EDIT: Turns out RossRidge was correct in his suggestion to use the -D_USE_32BIT_TIME_T flag. MY mistake was putting it in the wrong place. If you take a look at the Makefile.gcc that comes in the box, look at line 165 (which is in an IF statement). You want to tack -D_USE_32BIT_TIME_T at the end of that. BUT you will also want to tack it at the end of line 176 which is on the ELSE end of that IF statement. So that entire block would look something like this instead (Not a huge change, but still significant enough to make it crash if you don't do it and you're running under my situation): ################################################ # # # Nothing below here should have to be changed.# # # ################################################ ifeq "$(GRAPHICAL)" "Y" WINPORT = $(O)tile.o $(O)mhaskyn.o $(O)mhdlg.o \ $(O)mhfont.o $(O)mhinput.o $(O)mhmain.o $(O)mhmap.o \ $(O)mhmenu.o $(O)mhmsgwnd.o $(O)mhrip.o $(O)mhsplash.o \ $(O)mhstatus.o $(O)mhtext.o $(O)mswproc.o $(O)winhack.o WINPFLAG = -DTILES -DMSWIN_GRAPHICS -D_WIN32_IE=0x0400 -D_USE_32BIT_TIME_T NHRES = $(O)winres.o WINPINC = -I$(WIN32) WINPHDR = $(WIN32)/mhaskyn.h $(WIN32)/mhdlg.h $(WIN32)/mhfont.h \ $(WIN32)/mhinput.h $(WIN32)/mhmain.h $(WIN32)/mhmap.h \ $(WIN32)/mhmenu.h $(WIN32)/mhmsg.h $(WIN32)/mhmsgwnd.h \ $(WIN32)/mhrip.h $(WIN32)/mhstatus.h \ $(WIN32)/mhtext.h $(WIN32)/resource.h $(WIN32)/winMS.h WINPLIBS = -lcomctl32 -lwinmm else WINPORT = $(O)nttty.o WINPFLAG= -DWIN32CON -D_USE_32BIT_TIME_T WINPHDR = NHRES = $(O)console.o WINPINC = WINPLIBS = -lwinmm endif
(I don't know if I deserve credit for the answer, since I wouldn't have been aware of what the problem was without Harry Johnston's and indiv's comments, but I'll try to expand on the comments into a full answer.) As indiv explained, the reason why makedefs.exe crashes is because ctime returns NULL. Normally you wouldn't expect ctime to do this, so we need to check the documentation find out under what circumstances it will return an error. Since MinGW is being used to do the compiling we need to look at Microsoft's Visual C++ documentation. This is because MinGW doesn't have it's own C runtime, it just uses Microsoft's. Looking at at the Visual Studio C Run-Time Library Reference entry for ctime we find: Return Value A pointer to the character string result. NULL will be returned if: time represents a date before midnight, January 1, 1970, UTC. If you use _ctime32 or _wctime32 and time represents a date after 03:14:07 January 19, 2038. If you use _ctime64 or _wctime64 and time represents a date after 23:59:59, December 31, 3000, UTC. Now it's pretty safe to assume the original poster hasn't set his system clock to a time far into the future or long in the past. So why would ctime be using the wrong time? Harry Johnston pointed out that the code was using long instead of time_t to store time values in. This isn't too surprising. Nethack is really old code, and originally Unix stored its time in long values, using time_t for time values came later. Nethack would've had to be dealing with old systems that didn't have time_t for a significant chunl of its active development period. That explains why the Nethack source is using the wrong type, but it doesn't quite explain why it's passing the wrong value to ctime. The fact we don't see a description of the return value for ctime itself, just _ctime32 and _ctime64 gives us a clue. If time_t is a 64-bit type, then using long instead will be a problem. On Windows long is only 32-bits, so it would mean ctime is being passed a number that's one part time value, one part random bits. Reading on in the documentation confirms this is the case, and gives us a possible solution: ctime is an inline function which evaluates to _ctime64 and time_t is equivalent to __time64_t. If you need to force the compiler to interpret time_t as the old 32-bit time_t, you can define _USE_32BIT_TIME_T. Doing this will cause ctime to evaluate to _ctime32. This is not recommended because your application may fail after January 18, 2038, and it is not allowed on 64-bit platforms Now since defining _USE_32BIT_TIME_T only affects how the C headers are compiled, and since MinGW supplies it's own C headers, it's possible that MinGW doesn't support this. A quick check of MinGW's time.h reveals that it does, so the simple solution to use the -D_USE_32BIT_TIME_T compiler option to define this macro.