Suppose I have a call to a function which takes a variable number of arguments in my source code. I want to do some kind of static analysis on this source code to find the type of the arguments being actually passed to the function. For example, if my function call is -
foo(a, b, c)
I want to find the data type of a, b and c and store this information.
You pretty well have to do the parse-and-build-a-symbol-table part of compiling the program.
Which means running the preprocessor, and lexing as well.
That's the bad news.
The good news is that you don't have to do much of the hard stuff. No need to build a AST, every part of the code except typedefs; struct, union, and enum definitions; variable-or-function declarations-and-definitions; and analyzing the function call arguments can be a no-op.
On further thought prompted by Chris' comments: You do have to be able to analyze the types of expressions and handle the va-arg promotions, as well.
It is still a smaller project than writing a whole compiler, but should be approached with some thought.
If this is in C++, you can hack together some RTTI using typeid etc.
http://en.wikipedia.org/wiki/Run-time_type_information
Related
In C when a function is declared like void main(); trying to input an argument to it(as the first and the only argument) doesn't cause a compilation error and in order to prevent it, function can be declared like void main(void);. By the way, I think this also applies to Objective C and not to C++. With Objective C I am referring to the functions outside classes. Why is this? Thanks for reaching out. I imagine it's something like that in Fortran variables whose names start with i, j, k, l, m or n are implicitly of integer type(unless you add an implicit none).
Edit: Does Objective C allow this because of greater compatibility with C, or is it a reason similar to the reason for C having this for having this?
Note: I've kept the mistake in the question so that answers and comments wouldn't need to be changed.
Another note: As pointed out by #Steve Summit and #matt (here), Objective-C is a strict superset of C, which means that all C code is also valid Objective-C code and thus has to show this behavior regarding functions.
Because function prototypes were not a part of pre-standard C, functions could be declared only with empty parentheses:
extern double sin();
All existing code used that sort of notation. The standard would have failed had such code been made invalid, or made to mean “zero arguments”.
So, in standard C, a function declaration like that means “takes an undefined list of zero or more arguments”. The standard does specify that all functions with a variable argument list must have a prototype in scope, and the prototype will end with , ...). So, a function declared with an empty argument list is not a variadic function (whereas printf() is variadic).
Because the compiler is not told about the number and types of the arguments, it cannot complain when the function is called, regardless of the arguments in the call.
In early (pre-ANSI) C, a correct match of function arguments between a function's definition and its calls was not checked by the compiler.
I believe this was done for two reasons:
It made the compiler considerably simpler
C was always designed for separate compilation, and checking consistency across translation units (that is, across multiple source files) is a much harder problem.
So, in those early days, making sure that a function's call(s) matched its definition was the responsibility of the programmer, or of a separate program, lint.
The lax checking of function arguments also made varargs functions like printf possible.
At any rate, in the original C, when you wrote
extern int f();
, you were not saying "f is a function accepting no arguments and returning int". You were simply saying "f is a function returning int". You weren't saying anything about the arguments.
Basically, early C's type system didn't even have a way of recording the parameters expected by a function. And that was especially true when separate compilation came into play, because the linker resolved external symbols based pretty much on their names only.
C++ changed this, of course, by introducing function prototypes. In C++, when you say extern int f();, you are declaring a function that explicitly takes 0 arguments. (Also a scheme of "name mangling" was devised, which among other things let the linker do some consistency checking at link time.)
Now, this was all somewhat of a deficiency in old C, and the biggest change that ANSI C introduced was to adopt C++'s function prototype notation into C. It was slightly different, though: to maintain compatibility, in C saying extern int f(); had to be interpreted as meaning "function returning int and taking unspecified arguments". If you wanted to explicitly say that a function took no arguments, you had to (and still have to) say extern int f(void);.
There was also a new ... notation to explicitly mark a function as taking variable arguments, like printf, and the process of getting rid of "implicit int" in declarations was begun.
All in all it was a significant improvement, although there are still a few holes. In particular, there's still some responsibility placed on the programmer, namely to ensure that accurate function prototypes are always in scope, so that the compiler can check them. See also this question.
Two additional notes: You asked about Objective C, but I don't know anything about that language, so I can't address that point. And you said that for a function without a prototype, "trying to input an argument to it (as the first and the only argument) doesn't cause a compilation error", but in fact, you can pass any number or arguments to such a function, without error.
I have code that gives me an error. Implicit declaration of isNumericFloat.
I want to know if the function:
isNumericFloat()
a built it function in C?
NO, it's not a "built-in" c function.1
This function is used somewhere in your code and it's not part of the standard library. In fact, just because it uses camel case which is not very common in c code it seems like an odd function written by a not so c-ish programmer, of course that's a subjective reason, but commonly c programmers would choose is_numeric_float().
You need to search your code to see if you can find it's defintion, but in the mean time you can provide a prototype, like
int isNumericFloat(float value); // I don't really know what arguments it takes
// but you can surely infer them from the code
before it's ever called in the code, if you do so one of these two things will happen
If there is a definition for the function somewhere, it will compile fine.
If there is no definition, the linker will tell you that there is/are undefined reference/s to it in the code.
1Strictly speaking, there are no built-in functions in c, there is something called the standard library (headers starting with std , like stdlib.h), and I mean that it's not part of such library.
I am presently in a case where I need to call a lot of function pointers that has been extracted at runtime. The problem is that the arguments are unknown at compilation time.
But, at runtime I receive datas that allows me to know the arguments of the function and I can even store the arguments in a char* array. The problem is that I don't have a function pointer model to cast it into.
In high level language, I know there is function like "InvokeMethode(String name,Byte[] args)" that interpret the bytes array like arguments. Since reflection does not exist in C, I have no hope to see this with a function pointer.
One solution that I have in mind (and it's really bad), is to create a model of function pointer at compilation time that will cast in a "hardcoded way" the ptr to the right type to use like this:
void callFunc64Bits(void* funcPtr,long long args);
void callFuncVoid(void* funcPtr);
The problem is that I will have to create like 100 function like this that will cast the pointer correctly.
Is there a way to do it more efficiently?
Thank you very much!
This is a hard problem without, unfortunately, good or easy answers.
See this former SO question: Run-time parameters in gcc (inverse va_args/varargs)
See this C FAQ question: http://c-faq.com/varargs/invvarargs.html
See this collection of "wacky ideas" by the C FAQ list maintainer: http://c-faq.com/varargs/wacky.html
Addendum: see this former SO question: How to call functions by their pointers passing multiple arguments in C?
...which mentions "libffi": http://sourceware.org/libffi/
I'm trying to make some improvements to a interpreter for microcontrollers that I'm working on. For executing built-in functions I currently have something like this (albeit a bit faster):
function executeBuiltin(functionName, functionArgs) {
if (functionName=="foo") foo(getIntFromArg(functionArgs[0]));
if (functionName=="bar") bar(getIntFromArg(functionArgs[0]),getBoolFromArg(functionArgs[1]),getFloatFromArg(functionArgs[2]));
if (functionName=="baz") baz();
...
}
But it is for an embedded device (ARM) with very limited resources, and I need to cut down on the code size drastically. What I'd like to do is to have a general-purpose function for calling other functions with different arguments - something like this:
function executeBuiltin(functionName, functionArgs) {
functionData = fast_lookup(functionName);
call_with_args(functionData.functionPointer, functionData.functionArgumentTypes, functionArgs);
}
So I want to be able to call a standard C function and pass it whatever arguments it needs (which could all be of different types). For this, I need a call_with_args function.
I want to avoid re-writing every function to take argc+argv. Ideally each function that was called would be an entirely standard C function.
There's a discussion about this here - but has anything changed since 1993 when that post was written? Especially as I'm running on ARM where arguments are in registers rather than on the stack. Even if it's not in standard C, is there anything GCC specific that can be done?
UPDATE: It seems that despite behaviour being 'undefined' according to the spec, it looks like because of the way C calls work, you can pass more arguments to a function than it is expecting and everything will be fine, so you can unpack all the arguments into an array of uint32s, and can then just pass each uint32 to the function.
That makes writing 'nice' code for calls much easier, and it appears to work pretty well (on 32 bit platforms). The only problem seems to be when passing 64 bit numbers and compiling for 64bit x86 as it seems to do something particularly strange in that case.
Would it be possible to do at compile time with macros?
Something along the lines of:
https://www.redhat.com/archives/libvir-list/2014-March/msg00730.html
If runtime was required, perhaps __buildin_apply_args() could be leveraged.
from this document, section 5.5, Parameter Passing, it seems like parameters are passed both in registers and in stack, as with most of today platforms.
With "non standard C" I am thinking to pack the parameters and call the function following the documentation with some asm(). However you need a minimal information about the signature of the function being called anyway (I mean, how many bits for each argument to be passed).
From this point of view I would prefer to prepare an array of function names, an array of function pointers and an array of enumerated function signatures (in the number of bits of each argument... you don't need to differentiate void* from char* for example) and a switch/case on the signatures, and a switch/case on the last one. So I have reported two answers here.
You can do a very simple serialization to pass arbitrary arguments. Create an array and memcpy sizeof(arg) bytes into it for each passed argument.
Or you can create structs for function arguments.
Every function takes a char* or a void*. Then you pass either a pointer to a struct with that functions parameters, or you define a set of macros or functions to encode and decode arbitrary data from an array and pass the pointer to that array.
I'm following a guide to learn curses, and all of the C code within prototypes functions before main(), then defines them afterward. In my C++ learnings, I had heard about function prototyping but never done it, and as far as I know it doesn't make too much of a difference on how the code is compiled. Is it a programmer's personal choice more than anything else? If so, why was it included in C at all?
Function prototyping originally wasn't included in C. When you called a function, the compiler just took your word for it that it would exist and took the type of arguments you provided. If you got the argument order, number, or type wrong, too bad – your code would fail, possibly in mysterious ways, at runtime.
Later versions of C added function prototyping in order to address these problems. Your arguments are implicitly converted to the declared types under some circumstances or flagged as incompatible with the prototype, and the compiler could flag as an error the wrong order and number of types. This had the side effect of enabling varargs functions and the special argument handling they require.
Note that, in C (and unlike in C++), a function declared foo_t func() is not the same as a function declared as foo_t func(void). The latter is prototyped to have no arguments. The former declares a function without a prototype.
In C prototyping is needed so that your program knows that you have a function called x() when you have not gotten to defining it, that way y() knows that there is and exists a x(). C does top down compilation, so it needs to be defined before hand is the short answer.
x();
y();
main(){
}
y(){
x();
}
x(){
...
more code ...
maybe even y();
}
I was under the impression that it was so customers could have access to the .h file for libraries and see what functions were available to them, without having to see the implementation (which would be in another file).
Useful to see what the function returns/what parameters.
Function prototyping is a remnant from the olden days of compiler writing. It used to be considered horribly inefficient for a compiler to have to make multiple passes over a source file to compile it.
In C, in certain contexts, referring to a function in one manner is syntactically equivalent to referring to a variable: consider taking a pointer to a function versus taking a pointer to a variable. In the compiler's intermediate representation, the two are semantically distinct, but syntactically, whether an identifier is a variable, a function name, or an invalid identifier cannot be determined from the context.
Since it's not determinable from the context, without function prototypes, the compiler would need to make an extra pass over each one of your source files each time one of them compiles. This would add an extra O(n) factor for any compilation (that is, if compilation were O(m), it would now be O(m*n)), where n is the number of files in your project. In large projects, where compilation is already on the order of hours, having a two-pass compiler is highly undesirable.
Forward declaring all your functions would allow the compiler to build a table of functions as it scanned the file, and be able to determine when it encountered an identifier whether it referred to a function or a variable.
As a result of this, C (and by extension, C++) compilers can be extremely efficient in compilation.
It allows you to have a situation in which say you can have an iterator class defined in a separate .h file which includes the parent container class. Since you've included the parent header in the iterator, you can't have a method like say "getIterator()" because the return type would have to be the iterator class and therefore it would require that you include the iterator header inside the parent header creating a cyclic loop of inclusions (one includes the other which includes itself which includes the other again, etc.).
If you put the iterator class prototype inside the parent container, you can have such a method without including the iterator header. It only works because you're simply saying that such an object exists and will be defined.
There are ways of getting around it like having a precompiled header, but in my opinion it's less elegant and comes with a slew of disadvantages. Of couurse this is C++, not C. However, in practice you might have a situation in which you'd like to arrange code in this fashion, classes aside.