Is it possible with Zephir to include an external library? - c

I have some code in C which does some hardware access. This code is ready and well tested. Now I want to implement a web interface for controlling this hardware. So I came along PHP extension development with Zephir.
My question is, „Is it possible with Zephir to include an external library resp. link against it?“ and if it is possible, how can I do it?

Yes, it's possible and there are two approaches for working with C-code.
By wrapping C-code in CBLOCKs
You can embed c-code in tags, like so: %{ // c-code }%.
This feature is undocumented, but exists in the tests.
https://github.com/phalcon/zephir/blob/master/test/cblock.zep
https://github.com/phalcon/zephir/blob/c47ebdb71b18f7d8b182f4da4a9c77f734ee9a71/test/cblock.zep#L16
https://github.com/phalcon/zephir/blob/c47ebdb71b18f7d8b182f4da4a9c77f734ee9a71/ext/test/cblock.c
%{
// include a header
#include "headers/functions.h"
// c implementation of fibonacci
static long fibonacci(long n) {
if (n < 2) return n;
else return fibonacci(n - 2) + fibonacci(n - 1);
}
}%
Looks a bit ugly, but works ,) A bit more elegant, but also more work, are custom optimizers:
By writing a custom optimizer
An ‘optimizer’ works like an interceptor for function calls. An
‘optimizer’ replaces the call for the function in the PHP userland by
direct C-calls which are faster and have a lower overhead improving
performance.
It's possible to write an optimizer with a clean interfaces, allowing Zephir to know the parameter type passed forward to the C-function and the data type returned.
Manual: https://docs.zephir-lang.com/en/latest/optimizers.html
Example (call to fibonacci c-func): https://github.com/phalcon/zephir/pull/21#issuecomment-26178522

Related

Removing functions included from a header from scope of the next files

In my project we are heavily using a C header which provides an API to comunicate to an external software. Long story short, in our project's bugs show up more often on the calling of the functions defined in those headers (it is an old and ugly legacy code).
I would like to implement an indirection on the calling of those functions, so I could include some profiling before calling the actual implementation.
Because I'm not the only person working on this project, I would like to make those wrappers in a such way that if someone uses the original implementations directly it should cause a compile error.
If those headers were C++ sources, I would be able to simply make a namespace, wrap the included files in it, and implement my functions using it (the other developers would be able to use the original implementation using the :: operator, but just not being able to call it directly is enough encapsulation to me). However the headers are C sources (which I have to include with extern "C" directive to include), so namespaces won't help me AFAIK.
I tried to play around with defines, but with no luck, like this:
#define my_func api_func
#define api_func NULL
What I wanted with the above code is to make my_func to be translated to api_func during the preprocessing, while making a direct call to api_func give a compile error, but that won't work because it will actually make my_func to be translated to NULL too.
So, basically, I would like to make a wrapper, and make sure the only way to access the API is through this wrapper (unless the other developers make some workaround, but this is inevitable).
Please note that I need to wrap hundreds of functions, which show up spread in the whole code several times.
My wrapper necessarily will have to include those C headers, but I would like to make them leave scope outside the file of my wrapper, and make them to be unavailable to every other file who includes my wrapper, but I guess this is not possible in C/C++.
You have several options, none of them wonderful.
if you have the sources of the legacy software, so that you can recompile it, you can just change the names of the API functions to make room for the wrapper functions. If you additionally make the original functions static and put the wrappers in the same source files, then you can ensure that the originals are called only via the wrappers. Example:
static int api_func_real(int arg);
int api_func(int arg) {
// ... instrumentation ...
int result = api_func_real(arg);
// ... instrumentation ...
return result;
}
static int api_func_real(int arg) {
// ...
}
The preprocessor can help you with that, but I hesitate to recommend specifics without any details to work with.
if you do not have sources for the legacy software, or if otherwise you are unwilling to modify it, then you need to make all the callers call your wrappers instead of the original functions. In this case you can modify the headers or include an additional header before that uses #define to change each of the original function names. That header must not be included in the source files containing the API function implementations, nor in those providing the wrapper function implementations. Each define would be of the form:
#define api_func api_func_wrapper
You would then implement the various api_func_wrapper() functions.
Among the ways those cases differ is that if you change the legacy function names, then internal calls among those functions will go through the wrappers bearing the original names (unless you change the calls, too), but if you implement wrappers with new names then they will be used only when called explicitly, which will not happen for internal calls within the legacy code (unless, again, you modify those calls).
You can do something like
[your wrapper's include file]
int origFunc1 (int x);
int origFunc2 (int x, int y);
#ifndef WRAPPER_IMPL
#define origFunc1 wrappedFunc1
#define origFunc2 wrappedFunc2
#else
int wrappedFunc1(int x);
int wrappedFunc2(int x, int y);
#endif
[your wrapper implementation]
#define WRAPPER_IMPL
#include "wrapper.h"
int wrapperFunc1 (...) {
printf("Wrapper1 called\n");
origFunc1(...);
}
Your wrapper's C file obviously needs to #define WRAPPER_IMPL before including the header.
That is neither nice nor clean (and if someone wants to cheat, he could simply define WRAPPER_IMPL), but at least some way to go.
There are two ways to wrap or override C functions in Linux:
Using LD_PRELOAD:
There is a shell environment variable in Linux called LD_PRELOAD,
which can be set to a path of a shared library,
and that library will be loaded before any other library (including glibc).
Using ‘ld --wrap=symbol‘:
This can be used to use a wrapper function for symbol.
Any further reference to symbol will be resolved to the wrapper function.
a complete writeup can be found at:
http://samanbarghi.com/blog/2014/09/05/how-to-wrap-a-system-call-libc-function-in-linux/

programmatically mocking a function

Is there any way to programmatically mock a function for a embedded c application, running on linux. In below example I want to mock main to call someBlah instead of someFunc in run-time.
#include <stdio.h>
void someFunc( void )
{
printf("%s():%d\n",__func__,__LINE__);
}
void someBlah( void )
{
printf("%s():%d\n",__func__,__LINE__);
}
int main(void)
{
someFunc();
}
The program will be executing from ram in Linux so text segment should be modifiable. I know GDB works on some similar concept where breakpoints code locations are replaced by trap instructions.
Sure, just make a table of function pointers.
#define BLAH 0
#define FOO 1
void (*table_function[])(void) = {someBlah, someFoo};
If they all have the same interface and return type, you can just switch them by switching table entries.
Then you call a function by performing
table_function[BLAH]();
If you want to swap a function, just say
table_function[BLAH] = otherBlah;
Also: don't do this unless you are writing some kind of JIT-compiling environment or a VM, usually you don't need such constructs and if you need them you are probably having a bad architecture day.
Although if you're experienced in OO design you can design polymorphic constructs in C that way (ignore this if that doesn't make sense).
You could always make some part of the text segment modifiable by an appropriate call to mprotect and overwrite some code with your own (e.g. by generating machine code with libjit, GNU lightning, ... or manually).
But using function pointers is a cleaner way of doing that.
If the functions are inside a shared library, you could even overwrite its Procedure Linkage Table (see also the ABI spec, which depends upon the architecture - here is one for ARM)
There are a few mocking frameworks for C.
At work, we've had some success with cgreen but we did have to make changes to its internals. Luckily, it's quite small, and so relatively easy to extend. An alternative that looks good, but I haven't worked with, is a combination of Unity and CMock.
On the general topic of unit testing embedded C code, I highly recommend Test Driven Development for Embedded C.
Another way I have done this is:
#include <stdio.h>
#define DEBUG
void someFunc( void )
{
#ifndef DEBUG
printf("%s():%d\n",__func__,__LINE__);
#else
printf("%s():%d\n",__func__,__LINE__);
#endif
}
int main(void)
{
someFunc();
}
Take a look at CMocka, there is an article about mocking on LWN: Unit testing with mock objects in C

How to do dependency injection in C?

I'm looking for a good technical solution to doing DI in C.
I have seen some of the DI questions here already, but I haven't seen one with any actual examples or concrete implementation suggestions.
So, lets say we have the following situation:
We have a set of modules in c; we want to refactor those modules so that we can use DI to run unit tests and so on.
Each module effectively consists of a set of c functions:
module_function(...);
Modules depend on each other. Ie. Typically you may have a call such as:
int module1_doit(int x) {
int y = module2_dosomethingelse(x);
y += 2;
return(y);
}
What is the correct approach to DI for this?
Possible solutions seem to be:
(1) Using function pointers for all module functions, and when invoking a function do this (or similar):
int y = modules->module2->dosomethingelse(x);
(2) Compile multiple libraries (mock, std, etc.) of with the same symbols and dynamically link in the correct implementation.
(2) seems to be the correct way of doing it, but is difficult to configure and annoyingly forces you to build multiple binaries for each unit test.
(1) Seems like it might work, but at some point your DI controller is going to get stuck in a situation where you need to dynamically invoke a generic factory function (void ( factory) (...) say) with a number of other modules that need to be injected at runtime?
Is there another, better way of doing this in c?
What's the 'right' way of doing it?
I don't see any problem with using DI in C. See:
http://devmethodologies.blogspot.com/2012/07/dependency-injection.html
I've concluded that there is no 'right' way of doing this in C. It's always going to be more difficult and tedious than in other languages. I think it's important, however, not to obfuscate your code for the sake of unit tests, though. Making everything a function pointer in C may sound good, but I think it just makes the code horrific to debug in the end.
My latest approach has been to keep things simple. I don't change any code in C modules other than a small #ifdef UNIT_TESTING at the top of a file for externing and memory allocation tracking. I then take the module and compile it with all dependencies removed so that it fails link. Once I've reviewed the unresolved symbols to make sure they are what I want, I run a script that parses these dependencies and generates stub prototypes for all the symbols. These all get dumped in the unit test file. YMMV depending on how complex your external dependencies are.
If I need to mock a dependency in one instance, use the real one in another, or stub it in yet another, then I end up with three unit test modules for the one module under test. Having multiple binaries may not be ideal, but it's the only real option with C. They all get run at the same time, though, so it's not really a problem for me.
This is a perfect use-case for Ceedling.
Ceedling is sort umbrella project that brings together (among other things) Unity and CMock, which together can automate a lot of the work you're describing.
In general Ceedling/Unity/CMock are a set of ruby scripts that scan through your code and auto-generate mocks based on your module header files, as well as test runners that find all the tests and makes runners that will run them.
A separate test runner binary is generated for each test suite, linking in the appropriate mock and real implementations as you request in your test suite implementation.
I was initially hesitant to bring in ruby as a dependency to our build system for testing, and it seemed like a lot of complexity and magic, but after trying it out and writing some tests using the auto-generated mocking code I was hooked.
A little late to the party on this but this has been a recent topic where I work.
The two main ways that I've seen it done is using function pointers, or moving all dependencies to a specific C file.
A good example of the later is FATFS.
http://elm-chan.org/fsw/ff/en/appnote.html
The author of fatfs provides the bulk of the library functions and relegates certain specific dependencies for the user of the library to write (e.g. serial peripheral interface functions).
Function pointers are another useful tool, and using typedefs help to keep the code from getting too ugly.
Here's some simplified snippets from my Analog to Digital Converter (ADC) code:
typedef void (*adc_callback_t)(void);
bool ADC_CallBackSet(adc_callback_t callBack)
{
bool err = false;
if (NULL == ADC_callBack)
{
ADC_callBack = callBack;
}
else
{
err = true;
}
return err;
}
// When the ADC data is ready, this interrupt gets called
bool ADC_ISR(void)
{
// Clear the ADC interrupt flag
ADIF = 0;
// Call the callback function if set
if (NULL != ADC_callBack)
{
ADC_callBack();
}
return true; // handled
}
// Elsewhere
void FOO_Initialize(void)
{
ADC_CallBackSet(FOO_AdcCallback);
// Initialize other FOO stuff
}
void FOO_AdcCallback(void)
{
ADC_RESULT_T rawSample = ADC_ResultGet();
FOO_globalVar += rawSample;
}
Foo's interrupt behavior is now injected into the ADC's interrupt service routine.
You can take it a step further and pass a function pointer into FOO_Initialize so all dependency issues are managed by the application.
//dependency_injection.h
typedef void (*DI_Callback)(void)
typedef bool (*DI_CallbackSetter)(DI_Callback)
// foo.c
bool FOO_Initialize(DI_CallbackSetter CallbackSet)
{
bool err = CallbackSet(FOO_AdcCallback);
// Initialize other FOO stuff
return err;
}
There are two approaches that you can use. Whether you really want to or not, as Rafe is pointing out, are up to you.
First: Create the "dynamically" injected method in a static library. Link against the library and simply substitute it during tests. Voila, the method is replaced.
Second: Simply provide compile-time replacements based on preprocessing:
#ifndef YOUR_FLAG
/* normal method versions */
#else
/* changed method versions */
#endif
/* methods that have no substitute */

How to enforce interface contracts (in C) at compile time?

Background:
We're modeling the firmware for a new embedded system. Currently the firmware is being modeled in UML, but the code generation capabilities of the UML modeling tool will not be used.
Target language will be C (C99, to be specific).
Low power (i.e. performance, quick execution) and correctness are important, but correctness is the top priority, above everything else, including code size and execution speed.
In modeling the system, we've identified a set of well-defined components. Each component has its own interface, and many of the components interact with many of the components.
Most components in the model will be individual tasks (threads) under a real-time operating system (RTOS), although some components are nothing more than libraries. Tasks communicate with one another entirely via message passing / queue posting. Interaction with libraries will be in the form of synchronous function calls.
Because advice/recommendations might depend on scale, I'll provide some information. There are maybe around 12-15 components right now, might grow to ~20? Not 100s of components. Let's say on average, each component interacts with 25% of the other components.
In the component diagram, there are ports/connectors used to represent interfaces between components, i.e. one component provides what the other component requires. So far so good.
Here's the rub: there are many cases where we don't want "Component A" to have access to all of "Component B's" interface, i.e. we want to restrict Component A to a subset of the interface that Component B provides.
Question / problem:
Is there a systematic, fairly straightforward way to enforce -- preferably at compile time -- the interface contracts defined on the component diagram?
Obviously, compile-time solutions are preferable to run-time solutions (earlier detection, better performance, probably smaller code).
For example, suppose a library component "B" provides functions X(), Y() and Z(), but I only want component "A" to be able to call function Z(), not X() and Y(). Similarly, even though component "A" might be capable of receiving and handling a whole slew of different messages through its message queue, we don't any component to be able to send any message to any component.
The best I could come up with is to have different header files for each component-component interface, and to only expose (via the header file) the parts of the interface that the component is allowed to use. Obviously this could result in a lot of header files. This would also mean that message passing between components wouldn't done directly with the OS API, but rather through function calls, each of which builds & sends a specific (allowed) message. For synchronous calls/libraries, only the allowed subset of the API would be exposed.
For this exercise, you can assume people will be well-behaved. In other words, don't worry about people cheating & cutting & pasting function prototypes directly, or including header files that they're not allowed to. They won't directly post a message from "A" to "B" if it's not permitted, and so on...
Maybe there is a way to enforce contracts with compile-time assertions. Maybe there is a more elegant way to check/enforce this at run-time, even if it incurs some overhead.
Code will have to compile & lint cleanly, so the "function prototype firewall" approach is OK, but it just seems there might be a more idiomatic way to do this.
The idea with the headers is sound, but, depending on the interlacing between your components, it might be cleaner to divide the interface of each component into a number of sub-categories with their own header files instead of providing a header file for each component-component-connection.
The sub-categories need not necessarily be disjoint, but make sure (via preprocessor directives) that you can mix categories without getting re-definitions; this can be achieved in a systematic fashion by creating a header-file for each type or function declaration with its own inclusion guard, and then building the sub-category headers from these atomic blocks.
#ifdef FOO_H_
/* I considered allowing you to include this multiple times (probably indirectly)
and have a new set of `#define`s switched on each time, but the interaction
between that and the FOO_H_ got confusing. I don't doubt that there is a good
way to accomplish that, but I decided not to worry with it right now. */
#warn foo.h included more than one time
#else /* FOO_H_ */
#include <message.h>
#ifdef FOO_COMPONENT_A
int foo_func1(int x);
static inline int foo_func2(message_t * msg) {
return msg_send(foo, msg);
}
...
#else /* FOO_COMPONENT_A */
/* Doing this will hopefully cause your compiler to spit out a message with
an error that will provide a hint as to why using this function name is
wrong. You might want to play around with your compiler (and maybe a few
others) to see if there is a better illegal code for the body of the
macros. */
#define foo_func1(x) ("foo_func1"=NULL)
#define foo_func2(x) ("foo_func2"=NULL)
...
#endif /* FOO_COMPONENT_A */
#ifdef FOO_COMPONENT_B
int foo_func3(int x);
#else /* FOO_COMPONENT_B */
#define foo_func3(x) ("foo_func3"=NULL)
#endif /* FOO_COMPONENT_B */
You should consider creating a mini-language and a simple tool to generate header files along the lines of what nategoose proposed in his answer.
To generate the header in that answer, something like this (call it foo.comp):
[COMPONENT_A]
int foo_func1(int x);
static inline int foo_func2(message_t * msg) {
return msg_send(foo, msg);
}
[COMPONENT_B]
int foo_func3(int x);
(and extending the example to give an interface usable by multiple components):
[COMPONENT_B, COMPONENT_C]
int foo_func4(void);
This would be straightforward to parse and generate the header file. If your interfaces (I especially suspect the message passing might be) are even more boilerplate than I've assumed above, you can simplify the language somewhat.
The advantages here are:
A bit of syntactic sugar to make the maintenance easier.
You can change the protection scheme by changing the tool if you discover a better method later. There will be fewer places to change, which means you're more likely to be able to make the change. (For example, you might later find an alternative to the "illegal macro code" that nategoose proposes.)

C library naming conventions

Introduction
Hello folks, I recently learned to program in C! (This was a huge step for me, since C++ was the first language, I had contact with and scared me off for nearly 10 years.) Coming from a mostly OO background (Java + C#), this was a very nice paradigm shift.
I love C. It's such a beautiful language. What surprised me the most, is the high grade of modularity and code reusability C supports - of course it's not as high as in a OO-language, but still far beyond my expectations for an imperative language.
Question
How do I prevent naming conflicts between the client code and my C library code? In Java there are packages, in C# there are namespaces. Imagine I write a C library, which offers the operation "add". It is very likely, that the client already uses an operation called like that - what do I do?
I'm especially looking for a client friendly solution. For example, I wouldn't like to prefix all my api operations like "myuniquelibname_add" at all. What are the common solutions to this in the C world? Do you put all api operations in a struct, so the client can choose its own prefix?
I'm very looking forward to the insights I get through your answers!
EDIT (modified question)
Dear Answerers, thank You for Your answers! I now see, that prefixes are the only way to safely avoid naming conflicts. So, I would like to modifiy my question: What possibilities do I have, to let the client choose his own prefix?
The answer Unwind posted, is one way. It doesn't use prefixes in the normal sense, but one has to prefix every api call by "api->". What further solutions are there (like using a #define for example)?
EDIT 2 (status update)
It all boils down to one of two approaches:
Using a struct
Using #define (note: There are many ways, how one can use #define to achieve, what I desire)
I will not accept any answer, because I think that there is no correct answer. The solution one chooses rather depends on the particular case and one's own preferences. I, by myself, will try out all the approaches You mentioned to find out which suits me best in which situation. Feel free to post arguments for or against certain appraoches in the comments of the corresponding answers.
Finally, I would like to especially thank:
Unwind - for his sophisticated answer including a full implementation of the "struct-method"
Christoph - for his good answer and pointing me to Namespaces in C
All others - for Your great input
If someone finds it appropriate to close this question (as no further insights to expect), he/she should feel free to do so - I can not decide this, as I'm no C guru.
I'm no C guru, but from the libraries I have used, it is quite common to use a prefix to separate functions.
For example, SDL will use SDL, OpenGL will use gl, etc...
The struct way that Ken mentions would look something like this:
struct MyCoolApi
{
int (*add)(int x, int y);
};
MyCoolApi * my_cool_api_initialize(void);
Then clients would do:
#include <stdio.h>
#include <stdlib.h>
#include "mycoolapi.h"
int main(void)
{
struct MyCoolApi *api;
if((api = my_cool_api_initialize()) != NULL)
{
int sum = api->add(3, 39);
printf("The cool API considers 3 + 39 to be %d\n", sum);
}
return EXIT_SUCCESS;
}
This still has "namespace-issues"; the struct name (called the "struct tag") needs to be unique, and you can't declare nested structs that are useful by themselves. It works well for collecting functions though, and is a technique you see quite often in C.
UPDATE: Here's how the implementation side could look, this was requested in a comment:
#include "mycoolapi.h"
/* Note: This does **not** pollute the global namespace,
* since the function is static.
*/
static int add(int x, int y)
{
return x + y;
}
struct MyCoolApi * my_cool_api_initialize(void)
{
/* Since we don't need to do anything at initialize,
* just keep a const struct ready and return it.
*/
static const struct MyCoolApi the_api = {
add
};
return &the_api;
}
It's a shame you got scared off by C++, as it has namespaces to deal with precisely this problem. In C, you are pretty much limited to using prefixes - you certainly can't "put api operations in a struct".
Edit: In response to your second question regarding allowing users to specify their own prefix, I would avoid it like the plague. 99.9% of users will be happy with whatever prefix you provide (assuming it isn't too silly) and will be very UNHAPPY at the hoops (macros, structs, whatever) they will have to jump through to satisfy the remaining 0.1%.
As a library user, you can easily define your own shortened namespaces via the preprocessor; the result will look a bit strange, but it works:
#define ns(NAME) my_cool_namespace_ ## NAME
makes it possible to write
ns(foo)(42)
instead of
my_cool_namespace_foo(42)
As a library author, you can provide shortened names as desribed here.
If you follow unwinds's advice and create an API structure, you should make the function pointers compile-time constants to make inlinig possible, ie in your .h file, use the follwoing code:
// canonical name
extern int my_cool_api_add(int x, int y);
// API structure
struct my_cool_api
{
int (*add)(int x, int y);
};
typedef const struct my_cool_api *MyCoolApi;
// define in header to make inlining possible
static MyCoolApi my_cool_api_initialize(void)
{
static const struct my_cool_api the_api = { my_cool_api_add };
return &the_api;
}
Unfortunately, there's no sure way to avoid name clashes in C. Since it lacks namespaces, you're left with prefixing the names of global functions and variables. Most libraries pick some short and "unique" prefix (unique is in quotes for obvious reasons), and hope that no clashes occur.
One thing to note is that most of the code of a library can be statically declared - meaning that it won't clash with similarly named functions in other files. But exported functions indeed have to be carefully prefixed.
Since you are exposing functions with the same name client cannot include your library header files along with other header files which have name collision. In this case you add the following in the header file before the function prototype and this wouldn't effect client usage as well.
#define add myuniquelibname_add
Please note this is a quick fix solution and should be the last option.
For a really huge example of the struct method, take a look at the Linux kernel; 30-odd million lines of C in that style.
Prefixes are only choice on C level.
On some platforms (that support separate namespaces for linkers, like Windows, OS X and some commercial unices, but not Linux and FreeBSD) you can workaround conflicts by stuffing code in a library, and only export the symbols from the library you really need. (and e.g. aliasing in the importlib in case there are conflicts in exported symbols)

Resources