Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
As they say, your learn coding techniques from others' code. I've been trying to understand couple of free stacks and they all have one thing in common: Structure of function pointers. I've following of questions related to this architecture.
Is there any specific reason behind such an architecture?
Does function call via function pointer help in any optimization?
Example:
void do_Command1(void)
{
// Do something
}
void do_Command2(void)
{
// Do something
}
Option 1: Direct execution of above functions
void do_Func(void)
{
do_Command1();
do_Command2();
}
Option 2: Indirect execution of above functions via function pointers
// Create structure for function pointers
typedef struct
{
void (*pDo_Command1)(void);
void (*pDo_Command2)(void);
}EXECUTE_FUNC_STRUCT;
// Update structure instance with functions address
EXECUTE_FUNC_STRUCT ExecFunc = {
do_Command1,
do_Command2,
};
void do_Func(void)
{
EXECUTE_FUNC_STRUCT *pExecFunc; // Create structure pointer
pExecFun = &ExecFunc; // Assign structure instance address to the structure pointer
pExecFun->pDo_Command1(); // Execute command 1 function via structure pointer
pExecFun->pDo_Command2(); // Execute command 2 function via structure pointer
}
While Option 1 is easy to understand and implement, why do we need to use Option 2?
While Option 1 is easy to understand and implement, why do we need to use Option 2?
Option 1 doesn't allow you to change the behavior without changing the code - it will always execute the same functions in the same order every time the program is executed. Which, sometimes, is the right answer.
Option 2 gives you the flexibility to execute different functions, or to execute do_Command2 before do_Command1, based decisions at runtime (say after reading a configuration file, or based on the result of another operation, etc.).
Real-world example from personal experience - I was working on an application that would read data files generated from Labview-driven instruments and load them into a database. There were four different instruments, and for each instrument there were two types of files, one for calibration and the other containing actual data. The file naming convention was such that I could select the parsing routine based on the file name. Now, I could have written my code such that:
void parse ( const char *fileName )
{
if ( fileTypeIs( fileName, "GRA" ) && fileExtIs( fileName, "DAT" ) )
parseGraDat( fileName );
else if ( fileTypeIs( fileName, "GRA" ) && fileExtIs ( fileName, "CAL" ) )
parseGraCal( fileName );
else if ( fileTypeIs( fileName, "SON" ) && fileExtIs ( fileName, "DAT" ) )
parseSonDat( fileName );
// etc.
}
and that would have worked just fine. However, at the time, there was a possibility that new instruments would be added later and that there may be additional file types for the instruments. So, I decided that instead of a long if-else chain, I would use a lookup table. That way, if I did have to add new parsing routines, all I had to do was write the new routine and add an entry for it to the lookup table - I didn't have to modify any of the main program logic. The table looked something like this:
struct lut {
const char *type;
const char *ext;
void (*parseFunc)( const char * );
} LUT[] = { {"GRA", "DAT", parseGraDat },
{"GRA", "CAL", parseGraCal },
{"SON", "DAT", parseSonDat },
{"SON", "CAL", parseSonCal },
// etc.
};
Then I had a function that would take the file name, search the lookup table, and return the appropriate parsing function (or NULL if the filename wasn't recognized):
void (*parse)(const char *) = findParseFunc( LUT, fileName );
if ( parse )
parse( fileName );
else
log( ERROR, "No parsing function for %s", fileName );
Again, there's no reason I couldn't have used the if-else chain, and in retrospect it's probably what I should have done for that particular app1. But it's a really powerful technique for writing code that needs to be flexible and responsive.
I suffer from a tendency towards premature generalization - I'm writing code to solve what I think will be issues five years from now instead of the issue today, and I wind up with code that tends to be more complex than necessary.
Best explained via Example.
Example 1:
Lets say you want to implement a Shape class with a draw() method, then you would need a function pointer in order to do that.
struct Shape {
void (*draw)(struct Shape*);
};
void draw(struct Shape* s) {
s->draw(s);
}
void draw_rect(struct Shape *s) {}
void draw_ellipse(struct Shape *s) {}
int main()
{
struct Shape rect = { .draw = draw_rect };
struct Shape ellipse = { .draw = draw_ellipse };
struct Shape *shapes[] = { &rect, &ellipse };
for (int i=0; i < 2; ++i)
draw(shapes[i]);
}
Example 2:
FILE *file = fopen(...);
FILE *mem = fmemopen(...); /* POSIX */
Without function pointers, there would be no way to implement a common interface for file and memory streams.
Addendum
Well, there is another way. Based on the Shape example:
enum ShapeId {
SHAPE_RECT,
SHAPE_ELLIPSE
};
struct Shape {
enum ShapeId id;
};
void draw(struct Shape *s)
{
switch (s->id) {
case SHAPE_RECT: draw_rect(s); break;
case SHAPE_ELLIPSE: draw_ellipse(s); break;
}
}
The advantage of the second example could be, that the compiler could inline the functions, then you would have omitted the overhead of a function call.
"Everything in computer science can be solved with one more level of indirection."
The struct-of-function-pointers "pattern", let's call it, permits runtime choices. SQLite uses it all over the place, for example, for portability. If you provide a "file system" meeting its required semantics, then you can run SQLite on it, with Posix nowhere in sight.
GnuCOBOL uses the same idea for indexed files. Cobol defines ISAM semantics, whereby a program can read a record from a file by specifying a key. The underlying name-value store can be provided by several (configurable) libraries, which all provide the same functionality, but use different names for their "read a record" function. By wrapping these up as function pointers, the Cobol runtime support library can use any of those key-value systems, or even more than one at the same time (for different files, of course).
Related
I've tried multiple sources for solutions to this problem. They all either require modifying the source code, and architecture specific exploit such as writing in a jmp instruction to detour the function, or using a macro and including the c file. The first one is extremely annoying to deal with, the second is usually not possible due to page protections, and the third introduces a lot of problems with linking multiple files containing different mocks and unit test for the same source file. Is there any better method of doing this?
You can user function pointer in your nominal code. You assign them at init with nominal implemetation in your application. In your unit test you can then assign the function pointer to the mock implmentation. Function pointer is a common practice used to implement interface in C.
Here is a gist of how that could be done:
typedef struct {
void (*method) ();
} interface;
void run(itf *interface)
{
itf->method();
}
void methodImpl()
{
printf("nominal code");
}
void methodMock()
{
printf("mock code");
}
void do_run()
{
interface itf;
itf.method = methodImpl;
run(&itf);
}
void test_run()
{
interface itf;
itf.method = methodMock;
run(&itf);
}
Does memmove work on file pointer data?
I am trying to remove a line from a C file. I am trying to use memmove to make this more efficient than the internet's recommendation to create a duplicate file and overwrite it. I have debugged and I can't figure out why this code isn't working. I am asking for input. The logic is a for loop. Inside the loop, I have logic to do a memmove but it doesn't seem effective.
nt RemoveRow(int iRowNum)
{
char sReplaceLineStart[m_MaxSizeRow]={0};
char sTemp[m_MaxSizeRow] ={0};
size_t RemovalLength = 0;
GoToBeginningOfFile();
for(int i =0;i<m_iNumberOfRows;i++)
{
if(i == iRowNum)
{
// Line to remove
fgets(m_sRemovalRow,m_MaxSizeRow,pFile);
}
if(m_sRemovalRow == NULL)
{
// Were removing the last line
// just make it null
memset(m_sRemovalRow,0,sizeof(m_MaxSizeRow));
}
}
else if(i==iRowNum+1)
{
// replace removal line with this.
RemovalLength+=strlen(sTemp);
fgets(sReplaceLineStart, m_MaxSizeRow, pFile);
}
else if(i>iRowNum) {
// start line to replace with
RemovalLength+=strlen(sTemp);
fgets(sTemp, m_MaxSizeRow, pFile);
}
else
{
// were trying to get to the removal line
fgets(m_sCurrentRow, m_MaxSizeRow, pFile);
printf("(not at del row yet)iRow(%d)<iRowNum(%d) %s\n",
i,
m_iNumberOfRows,
m_sCurrentRow);
}
}
{
memmove(m_sRemovalRow,
sReplaceLineStart,
RemovalLength);
}
return 1;
}
FILE is a so-called opaque type, meaning that the application programmer is purposely locked out of its internals as per design - private encapsulation.
Generally one would create an opaque type using the concept of forward declaration, like this:
// stdio.h
typedef struct FILE FILE;
And then inside the private library:
// stdio.c - not accessible by the application programmer
struct FILE
{
// internals
};
Since FILE was forward declared and we only have access to the header, FILE is now an incomplete type, meaning we can't declare an instance of that type, access its members nor pass it to sizeof etc. We can only access it through the API which does know the internals. Since C allows us to declare a pointer to an incomplete type, the API will use FILE* like fopen does.
However, the implementation of the standard library isn't required to implement FILE like this - the option is simply there. So depending on the implementation of the standard library, we may or may not be able to create an instance of a FILE objet and perhaps even access its internals. But that's all in the realm of non-standard language extensions and such code would be non-portable.
What is the intention to set handle to an object as pointer-to pointer but not pointer? Like following code:
FT_Library library;
FT_Error error = FT_Init_FreeType( &library );
where
typedef struct FT_LibraryRec_ *FT_Library
so &library is a FT_LIBraryRec_ handle of type FT_LIBraryRec_**
It's a way to emulate pass by reference in C, which otherwise only have pass by value.
The 'C' library function FT_Init_FreeType has two outputs, the error code and/or the library handle (which is a pointer).
In C++ we'd more naturally either:
return an object which encapsulated the success or failure of the call and the library handle, or
return one output - the library handle, and throw an exception on failure.
C APIs are generally not implemented this way.
It is not unusual for a C Library function to return a success code, and to be passed the addresses of in/out variables to be conditionally mutated, as per the case above.
The approach hides implementation. It speeds up compilation of your code. It allows to upgrade data structures used by the library without breaking existing code that uses them. Finally, it makes sure the address of that object never changes, and that you don’t copy these objects.
Here’s how the version with a single pointer might be implemented:
struct FT_Struct
{
// Some fields/properties go here, e.g.
int field1;
char* field2;
}
FT_Error Init( FT_Struct* p )
{
p->field1 = 11;
p->field2 = malloc( 100 );
if( nullptr == p->field2 )
return E_OUTOFMEMORY;
return S_OK;
}
Or C++ equivalent, without any pointers:
class FT_Struct
{
int field1;
std::vector<char> field2;
public:
FT_Struct() :
field1( 11 )
{
field2.resize( 100 );
}
};
As a user of the library, you have to include struct/class FT_Struct definition. Libraries can be very complex so this will slow down compilation of your code.
If the library is dynamic i.e. *.dll on windows, *.so on linux or *.dylib on osx, you upgrade the library and if the new version changes memory layout of the struct/class, old applications will crash.
Because of the way C++ works, objects are passed by value, i.e. you normally expect them to be movable and copiable, which is not necessarily what library author wants to support.
Now consider the following function instead:
FT_Error Init( FT_Struct** pp )
{
try
{
*pp = new FT_Struct();
return S_OK;
}
catch( std::exception& ex )
{
return E_FAIL;
}
}
As a user of the library, you no longer need to know what’s inside FT_Struct or even what size it is. You don’t need to #include the implementation details, i.e. compilation will be faster.
This plays nicely with dynamic libraries, library author can change memory layout however they please, as long as the C API is stable, old apps will continue to work.
The API guarantees you won’t copy or move the values, you can’t copy structures of unknown lengths.
I want to use nftw to traverse a directory structure in C.
However, given what I want to do, I don't see a way around using a global variable.
The textbook examples of using (n)ftw all involve doing something like printing out a filename. I want, instead, to take the pathname and file checksum and place those in a data structure. But I don't see a good way to do that, given the limits on what can be passed to nftw.
The solution I'm using involves a global variable. The function called by nftw can then access that variable and add the required data.
Is there any reasonable way to do this without using a global variable?
Here's the exchange in previous post on stackoverflow in which someone suggested I post this as a follow-up.
Using ftw can be really, really bad. Internally it will save the the function pointer that you use, if another thread then does something else it will overwrite the function pointer.
Horror scenario:
thread 1: count billions of files
thread 2: delete some files
thread 1: ---oops, it is now deleting billions of
files instead of counting them.
In short. You are better off using fts_open.
If you still want to use nftw then my suggestion is to put the "global" type in a namespace and mark it as "thread_local". You should be able to adjust this to your needs.
/* in some cpp file */
namespace {
thread_local size_t gTotalBytes{0}; // thread local makes this thread safe
int GetSize(const char* path, const struct stat* statPtr, int currentFlag, struct FTW* internalFtwUsage) {
gTotalBytes+= statPtr->st_size;
return 0; //ntfw continues
}
} // namespace
size_t RecursiveFolderDiskUsed(const std::string& startPath) {
const int flags = FTW_DEPTH | FTW_MOUNT | FTW_PHYS;
const int maxFileDescriptorsToUse = 1024; // or whatever
const int result = nftw(startPath.c_str(), GetSize, maxFileDescriptorsToUse , flags);
// log or something if result== -1
return gTotalBytes;
}
No. nftw doesn't offer any user parameter that could be passed to the function, so you have to use global (or static) variables in C.
GCC offers an extension "nested function" which should capture the variables of their enclosing scopes, so they could be used like this:
void f()
{
int i = 0;
int fn(const char *,
const struct stat *, int, struct FTW *) {
i++;
return 0;
};
nftw("path", fn, 10, 0);
}
The data is best given static linkage (i.e. file-scope) in a separate module that includes only functions required to access the data, including the function passed to nftw(). That way the data is not visible globally and all access is controlled. It may be that the function that calls ntfw() is also part of this module, enabling the function passed to nftw() to also be static, and thus invisible externally.
In other words, you should do what you are probably doing already, but use separate compilation and static linkage judiciously to make the data only visible via access functions. Data with static linkage is accessible by any function within the same translation unit, and you avoid the problems associated with global variables by only including functions in that translation unit that are creators, maintainers or accessors of that data.
The general pattern is:
datamodule.h
#if defined DATAMODULE_INCLUDE
<type> create_data( <args>) ;
<type> get_data( <args> ) ;
#endif
datamodule.c
#include "datamodule.h"
static <type> my_data ;
static int nftwfunc(const char *filename, const struct stat *statptr, int fileflags, struct FTW *pfwt)
{
// update/add to my_data
...
}
<type> create_data( const char* path, <other args>)
{
...
ret = nftw( path, nftwfunc, fd_limit, flags);
...
}
<type> get_data( <args> )
{
// Get requested data from my_data and return it to caller
}
I have to write code in C where the user has to have flexibility in choosing any existing DB, write to files, or implement their own storage mechanism. I need wrapper functions that redirect to the right functions corresponding to the storage mechanism selected at runtime or compile time. Say my storage options are FLATFILE and SQLDB and my wrapper function is insert(value). So, if I select FLATFILE as my storage, when I call the wrapper function insert(value), it should in turn call the function that writes to a file. If I choose a SQLDB, insert(value) should call the function that insert the values in the data base.
I know I can somehow use a structure of function pointers to do wrapper functions, but I have no idea how.
Does anyone know of any docs, links, examples, etc I could refer to, to understand and implement something like this? Any pointers will be appreciated. Thanks!
Thanks!
#define BACKEND_FLATFILE 0
#define BACKEND_SQLDB 1
void insert_flatfile(const t_value *v) {
...
}
void insert_sqldb(const t_value *v) {
...
}
void (*insert_functions[]) (const t_value *) = {
insert_flatfile,
insert_sqldb,
};
void insert_wrapper(t_value *v, int backend) {
insert_functions[backend](v);
}
Besides, the different functions for one backend should be stuffed into a struct and you should create an array of such structs instead of one array per wrapper function.
You can use a simple version such as:
struct backend {
int (*insert)(...);
int (*remove)(...);
...
};
static struct backend db_backend = { db_insert, db_remove, ... };
static struct backend other_backend = { other_insert, other_remove, ... };
const struct backend *get_backend(enum backend_type type)
{
switch (type)
{
case DB_BACKEND:
return &db_backend;
case DB_OTHER:
return &db_other;
...
}
}
All of the above can be hidden inside a C file, with get_backend and the enumeration being public. Then you can use it like this:
struct backend *b = get_backend(DB_BACKEND);
b->insert(...);
b->remove(...);
Many details are missing, of course (many people like using typedef, for example). This is a basic setup, you can also create wrapper functions if you don't like the b->insert(...) syntax or if you want to set the back end once and then use insert() and remove() in the code. This is also useful if you already have some code that calls insert() directly and you want to direct the call to the right back end.
If you want a more elaborate solution, have a look at http://www.cs.rit.edu/~ats/books/ooc.pdf. You don't have to implement every last detail from it, but it can give you a few ideas.