Modular program design - c

My question is unfortunately badly formed as I'm not entirely certain to call what I'm trying to do. My apologies for that.
It came up since I'm trying to code a really basic browser that I want to implement in C and I was thinking about how best to go about it. The basic idea being something like libcurl (for network interaction) -> libxml2 (parse HTML) -> UI and then some manner of getting libcurl to accept GET or POST requests from the UI (haven't gotten to this point yet).
However, this approuch is severely limited, if I say want to check whether it's a PDF and then send it off to libpoppler before handing it to libxml2 I'll have to recode my entire program flow. Further, if I want to use parts of my program (say, the libcurl -> pdftohtml -> libxml2 part) and send it off to another program (for example w3m instead of my UI), I again don't see how I will manage that.
I could instead simply write a Perl or Python wrapper for curl, libxml2, etc, or do something along the lines of "curl example.com | parser | UI". However doing it in Perl or Python still seems like I'll have to recode my program logic every time I want to do something new, and piping everything seems inelegant. I would also like to do this in C if possible.
So my question is; what do one call this idea? I've been driving myself crazy trying to figure out how to search for a solution for a problem that I can't name. I know it has something to do with modularity, however I don't know what specifically and modularity is a very broad term. Secondly and optionally if anybody could point me in the direction of a solution I would appreciate that as well although it's not as important as what it's called.
Thanks to all who read this. :)

First I suggest you take a look at http://www.amazon.com/Interfaces-Implementations-Techniques-Creating-Reusable/dp/0201498413. Second most browsers are asynchronous so you are going to need a event library like libuv or libev. Also most modern websites require javascript to function properly, but adding a javascript engine to your browser would greatly complicate the project. I also don't see any mention of how you plan on parsing the http being sent to and from your browser, I suggest https://github.com/joyent/http-parser.
As for your question on control flow, I would have a function that parse's the response from the server and use's switch() to handle the various types of data being sent to your browser. There is a field in the http header which explains the content type and that way your browser should be able to call different functions based of what the content type is.
Also take a look at function pointers, both here Polymorphism (in C) and here How do function pointers in C work? . Function pointers would/could be a more eloquent way to solve your problem instead having large switch statements through out your code. With function pointers you can have one function that when called in your program behaves differently.
I will try to explain below with a browser as an example.
So lets say your browser just got back a http response from some server. The http response looks something like this in C.
struct http_res
{
struct http_header *header;
struct http_body *body
int (*decode_body)(char **);
};
So first your http parser will parse the http header and figure out if it's a valid response and if there's content, etc, etc. If there is content the parser will check the type and based off, if it's html, javascript, css, or whatever the parser will set the function pointer to point at the right function to decode the http body.
static int decode_javascript(char **body)
{
/* Whatever it takes to parse javascript from http. */
return 0;
}
static int decode_html(char **body)
{
/* Whatever it takes to parse html from http. */
return 0;
}
static int decode_css(char **body)
{
/* Whatever it takes to parse css from http. */
return 0;
}
int parse_http_header(struct http_res *http)
{
/* ... lots of other code to figure out content type. ... */
switch(body_content_type)
{
case BCT_JAVASCRIPT:
http->decode_body = &decode_javascript;
break;
case BCT_HTML:
http->decode_body = &decode_html;
break;
case BCT_CSS:
http->decode_body = &decode_css;
break;
default:
printf("Error can't parse body type.\n");
return -1;
}
return 0;
}
Now when we pass the http request to another part of the browser that function can call decode_body() in the http response object and it will end up with a decoded body it can understand, with out knowing what it's decoding.
int next_function(struct http_res * res)
{
char *decoded_body;
int rtrn;
/* Now we can decode the http body with out knowing anything about
it. We just call decode_body() and end up with a buffer with the
decoded data in it. */
rtrn = res->decode_body(&decoded_body);
if(rtrn < 0)
{
printf("Can't decode body.\n");
return -1;
}
return 0;
}
To make your program really modular at least in C, you would stick the various parts of your browser in different shared libraries, like the HTTP parser, event library, Javascript engine, html parser, etc, etc. Then you would create interfaces between each library and you would be able to swap out each library with a different one with having to change your program, you would link a different library at run time. Take a look at Dr Robert martin(uncle bob) he talks about this extensively. This talk is good but it lacks slides https://www.youtube.com/watch?v=asLUTiJJqdE , starts at 8:20. This one is also interesting, and it has slides: https://www.youtube.com/watch?v=WpkDN78P884 .
And finally nothing about C, perl or python means you will have to recode your program logic. You will have to design your program so that each module does not know about each other. The module knows about the interface and if you connect two modules that both "speak" the same interface you will have created a modular system. It's just like how the internet works the various computers on the internet do not need to know what the other computer is, or what it's doing, or it's operating system, all they need to know is TCP/IP and they can communicate with all the other devices on the internet.

Related

LLVM Loop Simplify Pass

I am probably misunderstanding some basic concept how LLVM & passes work, anyhow here is my question:
I am currently working on a pass where I extend the runOnModule (https://llvm.org/doxygen/classllvm_1_1ModulePass.html) function. I would like to run LoopSimplify first on the IR, but I do not seem to understand how to do that. There is a run(Function &F, FunctionAnalysisManager &AM) function as described on https://llvm.org/doxygen/classllvm_1_1LoopSimplifyPass.html and as far as I understand it I can call it on every function in my module. But for that I need a member of that class (LoopSimplify) to call it on which I do not know where to get from and also some FunctionAnalysisManager. What are they for and how do they need to look like? It is not like I can just feed it some empty constructs right?
I want to do this for the following guarantee:
"Loop pre-header insertion guarantees that there is a single, non-critical
entry edge from outside of the loop to the loop header. This simplifies a
number of analyses and transformations, such as LICM." as described in https://llvm.org/doxygen/LoopSimplify_8h_source.html.
While I support the directions to integrate your pass into using the pass manager, nonetheless, there is a way to force LoopSimplify to run by making your pass require it. This is also used in many of the LLVM provided passes, such as Scalar/LoopVersioningLICM.cpp
// This header includes LoopSimplifyID as an extern
#include "llvm/Transforms/Utils.h"
...
void YourPass::getAnalysisUsage(AnalysisUsage& AU) const {
AU.addRequiredID(LoopSimplifyID);
}
Doing so will force the pass to be run prior to your pass, no need to invoke it. However, if you need interface with this or another pass, you can request its analysis:
getAnalysis<LoopSimplifyPass>(F); // Where F is a function&

How to insert print for each function of C language for debugging?

I am studying and debugging one software. There are thousands of functions in this software. I plan to add printf() at the entry and exit point of each function. It will take a lot of time.
Is there one tool/script to do this?
I may use '__cyg_profile_func_enter'. But it can only get address. But I have to run another script to get function name. I also hope to get value of input parameters of this function too.
You should give a try to AOP : Aspect Oriented Programming. Personnaly I've only tried with Java and Spring AOP but there's an API for C too : AspectC (https://sites.google.com/a/gapp.msrg.utoronto.ca/aspectc/home). From what I've seen, it's not the only one.
From what I've red about this library, you can add an pointcut before compiling with AspectC :
// before means it's a before function aspect
// call means it's processed when a function is called
// args(...) means it applies to any function with any arguments
// this->funcName is the name of the function handled by AspectC
before(): call(args(...)) {
printf("Entering %s\n", this->funcName);
}
(not tried by myself but extracted from the reference page https://sites.google.com/a/gapp.msrg.utoronto.ca/aspectc/tutorial)
This is only a basic overview of what can be done and you still have to deal with the compilation (documented in the page linked before) but it looks like it could possibly help you. Give a try with a simple POC maybe.

How to avoid collision of enumerated values?

I'm building a front-end library. The back-end produces a number of error codes that are enumerated:
enum backerr {BACK_ERR1, BACK_ERR2, BACK_ERR3};
My front-end produces a number of additional error codes:
enum fronterr {FRONT_ERR1, FRONT_ERR2, FRONT_ERR3};
For convenience I would like to have a single error code returning function that would return both front end or back end errors depending on which one occurred.
Is there any way this can happen without collision of the values of two error codes, and considering we cannot know the values of the back-end?
If you don't know what the back end may generate then, no, there is no way to reliably select your own error codes so that they don't clash.
So you have a couple of options (at least).
The first is useful if the back end somehow publishes the error ranges, such as in a header file. To be honest, it should be doing this since there's no other way for a program to distinguish the different error codes and/or types.
If they are published, it's a simple matter for you to discover the highest and select your own codes to leave plenty of room for the back end to expand. For example, if the back end uses 1..100, you start yours at 1000. The chance of any system suddenly reporting ten times as many errors as the previous version is a slim one.
A second way is if you want real separation with zero possibility of conflict.
There's nothing to stop you returning a structure similar to:
struct sFrontError {
enum fronterr errorCode;
enum backerr backendCode;
};
and using that for your errors. Then your enumeration for the front end becomes:
enum fronterr {FRONT_OK, FRONT_BACK, FRONT_ERR1, FRONT_ERR2, FRONT_ERR3};
and you can evaluate it as follows:
If errorCode is FRONT_OK, there is no error.
If errorCode is FRONT_BACK, the error came from the back end, and you can find its code in backendCode.
Otherwise, it's a front end error and the code in errorCode fully specifies it.
If the backend exposes an exhaustive list of its error codes you can easily create a true superset of them with your own frontend error codes being a disjoint subset.
/* in backend.h */
enum backend_error
{
BACK_ERR_1,
BACK_ERR_2,
BACK_ERR_3,
};
/* in frontend.h */
#include <backend.h>
enum frontend_error
{
FRONT_ERR_1 = BACK_ERR_1,
FRONT_ERR_2 = BACK_ERR_2,
FRONT_ERR_3 = BACK_ERR_3,
FRONT_ERR_4,
FRONT_ERR_5,
};
This method doesn't force you to make any assumptions on the values of the backend error codes but if a future version of the backend defines additional error codes, you might be hosed. Another downside is that your header file #includes the backend's header so you are polluting the namespace.
If your users never call into the backend directly, that is, you are providing abstractions for all backend functionality, you can define your own error codes altogether and have a function that maps backend error codes to your own. Since it is not required for this function to be the identity function, you can always make this work even in face of future changes to the backend. It can also be implemented in your own implementation file to keep the backend namespace out of your user's picture.

Automatically Generate C Code From Header

I want to generate empty implementations of procedures defined in a header file. Ideally they should return NULL for pointers, 0 for integers, etc, and, in an ideal world, also print to stderr which function was called.
The motivation for this is the need to implement a wrapper that adapts a subset of a complex, existing API (the header file) to another library. Only a small number of the procedures in the API need to be delegated, but it's not clear which ones. So I hope to use an iterative approach, where I run against this auto-generated wrapper, see what is called, implement that with delegation, and repeat.
I've see Automatically generate C++ file from header? but the answers appear to be C++ specific.
So, for people that need the question spelled out in simple terms, how can I automate the generation of such an implementation given the header file? I would prefer an existing tool - my current best guess at a simple solution is using pycparser.
update Thanks guys. Both good answers. Also posted my current hack.
so, i'm going to mark the ea suggestion as the "answer" because i think it's probably the best idea in general. although i think that the cmock suggestion would work very well in tdd approach where the library development was driven by test failures, and i may end up trying that. but for now, i need a quicker + dirtier approach that works in an interactive way (the library in question is a dynamically loaded plugin for another, interactive, program, and i am trying to reverse engineer the sequence of api calls...)
so what i ended up doing was writing a python script that calls pycparse. i'll include it here in case it helps others, but it is not at all general (assumes all functions return int, for example, and has a hack to avoid func defs inside typedefs).
from pycparser import parse_file
from pycparser.c_ast import NodeVisitor
class AncestorVisitor(NodeVisitor):
def __init__(self):
self.current = None
self.ancestors = []
def visit(self, node):
if self.current:
self.ancestors.append(self.current)
self.current = node
try:
return super(AncestorVisitor, self).visit(node)
finally:
if self.ancestors:
self.ancestors.pop(-1)
class FunctionVisitor(AncestorVisitor):
def visit_FuncDecl(self, node):
if len(self.ancestors) < 3: # avoid typedefs
print node.type.type.names[0], node.type.declname, '(',
first = True
for param in node.args.params:
if first: first = False
else: print ',',
print param.type.type.names[0], param.type.declname,
print ')'
print '{fprintf(stderr, "%s\\n"); return 0;}' % node.type.declname
print '#include "myheader.h"'
print '#include <stdio.h>'
ast = parse_file('myheader.h', use_cpp=True)
FunctionVisitor().visit(ast)
UML modeling tools are capable of generating default implementation in the language of choice. Generally there is also a support for importing source code (including C headers). You can try to import your headers and generate source code from them. I personally have experience with Enterprise Architect and it supports both of these operations.
Caveat: this is an unresearched answer as I haven't had any experience with it myself.
I think you might have some luck with a mocking framework designed for unit testing. An example of such a framework is: cmock
The project page suggests it will generate code from a header. You could then take the code and tweak it.

Using Windows Forms and VC++ with Unmanaged Static Libraries

I am currently trying to write a UI for a Data Acquistion System in Visual Studio C++ 2010, and I am having a lot of trouble dealing with the interfacing of the third party libraries I am using and Windows Forms. The two libraries I am using are DAQX, a C library for a Data Acqustion System, and VITCam, a C++ library for a 1394 High Speed Camera. It's extremely frustrating trying to work with these libraries and any UI library that VS has to offer, as none of the function arguments ever get along.
DAQX uses windows types like WORD and DWORD, in normal C fashion, and when I'm writing a normal program, no UI involved, it works fine, but Windows Forms seems to hate anytime I want to make a simple DWORD Array inside the class.
VITCam is even worse. I can open the camera fine, but I am completely lost when it comes to trying to put the image on the screen somehow. I haven't uncovered an equivalanet, easy to follow way for putting it to the screen as to how the documentation puts it:
CDC* pDC=GetDC(); // obtain the device context for your window...
// move the image data
::SetDIBitsToDevice(pDC->m_hDC,0,0,
(int) (MyCam.GetDispBuf()->bmiHeader.biWidth),
(int) (MyCam.GetDispBuf()->bmiHeader.biHeight),
0,0,0,(WORD) (WORD) MyCam.GetDispBuf()->bmiHeader.biHeight,
MyCam.GetDispPixels(),MyCam.GetDispBuf(),
DIB_RGB_COLORS);
I can barely follow it as is. So, without doing to much blathering, How do most people work with static unmanaged libraries that were not developed with Windows Forms in mind? I've tried MFC as the VITCam documentations mentioned it, but it makes very little sense and isn't as intuitive as Windows Forms feels.
Edit:
This is the error message I get when trying to use a normal (at least to me) array.
Error 1 error C4368: cannot define 'buffer' as a member of managed 'WirelessHeadImpact::Form1': mixed types are not supported
And it points to this line:
private:
WORD buffer[BUFFSIZE*CHANCOUNT];
What I had before was this:
static array<WORD>^ _buffer;
And within a function I create the former array, pass it to the function, then return the latter after looping through and updating the array.
WORD buffer[BUFFSIZE*CHANCOUNT];
DWORD scansCollected = 0;
while (total_scans < SCANS) {
daqAdcTransferBufData(_handle, buffer, BUFFSIZE, DabtmWait, &scansCollected);
if (scansCollected > 0) {
for (WORD i=0;i<scansCollected;i++) {
_buffer[i] = buffer[i];
}
mixed type support is removed in Visual C++ 2005. If you want to associate a DWORD array to a managed class, use new (not gcnew) to allocate the array itself on the native heap and save the pointer of the array in the class.
by the way, you cannot pass addresses of objects on the managed heap to a native function without pinning the object, otherwise the GC is free to move the object at any time. If you want to pass a managed value to a native function, make sure your pass by value or the object is pinned.
It helps the readers if you post the actual error message you are getting, instead of having to guess out from your question.

Resources