How to avoid collision of enumerated values? - c

I'm building a front-end library. The back-end produces a number of error codes that are enumerated:
enum backerr {BACK_ERR1, BACK_ERR2, BACK_ERR3};
My front-end produces a number of additional error codes:
enum fronterr {FRONT_ERR1, FRONT_ERR2, FRONT_ERR3};
For convenience I would like to have a single error code returning function that would return both front end or back end errors depending on which one occurred.
Is there any way this can happen without collision of the values of two error codes, and considering we cannot know the values of the back-end?

If you don't know what the back end may generate then, no, there is no way to reliably select your own error codes so that they don't clash.
So you have a couple of options (at least).
The first is useful if the back end somehow publishes the error ranges, such as in a header file. To be honest, it should be doing this since there's no other way for a program to distinguish the different error codes and/or types.
If they are published, it's a simple matter for you to discover the highest and select your own codes to leave plenty of room for the back end to expand. For example, if the back end uses 1..100, you start yours at 1000. The chance of any system suddenly reporting ten times as many errors as the previous version is a slim one.
A second way is if you want real separation with zero possibility of conflict.
There's nothing to stop you returning a structure similar to:
struct sFrontError {
enum fronterr errorCode;
enum backerr backendCode;
};
and using that for your errors. Then your enumeration for the front end becomes:
enum fronterr {FRONT_OK, FRONT_BACK, FRONT_ERR1, FRONT_ERR2, FRONT_ERR3};
and you can evaluate it as follows:
If errorCode is FRONT_OK, there is no error.
If errorCode is FRONT_BACK, the error came from the back end, and you can find its code in backendCode.
Otherwise, it's a front end error and the code in errorCode fully specifies it.

If the backend exposes an exhaustive list of its error codes you can easily create a true superset of them with your own frontend error codes being a disjoint subset.
/* in backend.h */
enum backend_error
{
BACK_ERR_1,
BACK_ERR_2,
BACK_ERR_3,
};
/* in frontend.h */
#include <backend.h>
enum frontend_error
{
FRONT_ERR_1 = BACK_ERR_1,
FRONT_ERR_2 = BACK_ERR_2,
FRONT_ERR_3 = BACK_ERR_3,
FRONT_ERR_4,
FRONT_ERR_5,
};
This method doesn't force you to make any assumptions on the values of the backend error codes but if a future version of the backend defines additional error codes, you might be hosed. Another downside is that your header file #includes the backend's header so you are polluting the namespace.
If your users never call into the backend directly, that is, you are providing abstractions for all backend functionality, you can define your own error codes altogether and have a function that maps backend error codes to your own. Since it is not required for this function to be the identity function, you can always make this work even in face of future changes to the backend. It can also be implemented in your own implementation file to keep the backend namespace out of your user's picture.

Related

Modular program design

My question is unfortunately badly formed as I'm not entirely certain to call what I'm trying to do. My apologies for that.
It came up since I'm trying to code a really basic browser that I want to implement in C and I was thinking about how best to go about it. The basic idea being something like libcurl (for network interaction) -> libxml2 (parse HTML) -> UI and then some manner of getting libcurl to accept GET or POST requests from the UI (haven't gotten to this point yet).
However, this approuch is severely limited, if I say want to check whether it's a PDF and then send it off to libpoppler before handing it to libxml2 I'll have to recode my entire program flow. Further, if I want to use parts of my program (say, the libcurl -> pdftohtml -> libxml2 part) and send it off to another program (for example w3m instead of my UI), I again don't see how I will manage that.
I could instead simply write a Perl or Python wrapper for curl, libxml2, etc, or do something along the lines of "curl example.com | parser | UI". However doing it in Perl or Python still seems like I'll have to recode my program logic every time I want to do something new, and piping everything seems inelegant. I would also like to do this in C if possible.
So my question is; what do one call this idea? I've been driving myself crazy trying to figure out how to search for a solution for a problem that I can't name. I know it has something to do with modularity, however I don't know what specifically and modularity is a very broad term. Secondly and optionally if anybody could point me in the direction of a solution I would appreciate that as well although it's not as important as what it's called.
Thanks to all who read this. :)
First I suggest you take a look at http://www.amazon.com/Interfaces-Implementations-Techniques-Creating-Reusable/dp/0201498413. Second most browsers are asynchronous so you are going to need a event library like libuv or libev. Also most modern websites require javascript to function properly, but adding a javascript engine to your browser would greatly complicate the project. I also don't see any mention of how you plan on parsing the http being sent to and from your browser, I suggest https://github.com/joyent/http-parser.
As for your question on control flow, I would have a function that parse's the response from the server and use's switch() to handle the various types of data being sent to your browser. There is a field in the http header which explains the content type and that way your browser should be able to call different functions based of what the content type is.
Also take a look at function pointers, both here Polymorphism (in C) and here How do function pointers in C work? . Function pointers would/could be a more eloquent way to solve your problem instead having large switch statements through out your code. With function pointers you can have one function that when called in your program behaves differently.
I will try to explain below with a browser as an example.
So lets say your browser just got back a http response from some server. The http response looks something like this in C.
struct http_res
{
struct http_header *header;
struct http_body *body
int (*decode_body)(char **);
};
So first your http parser will parse the http header and figure out if it's a valid response and if there's content, etc, etc. If there is content the parser will check the type and based off, if it's html, javascript, css, or whatever the parser will set the function pointer to point at the right function to decode the http body.
static int decode_javascript(char **body)
{
/* Whatever it takes to parse javascript from http. */
return 0;
}
static int decode_html(char **body)
{
/* Whatever it takes to parse html from http. */
return 0;
}
static int decode_css(char **body)
{
/* Whatever it takes to parse css from http. */
return 0;
}
int parse_http_header(struct http_res *http)
{
/* ... lots of other code to figure out content type. ... */
switch(body_content_type)
{
case BCT_JAVASCRIPT:
http->decode_body = &decode_javascript;
break;
case BCT_HTML:
http->decode_body = &decode_html;
break;
case BCT_CSS:
http->decode_body = &decode_css;
break;
default:
printf("Error can't parse body type.\n");
return -1;
}
return 0;
}
Now when we pass the http request to another part of the browser that function can call decode_body() in the http response object and it will end up with a decoded body it can understand, with out knowing what it's decoding.
int next_function(struct http_res * res)
{
char *decoded_body;
int rtrn;
/* Now we can decode the http body with out knowing anything about
it. We just call decode_body() and end up with a buffer with the
decoded data in it. */
rtrn = res->decode_body(&decoded_body);
if(rtrn < 0)
{
printf("Can't decode body.\n");
return -1;
}
return 0;
}
To make your program really modular at least in C, you would stick the various parts of your browser in different shared libraries, like the HTTP parser, event library, Javascript engine, html parser, etc, etc. Then you would create interfaces between each library and you would be able to swap out each library with a different one with having to change your program, you would link a different library at run time. Take a look at Dr Robert martin(uncle bob) he talks about this extensively. This talk is good but it lacks slides https://www.youtube.com/watch?v=asLUTiJJqdE , starts at 8:20. This one is also interesting, and it has slides: https://www.youtube.com/watch?v=WpkDN78P884 .
And finally nothing about C, perl or python means you will have to recode your program logic. You will have to design your program so that each module does not know about each other. The module knows about the interface and if you connect two modules that both "speak" the same interface you will have created a modular system. It's just like how the internet works the various computers on the internet do not need to know what the other computer is, or what it's doing, or it's operating system, all they need to know is TCP/IP and they can communicate with all the other devices on the internet.

Why is the windows return code called HRESULT?

The standard return type for functions in Windows C/C++ APIs is called HRESULT.
What does the H mean?
Result handle as stated here at MSDN Error Handling in COM
The documentation only says:
The return value of COM functions and methods is an HRESULT, which is not a handle to an object, but is a 32-bit value with several fields encoded in a single 32-bit ULONG variable.
Which seems to indicate that it stands for "handle", but is misused in this case.
Hex Result.
HRESULT are listed in the form of 0x80070005. They are a number that gets returned by COM\OLE calls to indicate various types of SUCCESS or FAILURE. The code itself is comprised of a bit field structure for those that want to delve into the details.
Details of the bit field structure can be found here at Microsoft Dev Center's topic Structure of COM Error Codes and here at MSDN HRESULT Structure.
The H-prefix in Windows data types generally designates handle types1 (such as HBRUSH or HWND). The documentation seems to be in agreement, sort of:
The HRESULT (for result handle) is a way of returning success, warning, and error values. HRESULTs are really not handles to anything; they are only values with several fields encoded in the value.
In other words: Result handles are really not handles to anything. Clearly, things cannot possibly have been designed to be this confusing. There must be something else going on here.
Luckily, historian Raymond Chen is incessantly conserving this kind of knowledge. In the entry aptly titled Why does HRESULT begin with H when it’s not a handle to anything? he writes:
As I understand it, in the old days it really was a handle to an object that contained rich error information. For example, if the error was a cascade error, it had a link to the previous error. From the result handle, you could extract the full history of the error, from its origination, through all the functions that propagated or transformed it, until it finally reached you.
The document concludes with the following:
The COM team decided that the cost/benefit simply wasn’t worth it, so the HRESULT turned into a simple number. But the name stuck.
In summary: HRESULT values used to be handle types, but aren't handle types any more. The entire information is now encoded in the value itself.
Bonus reading:
Handle types losing their reference semantics over time is not without precedent. What is the difference between HINSTANCE and HMODULE? covers another prominent example.
1 Handle types store values where the actual value isn't meaningful by itself; it serves as a reference to other data that's private to the implementation.

quering for objects with angularFireCollection?

I used the implicit method for retrieving data objects:
setData = function(segment){
var url = 'https://myFireBase.firebaseio.com/';
var rawData = angularFire(url+segment,$rootScope,'data',{});
rawData.then(function(data){
// sorting and adjusting data, and then broadcasting and/or assinging
}
}
This code is located inside a service that gets called from different locations, by development stages it'll probably be around 100 - 150 so I got out of the controllers and into a service, but now firebase data-binding would obviously over-write the different segments so I turned back to explicit methid, to have the different firebases only sending the data to site instead of data-binding and over-writing each other:
var rawData = angularFireCollection(url+segment);
And right there I discovered why I chose the implicit in the first place: There's an argument for the typeof, i could tell firebase if I'm calling a string, an array, an object etc. I even looked at the angularfire.js and saw that if the argument is not given, if falls back to identifying it as an array by default.
Now, I'm definitely going to move to the explicit method (that is, if no salvation comes with angular2.0), and reconstructing my firebase jsons to fit the array-only policy is not that big of a deal, but surely there's an option to explicitly call objects, or am I missing something?
I'm not totally clear on what the question is - with angularFireCollection, you can certainly retrieve objects just fine. For example, in the bundled chat app (https://github.com/firebase/angularFire/blob/gh-pages/examples/chat/app.js#L5):
$scope.messages = angularFireCollection(new Firebase(url).limit(50));
Each message is stored as an object, with its own unique key as generated by push().
I'm also curious about what problems you found while using the implicit method as your app grew. We're really looking to address problems like these for the next iteration of angularFire!

TCL/C - when is setFromAnyProc() called

I am creating a new TCL_ObjType and so I need to define the 4 functions, setFromAnyProc, updateStringProc, dupIntRepProc and freeIntRepProc. When it comes to test my code, I see something interesting/mystery.
In my testing code, when I do the following:
Tcl_GetString(p_New_Tcl_obj);
updateStringProc() for the new TCL object is called, I can see it in gdb, this is expected.
The weird thing is when I do the following testing code:
Tcl_SetStringObj(p_New_Tcl_obj, p_str, strlen(p_str));
I expect setFromAnyProc() is called, but it is not!
I am confused. Why it is not called?
The setFromAnyProc is not nearly as useful as you might think. It's role is to convert a value[*] from something with a populated bytes field into something with a populated bytes field and a valid internalRep and typePtr. It's called when something wants a generic conversion to a particular format, and is in particular the core of the Tcl_ConvertToType function. You probably won't have used that; Tcl itself certainly doesn't!
This is because it turns out that the point when you want to do the conversion is in a type-specific accessor or manipulator function (examples from Tcl's API include Tcl_GetIntFromObj and Tcl_ListObjAppendElement, which are respectively an accessor for the int type[**] and a manipulator for the list type). At that point, you're in code that has to know the full details of the internals of that specific type, so using a generic conversion is not really all that useful: you can do the conversion directly if necessary (or factor that out to a conversion function).
Tcl_SetStringObj works by throwing away the internal representation of your object (with the freeIntRepProc callback), disposing of the old bytes string representation (through Tcl_InvalidateStringRep, or rather its internal analog) and then installing the new bytes you've supplied.
I find that I can leave the setFromAnyProc field of a Tcl_ObjType set to NULL with no problems.
[*] The Tcl_Obj type is mis-named for historic reasons. It's a value. Tcl_Value was taken for something else that's now obsolete and virtually unused.
[**] Integers are actually represented by a cluster of internal types, depending on the number of bits required. You don't need to know the details if you're just using them, as the accessor functions completely hide the complexity.

Is it a bad idea to mix bool and ret codes

I have some programs which make heavy use of libraries with enumerations of error codes.
The kind where 0(first value of enum) is success and 1 is failure. In some cases I have my own helper functions that return bool indicating error, in other cases I bubble up the error enumeration. Unfortunately sometimes I mistake one for the other and things fail.
What would you recommend? Am I missing some warnings on gcc which would warn in these cases?
P.S. it feels weird to return an error code which is totally unrelated to my code, although I guess I could return -1 or some other invalid value.
Is it a bad idea? No, you should do what makes sense rather than following some abstract rule (the likes of which almost never cater for all situations you're going to encounter anyway).
One way I avoid troubles is to ensure that all boolean-returning function read like proper English, examples being isEmpty(), userFlaggedExit() or hasContent(). This is distinct from my normal verb-noun constructs like updateTables(), deleteAccount() or crashProgram().
For a function which returns a boolean indicating success or failure of a function which would normally follow that verb-noun construct, I tend to use something like deleteAccountWorked() or successfulTableUpdate().
In all those boolean-returning cases, I can construct an easily readable if statement:
if (isEmpty (list)) ...
if (deleteAccountWorked (user)) ...
And so on.
For non-boolean-returning functions, I still follow the convention that 0 is okay and all other values are errors of some sort. The use of intelligent function names usually means it's obvious as to which is which.
But keep in mind, that's my solution. It may or may not work for other people.
In the parts of the application that you control, and the parts that make up your external API I would say, choose one type of error handling and stick to it. Which type is less important, but be consistent. Otherwise people working on your code will not know what to expect and even you yourself will scratch you head when you get back to the code in a year or so ;)
If standardizing on a zero == error scheme, you can mix and match both enum and bool if you construct your tests like this:
err = some_func();
if !err...
Since the first enum evaluates to zero and also the success case it matches perfectly with bool error returns.
However, in general it is better to return an int (or enum) since this allows for the expansion of the error codes returned without modification of calling code.
I wouldn't say, that it's a bad practice.
There's no need to create tons of enum-s, if you just need to return true/false, and you don't have other options (and true and false are explanatory enough ).
Also, if your functions are named OK, you will have less "mistakes"
For example - IsBlaBla - expects to return true. If you have [Do|On]Reload, a reload could fail for many reasons, so enum would be expected. The same for IsConnected and Connect, etc.
IMHO function naming helps here.
E.g. for functions that return a boolean value, is_foo_bar(...), or for functions that return success or an error code, do_foo_bar(...).

Resources