What is the difference between Either<string, A> and Either<Error, A> - fp-ts

I'm new to fp-ts, I saw some people use Either<string, A> and others use Either<Error, A> in different articles. I want to know the difference between the two, and how should I choose which one to use? Thanks!

Either is a tool to help you return one of two possible values from a function. Traditionally it is used to message to the call site when a process can fail. Failures are returned in the Left variant, whereas successful values are returned in the Right variant.
So with Either<string, A> the function is promising to return the Right<A> value when it succeeds or a Left<string> value when it fails. In that case, the failure is most likely a string containing a message describing the failure. Either<Error, A> is similar, except that the author of that function is using the existing Error class to contain information about the failure.
It's up to the designer of the function's API to decide what value it makes sense to return on failure, but neither is necessarily wrong.
Incidentally, the Left value being failure and the Right value being success is only convention. It's perfectly valid to have a function that returns Either<number, string> and return number in some situations and string in others.

Related

What problem does the Kleisli arrows solve in fp-ts?

I'm learning functional programming by using fp-ts lib and noticed that some functions end with K, like:
chainTaskEitherK
chainEitherK
fromTaskK
Also I read the explanation of What a K suffix means in fp-ts documentation, but unfortunately I can't say that one example gives me a solid idea of how to use it on the battlefield.
I would like to know exactly what problem they solve and what the code would look like without them (to see the benefit).
Please, consider that I'm a newbie in this topic.
First, (going off of the example in the link you shared). I think a Kleisli arrow is just a name for a type that looks like:
<A>(value: A) => F<A>
where F is some Functor value. They call it a constructor in the doc you linked which might be more precise? My understanding is that it's just a function that takes some non-Functor value (a string in the parse example) and puts it into some Functor.
These helpers you've listed are there for when you already have a Kleisli arrow and you want to use it with some other Functor values. In the example they have
declare function parse(s: string): Either<Error, number>;
and they have an IOEither value that would probably come from user input. They want to combine the two, basically run parse on the input if it's a Right and end up with a function using parse with the signature:
declare function parseForIO(s: string): IOEither<Error, number>;
This is so the return type can be compatible with our input type (so we can use chain on the IOEither to compose our larger function).
fromEitherK is therefore, wrapping the base parse function in some logic to naturally transform the resulting regular Either into an IOEither. The chainEitherK does that and a chain to save some of the boilerplate.
Basically, it's solving a compatibility issue when the return value from your Kleisli arrows doesn't match the value you need when chaining things together.
In addition to the #Souperman explanation I want to share my investigation on this topic
Let's take already known example from the fp-ts documentation.
We have an input variable of type IOEtiher
const input: IE.IOEither<Error, string> = IE.right('foo')
and function, which take a plain string and returns E.Either
function parse(s: string): E.Either<Error, number> {
// implentation
}
If we want to make this code works together in a fp-ts style, we need to introduce a pipe. pipe is a function which passes our data through the functions listed inside the pipe.
So, instead of doing this (imperative style)
const input: IE.IOEither<Error, string> = IE.right('foo')
const value = input()
let result: E.Either<Error, number>
if (value._tag === 'Right') {
result = parse(value.right) // where value.right is our 'foo'
}
We can do this
pipe(
input,
IE.chain(inputValue => parse(inputValue))
~~~~~~~~~~~~~~~~~ <- Error is shown
)
Error message
Type 'Either<Error, number>' is not assignable to type 'IOEither<Error, unknown>'.
Unfortunately, fp-ts cannot implicitly jump between types (e.g. from IOEither to Either) . In our example, we started with input variable which has IOEither (IE shortened) type and continue with a IE.chain method which tries to return Either value.
To make it work we can introduce a function which helps us to convert this types.
pipe(
input,
IE.chain(inputValue => IE.fromEitherK(parse)(inputValue))
)
Now our chain function explicitly know that parse function was converted from Either type to IOEither by using fromEitherK.
At this moment we can see that fromEitherK is a helper function that expects a Kleisli function in its arguments and return the new function with a new return type.
To make it more clear, we needn't to use a K suffix if, for example, our parse was a value (instead of function).
The code would look like
pipe(
input,
IE.chain(inputValue => IE.fromEither(parsed)) // I know this is useless code, but it shows its purpose
)
Returning back to our example. We can improve the code to make it more readable
Instead of this
pipe(
input,
IE.chain(inputValue => IE.fromEitherK(parse)(inputValue))
)
We can do this
pipe(
input,
IE.chain(IE.fromEitherK(parse))
)
And even more
pipe(
input,
IE.chainEitherK(parse)
)
Summary
As far I understand, a Kleisli arrows are the functions which take an argument and return a result wrapped in a container (like parse function).

How to deal with a missing key in a hashmap

What might be a proper error or return code for when a key is asked for that cannot be returned?
void hash_delete(hash_table* table, const char* key)
{
hash_item* item = hash_get(table, key);
if (item == NULL)
; // what error to raise?
else
delete_hash_item_internal(item);
}
My thoughts were either to make the function return a bool instead (1=found, 0=not found), or do an exit(). What do you think would be a proper way to handle this?
You have several possibilities, as you are the one who is designing the function. However, calling exit() is not a good choice, as it forces always the same radical behavior of the function. It is better to leave the caller the right to decide to stop the program, continue, log something, etc. (as has been already pointed in comments to the OP).
So I would choose to return different value when the key exist and when it does not. You can choose between the following:
Return a simple bool. Return true when key is found (deleted). Return false when key not found. The most simple and easy to understand behavior.
Return a char*. Return NULL when key is found (deleted). Return the key itself when it is not found. This method allows possible further action taken directly with the result. This may be useful, for example, if the key is obtained from a function and there is no need to store in a variable unless this function fails.
Return a hash_item*. Return the item variable when the key is found. Return NULL when key is not found. This is one of the most typical behaviors in deleting functions, and it allows to use the returned value of the delete function directly as get+delete avoiding a separate call. Note this method may have some issues depending on how the hash_table is implemented. For example, if it contains pointers to hash_item might therefore contain "valid" NULL values and those can be interpreted as deleting function not finding entry instead of finding entry containing NULL. Also may be problematic to return the item if the delete_hash_item_internal() function frees memory from the pointer stored in the table (which you will return) or something similar.
If I had to choose, I would prefer the last option (if it is possible due to issues I mentioned). If it is not possible I would use the first one for the sake of simplicity. But at the end is up to you as designer to decide which one is better, taking also into account the rest of the code you will need and how you want the function to be used.

Should C API return value or error code?

I'm writing my first C library and I'm not sure which way to go about. For example a function to retrive string value from some data store can look:
int get_value(void * store, char ** result);
or
char * get_value(void * store, int * error);
I'm having hard time coming with any objective reason to prefer one over another, but than again, I don't write C that much. The return-error-code will look more consistent when multiple output parameters are present, however the return-value could be bit easier to use? Not sure.
Is there general consensus on which style is better and why or is just a personal preference?
There tend not to be good, "hard" answers to style-based questions like this. What follows are my opinions; others will disagree.
Having a function simply return its return value usually makes it easier for the caller -- as long as the caller is interested in getting answers, not necessarily in optimal error handling.
Having all functions return success/failure codes -- returning any other data via "result" parameters -- makes for clean and consistent error handling, but tends to be less convenient for callers. You're always having to declare extra variables (of the proper type) to hold return values. You can't necessarily write things like a = f(g());.
Returning ordinary values ordinarily, and indicating errors via an "out of band" ordinary return value, is a popular technique -- the canonical example is the Standard C getchar function -- but it can feel rather ad-hoc and error-prone.
Returning values via the return value, and error codes via a "return" parameter, is unusual. I can see the attraction, but I can't say I've ever used that technique, or would. If there needs to be an error return distinct from the return value, the "C way" (though of course this is generally a pretty bad idea, and now pretty strongly deprecated) is to use some kind of globalish variable, à la errno.
If you want to pay any heed to the original "spirit of C", it was very much for programmer convenience, and was not too worried about rigid consistency, and was generally okay with healthy dollops of inconsistency and ad-hocciness. So using out-of-band error returns is fine.
If you want to pay heed to modern usage, it seems to be increasingly slanted towards conformity and correctness, meaning that consistent error return schemes are a good thing, even if they're less convenient. So having the return value be a success/failure code, and data returned by a result parameter, is fine.
If you want to have the return value be the return value, for convenience, and if errors are unusual, but for those callers who care you want to give them a way of getting fine-grained error information, a good compromise is sometimes to have a separate function to fetch the details of the most-recent error. This can still lead to the same kinds of race conditions as a global variable, but if your library uses some kind of "descriptors" or "handles", such that you can arrange to have this error-details function return the details of the most recent operation on a particular handle, it can work pretty well.

Why is the windows return code called HRESULT?

The standard return type for functions in Windows C/C++ APIs is called HRESULT.
What does the H mean?
Result handle as stated here at MSDN Error Handling in COM
The documentation only says:
The return value of COM functions and methods is an HRESULT, which is not a handle to an object, but is a 32-bit value with several fields encoded in a single 32-bit ULONG variable.
Which seems to indicate that it stands for "handle", but is misused in this case.
Hex Result.
HRESULT are listed in the form of 0x80070005. They are a number that gets returned by COM\OLE calls to indicate various types of SUCCESS or FAILURE. The code itself is comprised of a bit field structure for those that want to delve into the details.
Details of the bit field structure can be found here at Microsoft Dev Center's topic Structure of COM Error Codes and here at MSDN HRESULT Structure.
The H-prefix in Windows data types generally designates handle types1 (such as HBRUSH or HWND). The documentation seems to be in agreement, sort of:
The HRESULT (for result handle) is a way of returning success, warning, and error values. HRESULTs are really not handles to anything; they are only values with several fields encoded in the value.
In other words: Result handles are really not handles to anything. Clearly, things cannot possibly have been designed to be this confusing. There must be something else going on here.
Luckily, historian Raymond Chen is incessantly conserving this kind of knowledge. In the entry aptly titled Why does HRESULT begin with H when it’s not a handle to anything? he writes:
As I understand it, in the old days it really was a handle to an object that contained rich error information. For example, if the error was a cascade error, it had a link to the previous error. From the result handle, you could extract the full history of the error, from its origination, through all the functions that propagated or transformed it, until it finally reached you.
The document concludes with the following:
The COM team decided that the cost/benefit simply wasn’t worth it, so the HRESULT turned into a simple number. But the name stuck.
In summary: HRESULT values used to be handle types, but aren't handle types any more. The entire information is now encoded in the value itself.
Bonus reading:
Handle types losing their reference semantics over time is not without precedent. What is the difference between HINSTANCE and HMODULE? covers another prominent example.
1 Handle types store values where the actual value isn't meaningful by itself; it serves as a reference to other data that's private to the implementation.

Is it a bad idea to mix bool and ret codes

I have some programs which make heavy use of libraries with enumerations of error codes.
The kind where 0(first value of enum) is success and 1 is failure. In some cases I have my own helper functions that return bool indicating error, in other cases I bubble up the error enumeration. Unfortunately sometimes I mistake one for the other and things fail.
What would you recommend? Am I missing some warnings on gcc which would warn in these cases?
P.S. it feels weird to return an error code which is totally unrelated to my code, although I guess I could return -1 or some other invalid value.
Is it a bad idea? No, you should do what makes sense rather than following some abstract rule (the likes of which almost never cater for all situations you're going to encounter anyway).
One way I avoid troubles is to ensure that all boolean-returning function read like proper English, examples being isEmpty(), userFlaggedExit() or hasContent(). This is distinct from my normal verb-noun constructs like updateTables(), deleteAccount() or crashProgram().
For a function which returns a boolean indicating success or failure of a function which would normally follow that verb-noun construct, I tend to use something like deleteAccountWorked() or successfulTableUpdate().
In all those boolean-returning cases, I can construct an easily readable if statement:
if (isEmpty (list)) ...
if (deleteAccountWorked (user)) ...
And so on.
For non-boolean-returning functions, I still follow the convention that 0 is okay and all other values are errors of some sort. The use of intelligent function names usually means it's obvious as to which is which.
But keep in mind, that's my solution. It may or may not work for other people.
In the parts of the application that you control, and the parts that make up your external API I would say, choose one type of error handling and stick to it. Which type is less important, but be consistent. Otherwise people working on your code will not know what to expect and even you yourself will scratch you head when you get back to the code in a year or so ;)
If standardizing on a zero == error scheme, you can mix and match both enum and bool if you construct your tests like this:
err = some_func();
if !err...
Since the first enum evaluates to zero and also the success case it matches perfectly with bool error returns.
However, in general it is better to return an int (or enum) since this allows for the expansion of the error codes returned without modification of calling code.
I wouldn't say, that it's a bad practice.
There's no need to create tons of enum-s, if you just need to return true/false, and you don't have other options (and true and false are explanatory enough ).
Also, if your functions are named OK, you will have less "mistakes"
For example - IsBlaBla - expects to return true. If you have [Do|On]Reload, a reload could fail for many reasons, so enum would be expected. The same for IsConnected and Connect, etc.
IMHO function naming helps here.
E.g. for functions that return a boolean value, is_foo_bar(...), or for functions that return success or an error code, do_foo_bar(...).

Resources