Why is "explicitness" considered a Good Thing? - theory

I often hear people praise languages, frameworks, constructs, etc. for being "explicit". I'm trying to understand this logic. The purpose of a language, framework, etc. is to hide complexity. If it makes you specify all kinds of details explicitly, then it's not hiding much complexity, only moving it around. What's so great about explicitness and how do you make a language/framework/API "explicit" while still making it serve its purpose of hiding complexity?

Whether you should be explicit or implicit depends on the situation. You are correct in that often you are trying to hide complexity, and certain things being done behind the scenes for you automatically is good. encapsulation, etc.
Sometimes though frameworks or constructs hide things from us that they should not, and this makes things less clear. Sometimes certain information or settings are hidden from us and hence we don't know what's happening. Assumptions are made that we don't understand and can't determine. Behaviors happen that we can't predict.
Encapsulation: good. Hiding: bad. Making the right call takes experience. Where logic belongs, it should be explicit.
Example: I once removed about 90 lines of code from a series of a dozen code behind pages; data access code, business logic, etc., that did not belong there. I moved them to base pages and the key business object. This was good (encapsulation, separation of concerns, code organization, decoupling, etc.).
I then excitedly realized that I could remove the last line of code from many of these pages, moving it to the base page. It was a line that took a parameter from url and passed it to the business object. Good, right? Well, no, this was bad (I was hiding). This logic belonged here, even though it was almost the same line on every page. It linked the UI intention with the business object. It need to be explicit. Otherwise I was hiding, not encapsulating. With that line, someone looking at that page would know what that page did and why; without it, it would be a pain to determine what was going on.

I believe that explicit refers to knowing exactly what it is doing when you use it. That is different from knowing exactly how it's done, which is the complex part.

It's not so much that explicit is good (certainly the closely-related verbose is bad) as that when implicit goes wrong, it's so hard to tell WTF is going on.
Hack C++ for a decade or two and you'll understand exactly what I mean.

It is about expressing intentions. The reader can't tell if the default was left by mistake or by design. Being explicit removes that doubt.

Code is harder to read than to write. In nontrivial applications, a given piece of code will also be read more often than it is written. Therefore, we should write our code to make it as easy on the reader as possible. Code that does a lot of stuff that isn't obvious is not easy to read (or rather, it's hard to understand when you read it). Ergo, explicitness is considered a good thing.

Relying on default behaviour hides important details from people who aren't intimately familiar with the language/framework/whatever.
Consider how Perl code which relies extensively on shorthands is difficult to understand for people who don't know Perl.

Being explicit vs. implicit is all about what you hide, and what you show.
Ideally, you expose concepts that either the user cares about, or has to care about (whether they want to or not).
The advantage of being explicit is that it's easier to track down and find out what's going on, especially in case of failure. For instance, if I want to do logging, I can have an API that requires explicit initialization with a directory for the log. Or, I can use a default.
If I give an explicit directory, and it fails, I'll know why. If I use an implicit path, and it fails, I will have no idea of what has gone wrong, why, or where to look to fix it.
Implicit behavior is almost always a result of hiding information from the consumer. Sometimes that's the right thing to do, such as when you know in your environment there's only one "answer". However, it's best to know when you're hiding information and why, and ensure that you're letting your consumers work closer to their level of intent, and without trying to hide items of essential complexity.
Frequently implicit behavior is a result of "self-configuring" objects that look at their environment and try to guess the correct behavior. I'd avoid this pattern in general.
One rule I'd probably follow overall is that, for a given API, any operation should either be explicit, or implicit, but never a combination. Either make the operation something the user has to do, or make it something they don't have to think about. It's when you mix those two that you will run into the biggest problems.

Frameworks, etc., can be both explicit and hide complexity by offering the right abstractions for the job to be done.
Being explicit allows others to inspect and understand what is meant by the original developer.
Hiding complexity is not equivalent with being implicit. Implicitness would result in code that is only understandable by the person who wrote it as trying to understand what goes on under the hood is akin to reverse engineering in this case.
Explicit code has a theoretical chance of being proved correct. Implicit code never stands a chance in this respect.
Explicit code is maintainable, implicit code is not - this links to providing correct comments and choosing your identifiers with care.

An "explicit" language allows the computer to find bugs in software that a less-explicit language does not.
For example, C++ has the const keyword for variables whose values should never change. If a program tries to change these variables, the compiler can state that the code is likely wrong.

Good abstraction doesn't hide complexities, it takes decisions that are best left to the compiler off of your plate.
Consider garbage collection: The complexities of releasing resources are delegated to a garbage collector which is (presumably) better qualified to make a decision than you, the programmer. Not only does it take the decision off your hands, but it makes a better decision than you would have yourself.
Explicitness is (sometimes) good because it makes it so that certain decisions that in some cases are better left to the programmer are not automatically made by a less qualified agent. A good example is when you're declaring a floating point data type in a c-type language and initializing it to an integer:
double i = 5.0;
if instead you were to declare it as
var i = 5;
the compiler would rightfully assume you want an int and operations later on would be truncated.

Explicitness is desirable in the context of making it clear to the reader of your code what you intended to do.
There are many examples, but it's all about leaving no doubt about your intent.
e.g. These are not very explicit:
while (condition);
int MyFunction()
bool isActive; // In C# we know this is initialised to 0 (false)
a = b??c;
double a = 5;
double angle = 1.57;
but these are:
while (condition)
/* this loop does nothing but wait */ ;
private int MyFunction()
int isActive = false; // Now you know I really meant this to default to false
if (b != null) a = b; else a = c;
double a = 5.0;
double angleDegrees = 1.57;
The latter cases leave no room for misinterpretation. The former might lead to bugs when someone fails to read them carefully, or doesn't clearly understand a less readable syntax for doing something, or mixes up integer and float types.

In some cases the opposite is "magic" - as in "then a miracle occurs".
When a developer's reading code trying to understand or debug what's going on, explicitness can be a virtue.

The purpose of frameworks moving things around is to remove duplication in code and allow easier editing of chunks without breaking the whole thing.
When you have only one way of doing something, like say SUM(x,y);
We know exactly what this is going to do, no reason to ever need to rewrite it, and if you must you can, but its highly unlikely.
The opposite of that is programming languages like .NET that provide very complex functions that you often will need to rewrite if your doing anything but the obvious simple example.

Related

Is it good to use functions as much as possible?

When I read open source codes (Linux C codes), I see a lot functions are used instead of performing all operations on the main(), for example:
int main(void ){
function1();
return 0;
}
void function() {
// do something
function2();
}
void function2(){
function3();
//do something
function4();
}
void function3(){
//do something
}
void function4(){
//do something
}
Could you tell me what are the pros and cons of using functions as much as possible?
easy to add/remove functions (or new operations)
readability of the code
source efficiency(?) as the variables in the functions will be destroyed (unless dynamic allocation is done)
would the nested function slow the code flow?
Easy to add/remove functions (or new operations)
Definitely - it's also easy to see where does the context for an operation start/finish. It's much easier to see that way than by some arbitrary range of lines in the source.
Readability of the code
You can overdo it. There are cases where having a function or not having it does not make a difference in linecount, but does in readability - and it depends on a person whether it's positive or not.
For example, if you did lots of set-bit operations, would you make:
some_variable = some_variable | (1 << bit_position)
a function? Would it help?
Source efficiency(?) due to the variables in the functions being destroyed (unless dynamic allocation is done)
If the source is reasonable (as in, you're not reusing variable names past their real context), then it shouldn't matter. Compiler should know exactly where the value usage stops and where it can be ignored / destroyed.
Would the nested function slow the code flow?
In some cases where address aliasing cannot be properly determined it could. But it shouldn't matter in practice in most programs. By the time it starts to matter, you're probably going to be going through your application with a profiler and spotting problematic hotspots anyway.
Compilers are quite good these days at inlining functions though. You can trust them to do at least a decent job at getting rid of all cases where calling overhead is comparable to function length itself. (and many other cases)
This practice of using functions is really important as the amount of code you write increases. This practice of separating out to functions improves code hygiene and makes it easier to read. I read somewhere that there really is no point of code if it is only readable by you only (in some situations that is okay I'm assuming). If you want your code to live on, it must be maintainable and maintainability is one created by creating functions in the simplest sense possible. Also imagine where your code-base exceeds well over 100k lines. This is quite common and imagine having that all in the main function. That would be an absolute nightmare to maintain. Dividing the code into function helps create degrees of separability so many developers can work on different parts of the code-base. So basically short answer is yes, it is good to use functions when necessary.
Functions should help you structure your code. The basic idea is that when you identify some place in the code which does something that can be described in a coherent, self-contained way, you should think about putting it into a function.
Pros:
Code reuse. If you do many times some sequence of operations, why don't you write it once, use it many times?
Readability: it's much easier to understand strlen(st) than while (st[i++] != 0);
Correctness: the code in the previous line is actually buggy. If it is scattered around, you may probably not even see this bug, and if you will fix it in one place, the bug will stay somewhere else. But given this code inside a function named strlen, you will know what it should do, and you can fix it once.
Efficiency: sometimes, in certain situations, compilers may do a better job when compiling a code inside a function. You probably won't know it in advance, though.
Cons:
Splitting a code into functions just because it is A Good Thing is not a good idea. If you find it hard to give the function a good name (in your mother language, not only in C) it is suspicious. doThisAndThat() is probably two functions, not one. part1() is simply wrong.
Function call may cost you in execution time and stack memory. This is not as severe as it sounds, most of the time you should not care about it, but it's there.
When abused, it may lead to many functions doing partial work and delegating other parts from here to there. too many arguments may impede readability too.
There are basically two types of functions: functions that do a sequence of operations (these are called "procedures" in some contexts), and functions that does some form of calculation. These two types are often mixed in a single function, but it helps to remember this distinction.
There is another distinction between kinds of functions: Those that keep state (like strtok), those that may have side effects (like printf), and those that are "pure" (like sin). Function like strtok are essentially a special kind of a different construct, called Object in Object Oriented Programming.
You should use functions that perform one logical task each, at a level of abstraction that makes the function of each function easy to logically verify. For instance:
void create_ui() {
create_window();
show_window();
}
void create_window() {
create_border();
create_menu_bar();
create_body();
}
void create_menu_bar() {
for(int i = 0; i < N_MENUS; i++) {
create_menu(menus[i]);
}
assemble_menus();
}
void create_menu(arg) {
...
}
Now, as far as creating a UI is concerned, this isn't quite the way one would do it (you would probably want to pass in and return various components), but the logical structure is what I'm trying to emphasize. Break your task down into a few subtasks, and make each subtask its own function.
Don't try to avoid functions for optimization. If it's reasonable to do so, the compiler will inline them for you; if not, the overhead is still quite minimal. The gain in readability you get from this is a great deal more important than any speed you might get from putting everything in a monolithic function.
As for your title question, "as much as possible," no. Within reason, enough to see what each function does at a comfortable level of abstraction, no less and no more.
One condition you can use: if part of the code will be reuse/rewritten, then put it in a function.
I guess I think of functions like legos. You have hundreds of small pieces that you can put together into a whole. As a result of all of those well designed generic, small pieces you can make anything. If you had a single lego that looked like an entire house you couldn't then use it to build a plane, or train. Similarly, one huge piece of code is not so useful.
Functions are your bricks that you use when you design your project. Well chosen separation of functionality into small, easily testable, self contained "functions" makes building and looking after your whole project easy. Their benefits WAYYYYYYY out-weigh any possible efficiency issues you may think are there.
To be honest, the art of coding any sizeable project is in how you break it down into smaller pieces, so functions are key to that.

Compiler behavior?

I am reviewing some source code and I was wondering if the following was thread safe? I have heard of compiler or CPU instruction/read reordering (would it have something to do with branch prediction?) and the Data->unsafe_variable variable below can be modified at any time by another thread.
My question is: depending on how the compiler/CPU reorder read/writes, would it be possible that the below code would allow the Data->unsafe_variable to be fetched twice? (see 2nd snippet)
Note: I do not worry about the first access, any data can be there as long as it does not pass the 'if', I am just concerned by the possibility that the data would be fetched another time after the 'if'. I was also wondering if the cast into volatile here would help preventing a double fetch?
int function(void* Data) {
// Data is allocated on the heap
// What it contains at this point is not important
size_t _varSize = ((volatile DATA *)Data)->unsafe_variable;
if (_varSize > x * y)
{
return FALSE;
}
// I do not want Data->unsafe_variable to be fetch once this point reached,
// I want to use the value "supposedly" stored in _varSize
// Would any compiler/CPU reordering would allow it to be double fetched?
size_t size = _varSize - t * q;
function_xy(size);
return TRUE;
}
Basically I do not want the program to behave like this for security reasons:
_varSize = ((volatile DATA *)Data)->unsafe_variable;
if (_varSize > x * y)
{
return FALSE;
}
size_t size = ((volatile DATA *)Data)->unsafe_variable - t * q;
function10(size);
I am simplifying here and they cannot use mutex. However, would it be safer to use _ReadWriteBarrier() or MemoryBarrier() after the fist line instead of a volatile cast? (VS compiler)
Edit: Giving slightly more context to the code.
The code is broken for many reasons. I'll just point out one of the more subtle ones as others have pointed out the more obvious ones. The object is not volatile. Casting a pointer to a pointer to a volatile object doesn't make the object volatile, it just lies to the compiler.
But there's a much bigger point -- you are going about this totally the wrong way. You are supposed to be checking whether the code is correct, that is, whether it is guaranteed to work. You aren't clever enough, nobody is, to think of every possible way the system might fail to do what you assume it will do. So instead, just don't make those assumptions.
Thinking about things like CPU read re-ordering is totally wrong. You should expect the CPU to do what, and only what, it is required to do. You should definitely not think about specific mechanisms by which it might fail, but only whether it is guaranteed to work.
What you are doing is like trying to figure out if an employee is guaranteed to show up for work by checking if he had his flu shot, checking if he is still alive, and so on. You can't check for, or even think of, every possible way he might fail to show up. So if find that you have to check those kinds of things, then it's not guaranteed and relying on it is broken. Period.
You cannot make reliable code by saying "the CPU doesn't do anything that can break this, so it's okay". You can make reliable code by saying "I make sure my code doesn't rely on anything that isn't guaranteed by the relevant standards."
You are provided with all the tools you need to do the job, including memory barriers, atomic operations, mutexes, and so on. Please use them.
You are not clever enough to think of every way something not guaranteed to work might fail. And you have a plethora of things that are guaranteed to work. Fix this code, and if possible, have a talk with the person who wrote it about using proper synchronization.
This sounds a bit ranty, and I apologize for that. But I've seen too much code that used "tricks" like this that worked perfectly on the test machines but then broke when a new CPU came out, a new compiler, or a new version of the OS. Fixing code like this can be an incredible pain because these hacks hide the actual synchronization requirements. The right answer is almost always to code clearly and precisely what you actually want, rather than to assume that you'll get it because you don't know of any reason you won't.
This is valuable advice from painful experience.
The standard(s) are clear. If any thread may be modifying the object, all accesses, in all threads, must be synchronized, or you have undefined behavior.
The only portable solution for C++ is C++11 atomics, which is available in upcoming VS 2012.
As for C, I do not know if recent C standards bring some portable facilities, I am not following that, but as you are using Visal Studio, it does not matter anyway, as Microsoft is not implementing recent C standards.
Still, if you know you are developing for Visual Studio, you can rely on guarantees provided by this compiler, which apply to both C and C++. Some of them are implicit (accessing volatile variables implies also some memory barriers applied), some are explicit, like using _MemoryBarrier intrinsic.
The whole topic of the memory model is discussed in depth in Lockless Programming Considerations for Xbox 360 and Microsoft Windows, this should give you a good overview. Beware: the topic you are entering is full of hard topics and nasty surprises.
Note: Relying on volatile is not portable, but if you are using old C / C++ standards, there is no portable solution anyway, therefore be prepared to facing the need of reimplementing this for different platform should the need ever arise. When writing portable threaded code, volatile is considered almost useless:
For multi-threaded programming, there two key issues that volatile is often mistakenly thought to address:
atomicity
memory consistency, i.e. the order of a thread's operations as seen by another thread.

Is it okay to use functions to stay organized in C?

I'm a relatively new C programmer, and I've noticed that many conventions from other higher-level OOP languages don't exactly hold true on C.
Is it okay to use short functions to have your coding stay organized (even though it will likely be called only once)? An example of this would be 10-15 lines in something like void init_file(void), then calling it first in main().
I would have to say, not only is it OK, but it's generally encouraged. Just don't overly fragment the train of thought by creating myriads of tiny functions. Try to ensure that each function performs a single cohesive, well... function, with a clean interface (too many parameters can be a hint that the function is performing work which is not sufficiently separate from it's caller).
Furthermore, well-named functions can serve to replace comments that would otherwise be needed. As well as providing re-use, functions can also (or instead) provide a means to organize the code and break it down into smaller units which can be more readily understood. Using functions in this way is very much like creating packages and classes/modules, though at a more fine-grained level.
Yes. Please. Don't write long functions. Write short ones that do one thing and do it well. The fact that they may only be called once is fine. One benefit is that if you name your function well, you can avoid writing comments that will get out of sync with the code over time.
If I can take the liberty to do some quoting from Code Complete:
(These reason details have been abbreviated and in spots paraphrased, for the full explanation see the complete text.)
Valid Reasons to Create a Routine
Note the reasons overlap and are not intended to be independent of each other.
Reduce complexity - The single most important reason to create a routine is to reduce a program's complexity (hide away details so you don't need to think about them).
Introduce an intermediate, understandable abstraction - Putting a section of code int o a well-named routine is one of the best ways to document its purpose.
Avoid duplicate code - The most popular reason for creating a routine. Saves space and is easier to maintain (only have to check and/or modify one place).
Hide sequences - It's a good idea to hide the order in which events happen to be processed.
Hide pointer operations - Pointer operations tend to be hard to read and error prone. Isolating them into routines shifts focus to the intent of the operation instead of the mechanics of pointer manipulation.
Improve portability - Use routines to isolate nonportable capabilities.
Simplify complicated boolean tests - Putting complicated boolean tests into a function makes the code more readable because the details of the test are out of the way and a descriptive function name summarizes the purpose of the tests.
Improve performance - You can optimize the code in one place instead of several.
To ensure all routines are small? - No. With so many good reasons for putting code into a routine, this one is unnecessary. (This is the one thrown into the list to make sure you are paying attention!)
And one final quote from the text (Chapter 7: High-Quality Routines)
One of the strongest mental blocks to
creating effective routines is a
reluctance to create a simple routine
for a simple purpose. Constructing a
whole routine to contain two or three
lines of code might seem like
overkill, but experience shows how
helpful a good small routine can be.
If a group of statements can be thought of as a thing - then make them a function
i think it is more than OK, I would recommend it! short easy to prove correct functions with well thought out names lead to code which is more self documenting than long complex functions.
Any compiler worth using will be able to inline these calls to generate efficient code if needed.
Functions are absolutely necessary to stay organized. You need to first design the problem, and then depending on the different functionality you need to split them into functions. Some segment of code which is used multiple times, probably needs to be written in a function.
I think first thinking about what problem you have in hand, break down the components and for each component try writing a function. When writing the function see if there are some code segment doing the same thing, then break it into a sub function, or if there is a sub module then it is also a candidate for another function. But at some time this breaking job should stop, and it depends on you. Generally, do not make many too big functions and not many too small functions.
When construction the function please consider the design to have high cohesion and low coupling.
EDIT1::
you might want to also consider separate modules. For example if you need to use a stack or queue for some application. Make it separate modules whose functions could be called from other functions. This way you can save re-coding commonly used modules by programming them as a group of functions stored separately.
Yes
I follow a few guidelines:
DRY (aka DIE)
Keep Cyclomatic Complexity low
Functions should fit in a Terminal window
Each one of these principles at some point will require that a function be broken up, although I suppose #2 could imply that two functions with straight-line code should be combined. It's somewhat more common to do what is called method extraction than actually splitting a function into a top and bottom half, because the usual reason is to extract common code to be called more than once.
#1 is quite useful as a decision aid. It's the same thing as saying, as I do, "never copy code".
#2 gives you a good reason to break up a function even if there is no repeated code. If the decision logic passes a certain complexity threshold, we break it up into more functions that make fewer decisions.
It is indeed a good practice to refactor code into functions, irrespective of the language being used. Even if your code is short, it will make it more readable.
If your function is quite short, you can consider inlining it.
IBM Publib article on inlining

How bad is it to abandon THE rule in C (aka: return 0 on success)?

in a current project I dared to do away with the old 0 rule, i.e. returning 0 on success of a function. How is this seen in the community? The logic that I am imposing on the code (and therefore on the co-workers and all subsequent maintenance programmers) is:
.>0: for any kind of success/fulfillment, that is, a positive outcome
==0: for signalling no progress or busy or unfinished, which is zero information about the outcome
<0: for any kind of error/infeasibility, that is, a negative outcome
Sitting in between a lot of hardware units with unpredictable response times in a realtime system, many of the functions need to convey exactly this ternary logic so I decided it being legitimate to throw the minimalistic standard return logic away, at the cost of a few WTF's on the programmers side.
Opininons?
PS: on a side note, the Roman empire collapsed because the Romans with their number system lacking the 0, never knew when their C functions succeeded!
"Your program should follow an existing convention if an existing convention makes sense for it."
Source: The GNU C Library
By deviating from such a widely known convention, you are creating a high level of technical debt. Every single programmer that works on the code will have to ask the same questions, every consumer of a function will need to be aware of the deviation from the standard.
http://en.wikipedia.org/wiki/Exit_status
I think you're overstating the status of this mythical "rule". Much more often, it's that a function returns a nonnegative value on success indicating a result of some sort (number of bytes written/read/converted, current position, size, next character value, etc.), and that negative values, which otherwise would make no sense for the interface, are reserved for signalling error conditions. On the other hand, some functions need to return unsigned results, but zero never makes sense as a valid result, and then zero is used to signal errors.
In short, do whatever makes sense in the application or library you are developing, but aim for consistency. And I mean consistency with external code too, not just your own code. If you're using third-party or library code that follows a particular convention and your code is designed to be closely coupled to that third-party code, it might make sense to follow that code's conventions so that other programmers working on the project don't get unwanted surprises.
And finally, as others have said, whatever your convention, document it!
It is fine as long as you document it well.
I think it ultimately depends on the customers of your code.
In my last system we used more or less the same coding system as yours, with "0" meaning "I did nothing at all" (e.g. calling Init() twice on an object). This worked perfectly well and everybody who worked on that system knew this was the convention.
However, if you are writing an API that can be sold to external customers, or writing a module that will be plugged into an existing, "standard-RC" system, I would advise you to stick to the 0-on-success rule, in order to avoid future confusion and possible pitfalls for other developers.
And as per your PS, when in Rome, do like the romans do :-)
I think you should follow the Principle Of Least Astonishment
The POLA states that, when two
elements of an interface conflict, or
are ambiguous, the behaviour should be
that which will least surprise the
user; in particular a programmer
should try to think of the behavior
that will least surprise someone who
uses the program, rather than that
behavior that is natural from knowing
the inner workings of the program.
If your code is for internal consumption only, you may get away with it, though. So it really depends on the people your code will impact :)
There is nothing wrong with doing it that way, assuming you document it in a way that ensures others know what you're doing.
However, as an alternative, if might be worth exploring the option to return an enumerated type defining the codes. Something like:
enum returnCode {
SUCCESS, FAILURE, NO_CHANGE
}
That way, it's much more obvious what your code is doing, self-documenting even. But might not be an option, depending on your code base.
It is a convention only. I have worked with many api that abandon the principle when they want to convey more information to the caller. As long as your consistent with this approach any experienced programmer will quickly pick up the standard. What is hard is when each function uses a different approach IE with win32 api.
In my opinion (and that's the opinion of someone who tends to do out-of-band error messaging thanks to working in Java), I'd say it is acceptable if your functions are of a kind that require strict return-value processing anyway.
So if the return value of your method has to be inspected at all points where it's called, then such a non-standard solution might be acceptable.
If, however, the return value might be ignored or just checked for success at some points, then the non-standard solution produces quite some problem (for example you can no longer use the if(!myFunction()) ohNoesError(); idiom.
What is your problem? It is just a convention, not a law. If your logic makes more sense for your application, then it is fine, as long as it is well documented and consistent.
On Unix, exit status is unsigned, so this approach won't work if you ever have to run your program there, and this will confuse all your Unix programmers to no end. (I looked it up just now to make sure, and discovered to my surprised that Windows uses a signed exit status.) So I guess it will probably only mostly confuse your Windows programmers. :-)
I'd find another method to pass status between processes. There are many to choose from, some quite simple. You say "at the cost of a few WTF's on the programmers side" as if that's a small cost, but it sounds like a huge cost to me. Re-using an int in C is a miniscule benefit to be gained from confusing other programmers.
You need to go on a case by case basis. Think about the API and what you need to return. If your function only needs to return success or failure, I'd say give it an explicit type of bool (C99 has a bool type now) and return true for success and false for failure. That way things like:
if (!doSomething())
{
// failure processing
}
read naturally.
In many cases, however, you want to return some data value, in which case some specific unused or unlikely to be used value must be used as the failure case. For example the Unix system call open() has to return a file descriptor. 0 is a valid file descriptor as is theoretically any positive number (up to the maximum a process is allowed), so -1 is chosen as the failure case.
In other cases, you need to return a pointer. NULL is an obvious choice for failure of pointer returning functions. This is because it is highly unlikely to be valid and on most systems can't even be dereferenced.
One of the most important considerations is whether the caller and the called function or program will be updated by the same person at any given time. If you are maintaining an API where a function will return the value to a caller written by someone who may not even have access to your source code, or when it is the return code from a program that will be called from a script, only violate conventions for very strong reasons.
You are talking about passing information across a boundary between different layers of abstraction. Violating the convention ties both the caller and the callee to a different protocol increasing the coupling between them. If the different convention is fundamental to what you are communicating, you can do it. If, on the other hand, it is exposing the internals of the callee to the caller, consider whether you can hide the information.

What projects cannot be done in C?

I would like to know what projects cannot be done in C.
I know programming can be quicker and more intuitive in
other languages. But I would like to know what features
are missing in C that would prevent a project from being
completed well.
For example, very few web-frameworks exist in C.
C, like many other languages, is Turing Complete.
So simple answer is: none.
However, C++ Template Meta Programming meets the same criterion, so "it is possible" is not a good criterion to choose tools.
The very first C compiler?
A working solution to the halting problem
Alright, here's one: you cannot write an x86 boot sector in C. This is one of those things that has to be written in ASM.
There are none.
Different languages give you different ways to say things. For some classes of problems a given language may be more expressive and/or concise. Are there projects that you should pick something aside from C? Yes, of course. But to say you can't do it well in C is misleading. It would be better to ask which language is the best choice for the problem at hand, and are the gains worth using something unfamiliar?
Anything can be done in virtually any language.
That said there is a level of practicality. As your system's complexity increases, you need better tools to manage it.
The problems are still solvable, but you start to need more people and much more effort in design. I'm not saying other languages don't benefit from design, I'm saying that the same level and attention to detail may not be required.
Since we programmers are Human (I am at least) we have troubles in one area or another. My biggest is memory. If I can visualize my code as objects, manipulating large modules in my head becomes easier, and my brain can handle larger projects.
Of course, it's even possible to write good OO code in C, the patterns were developed in C by manually managing dispatch tables (tables of pointers with some pointers updated to point to different methods), and this is true of all programming constructs from higher languages--they can be done in any language, but...
If you were to implement objects in C, every single class you wrote would have a large amount of boilerplate overhead. If you made some form of exception handling, you would expose more boilerplate.
Higher level languages abstract this boilerplate out of your code and into the system, simplifying what you have to think about and debug (a dispatch table in C could take a lot of debugging, but in C++ it isn't going to fail because the code generated by a working compiler is going to be bug-free and hidden, you never see it).
I guess I'd say that's the biggest (only?) difference between low level and higher level languages, how much boilerplate do you hide. In the latest batch of dynamic languages, they are really into hiding loop constructs within the language, so more things look like:
directory.forEachFile(print file.name); // Not any real language
In C, even if you isolated part of the looping inside a function, setting up the function pointers and stuff would still take lines of un-obvious code that is not solving part of your primary problem.
There is not a single algorithm that cannot be written with C.
Depends on how much you want to invest (time/money/energy) to make it happen. Otherwise, I'd say there aren't any. It is just easier sometimes to use something else.
OS kernel has been written in C and everything runs over it so you can write everything in C.
Boot sector that needs ASM :-) , I don't think you meant that.

Resources