When I read open source codes (Linux C codes), I see a lot functions are used instead of performing all operations on the main(), for example:
int main(void ){
function1();
return 0;
}
void function() {
// do something
function2();
}
void function2(){
function3();
//do something
function4();
}
void function3(){
//do something
}
void function4(){
//do something
}
Could you tell me what are the pros and cons of using functions as much as possible?
easy to add/remove functions (or new operations)
readability of the code
source efficiency(?) as the variables in the functions will be destroyed (unless dynamic allocation is done)
would the nested function slow the code flow?
Easy to add/remove functions (or new operations)
Definitely - it's also easy to see where does the context for an operation start/finish. It's much easier to see that way than by some arbitrary range of lines in the source.
Readability of the code
You can overdo it. There are cases where having a function or not having it does not make a difference in linecount, but does in readability - and it depends on a person whether it's positive or not.
For example, if you did lots of set-bit operations, would you make:
some_variable = some_variable | (1 << bit_position)
a function? Would it help?
Source efficiency(?) due to the variables in the functions being destroyed (unless dynamic allocation is done)
If the source is reasonable (as in, you're not reusing variable names past their real context), then it shouldn't matter. Compiler should know exactly where the value usage stops and where it can be ignored / destroyed.
Would the nested function slow the code flow?
In some cases where address aliasing cannot be properly determined it could. But it shouldn't matter in practice in most programs. By the time it starts to matter, you're probably going to be going through your application with a profiler and spotting problematic hotspots anyway.
Compilers are quite good these days at inlining functions though. You can trust them to do at least a decent job at getting rid of all cases where calling overhead is comparable to function length itself. (and many other cases)
This practice of using functions is really important as the amount of code you write increases. This practice of separating out to functions improves code hygiene and makes it easier to read. I read somewhere that there really is no point of code if it is only readable by you only (in some situations that is okay I'm assuming). If you want your code to live on, it must be maintainable and maintainability is one created by creating functions in the simplest sense possible. Also imagine where your code-base exceeds well over 100k lines. This is quite common and imagine having that all in the main function. That would be an absolute nightmare to maintain. Dividing the code into function helps create degrees of separability so many developers can work on different parts of the code-base. So basically short answer is yes, it is good to use functions when necessary.
Functions should help you structure your code. The basic idea is that when you identify some place in the code which does something that can be described in a coherent, self-contained way, you should think about putting it into a function.
Pros:
Code reuse. If you do many times some sequence of operations, why don't you write it once, use it many times?
Readability: it's much easier to understand strlen(st) than while (st[i++] != 0);
Correctness: the code in the previous line is actually buggy. If it is scattered around, you may probably not even see this bug, and if you will fix it in one place, the bug will stay somewhere else. But given this code inside a function named strlen, you will know what it should do, and you can fix it once.
Efficiency: sometimes, in certain situations, compilers may do a better job when compiling a code inside a function. You probably won't know it in advance, though.
Cons:
Splitting a code into functions just because it is A Good Thing is not a good idea. If you find it hard to give the function a good name (in your mother language, not only in C) it is suspicious. doThisAndThat() is probably two functions, not one. part1() is simply wrong.
Function call may cost you in execution time and stack memory. This is not as severe as it sounds, most of the time you should not care about it, but it's there.
When abused, it may lead to many functions doing partial work and delegating other parts from here to there. too many arguments may impede readability too.
There are basically two types of functions: functions that do a sequence of operations (these are called "procedures" in some contexts), and functions that does some form of calculation. These two types are often mixed in a single function, but it helps to remember this distinction.
There is another distinction between kinds of functions: Those that keep state (like strtok), those that may have side effects (like printf), and those that are "pure" (like sin). Function like strtok are essentially a special kind of a different construct, called Object in Object Oriented Programming.
You should use functions that perform one logical task each, at a level of abstraction that makes the function of each function easy to logically verify. For instance:
void create_ui() {
create_window();
show_window();
}
void create_window() {
create_border();
create_menu_bar();
create_body();
}
void create_menu_bar() {
for(int i = 0; i < N_MENUS; i++) {
create_menu(menus[i]);
}
assemble_menus();
}
void create_menu(arg) {
...
}
Now, as far as creating a UI is concerned, this isn't quite the way one would do it (you would probably want to pass in and return various components), but the logical structure is what I'm trying to emphasize. Break your task down into a few subtasks, and make each subtask its own function.
Don't try to avoid functions for optimization. If it's reasonable to do so, the compiler will inline them for you; if not, the overhead is still quite minimal. The gain in readability you get from this is a great deal more important than any speed you might get from putting everything in a monolithic function.
As for your title question, "as much as possible," no. Within reason, enough to see what each function does at a comfortable level of abstraction, no less and no more.
One condition you can use: if part of the code will be reuse/rewritten, then put it in a function.
I guess I think of functions like legos. You have hundreds of small pieces that you can put together into a whole. As a result of all of those well designed generic, small pieces you can make anything. If you had a single lego that looked like an entire house you couldn't then use it to build a plane, or train. Similarly, one huge piece of code is not so useful.
Functions are your bricks that you use when you design your project. Well chosen separation of functionality into small, easily testable, self contained "functions" makes building and looking after your whole project easy. Their benefits WAYYYYYYY out-weigh any possible efficiency issues you may think are there.
To be honest, the art of coding any sizeable project is in how you break it down into smaller pieces, so functions are key to that.
Related
I am working on a an embedded architecture where ASM is predominent. I would like to refactor most of our legacy ASM code in C in order to increase readability and modularity.
So I am still puzzling with minor details which causes my hopes to vanish. The real problem is far more complex that this following example, but I would like to share this as an entry point to the discussion.
My goal is to find a optimal workaround.
Here is the original example (do not worry about what the code does. I wrote this randomly just to show the issue I would like to talk about).
int foo;
int bar;
int tmp;
int sum;
void do_something() {
tmp = bar;
bar = foo + bar;
foo = foo + tmp;
}
void compute_sum() {
for(tmp = 1; tmp < 3; tmp++)
sum += foo * sum + bar * sum;
}
void a_function() {
compute_sum();
do_something();
}
With this dummy code, anyone would immediately remove all the global variables and replace them with local ones:
void do_something(int *a, int *b) {
int tmp = *b;
*b = *a + *b;
*b = tmp + *a;
}
void compute_sum(int *sum, int foo, int bar) {
int tmp;
for(tmp = 1; tmp < 3; tmp++)
sum += *foo * sum + *bar * sum;
}
void a_function(int *sum, int *foo, int *bar) {
compute_sum(sum, foo, bar);
do_something(foo, bar);
}
Unfortunately this rework is worse than the original code because all the parameters are pushed into the stack which leads to latencies and larger code size.
The everything globals solution is both the best the ugliest solution. Especially when the source code is about 300k lines long with almost 3000 global variables.
Here we are not facing a compiler problem, but a structural issue. Writing beautiful, portable, readable, modular and robust code will never pass the ultimate performance test because compilers are dumb, even is 2015.
An alternative solution is to rather prefer inline functions. Unfortunately these functions have to be located into a header file which is also ugly.
A compiler cannot see further the file it is working on. When a function is marked as extern it will irrevocably lead to performance issues. The reason is the compiler cannot make any assumptions regarding the external declarations.
In the other way, the linker could do the job and ask the compiler to rebuild objects files by givin additionnal information to the compiler. Unfortunately not many compilers offer such features and when they do, they considerably slow down the build process.
I eventually came accross this dilemma:
Keep the code ugly to preserve performances
Everything's global
Functions without parameters (same as procedures)
Keeping everything in the same file
Follow standards and write clean code
Think of modules
Write small but numerous functions with well defined parameters
Write small but numerous source files
What to do when the target architecture has limited ressources. Going back to the assembly is my last option.
Additional Information
I am working on a SHARC architecture which is a quite powerful Harvard CISC architecture. Unfortunately one code instruction takes 48bits and a long only takes 32bits. With this fact it is better to keep to version of a variable rather than evaluating the second value on the fly:
The optimized example:
int foo;
int bar;
int half_foo;
void example_a() {
write(foo);
write(half_foo + bar);
}
The bad one:
void example_a(int foo, int bar) {
write(foo);
write(bar + (foo >> 1));
}
Ugly C code is still a lot more readable than assembler. In addition, it's likely that you'll net some unexpected free optimizations.
A compiler cannot see further the file it is working on. When a function is marked as extern it will irrevocably lead to performance issues. The reason is the compiler cannot make any assumptions regarding the external declarations.
False and false. Have you tried "Whole Program Optimization" yet? The benefits of inline functions, without having to organize into headers. Not that putting things in headers is necessarily ugly, if you organize the headers.
In your VisualDSP++ compiler, this is enabled by the -ipa switch.
The ccts compiler has a capability called interprocedural analysis (IPA), a
mechanism that allows the compiler to optimize across translation units
instead of within just one translation unit. This capability effectively
allows the compiler to see all of the source files that are used in a final link
at compilation time and make use of that information when optimizing.
All of the -ipa optimizations are invoked after the initial link, whereupon
a special program called the prelinker reinvokes the compiler to perform
the new optimizations.
I'm used to working in performance-critical core/kernel-type areas with very tight needs, often being beneficial to accept the optimizer and standard library performance with some grain of salt (ex: not getting too excited about the speed of malloc or auto-generated vectorization).
However, I've never had such tight needs so as to make the number of instructions or the speed of pushing more arguments to the stack be a considerable concern. If it is, indeed, a major concern for the target system and performance tests are failing, one thing to note is that performance tests modeled at a micro level of granularity often do have you obsessed with smallest of micro-efficiencies.
Micro-Efficiency Performance Tests
We made the mistake of writing all kinds of superficial micro-level tests in a former workplace I was at where we made tests to simply time something as basic as reading one 32-bit float from a file. Meanwhile, we made optimizations that significantly sped up the broad, real-world test cases associated with reading and parsing the contents of entire files while, at the same time, some of those uber-micro tests actually got slower for some unbeknownst reason (they weren't even directly modified, but changes to the code around them may have had some indirect impact relating to dynamic factors like caches, paging, etc., or merely how the optimizer treated such code).
So the micro-level world can get a bit more chaotic when you work with a higher-level language than assembly. The performance of the teeny things can shift under your feet a bit, but you have to ask yourself what's more important: a slight decrease in the performance of reading one 32-bit float from a file, or having real-world operations that read from entire files go significantly faster. Modeling your performance tests and profiling sessions at a higher level will give you room to selectively and productively optimize the parts that really matter. There you have many ways to skin a cat.
Run a profiler on an ultra-granular operation being executed a million times repeatedly and you would have already backed yourself into an assembly-type micro-corner for everything performing such micro-level tests just by the nature of how you are profiling the code. So you really want to zoom out a bit there, test things at a coarser level so that you can act like a disciplined sniper and hone in on the micro-efficiency of very select parts, dispatching the leaders behind inefficiencies rather than trying to be a hero taking out every little insignificant foot soldier that might be a performance obstacle.
Optimizing Linker
One of your misconceptions is that only the compiler can act as an optimizer. Linkers can perform a variety of optimizations when linking object files together, including inlining code. So there should rarely, if ever, be a need to jam everything into a single object file as an optimization. I'd try looking more into the settings of your linker if you find otherwise.
Interface Design
With these things aside, the key to a maintainable, large-scale codebase lies more in interface (i.e., header files) than implementation (source files). If you have a car with an engine that goes a thousand miles per hour, you might peer under the hood and find that there are little fire-breathing demons dancing around to allow that to happen. Perhaps there was a pact involved with demons to get such speed. But you don't have to expose that fact to the people driving the car. You can still give them a nice set of intuitive, safe controls to drive that beast.
So you might have a system that makes uninlined function calls 'expensive', but expensive relative to what? If you are calling a function that sorts a million elements, the relative cost of pushing a few small arguments to the stack like pointers and integers should be absolutely trivial no matter what kind of hardware you're dealing with. Inside the function, you might do all sorts of profiler-assisted things to boost performance like macros to forcefully inline code no matter what, perhaps even some inlined assembly, but the key to keeping that code from cascading its complexity throughout your system is to keep all that demon code hidden away from the people who are using your sort function and to make sure it's well-tested so that people don't have to constantly pop the hood trying to figure out the source of a malfunction.
Ignoring that 'relative to what?' question and only focusing on absolutes is also what leads to the micro-profiling which can be more counter-productive than helpful.
So I'd suggest looking at this more from a public interface design level, because behind an interface, if you look behind the curtains/under the hood, you might find all kinds of evil things going on to get that needed edge in performance in hotspot areas shown in a profiler. But you shouldn't need to pop the hood very often if your interfaces are well-designed and well-tested.
Globals become a bigger problem the wider their scope. If you have globals defined statically with internal linkage inside a source file that no one else can access, then those are actually rather 'local' globals. If thread-safety isn't a concern (if it is, then you should avoid mutable globals as much as possible), then you might have a number of performance-critical areas in your codebase where if you peer under the hood, you find file scope-static variables a lot to mitigate the overhead of function calls. That's still a whole lot easier to maintain than assembly, especially when the visibility of such globals are reduced with smaller and smaller source files dedicated to performing more singular, clear responsibilities.
I have designed/written/tested/documented many many real time embedded systems.
Both 'soft' real time and 'hard' real time.
I can tell you from hard earned experience that the algorithm used to implement the application is the place to make the biggest gains in speed.
Little stuff like a function call compared to in-line is trivial unless performed thousands (or even hundreds of thousands) of times
I'm a relatively new C programmer, and I've noticed that many conventions from other higher-level OOP languages don't exactly hold true on C.
Is it okay to use short functions to have your coding stay organized (even though it will likely be called only once)? An example of this would be 10-15 lines in something like void init_file(void), then calling it first in main().
I would have to say, not only is it OK, but it's generally encouraged. Just don't overly fragment the train of thought by creating myriads of tiny functions. Try to ensure that each function performs a single cohesive, well... function, with a clean interface (too many parameters can be a hint that the function is performing work which is not sufficiently separate from it's caller).
Furthermore, well-named functions can serve to replace comments that would otherwise be needed. As well as providing re-use, functions can also (or instead) provide a means to organize the code and break it down into smaller units which can be more readily understood. Using functions in this way is very much like creating packages and classes/modules, though at a more fine-grained level.
Yes. Please. Don't write long functions. Write short ones that do one thing and do it well. The fact that they may only be called once is fine. One benefit is that if you name your function well, you can avoid writing comments that will get out of sync with the code over time.
If I can take the liberty to do some quoting from Code Complete:
(These reason details have been abbreviated and in spots paraphrased, for the full explanation see the complete text.)
Valid Reasons to Create a Routine
Note the reasons overlap and are not intended to be independent of each other.
Reduce complexity - The single most important reason to create a routine is to reduce a program's complexity (hide away details so you don't need to think about them).
Introduce an intermediate, understandable abstraction - Putting a section of code int o a well-named routine is one of the best ways to document its purpose.
Avoid duplicate code - The most popular reason for creating a routine. Saves space and is easier to maintain (only have to check and/or modify one place).
Hide sequences - It's a good idea to hide the order in which events happen to be processed.
Hide pointer operations - Pointer operations tend to be hard to read and error prone. Isolating them into routines shifts focus to the intent of the operation instead of the mechanics of pointer manipulation.
Improve portability - Use routines to isolate nonportable capabilities.
Simplify complicated boolean tests - Putting complicated boolean tests into a function makes the code more readable because the details of the test are out of the way and a descriptive function name summarizes the purpose of the tests.
Improve performance - You can optimize the code in one place instead of several.
To ensure all routines are small? - No. With so many good reasons for putting code into a routine, this one is unnecessary. (This is the one thrown into the list to make sure you are paying attention!)
And one final quote from the text (Chapter 7: High-Quality Routines)
One of the strongest mental blocks to
creating effective routines is a
reluctance to create a simple routine
for a simple purpose. Constructing a
whole routine to contain two or three
lines of code might seem like
overkill, but experience shows how
helpful a good small routine can be.
If a group of statements can be thought of as a thing - then make them a function
i think it is more than OK, I would recommend it! short easy to prove correct functions with well thought out names lead to code which is more self documenting than long complex functions.
Any compiler worth using will be able to inline these calls to generate efficient code if needed.
Functions are absolutely necessary to stay organized. You need to first design the problem, and then depending on the different functionality you need to split them into functions. Some segment of code which is used multiple times, probably needs to be written in a function.
I think first thinking about what problem you have in hand, break down the components and for each component try writing a function. When writing the function see if there are some code segment doing the same thing, then break it into a sub function, or if there is a sub module then it is also a candidate for another function. But at some time this breaking job should stop, and it depends on you. Generally, do not make many too big functions and not many too small functions.
When construction the function please consider the design to have high cohesion and low coupling.
EDIT1::
you might want to also consider separate modules. For example if you need to use a stack or queue for some application. Make it separate modules whose functions could be called from other functions. This way you can save re-coding commonly used modules by programming them as a group of functions stored separately.
Yes
I follow a few guidelines:
DRY (aka DIE)
Keep Cyclomatic Complexity low
Functions should fit in a Terminal window
Each one of these principles at some point will require that a function be broken up, although I suppose #2 could imply that two functions with straight-line code should be combined. It's somewhat more common to do what is called method extraction than actually splitting a function into a top and bottom half, because the usual reason is to extract common code to be called more than once.
#1 is quite useful as a decision aid. It's the same thing as saying, as I do, "never copy code".
#2 gives you a good reason to break up a function even if there is no repeated code. If the decision logic passes a certain complexity threshold, we break it up into more functions that make fewer decisions.
It is indeed a good practice to refactor code into functions, irrespective of the language being used. Even if your code is short, it will make it more readable.
If your function is quite short, you can consider inlining it.
IBM Publib article on inlining
As a beginner, I read everywhere to avoid excess use of global variables. Well how to do so? My low skill fails. I am ending up passing tons of structures and it is harder to read than using globals. An tips on going through this problem/application structure design?
Depending on what your variables are doing, global scope might be the best scope. (Think flags to signal that an interrupt has arrived, and should be handled at a convenient time in the middle of a compute loop.)
Small utility programs can often feel much cleaner by using global variables (I'm thinking especially of small language parsers); but this makes it much harder to integrate the small utility programs into larger programs in the future. There are always trade-offs.
But chances are good the "correct" data organization will not feel quite so cumbersome. If you post code here, someone may be able to suggest cleaner layout, but the real problems come when code grows beyond easily-understood small samples.
I have a LOT of favorite programming style books, but I think the best I know of to address this situation is The Elements of Programming Style, by Kernighan and Plauger. It's quite old, and difficult to find, but short, sweet, and well worth finding used somewhere.
It's not as short, it's not as sweet, but still well worth finding Code Complete, 2nd edition. It's much more detailed, provides much more code, and provides much more diversity involved in designing software. It's excellent, but might be more intimidating.
There's nothing like studying the masters: the code in Advanced Programming in the Unix Environment, 2nd Edition is phenomenal, well worth every hour of study.
And, of course, there's always experience, but that takes time to acquire. Learning lessons from your own mistakes tends to stick much stronger than learning lessons from other people's mistakes. So keep at it. :)
I'd suggest Structured Design by Yourdon and Constantine. An old book by computer standards (it has examples involving tapes!) but very sound on the problems you are having.
Here are two options that you could use to improve your situation:
For read-only structures, have functions that can control access to the data with a const pointer:
struct my_struct;
const my_struct* GetMyStruct(void) const;
Limit the exposure of a global structure by declaring it static. This way it will only have file scope:
static mystruct myStructInstance;
If your program it the sort of "small" project where global variables don't feel so bad, but you think you might need to integrate it into a larger project in the future, a very simple solution is to add a single context pointer argument to each function and store all your "global" variables in there. If you always name it the same thing, you can even do stuff like:
#define current_filename context->current_filename
#define option_flags context->option_flags
etc. and your code will look virtually identical to how it would have looked with globals, except that you'll be able to have multiple instances of it in a single program, integrate it into a library, and so on with minimal fuss. Just keep those defines in a private header used by your source modules, not the public interface header.
#PeterK Problem is that structure as itself is always presented in C books a as container that can be declared/passed many times to different functions and that is a thing that may confused me and I never thought to use it as a simple one instance global container (and that may make my code more readable).
I am writing 3 phase motor control application to control 1 motor.
Based on what all you wrote please check if my current ideas of solving problem is right:
Pack some global information in structure according to function ex. (sInverterState, sButtonsState, sInverterParameters etc.)
If I write menu UI I can use static variables in C file and don’t care about passing structs when I have only 1 LCD. I don’t want to make it look like GTK++.
Writing reetrant code is not for me yet and its overdoing for this purpose.
Get proper education in IT field.
I may end up with lots of globals but at least they are nicely packed and readable.
I often hear people praise languages, frameworks, constructs, etc. for being "explicit". I'm trying to understand this logic. The purpose of a language, framework, etc. is to hide complexity. If it makes you specify all kinds of details explicitly, then it's not hiding much complexity, only moving it around. What's so great about explicitness and how do you make a language/framework/API "explicit" while still making it serve its purpose of hiding complexity?
Whether you should be explicit or implicit depends on the situation. You are correct in that often you are trying to hide complexity, and certain things being done behind the scenes for you automatically is good. encapsulation, etc.
Sometimes though frameworks or constructs hide things from us that they should not, and this makes things less clear. Sometimes certain information or settings are hidden from us and hence we don't know what's happening. Assumptions are made that we don't understand and can't determine. Behaviors happen that we can't predict.
Encapsulation: good. Hiding: bad. Making the right call takes experience. Where logic belongs, it should be explicit.
Example: I once removed about 90 lines of code from a series of a dozen code behind pages; data access code, business logic, etc., that did not belong there. I moved them to base pages and the key business object. This was good (encapsulation, separation of concerns, code organization, decoupling, etc.).
I then excitedly realized that I could remove the last line of code from many of these pages, moving it to the base page. It was a line that took a parameter from url and passed it to the business object. Good, right? Well, no, this was bad (I was hiding). This logic belonged here, even though it was almost the same line on every page. It linked the UI intention with the business object. It need to be explicit. Otherwise I was hiding, not encapsulating. With that line, someone looking at that page would know what that page did and why; without it, it would be a pain to determine what was going on.
I believe that explicit refers to knowing exactly what it is doing when you use it. That is different from knowing exactly how it's done, which is the complex part.
It's not so much that explicit is good (certainly the closely-related verbose is bad) as that when implicit goes wrong, it's so hard to tell WTF is going on.
Hack C++ for a decade or two and you'll understand exactly what I mean.
It is about expressing intentions. The reader can't tell if the default was left by mistake or by design. Being explicit removes that doubt.
Code is harder to read than to write. In nontrivial applications, a given piece of code will also be read more often than it is written. Therefore, we should write our code to make it as easy on the reader as possible. Code that does a lot of stuff that isn't obvious is not easy to read (or rather, it's hard to understand when you read it). Ergo, explicitness is considered a good thing.
Relying on default behaviour hides important details from people who aren't intimately familiar with the language/framework/whatever.
Consider how Perl code which relies extensively on shorthands is difficult to understand for people who don't know Perl.
Being explicit vs. implicit is all about what you hide, and what you show.
Ideally, you expose concepts that either the user cares about, or has to care about (whether they want to or not).
The advantage of being explicit is that it's easier to track down and find out what's going on, especially in case of failure. For instance, if I want to do logging, I can have an API that requires explicit initialization with a directory for the log. Or, I can use a default.
If I give an explicit directory, and it fails, I'll know why. If I use an implicit path, and it fails, I will have no idea of what has gone wrong, why, or where to look to fix it.
Implicit behavior is almost always a result of hiding information from the consumer. Sometimes that's the right thing to do, such as when you know in your environment there's only one "answer". However, it's best to know when you're hiding information and why, and ensure that you're letting your consumers work closer to their level of intent, and without trying to hide items of essential complexity.
Frequently implicit behavior is a result of "self-configuring" objects that look at their environment and try to guess the correct behavior. I'd avoid this pattern in general.
One rule I'd probably follow overall is that, for a given API, any operation should either be explicit, or implicit, but never a combination. Either make the operation something the user has to do, or make it something they don't have to think about. It's when you mix those two that you will run into the biggest problems.
Frameworks, etc., can be both explicit and hide complexity by offering the right abstractions for the job to be done.
Being explicit allows others to inspect and understand what is meant by the original developer.
Hiding complexity is not equivalent with being implicit. Implicitness would result in code that is only understandable by the person who wrote it as trying to understand what goes on under the hood is akin to reverse engineering in this case.
Explicit code has a theoretical chance of being proved correct. Implicit code never stands a chance in this respect.
Explicit code is maintainable, implicit code is not - this links to providing correct comments and choosing your identifiers with care.
An "explicit" language allows the computer to find bugs in software that a less-explicit language does not.
For example, C++ has the const keyword for variables whose values should never change. If a program tries to change these variables, the compiler can state that the code is likely wrong.
Good abstraction doesn't hide complexities, it takes decisions that are best left to the compiler off of your plate.
Consider garbage collection: The complexities of releasing resources are delegated to a garbage collector which is (presumably) better qualified to make a decision than you, the programmer. Not only does it take the decision off your hands, but it makes a better decision than you would have yourself.
Explicitness is (sometimes) good because it makes it so that certain decisions that in some cases are better left to the programmer are not automatically made by a less qualified agent. A good example is when you're declaring a floating point data type in a c-type language and initializing it to an integer:
double i = 5.0;
if instead you were to declare it as
var i = 5;
the compiler would rightfully assume you want an int and operations later on would be truncated.
Explicitness is desirable in the context of making it clear to the reader of your code what you intended to do.
There are many examples, but it's all about leaving no doubt about your intent.
e.g. These are not very explicit:
while (condition);
int MyFunction()
bool isActive; // In C# we know this is initialised to 0 (false)
a = b??c;
double a = 5;
double angle = 1.57;
but these are:
while (condition)
/* this loop does nothing but wait */ ;
private int MyFunction()
int isActive = false; // Now you know I really meant this to default to false
if (b != null) a = b; else a = c;
double a = 5.0;
double angleDegrees = 1.57;
The latter cases leave no room for misinterpretation. The former might lead to bugs when someone fails to read them carefully, or doesn't clearly understand a less readable syntax for doing something, or mixes up integer and float types.
In some cases the opposite is "magic" - as in "then a miracle occurs".
When a developer's reading code trying to understand or debug what's going on, explicitness can be a virtue.
The purpose of frameworks moving things around is to remove duplication in code and allow easier editing of chunks without breaking the whole thing.
When you have only one way of doing something, like say SUM(x,y);
We know exactly what this is going to do, no reason to ever need to rewrite it, and if you must you can, but its highly unlikely.
The opposite of that is programming languages like .NET that provide very complex functions that you often will need to rewrite if your doing anything but the obvious simple example.
I am writing an academic project about extremely long functions in the Linux kernel.
For that purpose, I am looking for examples for real-life functions that are extremely long (few hundreds of lines of code), that you don't consider bad programming (i.e., they won't benefit from decomposition or usage of a dispatch table).
Have you ever written or seen such a code? Can you post or link to it, and give explanation of why is it so long?
I have been getting amazing help from the community here - any idea that will be taken into the project will be properly credited.
Thanks,
Udi
The longest functions that I have ever written all have one thing in common, a very large switch statement. There are times, when you have to switch on a long list of items and it would only make things harder to understand if you tried to refactor some of the options into a separate function. Having large switch statements makes the Cyclomatic complexity go through the roof, but it is often better than the alternative implementations.
It was the last one before I got fired.
A previous job: An extremely long case statement, IIRC 1000+ lines. This was long before objects. Each option was only a few lines long. Breaking it up would have made it less clear. There were actually a pair of such routines doing different things to the same underlying set of data types.
Sorry, I don't have the code anymore and it isn't mine to post, anyway.
The longest function that I didn't see as being horrible would be the key method of a custom CPU VM. As with #epotter, this involved a big switch statement. In fact I'd say a lot of method that I find resist being cleanly broken down or improved in readability involve switch statements.
Unfortunately, you won't often find this type of subroutine checked in or posted somewhere if it's auto-generated during a build step using some sort of code generator.
So look for projects that have C generated from another language.
Beside the performance, I think the size of the call stack in Kernel space is 8K (please verify the size). Also, as far as I know, code in kernel is fairly specific. If some code is unlikely to be re-used in the future why bother make it a function considering function call overhead.
I could imagine that when speed is important (such as when holding some sort of lock in the kernel) you would not want to break up a function because of the overhead due to making a functional call. When compiled, parameters have to be pushed onto the stack and data has to be popped off before returning. Therefor you may have a large function for efficiency reasons.