As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm building a minishell in C, and have come to a roadblock that seems that it could be easily fixable by using global variables(3 to be exact). The reason I think globals are necessary, is that the alternative would be to pass these variables to almost every single function in my program.
The variables are mainargc, mainargv, and shiftedMArgV. The first two are the number of arguments, and argument list passed to main, respectively. The variable shiftedMArgV is the argument list, however it may have been shifted. I'm trying to create builtin functions shift and unshift, and would make shiftedMArgV point to different arguments.
So, would it be stupid to just make these global? Otherwise I will have to revise a very large amount of code, and I'm not sure I'd be losing anything by making them global.
If I do make them global, would doing so from the main header file be foolish?
Thanks for the help guys, if you need any clarification just ask.
As an alternative to global variables, consider 'global functions':
extern int msh_mainArgC(void);
extern char **msh_mainArgV(void);
extern char **msh_shiftedArgV(void);
The implementations of those functions is trivial, but it allows you control over the access to the memory. And if you need to do something fancy, you can change the implementation of the functions. (I chose to capitalize the C and V to make the difference more visible; when only the last character of an 8-12 letter name is different, it is harder to spot the difference.)
There'd be an implementation file that would define these functions. In that file, there'd be static variables storing the relevant information, and functions to set and otherwise manipulate the variables. In principle, if you slap enough const qualifiers around, you could ensure that the calling code cannot modify the data except via the functions designed to do so (or by using casts to remove the const-ness).
Whether this is worthwhile for you is debatable. But it might be. It is a more nearly 'object-oriented' style of operation. It is an alternative to consider and then discard, rather than something to leave unconsidered.
Note that your subsystems that use these functions might have one function that collects the global values, and then passes these down to its subordinate functions. This saves the subordinates from having to know where the values came from; the just operate with them correctly. If there are global variables, you have to worry about aliasing — is a function passed values (copies of the global variables) but does it also access the global variables. With the functions, you don't have to worry about that in the same way.
I would say that it is not stupid, but that you should proceed with a certain caution.
The reason globals are usually avoided is not that they should never be used, but rather that their usage has led people to frequently led programmers to crash and burn. Through experience one learns the difference between when it is the right time and when it is the wrong time.
If you have thought deeply about the problem you are trying to solve and considered the code you've wrote to solve this problem and also considered the future of this code (i.e. are you compromising maintainability) and feel that a global is either unavoidable or better represents the coded solution, then you should go with the global.
Later, you may crash and burn, but that experience will help you later discern what a better choice may have been. Conversely, if you feel as though not using the globals may lead to crashage and burnage, than this is your prior experience saying you should use them. You should trust such instincts.
Dijkstra has a paper in which he discusses the harm the goto statement may cause, but his discussion also, in my opinion, explains some of our difficulties with globals. It may be worth a read.
This answer and this answer may also be of use.
Globals are ok as long as they are really globals in a logical way, and not just a mean to make your life easier. For example, globals can describe an environment in which your program executes or, in another words, attributes that are relevant on system level of your app.
Pretty much all complex software I ever worked with had a set of well defined globals. There's nothing wrong with that. They range from just a handful to about a dozen. In the latter case they're usually grouped logically in structs.
Globals are usually exposed in a header file as externs, and then defined in a source file. Just remember that globals are shared between threads and thus must be protected, unless it makes more sense to declare them with thread local storage.
For a shell, you have a lot more state than this. You have the state of remapped file descriptors (which you need to track for various reasons), trap dispositions, set options, the environment and shell variables, ...
In general, global variables are the wrong solution. Instead, all of the state should be kept in a context structure of some sort, a pointer to which is passed around everywhere. This is good program design, and usually it allows you to have multiple instances of the same code running in the same process (e.g. multiple interpreters, multiple video decoders, etc.).
With that said, the shell is a very special case, because it also deals with a lot of global state you can't keep in a structure: signal dispositions, file descriptors and mappings, child processes, process groups, controlling terminal, etc. It may be possible to abstract a lot of this with an extra layer so that you can emulate the required behavior while keeping clean contexts that could exist in multiplicity within a single process, but that's a much more difficult task than writing a traditional shell. As such, I might give yourself some leeway to write your shell "the lazy way" using global variables. If this is a learning exercise, though, try to carefully identify each time you're introducing global variables or state, why you're doing it, and how you might be able to implement the program differently without having global state. This will be very useful to you in the future.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am working with embedded systems(Mostly ARM Cortex M3-M4) using C language.And was wondering what are the advantages/ disadvantages of dividing the task into many functions such as;
void Handle_Something(void)
{
// do Task-1
// do Task-2
// do Task-3
//etc.
}
to
void Hanlde_Something(void)
{
// Handle_Task1();
// Handle_Tasl2();
//etc.
}
How are these two approaches can be examined with respect to stack usage and overall processing speed, and which is safer/better for what reason ? (You can assume this is outside of ISR)
From what I know, Memory in the stack is allocated/deallocated for local variables in each call/return cycle, thus dividing the task seems reasonable in terms of memory usage but when doing this, I sometimes get Hardfaults from different sources(mostly Bus or Undefined Instruction errors.) that I couldn't figured out why.
Also, working speed is very crucial for many applications in my field, so ı do need to know which methode provides faster responses.
I would appreciate an enlightment. Thanks everybody in advance
This is what's known as "pre-mature optimization".
In the old days when compilers where horrible, they couldn't inline functions by themselves. So in the old days a keyword inline was added to C - similar non-standard versions also existed before the year 1999. This was used to tell a bad compiler how it should generate code better.
Nowadays this is mostly history. Compilers are better than programmers at determining what and when to inline. They may however struggle when a called function is located in a different "translation unit" (basically, in a different .c file). But in your case I take it this is not the case, but Handle_Task1() etc can be regarded as functions in the same file.
With the above in mind:
How are these two approaches can be examined with respect to stack usage and overall processing speed
They are to be regarded as identical. They use the same stack space and take the same time to execute.
Unless you have a bad, older compiler - in which case function calls always take extra space and execution time. Since you are working with modern MCU:s, this should not be the case, or you desperately need to get a better compiler.
As a rule of thumb, it is always better practice to split up larger functions in several smaller, for the sake of readability and maintenance. Even in hard real-time systems, there exist very few cases where function call overhead is an actual bottleneck even when bad compilers are used.
Memory on the stack isn't allocated/reallocated from some complex memory pool. The stack pointer is simply increased/decreased. An operation that is basically free in all but the tightest/smallest loops imaginable (and those will probably be optimized by the compiler).
Don't group together functions because they could reuse variables e.g. don't create a bunch of int tempInt; long tempLong; variables you use throughout your entire program. A variable should serve only a single purpose and its scope should be kept as tight as possible. Also see: is it good or bad to reuse the variables?
Expanding on that, keeping the scope of all variables as local as possible might even cause your compiler to keep the variables in a cpu register only. A shortly used variable might actually never be allocated!
Try to limit functions to only a singly purpose and try avoiding side effects: if you avoid global variables a function becomes easier to test, optimize and understand as each time you call it with the exact same set of arguments it will preform the exact same action. Have a look at: Why are global variables bad, in a single threaded, non-os, embedded application
Each solution has advantages and disadvantages.
The first approach allows to execute the code (a priori) faster, because the asm code won't have instructions related to jumps. However, you have to take the readability into account, in terms of mixing different kind of functionalities in the same function (or creating large functions, which is not a good idea from the guidelines point of view).
The second solution could be easier to understand, because each function contains a simple task, furthermore it is easier for documenting (that is, you dont have to explain different "purposes" in the same function). As I said, this
solution is slower, because your "scheduler" contains jumps, nevertheless you could declare the simple tasks as inline, given that you can split the code in several simple tasks with a proper documentation and the compiler will generate an assembler as the first approach, that is, avoiding the jumps.
Another point is the use of memory. If your simple tasks are being called from different parts of the code, the first solution and the second solution with inline are worse (in terms of memory) than the second solution without inline, because the function is added as many times as it is called from different parts of your code.
Working with modules is always more efficient in terms of error handling, debugging and re-reading. Considering some heavy working libraries (SLAM, PCL etc.) as functions, they are used as external functions and they don't cause a significant loss of performance(tbh sometimes it's almost impossible to embed such large functions into your code). You may face slightly higher stack use as #Colin commented.
I'm a relatively new C programmer, and I've noticed that many conventions from other higher-level OOP languages don't exactly hold true on C.
Is it okay to use short functions to have your coding stay organized (even though it will likely be called only once)? An example of this would be 10-15 lines in something like void init_file(void), then calling it first in main().
I would have to say, not only is it OK, but it's generally encouraged. Just don't overly fragment the train of thought by creating myriads of tiny functions. Try to ensure that each function performs a single cohesive, well... function, with a clean interface (too many parameters can be a hint that the function is performing work which is not sufficiently separate from it's caller).
Furthermore, well-named functions can serve to replace comments that would otherwise be needed. As well as providing re-use, functions can also (or instead) provide a means to organize the code and break it down into smaller units which can be more readily understood. Using functions in this way is very much like creating packages and classes/modules, though at a more fine-grained level.
Yes. Please. Don't write long functions. Write short ones that do one thing and do it well. The fact that they may only be called once is fine. One benefit is that if you name your function well, you can avoid writing comments that will get out of sync with the code over time.
If I can take the liberty to do some quoting from Code Complete:
(These reason details have been abbreviated and in spots paraphrased, for the full explanation see the complete text.)
Valid Reasons to Create a Routine
Note the reasons overlap and are not intended to be independent of each other.
Reduce complexity - The single most important reason to create a routine is to reduce a program's complexity (hide away details so you don't need to think about them).
Introduce an intermediate, understandable abstraction - Putting a section of code int o a well-named routine is one of the best ways to document its purpose.
Avoid duplicate code - The most popular reason for creating a routine. Saves space and is easier to maintain (only have to check and/or modify one place).
Hide sequences - It's a good idea to hide the order in which events happen to be processed.
Hide pointer operations - Pointer operations tend to be hard to read and error prone. Isolating them into routines shifts focus to the intent of the operation instead of the mechanics of pointer manipulation.
Improve portability - Use routines to isolate nonportable capabilities.
Simplify complicated boolean tests - Putting complicated boolean tests into a function makes the code more readable because the details of the test are out of the way and a descriptive function name summarizes the purpose of the tests.
Improve performance - You can optimize the code in one place instead of several.
To ensure all routines are small? - No. With so many good reasons for putting code into a routine, this one is unnecessary. (This is the one thrown into the list to make sure you are paying attention!)
And one final quote from the text (Chapter 7: High-Quality Routines)
One of the strongest mental blocks to
creating effective routines is a
reluctance to create a simple routine
for a simple purpose. Constructing a
whole routine to contain two or three
lines of code might seem like
overkill, but experience shows how
helpful a good small routine can be.
If a group of statements can be thought of as a thing - then make them a function
i think it is more than OK, I would recommend it! short easy to prove correct functions with well thought out names lead to code which is more self documenting than long complex functions.
Any compiler worth using will be able to inline these calls to generate efficient code if needed.
Functions are absolutely necessary to stay organized. You need to first design the problem, and then depending on the different functionality you need to split them into functions. Some segment of code which is used multiple times, probably needs to be written in a function.
I think first thinking about what problem you have in hand, break down the components and for each component try writing a function. When writing the function see if there are some code segment doing the same thing, then break it into a sub function, or if there is a sub module then it is also a candidate for another function. But at some time this breaking job should stop, and it depends on you. Generally, do not make many too big functions and not many too small functions.
When construction the function please consider the design to have high cohesion and low coupling.
EDIT1::
you might want to also consider separate modules. For example if you need to use a stack or queue for some application. Make it separate modules whose functions could be called from other functions. This way you can save re-coding commonly used modules by programming them as a group of functions stored separately.
Yes
I follow a few guidelines:
DRY (aka DIE)
Keep Cyclomatic Complexity low
Functions should fit in a Terminal window
Each one of these principles at some point will require that a function be broken up, although I suppose #2 could imply that two functions with straight-line code should be combined. It's somewhat more common to do what is called method extraction than actually splitting a function into a top and bottom half, because the usual reason is to extract common code to be called more than once.
#1 is quite useful as a decision aid. It's the same thing as saying, as I do, "never copy code".
#2 gives you a good reason to break up a function even if there is no repeated code. If the decision logic passes a certain complexity threshold, we break it up into more functions that make fewer decisions.
It is indeed a good practice to refactor code into functions, irrespective of the language being used. Even if your code is short, it will make it more readable.
If your function is quite short, you can consider inlining it.
IBM Publib article on inlining
As a beginner, I read everywhere to avoid excess use of global variables. Well how to do so? My low skill fails. I am ending up passing tons of structures and it is harder to read than using globals. An tips on going through this problem/application structure design?
Depending on what your variables are doing, global scope might be the best scope. (Think flags to signal that an interrupt has arrived, and should be handled at a convenient time in the middle of a compute loop.)
Small utility programs can often feel much cleaner by using global variables (I'm thinking especially of small language parsers); but this makes it much harder to integrate the small utility programs into larger programs in the future. There are always trade-offs.
But chances are good the "correct" data organization will not feel quite so cumbersome. If you post code here, someone may be able to suggest cleaner layout, but the real problems come when code grows beyond easily-understood small samples.
I have a LOT of favorite programming style books, but I think the best I know of to address this situation is The Elements of Programming Style, by Kernighan and Plauger. It's quite old, and difficult to find, but short, sweet, and well worth finding used somewhere.
It's not as short, it's not as sweet, but still well worth finding Code Complete, 2nd edition. It's much more detailed, provides much more code, and provides much more diversity involved in designing software. It's excellent, but might be more intimidating.
There's nothing like studying the masters: the code in Advanced Programming in the Unix Environment, 2nd Edition is phenomenal, well worth every hour of study.
And, of course, there's always experience, but that takes time to acquire. Learning lessons from your own mistakes tends to stick much stronger than learning lessons from other people's mistakes. So keep at it. :)
I'd suggest Structured Design by Yourdon and Constantine. An old book by computer standards (it has examples involving tapes!) but very sound on the problems you are having.
Here are two options that you could use to improve your situation:
For read-only structures, have functions that can control access to the data with a const pointer:
struct my_struct;
const my_struct* GetMyStruct(void) const;
Limit the exposure of a global structure by declaring it static. This way it will only have file scope:
static mystruct myStructInstance;
If your program it the sort of "small" project where global variables don't feel so bad, but you think you might need to integrate it into a larger project in the future, a very simple solution is to add a single context pointer argument to each function and store all your "global" variables in there. If you always name it the same thing, you can even do stuff like:
#define current_filename context->current_filename
#define option_flags context->option_flags
etc. and your code will look virtually identical to how it would have looked with globals, except that you'll be able to have multiple instances of it in a single program, integrate it into a library, and so on with minimal fuss. Just keep those defines in a private header used by your source modules, not the public interface header.
#PeterK Problem is that structure as itself is always presented in C books a as container that can be declared/passed many times to different functions and that is a thing that may confused me and I never thought to use it as a simple one instance global container (and that may make my code more readable).
I am writing 3 phase motor control application to control 1 motor.
Based on what all you wrote please check if my current ideas of solving problem is right:
Pack some global information in structure according to function ex. (sInverterState, sButtonsState, sInverterParameters etc.)
If I write menu UI I can use static variables in C file and don’t care about passing structs when I have only 1 LCD. I don’t want to make it look like GTK++.
Writing reetrant code is not for me yet and its overdoing for this purpose.
Get proper education in IT field.
I may end up with lots of globals but at least they are nicely packed and readable.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I've just been going back over a bit of C studying using Ivor Horton's Beginning C book. I got to the bit about declaring constants which seems to get mixed up with variables in the same sentence.
Just to clarify, what is the difference in specifying constants and variables in C, and really, when do you need to use a constant instead of a variable? I know folks say to use a constant when the information doesn't change during program execution but I can't really think of a time when a variable couldn't be used instead.
A variable, as you can guess from the name, varies over time. If it doesn't vary, there is "no loss". When you tell the compiler that the value will not change, the compiler can do a whole bunch of optimizations, like directly inlining the value and never allocating any space for the constant on the stack.
However, you cannot always count on your compiler to be smart enough to be able to correctly determine if a value will change once set. In any situation where the compiler is incapable of determining this with 100% confidence, the compiler will err on the side of safety and assume it could change. This can result in various performance impacts like avoiding inlining, not optimizing certain loops, creating object code that is not as parallelism-friendly.
Because of this, and since readability is also important, you should strive to use an explicit constant whenever possible and leave variables for things that can actually change.
As to why constants are used instead of literal numbers:
1) It makes code more readable. Everyone knows what 3.14 is (hopefully), not everyone knows that 3.07 is the income tax rate in PA. This is an example of domain-specific knowledge, and not everyone maintaining your code in the future (e.g., a tax software) will know it.
2) It saves work when you make a change. Going and changing every 3.07 to 3.18 if the tax rate changes in the future will be annoying. You always want to minimize changes and ideally make a single change. The more concurrent changes you have to make, the higher the risk that you will forget something, leading to errors.
3) You avoid risky errors. Imagine that there were two states with an income tax rate of 3.05, and then one of them changes to 3.18 while the other stays at 3.07. By just going and replacing, you could end up with severe errors. Of course, many integer or string constant values are more common than "3.07". For example, the number 7 could represent the number of days in the week, and something else. In large programs, it is very difficult to determine what each literal value means.
4) In the case of string text, it is common to use symbolic names for strings to allow the string pools to change quickly in the case of supporting multiple languages.
Note that in addition to variables and "constant variables", there are also some languages with enumerations. An enumeration actually allows you to defines a type for a small group of constants (e.g., return values), so using them will provide type safety.
For example, if I have an enumeration for the days of the weeks and for the months, I will be warned if I assign a month into a day. If I just use integer constants, there will be no warning when day 3 is assigned to month 3. You always want type safety, and it improves readability. Enumerations are also better for defining order. Imagine that you have constants for the days of the week, and now you want your week to start on Monday rather than Sunday.
Using constants is more a way of defensive programming, to protect yourself from yourself, from accidentally changing the value somewhere in the code when you're coding at 2 a.m. or before having drunk your coffee.
Technically, yes, you can use a variable instead.
Constants have several advantages over variables.
Constants provide some level of guarantee that code can't change the underlying value. This is not of much importance for a smaller project, but matters on a larger project with multiple components written by multiple authors.
Constants also provide a strong hint to the compiler for optimization. Since the compiler knows the value can't change, it doesn't need to load the value from memory and can optimize the code to work for only the exact value of the constant (for instance, the compiler can use shifts for multiplication/division if the const is a power of 2.)
Constants are also inherently static - you can declare the constant and its value in a header file, and not have to worry about defining it exactly one place.
For one, performance optimization.
More importantly, this is for human readers. Remember that your target audience is not only the compiler. It helps to express yourself in code, and avoid comments.
const int spaceTimeDimensions = 4;
if(gpsSattelitesAvailable >= spaceTimeDimensions)
Good();
For a low-level language like C, constants allow for several compilation optimizations.
For a programming language in general, you don't really need them. High level dynamic languages such as Ruby and JavaScript doesn't have them (or at least not in a true constant sense). Variables are used instead, just like you suggested.
Constant is when you just want to share the memory, and it doesn't change.
The const keyword is often used for function parameters, particularly pointers, to suggest the memory the pointer points to will not be modified by the function. Look at the decleration for strcpy for instance:
char *strcpy(char *dest, const char *src);
Otherwise, for example, an declaration such as
const int my_magic_no = 54321;
might be preferred over:
#define MY_MAGIC_NO 54321
for type safety reasons.
It's a really easy way to trap a certain class of errors. If you declare a variable const, and accidentally try to modify it, the compiler will call you on it.
Constants are very necessary in regards to declaration and intialization of variable for any purpose such as at the starting of the loop, to check the condition within the if -else statement, etc.
For more reference, feel free to read either of the following articles:
Constants in C Programming Language
Variables in C Language
Not using const can mean someone in a team project could declare where int FORTY_TWO = 42 and make it equal FORTY_TWO = 41 somewhere else by another team member. Therefore the end of the world happens and you also loose the answer to life. with const although none of this will ever happen. Plus const is stored elsewhere in memory, when compared to the storage of normal variables, and is more efficient.
The Problem
I'm working on a large C project (C99) that makes heavy use of global variables (I know, I know). The program works fairly well, but it was originally designed to run once and exit.
As such, it relies on it's global/static memory to be initialized with 0 (or whatever value it was declared with), and during runtime it modifies these variables (as most programs do).
However, instead of exiting on completion, I want to run the program again. I want to make a parent program that has control and visibility into this large program. Having complete visibility into the running program is very important.
The solution needs to work on macOS, Linux, and Windows.
I've considered:
1. Forking it
Make a small wrapper program that serves as the "shell", and execute the large program as needed.
Pros
OS does the hard work of resetting the memory to the correct values
Guaranteed to operate as intended
Cons
Lost visibility into the program
Can't inspect memory of executing program from wrapper during runtime, harder to tweak settings before launching, harder to collect runtime information
Need to implement a system to get internal data in/out of the program, potentially touching a lot of code
Unified experience harder (sharing a GUI window, etc)
2. Identify critical structures manually
Peruse the source, run the program multiple times, wait for program to blow up on a sanity check or bad memory access.
Pros
Easy to do
Easy to start
High visibility, code sharing, and unification
Cons
Does not catch every case, very patchwork
Time consuming
3. Refactor
Collect all globals into a single structure to memset, create initializers for variables that are initialized with a value. Handle statics on a case-by-case basis.
Pros
Conceptually easy, sledgehammer approach
High visibility, code sharing, and unification
Cons
Very time consuming, codebase large, would touch pretty much everything
4. Magic wand
Tell the OS to reinitialize global/static memory. If I need to save a value, I'll store it locally and then rewrite it when it's done.
Pros
Mostly perfect :)
Cons
Doesn't exist (?)
Very black magic
Probably not cross platform
May anger 3rd-party libs
What I am doing now
I am going with option 2 right now, just making my way through the code, leaning on the program to crash and point me in the right direction.
I'd say this method has gotten me about 80% of the way there. I've identified and reinitialized enough things that the program, more or less, can be rerun. It's not as widespread as I thought, and it gives me a lot of hope.
Occasionally, strange things happen or it doesn't operate as intended, but it also doesn't crash. This makes tracking it down more difficult.
I just need something to get me that last 20%. Maybe some sort of static analysis tool, or something to help me go through the source and see where globals are touched.
To detect easily the global and static variables you can try CppDepend and execute a cqlinq query like this one
from f in Fields where f.IsGlobal || f.IsStatic
select f
You can also modify the query if you want the variables used by a speific function or in a specific file.