My Question for larger coding perspective but I'm trying to understand with simple example. Lets say I have few lines of Code
int main(void) {
int input_1 = 10;
int input_2 = 10;
/* some stuff */
return 0;
}
After reading design principles(I am not sure whether it was common for programming language or not, I hope its generic) I came to know that above code is valid C code but its a dirty code because here I'm not following DRY(Don't repeat yourself) principle as magic number 10 is repeating.
Firstly My doubt is, Does C standard says the same about best practices of coding, I read specs but I didn't get exactly ?
And I modified as below to avoid the phrase Dirty Code
int main(void) { /* I'm not 100 percent sure that this is not dirty code ? */
const int value = 10; /*assigning 10 to const variable*/
int input_1 = value;
int input_2 = value;
/* some stuff */
return 0;
}
Does modified version is the correct or can I do something more better in that ? Finally If these design principles are best suggested than why compilers doesn't produce any warning.
This is more about avoiding magic numbers. Your 10 should have some semantic meaning if you claim it's "the same 10". Then you should do something like
#define FROBNUM 10 // use a name here that explains the meaning of the number
int main(void) {
int input_1 = FROBNUM;
int input_2 = FROBNUM;
/* some stuff */
return 0;
}
Introducing a const is unnecessary, macros solve this problem nicely. DRY is addressed here, the macro definition is the single source of the concrete value.
If there is on the other hand no semantic relationship between the two 10 values, #define two macros instead. This isn't "repeating yourself" if they indeed have a different meaning. Don't misunderstand DRY here.
Side note about your version with const: It has two flaws
The name value isn't semantic at all, so nothing gained, the number is still magic
With this declaration, you introduce a new object of automatic storage duration and type int, which you don't really need. A good compiler would optimize it away, but better not rely on that -- that's why a macro fits better here.
DRY mostly refers to there being one single source of truth. Certain business rules or reusable code patterns should only be expressed once, especially if they may be altered in the future. Examples include code to calculate shipping fees or tax rates, which you want to code exactly once and alter exactly in one place if they change; or the instantiation of a database adapter which you can alter in exactly one place when the database details change.
DRY does not mean that you must reduce every line of code which looks similar to another line of code into one single line.
Related
I'm developing code for an arduino based board, and I'm using VSCode, as I find it better than the Arduino IDE.
Now, in some parts of the code, I like to group certain statements together, to organise the code better. In C# (using Visual Studio) I would use #region NAME to do this. The C variant of this is #pragma region, however, I find this clutters the code, and isn't quite as clean as I would want it.
Instead I thought of using curly braces {}, to achieve something similar, but to my understanding the compiler uses them to declare scope right? So would using them like this:
char *data;
{
free(data);
}
produce any odd behaviour? From what I've tried the compiler doesn't seem to mind, but maybe I just haven't tried enough cases.
So, I guess what I want to know is: Would using curly braces in this way be detrimental to general coding in C?
The compound statement forms a block scope.
So for example this code snippet
int x;
int y = 10;
x = y;
is not equivalent tp
int x;
{
int y = 10;
}
x = y;
In the last case the compiler will issue an error that the identifier y is not declared.
Also using redundant braces makes a code less readable and confusing.
Arduino is not C only C++.
The best way of organizing the code in C++ is to use classes and structures.
In C# region is used to collapse and expand the code in the editor when outlining and has no effect on the code execution. Arduino IDE does not have any of those fancy editor features.
Compound statement in C or C++ is something completely different than the #region in C#. It creates a new scope that affects the compilation and execution of the code.
Using blocks to group statements and set them off is fine for suitable purposes. One situation I encountered from time to time was when a matrix needed separate processing for its first and last rows, due to border effects. Then the code could look like:
{
int row = 0;
// Code for first row.
}
for (int row = 1; row < N-1; ++row)
{
// Code for middle rows.
}
{
int row = N-1;
// Code for last row.
}
Often the code was largely similar for the three cases, and using blocks each of them made that similarity more visually apparent to the reader. At the same time, having the same level of indentation for each of the cases made the differences easier to see.
Similarly, blocks can organize sections of a function that are semi-repetitive but do not involve loops (and are not involved enough to deserve functions of their own, or that use so many variables that passing them parameters would be a mess).
Blocks are well defined by the C standard and will not produce any “odd behavior.”
I have several enums that serve as type constants. For example:
enum item_type {
street,
town,
lake,
border,
...
}
The enum values are used in code to designate object types, and are written out to disk as part of data files. This mostly works well, but there is one drawback:
There is no way to remove an enum member (because it is no longer used) without changing the integer values of all subsequent members. So any such change would make the code incompatible with existing data files.
Is there some good technique for avoiding this problem? Maybe some preprocessor trick?
The only solution I can think of is to explicitly set all the integer values. While that would work, it is hard to read and manage for big enums.
Note: This problem comes from the source code of Navit, which uses several such "type enums" (though they are actually hidden behind some macros).
If you want to remove items very rarely, you could do something like
enum item_type {
street,
town,
//lake,
border = town+2,
...
}
i.e. only explicitly assign a value to the item immediately following the one you remove.
Since compatibility is very important to you, it'd be more reliable to just bite the bullet and explicitly number all items
enum item_type {
street = 0,
town = 1,
//lake = 2,
border = 3,
...
}
I ended up declaring a macro UNUSED, which expands to
UNUSED_<linenumber>. Then unused enum values can just be replaced by
UNUSED. The macro expands to a unique identifier on each line it is
used because otherwise the compiler would complain about duplicate enum
entries it were used multiple times inside one enum.
This is slightly ugly if you have many "gaps". Still, I chose this
solution over simonc's solution because it easy to read (keeps the regular enum values free of visual
clutter like three=zero+2) and does not require magic numbers.
Admittedly this only makes sense if the gaps are few and far between.
For large gaps simonc's solution looks better.
Complete example:
#include <stdio.h>
#define UNUSED UNUSED_P(__LINE__)
#define UNUSED_P(x) UNUSED_P2(x)
#define UNUSED_P2(x) UNUSED_##x
enum e {
zero,
UNUSED,
UNUSED,
three,
};
int main(void){
printf("int value of 'three': %d\n",three);
return 0;
}
The double replacement is adapted from, among others, this question:
c++ - How, exactly, does the double-stringize trick work? .
An Example
Suppose we have a text to write and could be converted to "uppercase or lowercase", and can be printed "at left, center or right".
Specific case implementation (too many functions)
writeInUpperCaseAndCentered(char *str){//..}
writeInLowerCaseAndCentered(char *str){//..}
writeInUpperCaseAndLeft(char *str){//..}
and so on...
vs
Many Argument function (bad readability and even hard to code without a nice autocompletion IDE)
write( char *str , int toUpper, int centered ){//..}
vs
Context dependent (hard to reuse, hard to code, use of ugly globals, and sometimes even impossible to "detect" a context)
writeComplex (char *str)
{
// analize str and perhaps some global variables and
// (under who knows what rules) put it center/left/right and upper/lowercase
}
And perhaps there are others options..(and are welcome)
The question is:
Is there is any good practice or experience/academic advice for this (recurrent) trilemma ?
EDIT:
What I usually do is to combine "specific case" implementation, with an internal (I mean not in header) general common many-argument function, implementing only used cases, and hiding the ugly code, but I don't know if there is a better way that I don't know. This kind of things make me realize of why OOP was invented.
I'd avoid your first option because as you say the number of function you end up having to implement (though possibly only as macros) can grow out of control. The count doubles when you decide to add italic support, and doubles again for underline.
I'd probably avoid the second option as well. Againg consider what happens when you find it necessary to add support for italics or underlines. Now you need to add another parameter to the function, find all of the cases where you called the function and updated those calls. In short, anoying, though once again you could probably simplify the process with appropriate use of macros.
That leaves the third option. You can actually get some of the benefits of the other alternatives with this using bitflags. For example
#define WRITE_FORMAT_LEFT 1
#define WRITE_FORMAT_RIGHT 2
#define WRITE_FORMAT_CENTER 4
#define WRITE_FORMAT_BOLD 8
#define WRITE_FORMAT_ITALIC 16
....
write(char *string, unsigned int format)
{
if (format & WRITE_FORMAT_LEFT)
{
// write left
}
...
}
EDIT: To answer Greg S.
I think that the biggest improvement is that it means that if I decide, at this point, to add support for underlined text I it takes two steps
Add #define WRITE_FORMAT_UNDERLINE 32 to the header
Add the support for underlines in write().
At this point it can call write(..., ... | WRITE_FORMAT_UNLDERINE) where ever I like. More to the point I don't need to modify pre-existing calls to write, which I would have to do if I added a parameter to its signature.
Another potential benefit is that it allows you do something like the following:
#define WRITE_ALERT_FORMAT (WRITE_FORMAT_CENTER | \
WRITE_FORMAT_BOLD | \
WRITE_FORMAT_ITALIC)
I prefer the argument way.
Because there's going to be some code that all the different scenarios need to use. Making a function out of each scenario will produce code duplication, which is bad.
Instead of using an argument for each different case (toUpper, centered etc..), use a struct. If you need to add more cases then you only need to alter the struct:
typedef struct {
int toUpper;
int centered;
// etc...
} cases;
write( char *str , cases c ){//..}
I'd go for a combination of methods 1 and 2.
Code a method (A) that has all the arguments you need/can think of right now and a "bare" version (B) with no extra arguments. This version can call the first method with the default values. If your language supports it add default arguments. I'd also recommend that you use meaningful names for your arguments and, where possible, enumerations rather than magic numbers or a series of true/false flags. This will make it far easier to read your code and what values are actually being passed without having to look up the method definition.
This gives you a limited set of methods to maintain and 90% of your usages will be the basic method.
If you need to extend the functionality later add a new method with the new arguments and modify (A) to call this. You might want to modify (B) to call this as well, but it's not necessary.
I've run into exactly this situation a number of times -- my preference is none of the above, but instead to use a single formatter object. I can supply it with the number of arguments necessary to specify a particular format.
One major advantage of this is that I can create objects that specify logical formats instead of physical formats. This allows, for example, something like:
Format title = {upper_case, centered, bold};
Format body = {lower_case, left, normal};
write(title, "This is the title");
write(body, "This is some plain text");
Decoupling the logical format from the physical format gives you roughly the same kind of capabilities as a style sheet. If you want to change all your titles from italic to bold-face, change your body style from left justified to fully justified, etc., it becomes relatively easy to do that. With your current code, you're likely to end up searching through all your code and examining "by hand" to figure out whether a particular lower-case, left-justified item is body-text that you want to re-format, or a foot-note that you want to leave alone...
As you already mentioned, one striking point is readability: writeInUpperCaseAndCentered("Foobar!") is much easier to understand than write("Foobar!", true, true), although you could eliminate that problem by using enumerations. On the other hand, having arguments avoids awkward constructions like:
if(foo)
writeInUpperCaseAndCentered("Foobar!");
else if(bar)
writeInLowerCaseAndCentered("Foobar!");
else
...
In my humble opinion, this is a very strong argument (no pun intended) for the argument way.
I suggest more cohesive functions as opposed to superfunctions that can do all kinds of things unless a superfunction is really called for (printf would have been quite awkward if it only printed one type at a time). Signature redundancy should generally not be considered redundant code. Technically speaking it is more code, but you should focus more on eliminating logical redundancies in your code. The result is code that's much easier to maintain with very concise, well-defined behavior. Think of this as the ideal when it seems redundant to write/use multiple functions.
This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
Formatting of if Statements
Is there a best coding style for identations (same line, next line)?
Best way to code stackoverflow style 'questions' / 'tags' rollover buttons
public void Method {
}
or
public void Method
{
}
Besides personal preference is there any benefit of one style over another? I used to swear by the second method though now use the first style for work and personal projects.
By readability I mean imagine code in those methods - if/else etc...
Google C++ Style Guide suggests
Return type on the same line as function name, parameters on the same line if they fit.
Functions look like this:
ReturnType ClassName::FunctionName(Type par_name1, Type par_name2) {
DoSomething();
...
}
WebKit Coding Style Guidelines suggests
Function definitions: place each brace on its own line.
Right:
int main()
{
...
}
Wrong:
int main() {
...
}
They suggest braces-on-same-line for everything else, though.
GNU Coding Standards suggests
It is important to put the open-brace that starts the body of a C function in column one, so that they will start a defun. Several tools look for open-braces in column one to find the beginnings of C functions. These tools will not work on code not formatted that way.
Avoid putting open-brace, open-parenthesis or open-bracket in column one when they are inside a function, so that they won't start a defun. The open-brace that starts a struct body can go in column one if you find it useful to treat that definition as a defun.
It is also important for function definitions to start the name of the function in column one. This helps people to search for function definitions, and may also help certain tools recognize them. Thus, using Standard C syntax, the format is this:
static char *
concat (char *s1, char *s2)
{
...
}
or, if you want to use traditional C syntax, format the definition like this:
static char *
concat (s1, s2) /* Name starts in column one here */
char *s1, *s2;
{ /* Open brace in column one here */
...
}
As you can see, everybody has their own opinions. Personally, I prefer the Perl-ish braces-on-same-line-except-for-else, but as long as everybody working on the code can cooperate, it really doesn't matter.
I think it is completely subjective, however, I think it is important to establish code standards for your team and have everyone use the same style. That being said I like the second one (and have made my team use it) because it seems easier to read when it is not your code.
In the old days we used to use the first style (K & R style) because screens were smaller and code was often printed onto this stuff called paper.
These days we have big screen and the second method (ANSI style) makes it easier to see if your brackets match up.
See HERE and HERE for more information.
First one is smaller in terms of number of lines (maybe that is why development -Java- books tend to use that syntax)
Second one is, IMHO easier to read as you always have two aligned brackets.
Anyway both of them are widely used, it's a matter of your personal preferences.
I use the if statement as something to reason on in this highly emotive subject.
if (cond) {
//code
}
by just asking what does the else statement look like? The logical extension of the above is:-
if (cond) {
//code
} else {
//more code
}
Is that readable? I don't think so and its just plain ugly too.
More lines is != less readable. Hence I'd go with your latter option.
Personally I find the second one more readable (aligned curlys).
Its always easiest for a team to go with the defaults, and since Visual Studio and I agree on this, thats my argument. ;-)
Your lines of code count will be considerably less with the first option. :)
I have legacy C code base at work and I find a lot of function implementations in the style below.
char *DoStuff(char *inPtr, char *outPtr, char *error, long *amount)
{
*error = 0;
*amount = 0;
// Read bytes from inPtr and decode them as a long storing in amount
// before returning as a formatted string in outPtr.
return (outPtr);
}
Using DoStuff:
myOutPtr = DoStuff(myInPtr, myOutPtr, myError, &myAmount);
I find that pretty obtuse and when I need to implement a similar function I end up doing:
long NewDoStuff(char *inPtr, char *error)
{
long amount = 0;
*error = 0;
// Read bytes from inPtr and decode them as a long storing in amount.
return amount;
}
Using NewDoStuff:
myAmount = NewDoStuff(myInPtr, myError);
myOutPtr += sprintf (myOutPtr, "%d", myAmount);
I can't help but wondering if there is something I'm missing with the top example, is there a good reason to use that type of approach?
One advantage is that if you have many, many calls to these functions in your code, it will quickly become tedious to have to repeat the sprintf calls over and over again.
Also, returning the out pointer makes it possible for you to do things like:
DoOtherStuff(DoStuff(myInPtr, myOutPtr, myError, &myAmount), &myOther);
With your new approach, the equivalent code is quite a lot more verbose:
myAmount = DoNewStuff(myInPtr, myError);
myOutPtr += sprintf("%d", myAmount);
myOther = DoOtherStuff(myInPtr, myError);
myOutPtr += sprintf("%d", myOther);
It is the C standard library style. The return value is there to aid chaining of function calls.
Also, DoStuff is cleaner IMO. And you really should be using snprintf. And a change in the internals of buffer management do not affect your code. However, this is no longer true with NewDoStuff.
The code you presented is a little unclear (for example, why are you adding myOutPtr with the results of the sprintf.
However, in general what it seems that you're essentially describing is the breakdown of one function that does two things into a function that does one thing and a code that does something else (the concatenation).
Separating responsibilities into two functions is a good idea. However, you would want to have a separate function for this concatenation and formatting, it's really not clear.
In addition, every time you break a function call into multiple calls, you are creating code replication. Code replication is never a good idea, so you would need a function to do that, and you will end up (this being C) with something that looks like your original DoStuff.
So I am not sure that there is much you can do about this. One of the limitations of non-OOP languages is that you have to send huge amounts of parameters (unless you used structs). You might not be able to avoid the giant interface.
If you wind up having to do the sprintf call after every call to NewDoStuff, then you are repeating yourself (and therefore violating the DRY principle). When you realize that you need to format it differently you will need to change it in every location instead of just the one.
As a rule of thumb, if the interface to one of my functions exceeds 110 columns, I look strongly at using a structure (and if I'm taking the best approach). What I don't (ever) want to do is take a function that does 5 things and break it into 5 functions, unless some functionality within the function is not only useful, but needed on its own.
I would favor the first function, but I'm also quite accustomed to the standard C style.