Can clang-format break my code? - c

As clang-format is a tool to only reformat code, is it possible that such formatting can break working code or at least change how it works? Is there some kind of contract that it will/can not change how code works?
We have a lot of code that we want to format with clang-format. This means, many lines of code will change. Not having to review every single line of code that only changed due to a clang-format would be a big simplification of this process.
I would say that clang-format will not change how code works. On the other hand I am not 100% sure, if this can be guaranteed.

Short answer: YES.
The clang-format tool has a -sort-includes option. Changing the order of #include directives can definitely change the behavior of existing code, and may break existing code.
Since the corresponding SortIncludes option is set to true by several of the built-in styles, it might not be obvious that clang-format is going to reorder your includes.
MyStruct.h:
struct MyStruct {
uint8_t value;
};
original.c:
#include <stdint.h>
#include <stddef.h>
#include "MyStruct.h"
int main (int argc, char **argv) {
struct MyStruct s = { 0 };
return s.value;
}
Now let's say we run clang-format -style=llvm original.c > restyled.c.
restyled.c:
#include "MyStruct.h"
#include <stddef.h>
#include <stdint.h>
int main(int argc, char **argv) {
struct MyStruct s = {0};
return s.value;
}
Due to the reordering of the header files, I get the following error when compiling restyled.c:
In file included from restyled.c:1:
./MyStruct.h:2:5: error: unknown type name 'uint8_t'
uint8_t value;
^
1 error generated.
However, this issue should be easy to work around. It's unlikely that you have order-dependent includes like this, but if you do, you can fix the problem by putting a blank line between groups of headers that require a specific order, since apparently clang-format only sorts groups of #include directives with no non-#include lines in between.
fixed-original.c:
#include <stdint.h>
#include <stddef.h>
#include "MyStruct.h"
int main (int argc, char **argv) {
struct MyStruct s = { 0 };
return s.value;
}
fixed-restyled.c:
#include <stddef.h>
#include <stdint.h>
#include "MyStruct.h"
int main(int argc, char **argv) {
struct MyStruct s = {0};
return s.value;
}
Note that stdint.h and stddef.h were still reordered since their includes are still "grouped", but that the new blank line prevented MyStruct.h from being moved before the standard library includes.
However...
If reordering your #include directives breaks your code, you should probably do one of the following anyway:
Explicitly include the dependencies for each header in the header file. In my example, I'd need to include stdint.h in MyStruct.h.
Add a comment line between the include groups that explicitly states the ordering dependency. Remember that any non-#include line should break up a group, so comment lines work as well. The comment line in the following code also prevents clang-format from including MyStruct.h before the standard library headers.
alternate-original.c:
#include <stdint.h>
#include <stddef.h>
// must come after stdint.h
#include "MyStruct.h"
int main (int argc, char **argv) {
struct MyStruct s = { 0 };
return s.value;
}

For sure it can change how your code works. And the reason is C program can view some properties of its source code. What I'm thinking about is __LINE__ macro, but I'm not sure there are no other ways.
Consider 1.c:
#include <stdio.h>
int main(){printf("%d\n", __LINE__);}
Then:
> clang 1.c -o 1.exe & 1.exe
2
Now do some clang-format:
> clang-format -style=Chromium 1.c >2.c
And 2.c is:
#include <stdio.h>
int main() {
printf("%d\n", __LINE__);
}
And, of course, output has changed:
> clang 2.c -o 2.exe & 2.exe
3

Since clang-format affects only whitespace characters, you can check that files before and after clang-formating are identical up to whitespaces. In Linux/BSD/OS X you can use diff and tr for that:
$ diff --ignore-all-space <(tr '\n' ' ' < 2.c ) <(tr '\n' ' ' < 1.c)
1.c:
#include <stdio.h>
int main() {printf("Hello, world!\n"); return 0;}
2.c:
#include <stdio.h>
int main() {
printf("Hello, world!\n");
return 0;
}
Output of diff command is empty, meaning that files 1.c and 2.c are identical up to whitespaces.
As Karoly mentioned in his comment, note that in ideal conditions you still have to check spaces that matters, e.g. string literals. But in the real world I believe this test is more than enough.

clang-format reformatted ASM code in a project because we effectively did this:
#define ASM _asm
ASM {
...
}

yes
it will not break the working flow
the system has the config switch:
"C_Cpp.clang_format_sortIncludes": false,
but it not work, i don't know what is wrong...
my version is:ms-vscode.cpptools-0.13.1
this is my solution:
for the stable working flow ,use the grammar:
// clang-format off
...here is your code
// clang-format on

It can break your code, if you use special constructs in your code and your settings for formatting.
Inline Assembler
If you normally compile your code with gcc and make use of gcc-style inline assembler, clang-format will very likely break the naming of register variables, as it sees the %-character as an operator.
asm_movq(%[val2], %%mm0)
will be reformatted as
asm_movq(% [val2], % % mm0)
which will no longer compile.
Constructing a Path in a macro
If you build up a path using macros without using strings, clang-format again will see the '/' character as an operator and will put spaces around it.
Boost e.g. uses a construct like this:
# define AUX778076_PREPROCESSED_HEADER \
BOOST_MPL_CFG_COMPILER_DIR/BOOST_MPL_PREPROCESSED_HEADER
to construct a path to a header file. The '/' is not an operator here, but as it is not inside a string, clang-format treats it as an operator and puts spaces around it, creating a different path.
The include of the header file will obviously fail.
Conclusion
Yes, clang-format can break your code. If you are using very specific constructs that are edge cases or outside of the language standard or simply extensions of your very specific compiler (which is not clang), then you will need to check the changes made by clang-format. Otherwise you risk getting hidden errors.

I imagine it would not, given that it is built on clang's static analysis, and therefore has knowledge of the structure of code itself, rather than just a dumb source code formatter that operates on the text alone(one of the boons of being able to use a compiler library). Given that the formatter uses the same parser and lexer as the compiler itself, I'd feel safe enough that it wouldn't have any issue spitting out code that behaves the same as what you feed it.
You can see the source code for the C++ formatter here: http://clang.llvm.org/doxygen/Format_8cpp_source.html

Related

When should I use preprocessor directives over if statements

I am sorry if this sounds like a dumb question, I am learning C, and I was wondering: when should I prioritize this syntax, for example
#include <stdio.h>
#define ALIVE 1
int main(void) {
#if ALIVE
printf("Alive");
#else
printf("Unalived");
#endif
}
Over this syntax (example):
#include <stdio.h>
#define ALIVE 1
int main(void) {
if (ALIVE)
printf("Alive");
else
printf("Unalived");
}
Thank you for spending time reading my question, I hope this isn't a dumb question and I wish you a nice day.
For starters, and to see the main difference between the two programs you show, let's see how they will look after preprocessing.
The first one:
#include <stdio.h>
#define ALIVE 1
int main(void) {
#if ALIVE
printf("Alive");
#else
printf("Unalived");
#endif
}
will expand to
#include <stdio.h>
int main(void) {
printf("Alive");
}
The second one:
#include <stdio.h>
#define ALIVE 1
int main(void) {
if (ALIVE)
printf("Alive");
else
printf("Unalived");
}
will expand to:
#include <stdio.h>
int main(void) {
if (1)
printf("Alive");
else
printf("Unalived");
}
[Examples above skip the actual inclusion of the header file]
While this doesn't directly answer your question it can give hints to what using the preprocessor conditional compilation for.
The main use is to conditionally give the compiler different code depending on the macros. Mostly used for portability-issues, when creating programs that needs to be built for different systems (for example Linux and Windows).
When using the preprocessor to do conditional compilation, whole parts of the code, that would otherwise be invalid on the target system, could simply be omitted and the compiler won't even see it.
If you use the standard C if statement, then both branches of the condition must be valid code that the compiler can build.
As a rule of thumb, you should always prefer if (... over #if and should only use #if where if will not work
when there's something in the if that is syntactically incorrect when the condition is false so it won't compile at all
when you need to do this at the global scope, where statements (like if) are not allowed
in your example, the if version is much better.

C header file is causing warning "ISO C requires a translation unit to contain at least one declaration"

Using Qt Creator I made these plain C files just to test my understanding:
main.c
#include <stdio.h>
#include "linked.h"
int main()
{
printf("Hello World!\n");
printf("%d", linked());
return 0;
}
linked.h
#ifndef LINKED_H_
#define LINKED_H_
int linked(void);
#endif // LINKED_H
linked.c
int linked()
{
return 5;
}
The IDE shows a warning on the line of linked.h in-between #define LINKED_H_ and int linked(void); which reads
ISO C requires a translation unit to contain at least one declaration
My best guess about what this means is that any header or other C file, if it is in a project, should get used in the main file at least once somewhere. I've tried searching the warning but if this has been answered elsewhere, I'm not able to understand the answer. It seems to me I've used the linked function and so it shouldn't give me this warning. Can anyone explain what's going on?
The program compiles and runs exactly as expected.
I think the issue is that you don't #include "linked.h" from linked.c. The current linked.c file doesn't have any declarations; it only has one function definition.
To fix this, add this line to linked.c:
#include "linked.h"
I don't know why it says this is an issue with linked.h, but it seems to be quite a coincidence that the line number you pointed out just happens to be the line number of the end of linked.c.
Of course, that may be all this is; a coincidence. So, if that doesn't work, try putting some sort of external declaration in this file. The easiest way to do that is to include a standard header, such as stdio.h. I would still advise you to #include "linked.h" from inside linked.c, though.
add a header
#ifndef LINKED_H_
#define LINKED_H_
#include <stdio.h>
int linked(void);
#endif // LINKED_H
The way you wrote the code, you need to use:
extern int linked(void);
(notice the additional "extern"). That might help with the issue.
Also, the code in linked.c should be:
int linked(void)
{
return 5;
}
(Notice the "parameter" - "void").
According to IBM, you need some declaration in the header file, but you do have one. Perhaps LINKED_H_ is defined elsewhere, or the compiler is seeing that it's possible that the precompiler condition might result in an empty parse.
Perhaps this header file will work for you:
linked.h
#ifndef LINKED_H_
#define LINKED_H_
int linked(void);
#endif // LINKED_H
char __allowLinkedHToBeIsoCCompliant = 1;

Include from preprocessor macro

I am trying include a file constructed from pre-processor macros, but running into a wall due to rules regarding tokens, it seems. I used the answer here as a reference: Concatenate string in C #include filename, but my case differs in that there are decimal points in the define I am using to construct my include. This is what I have currently that will not get through the preprocessor stage:
main.c:
#include <stdio.h>
#include <stdlib.h>
#define VERSION 1.1.0
#define STRINGIFY(arg) #arg
#define INCLUDE_HELPER(arg) STRINGIFY(other_ ##arg.h)
#define INCLUDE_THIS(arg) INCLUDE_HELPER(arg)
#include INCLUDE_THIS(VERSION)
int main(int argc, char **argv) {
printf(INCLUDE_THIS(VERSION));
fflush(stdout);
#if defined (SUCCESS)
printf("\nSUCCESS!\n");
#endif
return EXIT_SUCCESS;
}
other_1.1.0.h:
#define SUCCESS
Were I to use #define VERSION 1_1_0 and renamed the header accordingly it would work (but not viable for my use as I have no control over the name of the header files the actual project uses), but 1.1.0 is not a valid preprocessor token.
EDIT:
After a bit more digging through the documentation, I see that 1.1.0 is a valid preprocessing number; it is the resulting concatenation of other_1.1.0 that is invalid. Regardless, the issue of not being able to construct the include remains.
It's easy once you stop thinking about token concatenation. Stringification works with any sequence of tokens, so there is no need to force its argument into being a single token. You do need an extra indirection so that the argument is expanded, but that's normal.
The only trick is to write the sequence without whitespace, which is what ID is for:
#define STRINGIFY(arg) STRINGIFY_(arg)
#define STRINGIFY_(arg) #arg
#define ID(x) x
#define VERSION 1.1.0
#include STRINGIFY(ID(other_)VERSION.h)
See https://stackoverflow.com/a/32077478/1566221 for a longer explanation.
With some experimentation, I came up with a solution that, while not ideal, could be workable.
#define VERSION _1.1.0
#define STRINGIFY(arg) #arg
#define INCLUDE_HELPER(arg) STRINGIFY(other ##arg.h)
#define INCLUDE_THIS(arg) INCLUDE_HELPER(arg)
#include INCLUDE_THIS(VERSION)
Rather than pasting other_ and 1.1.0 together, I am pasting other and _1.1.0. I am not sure why this is acceptable as the resulting token is the same, but there it is.
I would still prefer to have a solution that allows me to just define the version number without the underscore, so I will hold off on accepting this answer in case someone can come up with a more elegant solution (and works for people who don't happen to need an underscore anyways)
If you are passing -DVERSION=1.1.0 as a compile-line parameter, rather than hard-wiring it in the source code, then there's nothing to stop you passing a second define using make or the shell to do the concatenation. For example, in a makefile, you might have:
VERSION = 1.1.0
VERSION_HEADER = other_${VERSION}.h
CFLAGS += -DVERSION=${VERSION} -DVERSION_HEADER=${VERSION_HEADER}
and then:
#include <stdio.h>
#include <stdlib.h>
#define STRINGIFY(arg) #arg
#define INCLUDE_HELPER(arg) STRINGIFY(arg)
#define INCLUDE_THIS(arg) INCLUDE_HELPER(arg)
#include INCLUDE_THIS(VERSION_HEADER)
int main(void)
{
printf("%s\n", INCLUDE_THIS(VERSION));
#if defined (SUCCESS)
printf("SUCCESS!\n");
#endif
return EXIT_SUCCESS;
}
which is basically your code with the #define VERSION line removed, and using the stringified version of VERSION_HEADER instead of trying to construct the header name in the source code. You might want to use:
#ifndef VERSION
#define VERSION 1.1.0
#endif
#ifndef VERSION_HEADER
#define VERSION_HEADER other_1.1.0.h
#endif
for some suitable default fallback version in case the person running the compilation doesn't specify the information on the command line. Or you might use #error You did not set -DVERSION=x.y.z on the command line instead of setting the default value.
When compiled (source file hdr59.c):
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror -DVERSION=1.1.0 \
> -DVERSION_HEADER=other_1.1.0.h hdr59.c -o hdr59
$ ./hdr59
1.1.0
SUCCESS!
$
I would put the three lines of macro and the #include line into a separate small header so that it can be included when the version header is needed. If the default setting is required too, then that adds to the importance of putting the code into a separate header for reuse. The program's source code might contain:
#include "other_version.h"
and that header would arrange to include the correct file, more or less as shown.

C error: conflicting types for function and previous declaration was here (not duplicate)

Apologies for the dumb question. I checked all similar questions for the same error on stackoverflow, but it didn't help me understand why this error is happening in the following code.
I have one additional header file and a source file, which is included in the main file, and when I compile, I am getting the following error. I am trying to pass the char** argv from the main() to another function defined in another header file.
#include "include/Process.h"
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv) {
if (argc < 2) {
printf("Please provide a path to file\n");
return (EXIT_FAILURE);
}
Process(argv);
Process.h:
#pragma once
extern void Process(char** path);
Process.c:
#include <stdio.h>
#include "../include/Process.h"
#include <stdlib.h>
#include <sys/stat.h>
#include <syslog.h>
#include <sys/types.h>
#include <unistd.h>
void Process(char** path) {
printf("%s\n", path[1]);
}
It gets compiled but the warning is
./src/Process.c:22:6: error: conflicting types for ‘Process’
void Process(char** path) {
^
./include/Process.h:17:6: note: previous declaration of ‘Process’ was here
extern void Process(char** path);
^
However, the warning disappears when I change the type of path from char** to char* and pass argv[1] instead of argv.
I am clueless why this is happening like this, and according to
this similar post, I tried adding a forward declaration for char** path above extern void Process(char** path); in the Process.h file, but it didn't help either.
Why is this error thrown when using char** path?
Why it disappears when I use char* path?
So far, I am able to see the program running, even with this warning. Is it safe to ignore this warning? If not, what could be the possible effects it can have during runtime?
Using gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13)
Thanks.
Try putting your custom includes after the system includes.
It might be possible that the custom include defines a macro which interferes with the system includes. To minimize the risk of this, I always put the Standard C includes first, then any OS includes, and then third party libraries, and then my own ones
In theory the custom include shouldn't do this, and the system includes should only use reserved names, but in practice this doesn't always happen.

Header Files in C

I have been reading about C for a while now and decided lets write a little add program, nothing fancy at all. My understanding of C headers is that they are "interfaces" (such as like java and other languages) but where you can also define variable that either have set values or not..
So I wrote this:
#include <stdio.h>
#include <stdlib.h>
#include "sample.h"
int main(int argc, char** argv) {
printf("hello World\n");
add(12, 18);
return (EXIT_SUCCESS);
}
int add(int a, int b){
int value = a+b;
printf("value = %d\n", value);
return 0;
}
It has a header file that looks like such:
#ifndef SAMPLE_H_GUARD
#define SAMPLE_H_GUARD
int add(int a, int b);
#endif
I thought header files, and this is where I am lost on their definition, was suppose to define the use of add, so all I would have to do is call add - From my understanding, I define the rules of add and then implement the functionality of add....
Also, A lot of the material I have read shows one header file for multiple C files. where as a lot of projects today have one header per one c, meaning Sample.h belongs to Sample.c and nothing else.
Can some one shed some light on this?
Could I have done this like so:
main.c
#include <stdio.h>
#include <stdlib.h>
#include "sample.h"
int main(int argc, char** argv) {
printf("hello World\n");
add(12, 18);
return (EXIT_SUCCESS);
}
add.c
#include <stdio.h>
#include <stdlib.h>
#include "sample.h"
int add(int a, int b){
int value = a+b;
printf("value = %d\n", value);
return 0;
}
sample.h
#ifndef SAMPLE_H_GUARD
#define SAMPLE_H_GUARD
int add(int a, int b);
#endif
I believe in the book I was reading: C Programming Language they had a calculator example split up like this, my question is how does C know where add is defined? It knows the rules for it based on the header file, i think, but not where the actual implementation is ....
There example where they split of the files like such doe not have something like #include "add.c" all they do is include the header file in the files that either implement or use this functionality.
Note: obviously the calculator example and my example are going to be different but fundamentally the same - for those who have the book. I am just lost on how to use header files effectively and efficiently.
A header file in C would declare the function add for those modules that need it, but not define the function. The function is still to be defined in its own module (e.g., in your case, add.c).
So in general, to make a function foo available to several modules, you would normally:
Choose a header file (maybe it's own if there are other associated
defines, etc) to declare foo. For example, perhaps foo.h would
have void foo(...);
In some module, perhaps foo.c, you would define the complete
function foo.
In any module that wants to call foo, you would #include "foo.h"
(or whatever header you used) and call the function.
When you compile/link the code, you would make sure all modules,
including foo.o or whatever module has foo defined in it, were
present.
A declaration, given in the header file, provides (of course) the function name, the function return type as well as listing all the parameters and their types. This is all the compiler needs to know to figure out how to call the function from the calling module. At link time, addresses are all resolved so that the modules then know exactly where the function is in its own particular module.
My understanding of C headers is that they are "interfaces" (such as
like java and other languages) but where you can also define variable
that either have set values or not..
This is not correct. You cannot "define" variables - well, you can but that will cause multiple definitions error while compiling code if you include header more than once.
Could I have done this like so:
Regarding your code - both variants are correct. C language uses headers to read declarations and hence headers are optional as well. You can have your code split into as many as you want .h and .c files. Compiler will create an object file for each .c file. All .h files included in a c file are basically embedded in that C file "before compilation" i.e. in preprocessing phase. Linker then comes in picture which combines objects to produce the executable.
Please don't hesitate if something is not clear in my answer.

Resources