Understanding the processed hello world source code in C [duplicate] - c

This question already has answers here:
What is the meaning of lines starting with a hash sign and number like '# 1 "a.c"' in the gcc preprocessor output?
(3 answers)
Closed 8 years ago.
I have a hello world program of the source code in C of the following:
For #include <stdio.h>
#define MESSAGE "Hello, world!"
int main()
{
puts(MESSAGE);
return 0;
}
Now if we preprocess the source code by gcc, we get in front:
# 1 "hello-world.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "hello-world.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 28 "/usr/include/stdio.h" 3 4
# 1 "/usr/include/features.h" 1 3 4
# 361 "/usr/include/features.h" 3 4
# 1 "/usr/include/sys/cdefs.h" 1 3 4
# 365 "/usr/include/sys/cdefs.h" 3 4
# 1 "/usr/include/bits/wordsize.h" 1 3 4
# 366 "/usr/include/sys/cdefs.h" 2 3 4
# 362 "/usr/include/features.h" 2 3 4
# 385 "/usr/include/features.h" 3 4
# 1 "/usr/include/gnu/stubs.h" 1 3 4
My question is apparently, # 1 gets repeated, and so on. So what does this mean? What would # 28 and # 365 and # 385 mean?

Those are source line numbers in the given files. For example, # 28 "/usr/include/stdio.h" 3 4 precedes lines that originated in line 28 of stdio.h.
You can read more by GCC's preprocessor output here. The format for the lines you've shown is:
# linenum filename flags

Related

C Preprocessor output [duplicate]

This question already has an answer here:
cpp preprocessor output not able to understand? [duplicate]
(1 answer)
Closed 5 years ago.
For gcc -E sample.c -o sample.i with the following input C program,
#include <stdio.h>
int main() {
printf("hello world\n");
return 0;
}
the sample.i have the following output preceded by # symbols and I wonder what the line with # exactly means.
# 1 "sample.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "sample.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 27 "/usr/include/stdio.h" 3 4
...
There are comments that help a person identify how the preprocessor expanded the various #include <...> macros and other items.
Reading these lines provides the equivalent of reading the logging messages of the preprocessor as it encounters the macros and expands them.
# 1 "sample.c"
Start on line one of the input "sample.c"
# 1 "<built-in>"
Process the built-in c pre-procssor directive (must be an implementation detail), but is presented as a fake "file".
# 1 "<command-line>"
Process the command line directive (again implementation detail), presented as a fake "file".
# 1 "/usr/include/stdc-predef.h" 1 3 4
Include (at line 1) the stdc-predef.h file, it's the start of the file, suppress warnings permitted for system header files, assure that the symbols are treated like C symbols.
# 1 "<command-line>" 2
Return from the command line "fake" file.
# 1 "sample.c"
Back in sample.c.
# 1 "/usr/include/stdio.h" 1 3 4
Now starting with file "stdio.h", suppress permitted system warnings, treat symbols in the file a C symbols.
# 27 "/usr/include/stdio.h" 3 4
And so on...
The documentation is here.

r - Constructing a row index based on arrays of starting point and length [duplicate]

I have a table as below
product=c("a","b","c")
min=c(1,5,3)
max=c(1,7,7)
dd=data.frame(product,min,max)
> dd
product min max
1 a 1 1
2 b 5 7
3 c 3 7
I want to create a table which will look like below. I want to create one row for each value between and including min and max for a product
product mm
a 1
b 5
b 6
b 7
c 3
c 4
c 5
c 6
c 7
How can i do it using R? is there any package which would give quick results?
Try
library(data.table)
setDT(dd)[, list(mm=min:max), by = product]
# product mm
#1: a 1
#2: b 5
#3: b 6
#4: b 7
#5: c 3
#6: c 4
#7: c 5
#8: c 6
#9: c 7
Or a faster option would be seq.int(min, max, 1L) as suggested by #David Arenburg
setDT(dd)[, list(mm = seq.int(min, max, 1L)), by = product]
Benchmarks
library(stringi)
set.seed(24)
product <- unique(stri_rand_strings(1e5,4))
min1 <- sample(1:10, length(product), replace=TRUE)
max1 <- sample(11:15, length(product), replace=TRUE)
dd <- data.frame(product, min1, max1)
dd2 <- copy(dd)
josilber <- function(){res1 <- data.frame(product=rep(dd$product,
dd$max1-dd$min1+1),
mm=unlist(mapply(seq, dd$min1, dd$max1)))
}
akrun <- function(){as.data.table(dd2)[, list(mm = seq.int(min1, max1,
1L)), by = product]}
Ananda <- function() {stack(lapply(split(dd[-1], dd[1]),
function(x) seq(x[[1]], x[[2]])))}
jiber <- function(){res <- by(dd[,-1], dd[,1], function(x)
seq(x$min1, x$max1) )
res <- as.data.frame(unlist(res))
data.frame(product=gsub("[0-9]", "", rownames(res)), mm=res[,1])}
system.time(akrun())
# user system elapsed
# 0.129 0.001 0.129
system.time(josilber())
# user system elapsed
# 0.762 0.002 0.764
system.time(Ananda())
# user system elapsed
#45.449 0.191 45.636
system.time(jiber())
# user system elapsed
# 48.013 8.218 56.291
library(microbenchmark)
microbenchmark(josilber(), akrun(), times=20L, unit='relative')
#Unit: relative
# expr min lq mean median uq max neval cld
#josilber() 6.39757 6.713236 5.570836 5.901037 5.603639 3.970663 20 b
# akrun() 1.00000 1.000000 1.000000 1.000000 1.000000 1.000000 20 a
With base R, you could do something like:
data.frame(product=rep(dd$product, dd$max-dd$min+1),
mm=unlist(mapply(seq, dd$min, dd$max)))
# product mm
# 1 a 1
# 2 b 5
# 3 b 6
# 4 b 7
# 5 c 3
# 6 c 4
# 7 c 5
# 8 c 6
# 9 c 7
You could also consider split + lapply + stack:
stack(lapply(split(dd[-1], dd[1]), function(x) seq(x[[1]], x[[2]])))
## values ind
## 1 1 a
## 2 5 b
## 3 6 b
## 4 7 b
## 5 3 c
## 6 4 c
## 7 5 c
## 8 6 c
## 9 7 c
Just another approach using R base functions
> res <- by(dd[,-1], dd[,1], function(x) seq(x$min, x$max) )
> res <- as.data.frame(unlist(res))
> data.frame(product=gsub("[0-9]", "", rownames(res)), mm=res[,1])
product mm
1 a 1
2 b 5
3 b 6
4 b 7
5 c 3
6 c 4
7 c 5
8 c 6
9 c 7

What do the numbers mean in the preprocessed .i files when compiling C with gcc?

I am trying to understand the compiling process. We can see the preprocessor intermediate file by using:
gcc -E hello.c -o hello.i
or
cpp hello.c > hello.i
I roughly know what the preprocessor does, but I have difficulties understanding the numbers in some of the lines. For example:
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "hello.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 27 "/usr/include/stdio.h" 3 4
# 1 "/usr/include/features.h" 1 3 4
# 374 "/usr/include/features.h" 3 4
The numbers can help debugger to display the line numbers. So my guess for the first column is the line number for column #2 file. But what do the following numbers do?
The numbers following the filename are flags:
1: This indicates the start of a new file.
2: This indicates returning to a file (after having included another file).
3: This indicates that the following text comes from a system header file, so certain warnings should be suppressed.
4: This indicates that the following text should be treated as being wrapped in an implicit extern "C" block.
Source: https://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html

What does “# 1 "/usr/include/stdio.h" 1 3 4” mean in "gcc -E" output? [duplicate]

This question already has answers here:
What is the meaning of lines starting with a hash sign and number like '# 1 "a.c"' in the gcc preprocessor output?
(3 answers)
Closed 9 years ago.
See following example:
$ cat foo.c
#include <stdio.h>
int main()
{
return 0;
}
$ gcc -E foo.c | head
# 1 "foo.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "foo.c"
# 1 "/usr/include/stdio.h" 1 3 4
# 36 "/usr/include/stdio.h" 3 4
# 1 "/usr/include/sys/feature_tests.h" 1 3 4
# 30 "/usr/include/sys/feature_tests.h" 3 4
#pragma ident "%Z%%M% %I% %E% SMI"
$
I tried Google but I don't know what keywords I should use for searching. Any links to the documentation?
They are linemarkers for identifying which source file and line a particular line of code came from. They can be used to generate more accurate diagnostic messages, for example. Documentation links:
Preprocessor Output
Line Control
You can use the -P option to omit them if you'd like.

What files could have been included?

I would like to be able to get a list of all possible files included in a C source file.
I understand there are complications with other # directives (for instance, an #ifdef could either prevent an include or cause an extra include). All I'm looking for is a list of files that may have been included.
Is there a tool that already does this?
The files I'm compiling are only going to .o, and the standard C libraries are not included. I know that sounds wonky, but we have our reasons.
The reason I want to be able to do this is I want to have a list of files which may have contributed something to the .o, so I can check to see if they have changed.
Quoting the man page for gcc:
-M Instead of outputting the result of preprocessing, output a rule
suitable for make describing the dependencies of the main source
file. The preprocessor outputs one make rule containing the object
file name for that source file, a colon, and the names of all the
included files, including those coming from -include or -imacros
command line options.
This basically does what you want. There are several other related options (all starting with -M) that give you different variants of this output.
my syntax is rusty, but ...
grep -ir "#include " *.c
might work ...
If you use gcc you can inspect preprocessor dump:
[~]> gcc -E /usr/include/cups/dir.h|grep "#"
# 1 "/usr/include/cups/dir.h"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "/usr/include/cups/dir.h"
# 26 "/usr/include/cups/dir.h"
# 1 "/usr/include/sys/stat.h" 1 3 4
# 73 "/usr/include/sys/stat.h" 3 4
# 1 "/usr/include/sys/_types.h" 1 3 4
# 32 "/usr/include/sys/_types.h" 3 4
# 1 "/usr/include/sys/cdefs.h" 1 3 4
# 33 "/usr/include/sys/_types.h" 2 3 4
# 1 "/usr/include/machine/_types.h" 1 3 4
# 34 "/usr/include/machine/_types.h" 3 4
# 1 "/usr/include/i386/_types.h" 1 3 4
# 37 "/usr/include/i386/_types.h" 3 4
# 70 "/usr/include/i386/_types.h" 3 4
# 35 "/usr/include/machine/_types.h" 2 3 4
# 34 "/usr/include/sys/_types.h" 2 3 4
# 58 "/usr/include/sys/_types.h" 3 4
# 94 "/usr/include/sys/_types.h" 3 4
# 74 "/usr/include/sys/stat.h" 2 3 4
# 1 "/usr/include/sys/_structs.h" 1 3 4
# 88 "/usr/include/sys/_structs.h" 3 4
# 79 "/usr/include/sys/stat.h" 2 3 4
# 152 "/usr/include/sys/stat.h" 3 4
# 228 "/usr/include/sys/stat.h" 3 4
# 248 "/usr/include/sys/stat.h" 3 4
# 422 "/usr/include/sys/stat.h" 3 4
# 27 "/usr/include/cups/dir.h" 2
# 42 "/usr/include/cups/dir.h"
You could do the preprocessor step only. Most compilers allow this.
Of course that would require some busywork reading the resulting file.
All POSSIBLE files? No way to do that.
Alas, simply grepping the source for #include is not guaranteed to be enough, because someone may have committed ...
#define tricksy(foo,bar) <foo##bar>
#define precious tricksy(ios, tream)
#include precious
int main(int, char **)
{
std::cout << "Hobbits!" << std::endl;
return 0;
}
... though you would be able to tell by inspection that something nonstandard was going on with the #include precious because of the missing <> or "".
A non-perverse example would be token-pasting different library root directories depending on command-line definitions.

Resources