C preprocessors - for homework [closed] - c

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Assignment:
You are required to implement a C preprocessor. The preprocessor is to be implemented as a command-line tool, the input to which is a C source file (.c extension) and the output is the preprocessed file (.i extension). The tool also takes several options.
$ cppr <options> file.c
On successful processing, file .i is produced.
<options> may be:
Preprocessor options-
-Aassertion -C -dD -dM -dN -Dmacro[=defn] -E -H
-idirafter dir -include file -imacros file
-iprefixfile -iwithprefix dir -M -MD -MM -MMD
-nostdinc –P -Umacro –undef
Directory options-
-Bprefix -Idir -I-
Implement any two of the above. This has to be decided during requirements phase.
These are the options defined by the GCC compiler. Refer to the manpage of GCC to understand the options.
You must implement the following features at a minimum:
Stripping off of comments
#ifdef and #endif
#define for constants (not macros)

It is not easy to answer not knowing what exactly you don't understand, but I'll try anyway, using my very limited C experience.
What is a preprocessor?
A preprocessor is a program that does some kind of processing on the code file before it is compiled. You can, for example, define a symbolic constant with a preprocessor directive:
#define PI 3.14159
Then you can use this value with a meaningful name across your code:
area = r * r * PI;
...
circumference = 2 * r * PI;
What the preprocessor does here is replace all occurrences of PI with the numeric value you specified:
area = r * r * 3.14159;
...
circumference = 2 * r * 3.14159;
You can also include code depending on whether or not a constant has already been defined somewhere else in your code (this is typically used in projects with multiple files):
#define WINDOWS
...
#ifdef WINDOWS
/* do Windows-specific stuff here */
#endif
The lines between #ifdef and #endif will only be included if the constant WINDOWS is defined before.
I hope that by now you have some idea about what your program should do.
Tips on implementing the "minimum features"
Here I'm going to give you some ideas on how to write the minimum features your professor requires. These are just off the top of my head, so please think about them first.
Stripping off of comments
While reading the input, look for "/*". When you encounter it, stop writing to the output, then when you find "*/", you can start writing again. Use a boolean flag to indicate whether you are inside a comment (AFAIK, there is no bool type in C, so use an int with 0 or 1, or, more ideally, two symbolic constants like INSIDE_COMMENT and OUTSIDE_COMMENT).
#define for constants (not macros)
If you encounter any line beginning with #, obviously you should not write it out. If you find a #define directive, store the symbolic name and the value somewhere (both strings), and from then on, look for the name in the input, and write out the value instead each time it is found. You can set a maximum length for the constant name, this is I think 6 chars in C, and always check 6 characters from the input. If the 6 characters begin with a known constant name, write out the value instead.
#ifdef and #endif
Create a boolean flag to indicate whether you are inside an #ifdef, much like with comments. When finding #ifdef, check if you are already storing the constant name, and write to the output depending on that.
I hope this helps.
EDIT: also read the comment by gs!

Here's the gcc documentation on preprocessor options, which might be of some help to you. It's fairly long but most of it deals with options that you don't need to bother with, so you can look through and pick out the relevant sections.

Your C textbook should describe what the standard C preprocessor does, but you can also try man cpp.
Then write a program to perform a limited subset of these tasks (i.e. process #ifdef / #endif pairs, and simple #defines).
Your program should parse its command line, accept at least two of the options listed above, and handle them in the way explained in the gcc manpage.

Related

Elegantly adding m4 macro processor into gcc compilation chain? [duplicate]

Could you please give me an example of writing a custom gcc preprocessor?
My goal is to replace SID("foo") alike macros with appropriate CRC32 computed values. For any other macro I'd like to use the standard cpp preprocessor.
It looks like it's possible to achieve this goal using -no-integrated-cpp -B options, however I can't find any simple example of their usage.
Warning: dangerous and ugly hack. Close your eyes now You can hook your own preprocessor by adding the '-no-integrated-cpp' and '-B' switches to the gcc command line. '-no-integrated-cpp' means that gcc does search in the '-B' path for its preprocessors before it uses its internal search path. The invocations of the preprocessor can be identified if the 'cc1', 'cc1plus' or 'cc1obj' programs (these are the C, C++ and Objective-c compilers) are invoked with the '-E' option. You can do your own preprocessing when you see this option. When there is no '-E' option pass all the parameters to the original programs. When there is such an option, you can do your own preprocessing, and pass the manipulated file to the original compiler.
It looks like this:
> cat cc1
#!/bin/sh
echo "My own special preprocessor -- $#"
/usr/lib/gcc/i486-linux-gnu/4.3/cc1 $#
exit $?
> chmod 755 cc1
> gcc -no-integrated-cpp -B$PWD x.c
My own special preprocessor -- -E -quiet x.c -mtune=generic -o /tmp/cc68tIbc.i
My own special preprocessor -- -fpreprocessed /tmp/cc68tIbc.i -quiet -dumpbase x.c -mtune=generic -auxbase x -o /tmp/cc0WGHdh.s
This example calls the original preprocessor, but prints an additional message and the parameters. You can replace the script by your own preprocessor.
The bad hack is over. You can open your eyes now.
One way is to use a program transformation system, to "rewrite" just the SID macro invocation to what you want before you do the compilation, leaving the rest of the preprocessor handling to the compiler itself.
Our DMS Software Reengineering Toolkit is a such a system, that can be applied to many languages including C and specifically the GCC 2/3/4 series of compilers.
To implement this idea using DMS, you would run DMS with its C front end
over your source code before the compilation step. DMS can parse the code without expanding the preprocessor directives, build
abstract syntax trees representing it, carry out transformations on the ASTs, and then spit out result as compilable C text.
The specific transformation rule you would use is:
rule replace_SID_invocation(s:STRING):expression->expression
= "SID(\s)" -> ComputeCRC32(s);
where ComputeCRC32 is custom code that does what it says. (DMS includes a CRC32 implementation, so the custom code for this is pretty short.
DMS is kind a a big hammer for this task. You could use PERL to implement something pretty similar. The difference with PERL (or some other string match/replace hack) is the risk that a) it might find the pattern someplace where you don't want a replacement, e.g.
... QSID("foo")... // this isn't a SID invocation
which you can probably fix by coding your pattern match carefully, b) fail to match a SID call found in suprising circumstances:
... SID ( /* master login id */ "Joel" ) ... // need to account for formatting and whitespace
and c) fail to handle the various kinds of escape characters that show up in the literal string itself:
... SID("f\no\072") ... // need to handle all of GCC's weird escapes
DMS's C front end handles all the escapes for you; the ComputeCRC32 function above would see the string containing the actual intended characters, not the raw text you see in the source code.
So its really a matter of whether you care about the dark-corner cases, or if you think you may have more special processing to do.
Given the way you've described the problem, I'd be sorely tempted to go the Perl route first and simply outlaw the funny cases. If you can't do this, then the big hammer makes sense.

C -- Print To Screen Without #include <stdio.h>? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
Is there a way to have a C source file print to the screen without including <stdio.h>?
Here's my situation: I was asked to programatically handle 1000 C source files that will each implements several numerical functions in C (these functions are supposed to work on data that is in memory, eithout any I/O). The origin of these source files in unclear, and hence, I'd like to make sure there will be no harm to my machine when I compile & run these source files.
Is there a way to find out if a C source file is potentially harmful? I thought of asking the developers to avoid any #include statements whatsoever, but I do need just printf -- as I'd like them to include an output of their calculations within main().
Any ideas?
Sure, add the prototype for printf at the top of your source file, as long as you're linking to the CRT libraries you can use the function without including stdio.h
printf prototype
int printf ( const char * format, ... );
There are, though they are probably a bit larger than the scope of the format of SO. In essence you leverage assembler calls in C. The blog KSplice touches on the subject ( with code and examples ) here.
Is there a way to find out if a C source file is potentially harmful?
No, there is none. A malicious source file could possibly do anything it wanted by defining its own prototypes, or by using inline assembly -- #include is just a compile-time convenience.
I would like to clarify why we need printf and studio.h to maybe make the concept more clear. C is a portable language. You can compile c for Linux, Mac OSX, Windows. In each, causing output normally boils down to a system call, or in embedded systems, dealing directly with a frame buffer or Uart device.
So of course it is possible, do you want to do it? Depends why. If you are coding against a specific platform and dont have printf(), then you may have to look into invoking a system call directly for that platform/writing some platform specific assembly code. It all depends on your use case.
Sure, put the necessary function prototypes in your program.
If you mean by not using printf, then you have several options - you can use fwrite, or you could dispense with streams and use write, or you could invoke operating system I/O services directly, or perhaps you could talk to the display hardware directly, or many other things.
If you want a better answer, perhaps explain why you want to not include stdio.h
This is silly but still:
#include <string.h>
int main() {
puts ("hi");
return 0;
}
and Output:
$ gcc -o try try.c
$ ./try
hi

Multi-pass C preprocessor [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Is it remotely sane to apply the C preprocessor to the same codebase multiple times (specifically, twice in sequence?)
For instance, having declarations such as the following:
##define DECLARE(FILE) # define DECLARATIONS \
# include FILE \
# undef DECLARATIONS
Have you ever seen such an idiom before? If so, what codebase? Can you link it? What sort of patterns would be followed to compile a project doing something like this? Can the CPP as it stands be made to do this, or do I need to write a meta-preprocessor to “hide” the single-hash declarations while processing the double-hash declarations, and so on?
I think when you need multiple CPP passes, you might want to consider m4 or some other sophisticated macro system/code generator. I think it will be hard to do what you want, and since you are going to be changing your build process for this anyway, look at other templating or macro systems.
Oh wow, why would you want to do this? I am sure GCC could be coerced into doing something like this with some clever make tricks (use the -E flag for GCC) but I can't imagine anyone being able to maintain it later.
Google threw this up, so here's a four-years-late use case for multiple (pre)compilation passes.
The largest benefit to multiple-pass compilation that I can see comes from optionally preprocessing the file. Specifically, when one would like to see the preprocessed source without including the very large standard headers at the top. E.g.,
#ifdef PRECOMPILATION
#ifdef TMPINCLUDE
#error "This stunt assumes TMPINCLUDE isn't already defined"
#endif
#define TMPINCLUDE #include <stdlib.h>
TMPINCLUDE
#undef TMPINCLUDE
#else
#include <stdlib.h>
#endif
This will compile as normal in the absence of PRECOMPILATION, but if compiled as gcc -E -P -DPRECOMPILATION or similar, will translate into a source file containing all your code, post expansion, and the #include statement at the top. So it's still valid code and can also be compiled from the already-preprocessed file.
Macros are unpopular in the C and C++ world. I would like to release a plausibly useful library to the wider world, but it's very heavily based on macros to reduce code duplication. Using an either-one-or-two pass compilation model means I can use the library directly, macros and all, in my own work, but can also release a sanitised version which only uses the preprocessor to include standard libraries.
Whether that is remotely sane or not is rather subjective.

Compiling in constants at runtime

I have a program for some scientific simulation stuff, and as such needs to run quickly.
When I started out, I was somewhat lazy, and decided to allow inputting constants later; and just used #define macros for them all.
The problem is that when I tried changing that, it got a lot slower. For example, changing
#define WIDTH 8
//..... code
to
#define WIDTH width
int width;
//... main() {
width=atoi(argv[1]);
//...... code
resulted in something that used to take 2 seconds taking 2.8. That's just for one of about a dozen constants, and I can't really afford that even. Also, there is probably some complied-away math with these.
So my question is if I can have some way (bash script?) of compiling the constants I want to use into the program at runtime. It's ok if any machine that needs to run this has to have a compiler on it. It currently compiles with a standard (quite simple) Makefile.
--This also allows for march=native, which should help a little.
I suppose my question also is if there's a better way of doing it entirely...
At least if I understand your question correctly, what I'd probably do would be something like:
#ifndef WIDTH
#define WIDTH 8
#endif
(and likewise for the other constants you want to be able to modify). The in your makefile(s), add some options to the makefile to pass the correct definitions to the compiler when/if necessary, so if you wanted to change the WIDTH, you'd have something like:
cflags=-DWIDTH=12
and when you compile the file, this would be used as the definition for WIDTH, but if you didn't define a value in the makefile, the default in the source file would be used.
The difference is that with the macro being just an integer literal, the compiler is able to often calculate a bunch of the math at compile time. A trivial example is if you had:
int x = WIDTH * 3;
the compiler would actually emit:
int x = 24;
no multiply there. If you change WIDTH to a variable, it can't do that, because it could be any value. So there is almost certainly going to be some difference in speed (how much depends on the circumstance and it is often so little that it doesn't matter).
I recommend making what needs to be variables variables and then profiling to find the hot spots in the code. Almost always, it's the algorithm that slows you down the most. Once you find out which blocks of code you are spending the most time in, then you can figure out ways to make that part faster.
The only real solution would be to have a separate header file with the constants that you could have a script generate then compile the program. Or if there aren't too many just passing them directly to gcc. This of course sacrifices up front speed for runtime speed. I do wonder if a difference of 0.8 seconds in runtime is un-affordable, how is compiling a program (which will surely take more than a second) affordable?
The script could be something as simple as this:
#!/bin/sh
echo "#define WIDTH $1" > constants.h
echo "#define HEIGHT $2" >> constants.h
gcc prog.c -o prog && ./prog
where prog.c includes constants.h or something like this (with no extra header).
#!/bin/sh
gcc -DWIDTH=$1 -DHEIGHT=$2 prog.c -o prog && ./prog
You could store the relevant defines into a separate header file constants.h:
#ifndef CONSTANTS_H
#define CONSTANTS_H
#define WIDTH 8
...other defines...
#endif
If you take care that the header is included only once, then you can even omit the include guards and have a small file with only the relevant stuff. I would go this way if the program is used by others who need to change the constants. If you're the only one using it, then Jerry's method is just fine.
EDIT:
Reading your comment, this separate header could be easily generated with a little tool from the makefile before the compilation.

Listing C Constants/Macros

Is there a way to make the GNU C Preprocessor, cpp (or some other tool) list all available macros and their values at a given point in a C file?
I'm looking for system-specific macros while porting a program that's already unix savvy and loading a sparse bunch of unix system files.
Just wondering if there's an easier way than going hunting for definitions.
I don't know about a certain spot in a file, but using:
$ touch emptyfile
$ cpp -dM emptyfile
Dumps all the default ones. Doing the same for a C file with some #include and #define lines in it includes all those as well. I guess you could truncate your file to the spot you care about and then do the same?
From the man page:
-dCHARS
CHARS is a sequence of one or more of the following characters, and must not be preceded by a space. Other characters are interpreted by the compiler proper, or reserved for future versions of GCC, and so are silently ignored. If you specify characters whose behavior conflicts, the result is undefined.
M
Instead of the normal output, generate a list of #define directives for all the macros defined during the execution of the preprocessor, including predefined macros. This gives you a way of finding out what is predefined in your version of the preprocessor. Assuming you have no file foo.h, the command
touch foo.h; cpp -dM foo.h
will show all the predefined macros.
If you use -dM without the -E option, -dM is interpreted as a synonym for -fdump-rtl-mach.
D
Like M except in two respects: it does not include the predefined macros, and it outputs both the #define directives and the result of preprocessing. Both kinds of output go to the standard output file.
N
Like D, but emit only the macro names, not their expansions.
I
Output #include directives in addition to the result of preprocessing.
With gcc, you can use the "-dD" option to dump all the macro definitions to stdout.
Why not consult the section on Predefined-macros? Do you need this for building a project or some such thing?
To list "their values at a given point in a C file" using macros, there is two that can demonstrate a given point in a C file, especially when compiled, and would be deemed useful for tracing a point of failure...consider this sample code in a file called foo.c:
if (!(ptr = malloc(20))){
fprintf(stderr, "Whoops! Malloc Failed in %s at line %d\n", __FILE__, __LINE__);
}
If that code logic was used several times in this file, and the call to malloc was failing, you would get this output:
Whoops! Malloc Failed in foo.c at line 25
The line number would be different depending on where in the source, that logic is used. This sample serves the purpose in showing where that macro could be used...
Here is a link to a page with an overview of command-line options to list predefined macros for most compilers (gcc, clang...).

Resources