Is environ available on Windows? - c

Is environ variable (as of POSIX) available (at least for reading) in major Windows C compilers?
I know that execve is available on Windows: https://en.wikipedia.org/wiki/Exec_(system_call)
But I am unsure if environ is available, too.

environ should be available, but is deprecated and you should use the more secure methods.
The execXX() calls are available, but fork() isn't, so effectively the exec functions are rendered useless.
You can use CreateProcessA for similar effect, and have the ability to set up environments and pipes cleanly.
Just to acknowledge #eryksun 's concerns: You do need to consider which character set you are using before using any Microsoft "A" file (and other O/S) APIs. It is simplest if you can do all your code using 16bit unicode, as that is the underlying type for NT, Windows 7, Windows 10. On unix and mac, you can make assumptions that utf-8 is the 8-bit character set of choice, but that has yet to happen for windows, and of course "backward compatibility". If you are using any of the "unix-like" M/S API, you should already be making the same design decisions, though, so should already have an answer.

The following program will print the environment variables.
#include <stdio.h>
int main(int argc, char *argv[], char *env[]){
int e = 0;
while (env[e] != NULL) {
printf("%s\n", env[e++]);
}
}

EDIT: I was wrong; looks like the MSVC runtime library does include support for environ (though deprecated) after all. I will leave my previous answer below if anyone is interested in alternative methods.
Not that I'm aware of, but, if you want to access the environment-variables on Windows, you have some options:
Declare main or wmain with the following signature:
int (w)main(int argc, char/wchar_t *argv[], char/wchar_t *envp[])
This is defined in the C Standard as a pointer to the environment block, if applicable:
§ J.5.1:
In a hosted environment, the main function receives a third argument, char *envp[],
that points to a null-terminated array of pointers to char, each of which points to a string
that provides information about the environment for this execution of the program
(5.1.2.2.1).
Use the Windows API function GetEnvironmentVariable(A|W) to get an individual environment variable, or GetEnvironmentStrings to get the entire environment array.
The standard C function getenv.

Related

Why can C main function be coded with or without parameters?

Just wondering why this
int main(void){}
compiles and links
and so does this:
int main(int argc, char **argv){}
Why isn't it required to be one or the other?
gcc will even compile and link with one argument:
int main(int argc){}
but issue this warning with -Wall:
smallest_1.5.c:3:1: warning: ‘main’ takes only zero or two arguments [-Wmain]
I am not asking this as in "how come they allow this?" but as in "how does the caller and the linker handle multiple possibilities for main?"
I am taking a Linux point of view below.
The main function is very special in the standard definition (for hosted C11 implementations). It is also explicitly known by recent compilers (both GCC & Clang/LLVM....) which have specific code to handle main (and to give you this warning). BTW, GCC (with help from GNU libc headers thru function attributes) has also special code for printf. And you could add your own customization to GCC using MELT for your own function attributes.
For the linker, main is often a usual symbol, but it is called from crt0 (compile your code using gcc -v to understand what that really means). BTW, the ld(1) linker (and ELF files, e.g. executables or object files) has no notion of types or function signatures and deals only with names (This is why C++ compilers do some name mangling).
And the ABI and the calling conventions are so defined that passing unused arguments to a function (like main or even open(2)...) does not do any harm (several arguments get passed in registers). Read the x86-64 System V ABI for details.
See also the references in this answer.
At last, you really should practically define your main as int main(int argc, char**argv) and nothing else, and you hopefully should handle program arguments thru them (at least --help & --version as mandated by GNU coding standards). On Linux, I hate programs (and I curse their programmers) not doing that (so please handle --help & --version).
Because the calling code can, for example, pass arguments in registers or on the stack. The two argument main uses them, while the zero argument main does nothing with them. It's that simple. Linking does not even enter the picture.
If you are worried about stack adjustments in the called code, the main function just needs to make sure the stack pointer is the same when it returns (and often even this is of no importance, e.g. when the ABI states that the caller is responsible for stack management).
Making it work has to do with the binary format of the executable and the OS's loader. The linker doesn't care (well it cares a little: it needs to mark the entry point) and the only caller routine is the loader.
The loader for any system must know how to bring supported binary format into memory and branch into the entry point. This varies slightly by system and binary format.
If you have a question about a particular OS/binary format, you may want to clarify.
The short answer: if you don't use the parameters, then you can declare main without parameters, in two ways:
int main(void)
or
int main()
The first means main is a function with no parameters. The second means main is a function with any number of parameters.
Since you don't access the parameters, both will be fine. Any compiler having "special" code to check the parameters of main is wrong. (But: main must return a value.)
The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:
int main(void) { /* ... */ }
or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):
int main(int argc, char *argv[]) { /* ... */ }
or equivalent; or in some other implementation-defined manner.
Regarding the parameters:
The first counts the arguments supplied to the program and the second is an array of pointers to the strings which are those arguments. These arguments are passed to the program by the command line interpreter.
So, the two possibilities are handled as:
If no parameters are declared: no parameters are expected as input.
If there are parameters in main() ,they should:
argc is greater than zero.
argv[argc] is a null pointer.
argv[0] through to argv[argc-1] are pointers to strings whose meaning will be determined by the program.
argv[0] will be a string containing the program's name or a null string if that is not available. Remaining elements of argv represent the arguments supplied to the program. In cases where there is only support for single-case characters, the contents of these strings will be supplied to the program in lower-case.
In memory:
they will be placed on the stack just above the return address and the saved base pointer (just as any other stack frame).
At machine level:
they will be passed in registers, depending on the implementation.

Usage of SafeStr in C

I am reading about using of safe strings at following location
https://www.securecoding.cert.org/confluence/pages/viewpage.action?pageId=5111861
It is mentioned as below.
SafeStr strings, when used properly, can eliminate many of these errors and provide backward compatibility to legacy code as well.
My question is what does author mean by "provide backward compatibility to legacy code as well." ? Request to explain with example.
Thanks for your time and help
It means that functions from the standard libc (and others) which expects plain, null terminated char arrays, will work even on those SafeStrs. This is probably achieved by putting a control structure at a negative offset (or some other trick) from the start of the string.
Examples: strcmp() printf() etc can be used directly on the strings returned by SafeStr.
In contrast, there are also other string libraries for C which are very "smart" and dynamic, but these strings can not be sent without conversion to "old school" functions.
From that page:
The library is based on the safestr_t type which is completely
compatible with char *. This allows casting of safestr_t structures to
char *.
That's some backward compatibility with all the existing code that takes char * or const char * pointers.

Limit on the number of arguments to main in C

Is there a limit on the number of arguments that we pass to main() in C? As you all know, it is defined as int main(int argc, char *argv[]).
When I call the program, I can pass arguments like so:
$ prog.exe arg1 arg2 arg3.....argn
Is there an upper bound in the number of argument that we may supply to main() in this way?
According to the POSIX spec for exec, there is a macro ARG_MAX defined in <limits.h> which defines the maximum number of bytes for the arguments + environment variables.
But since C doesn't define anything about that, no, there isn't an inherent cross-platform limit. You have to consult your OS manual if it doesn't define that macro.
No, there is no limit imposed by the ISO C99 standard. If you're using the "blessed" main form (of which there are two):
int main (int argc, char *argv[]);
then you will be limited to the maximum size of a signed integer (implementation-dependent but guaranteed to be at least 215-1 or 32,767).
Of course, you could even have more than that since the standard specifically allows for non-blessed main forms (for example, one that takes a long as the count).
The standard mandates how the arguments are stored and things like argv[argc] having to be NULL, but it does not directly limit the quantity.
Of course, there will be a limit in practice but this will depend entirely on the implementation and environment. However, if you have to ask, then you're probably doing something wrong.
Most tools would place a truly large number of arguments into a response file (say args.txt) then pass a single argument like:
my_prog #args.txt
which gets around arbitrary limits on argument quantity and size.
I wouldn't think so. While there may not be a theoretical limit, the computer probably can't handle 1.5 million arguments. Is there any particular reason you need to know this? I wouldn't recommend using command line arguments for thing other than options, file parameters, ect...
There is no limit explicit in C itself. This is an example of a behavior not defined in the language but rather the implementation. Remember that the language itself is different than it's implementation, subsequent libraries, IDE's, etc.

Is main() a pre-defined function in C? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
main() in C, C++, Java, C#
I'm new to programming in general, and C in particular. Every example I've looked at has a "main" function - is this pre-defined in some way, such that the name takes on a special meaning to the compiler or runtime... or is it merely a common idiom among C programmers (like using "foo" and "bar" for arbitrary variable names).
No, you need to define main in your program. Since it's called from the run-time, however, the interface your main must provide is pre-defined (must return an int, must take either zero arguments or two, the first an int, and the second a char ** or, equivalently, char *[]). The C and C++ standards do specify that a function with external linkage named main acts as the entry point for a program1.
At least as the term is normally used, a predefined function would be one such as sin or printf that's in the standard library so you can use it without having to write it yourself.
1If you want to get technical, that's only true for a "hosted" implementation -- i.e., the kind most of us use most of the time that produces programs that run on an operating system. A "free-standing" implementation (one produces program that run directly on the "bare metal" with no operating system under it) is free to define the entry point(s) as it sees fit. A free-standing implementation can also leave out most of the normal run-time library, providing only a handful of headers (e.g., <stddef.h>) and virtually no standard library functions.
Yes, main is a predefined function in the general sense of the the word "defined". In other words, the C language standard specifies that the function called at program startup shall be named main. It is not merely a convention used by programmers as we have with foo or bar.
The fine print: from the perspective of the technical meaning of the word "defined" in the context of C programming, no the main function is not "predefined" -- the compiler or C library do not supply a predefined function named main. You need to define your own implementation of the main function (and, obviously, you should name it main).
There is typically a piece of code that normal C programs are linked to which does:
extern int main(int argc, char * argv[], char * envp[]);
FILE * stdin;
FILE * stdout;
FILE * stderr;
/* ** setup argv ** */
...
/* ** setup envp ** */
...
/* ** setup stdio ** */
stdin = fdopen(0, "r");
stdout = fdopen(1, "w");
stderr = fdopen(2, "w");
int rc;
rc = main(argc, argv, envp); // envp may not be present here with some systems
exit(rc);
Note that this code is C, not C++, so it expects main to be a C function.
Also note that my code does no error checking and leaves out a lot of other system dependent stuff that probably happens. It also ignores some things that happen with C++, objective C, and various other languages that may be linked to it (notably constructor and destructor calling, and possibly having main be within a C++ try/catch block).
Anyway, this code knows that main is a function and takes arguments. If your main looks like:
int main(void) {
Then it still gets passed arguments, but they are ignored.
This code specially linked so that it is called when the program starts up.
You are completely free to write your own version of this code on many architectures, but it relies on intimate knowledge of how the operating system starts a new program as well as the C (and C++ and possibly Objective C) run time. It is likely to require some assembly programming and or use of compiler extensions to properly build.
The C compiler driver (the command you usually call when you call the compiler) passes the object file containing all of this (often called crt0.0, for C Run Time ...) along with the rest of your program to the linker, unless told not to.
When building an operating system kernel or an embedded program you often do not want to use the standard crt*.o file. You also may not want to use it if you are building a normal application in another programming language, or have some other non-standard requirements.
No, or you couldn't define one.
It's not predefined, but its meaning as an entry point, if it is present, is defined.

Access command line arguments without using char **argv in main

Is there any way to access the command line arguments, without using the argument to main? I need to access it in another function, and I would prefer not passing it in.
I need a solution that only necessarily works on Mac OS and Linux with GCC.
I don't know how to do it on MacOS, but I suspect the trick I will describe here can be ported to MacOS with a bit of cross-reading.
On linux you can use the so called ".init_array" section of the ELF binary, to register a function which gets called during program initilization (before main() is called). This function has the same signature as the normal main() function, execept it returns "void".
Thus, you can use this function to remember or process argc, argv[] and evp[].
Here is some code you can use:
static void my_cool_main(int argc, char* argv[], char* envp[])
{
// your code goes here
}
__attribute__((section(".init_array"))) void (* p_my_cool_main)(int,char*[],char*[]) = &my_cool_main;
PS: This code can also be put in a library, so it should fit your case.
It even works, when your prgram is run with valgrind - valgrind does not fork a new process, and this results in /proc/self/cmdline showing the original valgrind command-line.
PPS: Keep in mind that during this very early program execution many subsystem are not yet fully initialized - I tried libc I/O routines, they seem to work, but don't rely on it - even gloval variables might not yet be constructed, etc...
In Linux, you can open /proc/self/cmdline (assuming that /proc is present) and parse manually (this is only required if you need argc/argv before main() - e.g. in a global constructor - as otherwise it's better to pass them via global vars).
More solutions are available here: http://blog.linuxgamepublishing.com/2009/10/12/argv-and-argc-and-just-how-to-get-them/
Yeah, it's gross and unportable, but if you are solving practical problems you may not care.
You can copy them into global variables if you want.
I do not think you should do it as the C runtime will prepare the arguments and pass it into the main via int argc, char **argv, do not attempt to manipulate the behaviour by hacking it up as it would largely be unportable or possibly undefined behaviour!! Stick to the rules and you will have portability...no other way of doing it other than breaking it...
You can. Most platforms provide global variables __argc and __argv. But again, I support zneak's comment.
P.S. Use boost::program_options to parse them. Please do not do it any other way in C++.
Is there some reason why passing a pointer to space that is already consumed is so bad? You won't be getting any real savings out of eliminating the argument to the function in question and you could set off an interesting display of fireworks. Skirting around main()'s call stack with creative hackery usually ends up in undefined behavior, or reliance on compiler specific behavior. Both are bad for functionality and portability respectively.
Keep in mind the arguments in question are pointers to arguments, they are going to consume space no matter what you do. The convenience of an index of them is as cheap as sizeof(int), I don't see any reason not to use it.
It sounds like you are optimizing rather aggressively and prematurely, or you are stuck with having to add features into code that you really don't want to mess with. In either case, doing things conventionally will save both time and trouble.

Resources