How to call same file as different name for different method? - c

I was recently at a presentation where one of the speakers stated that he'd used a single CGI file, written in C, that is called by the webserver, but the webserver calls the file by using different names, the CGI file would run a different method.
How can I have a single C file execute different functions within when it is called by different names? Also how do I re-direct the calls for differently named files back to this single file?
Is this possible or was he just full of himself?

If you create the executable with different names but with the same code base, you can take a different branch of the code based the name of the executable used to invoke the program.
Simple example file:
#include <stdio.h>
#include <string.h>
int main1(int argc, char** argv)
{
printf("Came to main1.\n");
return 0;
}
int main2(int argc, char** argv)
{
printf("Came to main2.\n");
return 0;
}
int main(int argc, char** argv)
{
// If the program was invoked using main1, go to main1
if (strstr(argv[0], "main1") != NULL )
{
return main1(argc-1, argv+1);
}
// If the program was invoked using main2, go to main2
if (strstr(argv[0], "main2") != NULL )
{
return main2(argc-1, argv+1);
}
// Don't know what to do.
return -1;
}
Create two different executables from the file.
cc test-262.c -o main1
cc test-262.c -o main2
Then, invoke the program by using the two different executables:
./main1
Output:
Came to main1.
and...
./main2
Output:
Came to main2.

Unix filesystems support the concept of hard and soft links. To create them just type:
ln origfile newfile
to create a hard link, or:
ln -s origfile newfile
to create a soft link.
Soft links are just a special kind of file that contains the path of another file. Most operations in the link transparently result in operating on the target file.
Hard links are lower level. In effect, all files are a link from the pathname to the content. In Unix you can link more than one pathname to the same content. In effect, there's no "original" and "links", all are links. When you delete a file, you're just removing a link, and when the link count goes to zero, the content is removed.
Many unix utilities do this trick. Since the running shell includes the name used to invoque the executable, it's handled just like the 0'th argument of the command line.

When a CGI script is being called by a web server, it receives a considerable amount of information in its environment to let it know how it was called, including:
SCRIPT_NAME, the path to the script from the document root
SCRIPT_FILENAME, the filesystem path to the script (usually the same as argv[0])
REQUEST_URI, the path that was requested by the browser (usually similar to SCRIPT_NAME in the absence of URL rewriting)
QUERY_STRING and PATH_INFO, which contain URL parameters following the script's name
HTTP_*, which contain most of the HTTP headers that were passed in the request
Point is, the script gets a lot of information about how it was called. It could be using any of those to make its decision.

It's possible and actually pretty common.
The first element in the argv array passed to the main function is the "name" of the executable. This can be the full path, or it can be just the last component of the path, or -- if the executable is started with an exec* function call, it can be an arbitrary string. (And Posix allows it to be a null string, as well, but in practice that's pretty rare.)
So there is nothing stopping the executable from looking at argv[0] (having first checked to make sure that argc > 0) and parsing it.
The most typical way to introduce a different name for the executable is to insert a filesystem link with the alternate name (which could be either a hard or a soft-link, but for maintainability soft links are more useful.)
For CGIs, it is not even necessary to examine argv[0], since there are various useful environment variables, including (at least): SCRIPT_NAME.

Related

What is the entry point for git?

I was browsing through the git source code, and I was wondering where the entry point file is? I have gone through a couple files, that I thought would be it but could not find a main function.
I could be wrong, but I believe the entrypoint is main() in common-main.c.
int main(int argc, const char **argv)
{
/*
* Always open file descriptors 0/1/2 to avoid clobbering files
* in die(). It also avoids messing up when the pipes are dup'ed
* onto stdin/stdout/stderr in the child processes we spawn.
*/
sanitize_stdfds();
git_setup_gettext();
git_extract_argv0_path(argv[0]);
restore_sigpipe_to_default();
return cmd_main(argc, argv);
}
At the end you can see it returns cmd_main(argc, argv). There are a number of definitions of cmd_main(), but I believe the one returned here is the one defined in git.c, which is a bit long to post here in its entirety, but is excerpted below:
int cmd_main(int argc, const char **argv)
{
const char *cmd;
cmd = argv[0];
if (!cmd)
cmd = "git-help";
else {
const char *slash = find_last_dir_sep(cmd);
if (slash)
cmd = slash + 1;
}
/*
* "git-xxxx" is the same as "git xxxx", but we obviously:
*
* - cannot take flags in between the "git" and the "xxxx".
* - cannot execute it externally (since it would just do
* the same thing over again)
*
* So we just directly call the builtin handler, and die if
* that one cannot handle it.
*/
if (skip_prefix(cmd, "git-", &cmd)) {
argv[0] = cmd;
handle_builtin(argc, argv);
die("cannot handle %s as a builtin", cmd);
}
handle_builtin() is also defined in git.c.
Perhaps it's best to address the misunderstanding. Git is a way of collecting, recording, and archiving changes to a project directory. This is the purpose of a Version Control System, and git is perhaps one of the more recognizable ones.
Sometimes they also provide build automation, but often the best tools focus on the fewest responsibilities. In the case of git, it mostly focuses on commits to a repository in order to preserve different states of the directory it is initialized to. It doesn't build the program, so the entry points are unaffected.
For C projects, the entry point will always be the same one defined by the compiler. Generally this is a function called main, but there are ways to redefine or hide this entry point. Arduino, for example, uses setup as the entry point and then calls loop.
The comment left by #larks is an easy way to find the entry point when you're not sure. Using a simple recursive search from a git repo's root directory can hunt for the word main in any included file:
grep main *.c
The Windows equivalent is FINDSTR, but recent updates to Windows 10 have greatly improved compatibility with Bash commands. grep is usable in the version I'm running. So is ls, though I'm not sure whether it has been there all along.
Some git projects include multiple languages, and many languages related to C (and predecessors) use the same entry point name. Looking only in file extensions of .c is a good way to find the entry point of the C components, assuming the code is of high enough quality that you'd want to run it in the first place.
There are definitely ways to interfere with how well the extension filters out other languages, but their use implies very haphazard coding practice.

How to open files from a NaCl Dev Environment application?

I'm trying to get a simple command line application to run in the NaCl Development Environment. But I don't understand why it doesn't want to open files:
#include <stdio.h>
#include <ppapi_simple/ps_main.h>
int my_main (int argc, char ** argv) {
FILE * f = fopen ("out.txt","w");
if (f) {
fputs ("output to the file", f);
fclose(f);
} else {
puts("could not open file");
}
}
PPAPI_SIMPLE_REGISTER_MAIN(my_main)
Running:
bash.nmf-4.3$ gcc -I"$NACL_SDK_ROOT/include" test.c -lppapi_simple -lnacl_io -lppapi
bash.nmf-4.3$ ./a.out
could not open file
bash.nmf-4.3$
It's clearly possible for an application to open files in arbitrary locations within the dev environment - I'm using nano to edit the test code! But the naclports version of nano doesn't look like it's been changed in ways that are immediately connected to file manipulation..?
Lua is another app that appears to have only been modified very slightly. It falls somewhere in between, in that it can run test files but only if they're placed in /mnt/html5, and won't load them from the home folder. My test program shows no difference in behaviour if I change it to look in /mnt/html5 though.
NB. my goal here is to build a terminal application I can use within the dev environment alongside Lua and nano and so on, not a browser-based app - I assume that makes some difference to the file handling rules.
Programs run in the NaCl Dev Environment currently need to linked with -lcli_main (which in turn depends on -lnacl_spawn) for an entry point which understands how to communicate with the javascript "kernel" in naclprocess.js. They need this to know what current working directory they were run from, as well as to heard about mounted file systems.
Programs linked against just ppapi_simple can be run, but will not setup all the mount points the dev environment may expect.
There is a linker script in the dev env that simplifies linking a command line program -lmingn. For example the test program from the question can be compiled with:
gcc test.c -o test -lmingn
NOTE: This linker script had a recently resolved issue, a new version with the fix was published to the store on 5/5/2015.
In the near future, we have plans to simplify things further, by allowing main to be the entry point.
Thanks for pointing out the lua port lacks the new entry point!
I've filed an issue and will look into fixing it soon:
https://code.google.com/p/naclports/issues/detail?id=215
I found a solution to this, although I don't fully understand what it's doing. It turns out that the small changes made to nano are important, because they cause some other functions elsewhere in the NaCl libraries to get pulled in that correctly set up the environment for file handling.
If the above file is changed to:
#include <stdio.h>
int nacl_main (int argc, char ** argv) {
FILE * f = fopen ("out.txt","w");
if (f) {
fputs ("output to the file", f);
fclose(f);
} else {
puts("could not open file");
}
}
...and compiled with two more libraries:
gcc -I"$NACL_SDK_ROOT/include" test.c -lppapi_simple -lnacl_io -lppapi -lcli_main -lnacl_spawn
...then it will work as expected and write the file.
Instead of registering our own not-main function with PPAPI_SIMPLE_REGISTER_MAIN, pulling in cli_main causes it to do so with an internal function that sets some things up, presumably including what is needed for file writing to work, and expects to then be able to call nacl_main, which is left to the program to define with external visibility (several layers of fake-main stacking going on). This is why the changes to nano look so minimal.
nacl_spawn needs to be linked because cli_main uses it for ...something.

Relative paths in C

I have a C program that uses some resources located in the same directory as the executable. When I execute the program from a random working directory (not the directory where the program is located) the resources don't load, because the relative path I use in the code is not the path where the executable is. How can I solve this nicely?
Pass the path of the directory that contains the resources to the program as an argument and either:
change the current directory of the process to the directory (chdir() on Unix and SetCurrentDirectory() on Windows), or
construct absolute paths to the resources
If it is Windows, as the comment on the question suggests, you can obtain the path of the exe using GetModuleFileName(), extract the directory from it and avoid having to provide an argument to the program. Then either of two options listed would allow the program to be executed from anywhere and still locate its resources.
For anyone happening upon this old question in the future as I just did:
The program (at least in linux) keeps the command it was called by as the first argument of int main argument list.
e.g.
In this example we will drill down a couple of directories to get to our program, resulting in the following call command user#PC:~$ ./foo/bar/awesome_program.x86_64.
The program (code below) will print ./foo/bar/awesome_program.x86_64.
Since we have that string as a variable, it should be rather simple to construct relative paths from it, only replacing the end of that string with paths relative to the executable.
working code:
#include <stdio.h>
int main (int argc, char **argv)
{
printf("calling path: %s\n", argv[0]);
return 0;
}

How to remove multiple files in C using wildcards?

Is there any way in C to remove (using remove()) multiple files using a * (wildcards)?
I have a set of files that all start with Index. For example: Index1.txt, Index-39.txt etc.
They all start with Index but I don't know what text follows. There are also other files in the same directory so deleting all files won't work.
I know you can read the directory, iterate each file name, read the the first 5 chars, compare and if it fits then delete, but, is there an easier way (this is what I currently do by the way)?
This is standard C, since the code runs on Linux and Windows.
As you point out you could use diropen, dirread, dirclose to access the directory contents, a function of your own (or transform the wildcards into a regex and use a regex library) to match, and unlink to delete.
There isn't a standard way to do this easier. There are likely to be libraries, but they won't be more efficient than what you're doing. Typically a file finding function takes a callback where you provide the matching and action part of the code. All you'd be saving is the loop.
If you don't mind being platform-specific, you could use the system() call:
system("del index*.txt"); // DOS
system("rm index*.txt"); // unix
Here is some documentation on the system() call, which is part of the standard C library (cstdlib).
Is this all the program does? If so, let the command line do the wildcard expansion for you:
int main(int argc, char* argv[])
{
while (argc--)
remove(argv[argc]);
}
on Windows, you need to link against 'setargv.obj', included in the VC standard lib directory.

Determine UID that last modified a file in Linux?

I'm writing a program that will be monitoring select files and directories for changes. Some of the files are world writeable, some owner, some group.
What I need to do is be able to figure out the last person to modify (not just access) a file. Somehow I thought this would be simple, given that we know the inode of the file .. however I can not seem to find any way of obtaining this. I thought there was a practical way of correlating any given inode to the uid last accessing it.
I think I've squeezed google for all its going to give me on the topic.
Any help is appreciated. I'm writing the program in C.
Edit:
I need to be able to do this after the PID of whatever program modified the file is long gone.
If you are on a 2.6 kernel, you can take advantage of kernel's auditd daemon. Check this URL out. It might give you some hint on how to accomplish what you are trying to. I'm sure there is an API you could use in C.
To my knowledge, this information is not stored by any of the common filesystems, but you should by able to hook into inotify and keep an audit trail of which processes touch which files.
Okay, using straight old standard Linux with normal file systems, you're not going to be able to do it. That information isn't stored anywhere (see man lstat for what is stored.)
As #pablo suggests, you can do this with security auditing turned on. The link he notes is a good start, but the gist of it is this:
you turn on the audit daemon, which enables auditing form the kernel
you configure the rules file to capture what you want
you search the audit files for the events you want.
The difficulty here is that if you start auditing all file operations for all files, the audit is going to get big.
So what is the actual need you want to fil?
very basic , but it works:
you can easily write a little c-program that does what you want
this example retrieves the UID of file or directory or link,
just try to find the properties that you want.
compile with:
gcc -x c my-prog.c -o my-prog
then:
./my-prog /etc
a lot of other information can be obtained like this
it's not robust. but whatever, i know how to use it,
and do the checking in a bash shell :-)
[ -x /etc ] && my-prog /etc
source code:
# retrieve the uid of a file
# source code: my-prog.c
#
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
int main(int argc, char **argv) {
struct stat buffer;
int status;
char *fname;
fname=argv[1];
status = stat(fname, &buffer);
printf("%i",buffer.st_uid);
return 0;
}

Resources