how to use C to complete a wildcard function? - c

I am a rookie on C, and now I want to use C to complete a wildcard function. For example, I write a photo processing program named myphoto, and I want to use it like this: myphoto ./photos/*.png, and then myphoto will process all the png file in the dir one by one.
I would like to solve this problem as easily as possible, without the usage of regular expression, and I came up with a idea that maybe I could use the EXEC function to execute a command, but the EXEC function only returns int, not the char*.
So how can I solve this problem? thanks!

It is operating system specific. I'm giving a Posix and Linux point of view (on Windows it is different, and I don't know it).
Notice that if you are writing the program myprog.c compiled into myprog then running
myprog photos/*.png the main function in myprog.c is getting an array of strings (declare int main(int argc, char**argv) then the array of arguments has argc strings in array argv ....). The expansion is done by the shell before starting your myprog binary executable. See execve(2)
On Linux and Posix systems: read glob(7), you may want to use glob(3) and/or fnmatch(3) and/or wordexp(3). These functions are useful mostly if some data (e.g. a line in a file) contains photos/*.jpeg and your program want to "glob" that. You don't need to "glob" the arguments of main, this has been done already by your shell.
Read Advanced Linux Programming

Related

Unsymmetry between main and launching process

A C main method has the signature
int main(int argc, char** argv) {
}
It will get an array of command line parameters. But when trying to launch an application, e.g. using CreateProcess or ShellExecute, they only accept 2 parameters, one for the application to launch and one for the parameters. Why the parameters are not specified as array, too? Why every application that uses other applications has to deal with escaping of command line parameters, e.g., when invoking a compare tool with 2 arbitrary file names that might contain spaces or quotes?
On very few system the actual program execution actually start at the main (or WinMain) or similar function. Instead the compiler tells the linker to use a special function which usually doesn't really take any arguments, in the C sense of the word.
The command-line arguments (if any) could be passed through special registers on the assembly level, or they needs to be fetched using special OS-specific functions (like GetCommandLine in the Windows API).
On Windows, the GetCommandLine function does indeed get the command line as a single string. Just like it was passed to e.g. CreateProcess.
For a Windows console program, the special "entry" function does some other initialization (like setting up stdin etc.), and then calls GetCommandLine to get the command-line arguments, which it then parses into an array suitable for the main function, which is then called.
If you look at the POSIX world (where e.g. Linux and macOS lives) then they have the exec family of functions which does indeed take an array for the arguments. Or a variable-argument list which is parsed into such an array.

How to hide the system call passed in the system() function from htop

Consider this C snippet:
snprintf(buf, sizeof(buf), "<LONG PROCESS WITH PARAMETERS HAVING SENSITIVE INFO>";
system(buf);
Now on compiling and executing this, the "sensitive" parameters of the process can be seen on programs like htop
And I don't want that.
I would like to know if there's a way to hide everything passed in system() such that htop will only show the name of the compiled executable (i.e htop just displays a.out all the time)
In all the Unix-like systems I've used, including many Linux variants, it's possible for a program to overwrite it's command-line arguments "from inside". So in C we might use, for example, strcnpy() just to blank the values of argv[1], argv[2], etc. Of course, you need to have processed or copied these arguments first, and you need to be careful not to overwrite memory outside the specific limits of each argv.
I don't think anything about Unix guarantees the portability or continued applicability of this approach, but I have been using it for at least twenty years. It conceals the command from casual uses of ps, etc., and also from /proc/NN/cmdline, but it won't stop the shell storing the command line somewhere (e.g., in a shell history file). So it only prevents casual snooping.
A better approach is not to get into the situation in the first place -- have the program take its input from files (which could be encrypted), or environment variables, or use certificates. Or almost anything, in fact, except the command line.

Trying to call C program from Ruby script

I am trying to call a C program from my Ruby script, parsing it an argument (file object) and then store some variables the C program would return.
The idea is that my Ruby script allows me to easily cycle through the files & folders of a parent folder but it is way too slow to efficiently process all the files in that folder. Hence the use of a C program that I want to call to process each file.
My problem is that I can't find a method to call that C program from Ruby (and how to parse it the file argument, I'm not even sure it is possible as I don't know if Ruby files objects and C streams are "compatible")
Thank you in advance for your help !
You say you are trying to call a program so I assume you are not trying to statically or dynamically load a library and call a function. (If you are trying to load a library to call a function then look to the DL::Importer module.)
As for calling an external program from Ruby and receiving its result (from stdout, in this case), regardless of whether it was written in C or not, an easy way to do it is:
value = `program arg1 arg2 ...`
e.g. if the program you want to call compresses a given file and outputs the compressed size.
size = `mycompressionprogram filename.txt`
puts "compressed result is: #{size}"
Note those are back ticks " ` ".
So this is one easy way to code your computationally heavy stuff in C and wrap it up in a Ruby script.
One simple traditional way for a Ruby process to interact with unrelated C code is popen, which will allow your Ruby process to invoke the (compiled) code as a separate process, passing your choice of arguments into the traditional space the operating system allocates for that (accessible in argv in your process's int main(int argc, char** argv)), and then interacting with its standard input and standard output over a pipe. However, this technique launches another process and requires that you serialize/deserialize any ongoing interprocess communication so that it can run over the pipe, which may be an impediment.
So you can also write the C code as a Ruby extension, which will allow you to return values more readily, and moreover avoids the overhead associated with having a separate process involved. However, note that if you perform extensive work with Ruby objects in your C code you may still incur the performance penalties you'd hoped to avoid. The canonical document on how to write Ruby extensions is README.EXT.

How to include the command "wget" on my C source code?

I need to run a program that crawls websites and I already have an algorithm and some parts of the code. Problem is, I do not know how to insert wget into my source code. Our student assistant hinted that some kind of keyword or function shall be used before the wget( system, I think or something but I'm not so sure).
when to not use system:
1.) when you want to distribute the program to different environment, where the program you call via system is not available
2.) in a security relevant environment, where you have to make sure that the program you call is really the program you want it to be
3.) when the thing you want to do can easily be accomplished in 10-20 lines of C code
4.) in performance-critical applications
so, you should use system virtually never.
instead, to accomplish the same thing, you could use libcurl, as David suggested (his answer seems to be gone...), or do some socket programming (it's C, after all).
In a real-world scenario, I'd probably just default to writing the crawler in a different language. web requests and complex string processing are not necessarily the strong sides of C, and most definitely not very convenient to use :)
You can use the system() command.
In your case (possibly):
system("/bin/wget");
But if you want really call wget with parameters, so you should use execl().
execl("/bin/wget", "http://anyadress.com/file");
Whenever , you want to run shell commands from your C program , you use system("shell command").In your case
system("wget");
Note - wget is an executable , whose location is added to the path variable, so there is no need to specify the path explicitly.
-- Example --
#include <stdio.h>
#define BUFFLEN 2500
int main()
{
char web_address[BUFFLEN] = "www.google.com";
system("wget 'web_address' ");
return 0;
}
The system command is used to execute a shell command. man system

Using '__progname' instead of argv[0]

In the C / Unix environment I work in, I see some developers using __progname instead of argv[0] for usage messages. Is there some advantage to this? What's the difference between __progname and argv[0]. Is it portable?
__progname isn't standard and therefore not portable, prefer argv[0]. I suppose __progname could lookup a string resource to get the name which isn't dependent on the filename you ran it as. But argv[0] will give you the name they actually ran it as which I would find more useful.
Using __progname allows you to alter the contents of the argv[] array while still maintaining the program name. Some of the common tools such as getopt() modify argv[] as they process the arguments.
For portability, you can strcopy argv[0] into your own progname buffer when your program starts.
There is also a GNU extension for this, so that one can access the program invocation name from outside of main() without saving it manually. One might be better off doing it manually, however; thus making it portable as opposed to relying on the GNU extension. Nevertheless, I here provide an excerpt from the available documentation.
From the on-line GNU C Library manual (accessed today):
"Many programs that don't read input from the terminal are designed to exit if any system call fails. By convention, the error message from such a program should start with the program's name, sans directories. You can find that name in the variable program_invocation_short_name; the full file name is stored the variable program_invocation_name.
Variable: char * program_invocation_name
This variable's value is the name that was used to invoke the program running in the current process. It is the same as argv[0]. Note that this is not necessarily a useful file name; often it contains no directory names.
Variable: char * program_invocation_short_name
This variable's value is the name that was used to invoke the program running in the current process, with directory names removed. (That is to say, it is the same as program_invocation_name minus everything up to the last slash, if any.)
The library initialization code sets up both of these variables before calling main.
Portability Note: These two variables are GNU extensions. If you want your program to work with non-GNU libraries, you must save the value of argv[0] in main, and then strip off the directory names yourself. We added these extensions to make it possible to write self-contained error-reporting subroutines that require no explicit cooperation from main."
I see at least two potential problems with argv[0].
First, argv[0] or argv itself may be NULL if execve() caller was evil or careless enough. Calling execve("foobar", NULL, NULL) is usually an easy and fun way to prove an over confident programmer his code is not sig11-proof.
It must also be noted that argv will not be defined outside of main() while __progname is usually defined as a global variable you can use from within your usage() function or even before main() is called (like non standard GCC constructors).
It's a BSDism, and definitely not portable.
__progname is just argv[0], and examples in other replies here show the weaknesses of using it. Although not portable either, I'm using readlink on /proc/self/exe (Linux, Android), and reading the contents of /proc/self/exefile (QNX).
If your program was run using, for instance, a symbolic link, argv[0] will contain the name of that link.
I'm guessing that __progname will contain the name of the actual program file.
In any case, argv[0] is defined by the C standard. __progname is not.

Resources