Strange behavior of argv when passing string containing "!!!!" - c

I have written a small program that takes some input parameters from *argv[] and prints them. In almost all use cases my code works perfectly fine. A problem only arises when I use more than one exclamation mark at the end of the string I want to pass as an argument ...
This works:
./program -m "Hello, world!"
This does NOT work:
./program -m "Hello, world!!!!"
^^ If I do this, the program output is either twice that string, or the command I entered previous to ./program.
However, what I absolutely don't understand: The following, oddly enough, DOES work:
./program -m 'Hello, world!!!!'
^^ The output is exactly ...
Hello, world!!!!
... just as desired.
So, my questions are:
Why does this strange behavior occur when using multiple exclamation marks in a string?
As far as I know, in C you use "" for strings and '' for single chars. So why do I get the desired result when using '', but not when using "" as I should (in my understanding)?
Is there a mistake in my code or what do I need to change to be able to enter any string (no matter if, what, and how many punctuation marks are used) and get exactly that string printed?
The relevant parts of my code:
// this is a simplified example that, in essence, does the same
// as my (significantly longer) code
int main(int argc, char* argv[]) {
char *msg = (char *)calloc(1024, sizeof(char));
printf("%s", strcat(msg, argv[2])); // argv[1] is "-m"
free(msg);
}
I already tried copying the content of argv[2] into a char* buffer first and appending a '\0' to it, which didn't change anything.

This is not related to your code but to the shell that starts it.
In most shells, !! is shorthand for the last command that was run. When you use double quotes, the shell allows for history expansion (along with variable substitution, etc.) within the string, so when you put !! inside of a double-quoted string it substitutes the last command run.
What this means for your program is that all this happens before your program is executed, so there's not much the program can do except check if the string that is passed in is valid.
In contrast, when you use single quotes the shell does not do any substitutions and the string is passed to the program unmodified.
So you need to use single quotes to pass this string. Your users would need to know this if they don't want any substitution to happen. The alternative is to create a wrapper shell script that prompts the user for the string to pass in, then the script would subsequently call your program with the proper arguments.

The shell does expansion in double-quoted strings. And if you read the Bash manual page (assuming you use Bash, which is the default on most Linux distributions) then if you look at the History Expansion section you will see that !! means
Refer to the previous command.
So !!!! in your double-quoted string will expand to the previous command, twice.
Such expansion is not made for single-quoted strings.
So the problem is not within your program, it's due to the environment (the shell) calling your program.

In addition to the supplied answers, you should remember that echo is your shell friend. If you prefix your command with "echo ", you will see what shell is actually sending to your script.
echo ./program -m "Hello, world!!!!"
This would have showed you some strangeness and might have helped steer you in the right direction.

Related

Trying to get an asterisk * as input to main from command line

I'm trying to send input from the command line to my main function. The input is then sent to the functions checkNum etc.
int main(int argc, char *argv[])
{
int x = checkNum(argv[1]);
int y = checkNum(argv[3]);
int o = checkOP(argv[2]);
…
}
It is supposed to be a calculator so for example in the command line when I write:
program.exe 4 + 2
and it will give me the answer 6 (code for this is not included).
The problem is when I want to multiply and I type for example
program.exe 3 * 4
It seems like it creates a pointer (or something, not quite sure) instead of giving me the char pointer to the char '*'.
The question is can I get the input '*' to behave the same way as when I type '+'?
Edit: Writing "*" in the command line works. Is there a way where I only need to type *?
The code is running on Windows, which seems to be part of the problem.
As #JohnBollinger wrote in the comments, you should use
/path/to/program 3 '*' 4
the way it's written at the moment.
But some explanation is clearly required. This is because the shell will parse the command line before passing it to your program. * will expand to any file in the directory (UNIX) or something similar (windows), space separated. This is not what you need. You cannot fix it within your program as it will be too late. (On UNIX you can ensure you are in an empty directory but that probably doesn't help).
Another way around this is to quote the entire argument (and rewrite you program appropriately), i.e.
/path/to/program '3 * 4'
in which case you would need to use strtok_r or strsep to step through the (single) argument passed, separating it on the space(s).
How the shell handles the command-line arguments is outside the scope and control of your program. There is nothing you can put in the program to tell the shell to avoid performing any of its normal command-handling behavior.
I suggest, however, that instead of relying on the shell for word splitting, you make your program expect the whole expression as a single argument, and for it to parse the expression. That will not relieve you of the need for quotes, but it will make the resulting commands look more natural:
program.exe 3+4
program.exe "3 + 4"
program.exe "4*5"
That will also help if you expand your program to handle more complex expressions, such as those containing parentheses (which are also significant to the shell).
You can turn off the shell globbing if you don't want to use single quote (') or double quote (").
Do
# set -o noglob
or
# set -f
(both are equivalent).
to turn off the shell globbing. Now, the shell won't expand any globs, including *.

C - secure execution of system() or exec() with environment variables

I have two strings, both of which can be set by the user, e.g.
char *command = "vim $VAR";
char *myVar = "/tmp/something";
I want to execute *command using *myVar for $VAR.
I tried concatenating them as an environment variable (e.g. (pseudo-code) system("VAR=" + *myVar + "; " + *command), but the user controls myVar so this would be very insecure and buggy.
I considered tokenizing on spaces to directly replace $var and passing the results to exec(), but it's too awkward to worry about tokenizing shell command arguments correctly.
I think the solution is to emulate system() with exec by doing something like exec("sh", "-c", command, "--argument", "VAR", myVar), but I can't see anything in the sh/dash/bash man pages to permit setting environment variables in this way.
Edit: I just saw execvpe() which has an argument for setting environment variables from key=value strings. Would this be safe to use with untrusted input for the value?
How do I do this safely?
You can perform some string replacement on the value of myVar — put it inside single quotes, and replace all single quotes (the character ') by the four-character string '\''. Fiddly but safe if you don't make an implementation mistake. If possible, use a library that does it for you.
If your program is single-threaded, I recommend a different solution that doesn't involve fiddly quoting. You talk of setting environment variables… Well, just do it: make VAR an environment variable.
setenv("VAR", myVar, 1);
system(command);
unsetenv("VAR")
I've omitted error checking, and I assume that VAR isn't needed elsewhere in your program (if it is, this solution becomes more tedious because you need to remember the old value).
If you want fine control over the environment in which the command runs, you can reimplement system on top of fork, execve (or execvpe) and waitpid, or on top of posix_spawn (or posix_spawnp) and waitpid. It's more effort but you gain flexibility.
Note that whatever solution you adopt other than doing string replacement to "vim $VAR" inside the C program, the command will need to be vim "$VAR" and not vim $VAR. This is because in shell syntax, $VAR means “the value of the variable VAR” only if it's inside double quotes — otherwise, $VAR means “take the value of VAR, split it into words, and expand each word as a file name wildcard pattern”.
You need to quote the string contained in myVar; this may mean escaping naughty characters (eg with backslash).
You could use g_shell_quote from Glib
So as Ben pointed out, command is probably loaded at runtime.
I think the best approach is to tokenize command, rather than to tokenize myVar. You can then find which word in command is $VAR and replace that with the value of myVar. Then you can use posix_spawnp as per below.
If you really want command to be an arbitrary shell command, then your only option is to escape myVar before assigning it to an environment variable. Otherwise the shell will expand spaces and other special characters in it regardless of how you set it.
Third option is to make sure command is vim "$VAR" instead of vim $VAR. In that case you can assign it to environment using setenv, then call system, and then unset it after.
Old answer in case command is static:
It looks like what you actually want to do is
extern char *environ[];
posix_spawnp(NULL, "vim", NULL, NULL, (char*[]){"vim", myVar, NULL}, environ);
wait(NULL);
i.e. exec vim directly without any shell, with myVar as the first argument.
How about:
fork (if emulating system or spawn, skip this if doing exec)
setenv("VAR", myVar) in the fork child
exec "sh -c " + command

Bash and Double-Quotes passing to argv

I have re-purposed this example to keep it simple, but what I am trying to do is get a nested double-quote string as a single argv value when the bash shell executes it.
Here is the script example:
set -x
command1="key1=value1 \"key2=value2 key3=value3\""
command2="keyA=valueA keyB=valueB keyC=valueC"
echo $command1
echo $command2
the output is:
++ command1='key1=value1 "key2=value2 key3=value3"'
++ command2='keyA=valueA keyB=valueB keyC=valueC'
++ echo key1=value1 '"key2=value2' 'key3=value3"'
key1=value1 "key2=value2 key3=value3"
++ echo keyA=valueA keyB=valueB keyC=valueC
keyA=valueA keyB=valueB keyC=valueC
I did test as well, that when you do everything on the command line, the nested quote message IS set as a single argv value. i.e.
prog.exe argument1 "argument2 argument3"
argv[0] = prog.exe
argv[1] = argument1
argv[2] = argument2 argument3
Using the above example:
command1="key1=value1 \"key2=value2 key3=value3\""
The error is, my argv is comming back like:
arg[1] = echo
arg[2] = key1=value1
arg[3] = "key2=value2
arg[4] = key3=value3"
where I really want my argv[3] value to be "key2=value2 key3=value3"
I noticed that debug (set -x) shows a single-quote at the points where my arguments get broken which kinda indicates that it is thinking about the arguments at these break point...just not sure.
Any idea what is really going on here? How can I change the script?
Thanks in advance.
What is happening is that your nested quotes are literal and not parsed into separate arguments by the shell. The best way to handle this using bash is to use an array instead of a string:
args=('key1=value1', 'key2=value2 key3=value3')
prog.exe "${args[#]}"
The Bash FAQ50 has some more examples and use cases for dynamic commands.
A kind of crazy "answer" is to set IFS to double quote like this (save/restore original IFS):
SAVED_IFS=$IFS
IFS=$'\"'
prog.exe $command1
IFS=$SAVED_IFS
It kind of illustrates word splitting which occurs on unquoted arguments but does not affect variables or text inside ".." quotes. Text inside double quotes (after various expansions) is passed to the program as a single argument. However a bare variable $command1 (unquoted) undergoes word splitting which does not care about " inside the variable (taking it literal). A stupid IFS hack forces word splitting to be made at ". Also beware of a trailing whitespace at the end of argv[1] which appears because of word splitting at the " boundary.
jordanm's answer is much better for production use than mine :) The array is quoted, i.e. each array element is expanded as individual string and no word splitting occurs afterwards. This is essential. If it is unquoted like ${args[#]} it would be word split into three arguments instead of two.

how to check if there's a string in the command line

I have this command line:> write_strings "Hello World!" a.txt b.txt dir/a.txt.
all the elements (command, string, file names) go into an array of char pointer. how can I take an element and check if it's a string or a file name?
I don't mean the exact code lines, buts just need the idea. the program should return an error if there's no string.
You can use an API such as stat or access to check if the file pointed to by a path exists. There is no fundamental distinction between filepaths and regular strings when they are passed to your process.
If you're using the standard main(int argc, char *argv[]) convention, you can loop through argv, checking each one to see if it's a file via one of the previously-mentioned system calls.
Every string that can be passed on a command line is a potential pathname, since the only restriction in both cases is that there can't be any NULs.
A program with a command line syntax in which a specific argument might or might not be used as a pathname (depending on some vague definition of "filename-ish strings" or even a file existence test) is a bad design. Each argument should have a meaning defined by its order in the argument list, or by being associated with an option like -m msg or -o outputfile.
A well-behaved unix program will let the user create a file called Hello world! if he wants.
not regarding how meaningful the program might or might not be - you can compare the single characters of your char *argv[] by looping through them via argv[i][j]. If every string includes a ".txt" you do not have a string, which is not a filename in your context

command injection in C programming

I was implementing an echo command using the system() function. The argument for the echo command comes from a command line argument. But when used ';' in the argument it is showing the directory listing.
What should i do to avoid it? Is it because of command injection in my program?
update: code added from comment
#include<string.h>
#include<stdio.h>
#include<stdlib.h>
int main(int argc, char **argv) {
char cmd[50] = "echo ";
strcat(cmd,argv[1]);
system(cmd);
}
I could compile the code but while executing if i give the command line argument as eg: './a.out hello;ls ' then directory listing is happening.
Why are you trying to use a shell access (which is exactly what System() does), and than attempt to restrict it?
If you need for some reason to use 'echo', please build your own execve() parameters, and launch /bin/echo directly.. this way you can restrict the damage only to the tasks 'echo' can do.
When attempting to run your program with the command ./a.out hello;ls, you are actually providing the shell with two separate commands that it executes in sequence. First the shell runs a.out with the command line parameter "hello" in argv[1], which prints it out using echo. Then your program exits, and the shell runs the next command, ls, and displays the directory listing.
If you want to pass that string to the program as a command line parameter, you need to escape the special shell character ;, so the shell does not parse it before giving it to your program. To escape a character, precede it with a \.
Try running the command with ./a.out hello\;ls, and then using printf instead of echo.
[can't respond to other answers yet, so reposting the question]
"Is possible to get the argument with ';', without using '\' in the command line argument. Is possible for me to include a '\' from my program after getting argv?"
No, it is not possible. The interpretation of ";" is done by the shell before getting to your program, so unless you escape at the call, your program will never be aware of the ";". i.e.
PROG1 parms ; PROG2
will cause the shell (which is interpreting what you type) to do the following:
start PROG1 and pass it parms.
once PROG1 is done, start PROG2
There are a number of special characters which the shell will take over by default and your program will never see: * for wildcards, | for pipes, & for parallel execution, etc... None of these will be seen by the program being run, they just tell the shell to do special things.
Alternatively to using the "\", you can enclose your parameter in single or double quotes (which are different, but for your example will both work). i.e.:
./a.out "hello;ls"
./a.out 'hello;ls'
Note that these will work for the printf option, if you call "system" you are in effect telling C to start a shell to run what you are passing in, so the input will once again be subject to shell interpretation.
system() is very difficult to use in a secure manner. It's much easier to just use one of the exec* functions.

Resources