Parsing commands shell-like in C - c

I want to parse user input commands in my C (just C) program. Sample commands:
add node ID
add arc ID from ID to ID
print
exit
and so on. Then I want to do some validation with IDs and forward them to specified functions. Functions and validations are of course ready. It's all about parsing and matching functions...
I've made it with many ifs and strtoks, but I'm sure it's not the best way... Any ideas (libs)?

I think what you want is something like this:
while (1)
{
char *line = malloc(128); // we need to be able to increase the pointer
char *origLine = line;
fgets(line, 128, stdin);
char command[20];
sscanf(line, "%20s ", command);
line = strchr(line, ' ');
printf("The Command is: %s\n", command);
unsigned argumentsCount = 0;
char **arguments = malloc(sizeof(char *));
while (1)
{
char arg[20];
if (line && (sscanf(++line, "%20s", arg) == 1))
{
arguments[argumentsCount] = malloc(sizeof(char) * 20);
strncpy(arguments[argumentsCount], arg, 20);
argumentsCount++;
arguments = realloc(arguments, sizeof(char *) * argumentsCount + 1);
line = strchr(line, ' ');
}
else {
break;
}
}
for (int i = 0; i < argumentsCount; i++) {
printf("Argument %i is: %s\n", i, arguments[i]);
}
for (int i = 0; i < argumentsCount; i++) {
free(arguments[i]);
}
free(arguments);
free(origLine);
}
You can do what you wish with 'command' and 'arguments' just before you free it all.

It depends on how complicated your command language is. It might be worth going to the trouble of womping up a simple recursive descent parser if you have more than a couple of commands, or if each command can take multiple forms, such as your add command.
I've done a couple of RDPs by hand for some projects in the past. It's a bit of work, but it allows you to handle some fairly complex commands that wouldn't be straightforward to parse otherwise. You could also use a parser generator like lex/yacc or flex/bison, although that may be overkill for what you are doing.
Otherwise, it's basically what you've described; strok and a bunch of nested if statements.

I just wanted to add something to Richard Ross's reply: Check the returned value from malloc and realloc. It may lead to hard-to-find crashes in your program.

All your command line parameters will be stored into a array of strings called argv.
You can access those values using argv[0], argv[1] ... argv[n].

Related

`execvp()` seems to not be completing the path search

TL;DR -- what sorts of things might cause an execvp call to not fully function/ search the path properly?
I'm on the tail end of building a rudimentary shell with some quality of life features that I've added over time e.g. history, alias's, and completions. I built those features on top of a functional shell that had a working $PATH search for execution e.g. typing in "ls -la" produced the desired behavior. As you might imagine, I accomplished this just using execvp. (This is written in C if it's not already clear)
I have not changed any of my tokenizing logic and have ensured that the file name is correct; in particular, execvp was producing the desired behavior before I had added these features to my REPL. echo "hello" still produces a tokenized char **xyz and the first token is indeed echo, null-terminated, and so on. That is, my call still looks like, with variables filled-in, ... execvp("echo", argv); after which I call perror, which should only trigger when something has gone awry. Each time I just run the above command, though, since I've added in these features, it returns a failure with the no such file or directory --- before I added these features in though, the behavior was as desired. I'll note, though, that running /bin/echo "hello" runs as expected. Examples are WLOG.
I'm not sure where I should even start looking for errors, and my Google-fu has been mostly fruitless: any suggestions?
I'm initially going to omit code because it totals to several hundred lines and a MWE would not be particularly minimal in addition to my desires to keep this general rather than very particular to my code, though I'm not sure what's causing this. My repository is public and up-to-date, and I'm happy to post any code here.
EDIT:
I knew I wasn't explicitly editing the PATH variable, etc., but this block of code was the problem:
// Grab $PATH from env
char *pathvar = getenv("PATH");
if (pathvar) {
char *path;
int i;
// tokenize on colon to get paths
// then use that immediately to
// scandir, and add everything in
// there to the completions system
path = strtok(pathvar, ":");
while (path) {
struct dirent **fListTemp;
int num_files = scandir(path, &fListTemp, NULL, alphasort);
// only adding the names that are completely composed of
// lower case letters; completions are done using a naive
// Trie Node structure that only supports lowercase letters
// for now... e.g. g++ does not work, and the '+' leads to
// a seg-fault. Same holds for . and ..
for (i = 0; i < num_files; i++) {
char *curr = fListTemp[i]->d_name;
if (strcmp(curr, ".")==0 || strcmp(curr, "..")==0){
continue;
} else if (notalpha(curr)) {
continue;
} else {
str_tolower(curr);
tn_insert(completions, curr);
}
}
for (i = 0; i < num_files; i++) {
free(fListTemp[i]);
}
free (fListTemp);
path = strtok(NULL, ":");
}
} else {
fprintf(stderr, "{wsh # init} -- $PATH variable could not be found?");
}
Note that
The getenv() function returns a pointer to the value in the
environment, or NULL if there is no match.
so my original code was indeed tampering with the PATH variable. The solution I came up with quickly was just to create a copy of that string and use that to parse through the PATH instead:
// Grab $PATH from env
char *pathvar = getenv("PATH");
char *pathvar_cpy = strcpy(pathvar_cpy, pathvar);
if (pathvar_cpy) {
char *path;
int i;
path = strtok(pathvar_cpy, ":");
while (path) {
// Scan directory
struct dirent **fListTemp;
int num_files = scandir(path, &fListTemp, NULL, alphasort);
for (i = 0; i < num_files; i++) {
char *curr = fListTemp[i]->d_name;
if (strcmp(curr, ".")==0 || strcmp(curr, "..")==0){
continue;
} else if (notalpha(curr)) {
continue;
} else {
str_tolower(curr);
tn_insert(completions, curr);
}
}
for (i = 0; i < num_files; i++) {
free(fListTemp[i]);
}
free (fListTemp);
path = strtok(NULL, ":");
}
} else {
fprintf(stderr, "{wsh # init} -- $PATH variable could not be found?");
}

File I/O Extraction with structures in C

The task is to read in a .txt file with a command line argument, within the file there is a list unstructured information listing every airport in the state of Florida note this is only a snippet of the total file. There is some data that must be ignored such as ASO ORL PR A 0 18400 - anything that does not pertain to the structured variables within AirPdata.
The assignment is asking for the site number, locID, fieldname, city, state, latitude, longitude, and if there is a control tower or not.
INPUT
03406.20*H 2FD7 AIR ORLANDO ORLANDO FL ASO ORL PR 28-26-08.0210N 081-28-23.2590W PR NON-NPIAS N A 0 18400
03406.18*H 32FL MEYER- INC ORLANDO FL ASO ORL PR 28-30-05.0120N 081-22-06.2490W PR NON-NPAS N 0 0
OUTPUT
Site# LocID Airport Name City ST Latitude Longitude Control Tower
------------------------------------------------------------------------
03406.20*H 2FD7 AIR ORLANDO ORLANDO FL 28-26-08.0210N 081-28-23.2590W N
03406.18*H 32FL MEYER ORLANDO FL 28-30.05.0120N 081-26-39.2560W N
etc.. etc. etc.. etc.. .. etc.. etc.. ..
etc.. etc. etc.. etc.. .. etc.. etc.. ..
my code so far looks like
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
typedef struct airPdata{
char *siteNumber;
char *locID;
char *fieldName;
char *city;
char *state;
char *latitude;
char *longitude;
char controlTower;
} airPdata;
int main (int argc, char* argv[])
{
char text[1000];
FILE *fp;
char firstwords[200];
if (strcmp(argv[1], "orlando5.txt") == 0)
{
fp = fopen(argv[1], "r");
if (fp == NULL)
{
perror("Error opening the file");
return(-1);
}
while (fgets(text, sizeof(text), fp) != NULL)
{
printf("%s", text);
}
}
else
printf("File name is incorrect");
fflush(stdout);
fclose(fp);
}
So far i'm able to read the whole file, then output the unstructured input onto the command line.
The next thing I tried to figure out is to extract piece by piece the strings and store them into the variables within the structure. Currently i'm stuck at this phase. I've looked up information on strcpy, and other string library functions, data extraction methods, ETL, I'm just not sure what function to use properly within my code.
I've done something very similar to this in java using substrings, and if there is a way to take a substring of the massive string of text, and set parameters on what substrings are held in what variable, that would potentially work. such as... LocID is never more than 4 characters long, so anything with a numerical/letter combination that is four letters long can be stored into airPdata.LocID for example.
After the variables are stored within the structures, I know I have to use strtok to organize them within the list under site#, locID...etc.. however, that's my best guess to approach this problem, i'm pretty lost.
I don't know what the format is. It can't be space-separated, some of the fields have spaces in them. It doesn't look fixed-width. Because you mentioned strtok I'm going to assume its tab-separated.
You can use strsep use that. strtok has a lot of problems that strsep solves, but strsep isn't standard C. I'm going to assume this is some assignment requiring standard C, so I'll begrudgingly use strtok.
The basic thing to do is to read each line, and then split it into columns with strtok or strsep.
char line[1024];
while (fgets(line, sizeof(line), fp) != NULL) {
char *column;
int col_num = 0;
for( column = strtok(line, "\t");
column;
column = strtok(NULL, "\t") )
{
col_num++;
printf("%d: %s\n", col_num, column);
}
}
fclose(fp);
strtok is funny. It keeps its own internal state of where it is in the string. The first time you call it, you pass it the string you're looking at. To get the rest of the fields, you call it with NULL and it will keep reading through that string. So that's why there's that funny for loop that looks like its repeating itself.
Global state is dangerous and very error prone. strsep and strtok_r fix this. If you're being told to use strtok, find a better resource to learn from.
Now that we have each column and its position, we can do what we like with it. I'm going to use a switch to choose only the columns we want.
for( column = strtok(line, "\t");
column;
column = strtok(NULL, "\t") )
{
col_num++;
switch( col_num ) {
case 1:
case 2:
case 3:
case 4:
case 5:
case 9:
case 10:
case 13:
printf("%s\t", column);
break;
default:
break;
}
}
puts("");
You can do whatever you like with the columns at this point. You can print them immediately, or put them in a list, or a structure.
Just remember that column is pointing to memory in line and line will be overwritten. If you want to store column, you'll have to copy it first. You can do that with strdup but *sigh* that isn't standard C. strcpy is really easy to use wrong. If you're stuck with standard C, write your own strdup.
char *mystrdup( const char *src ) {
char *dst = malloc( (sizeof(src) * sizeof(char)) + 1 );
strcpy( dst, src );
return dst;
}

fscanf not saving the data to struct?

I have an array of structs and they get saved into a file. Currently there are two lines in the file:
a a 1
b b 2
I am trying to read in the file and have the data saved to the struct:
typedef struct book{
char number[11];//10 numbers
char first[21]; //20 char first/last name
char last[21];
} info;
info info1[500]
into num = 0;
pRead = fopen("phone_book.dat", "r");
if ( pRead == NULL ){
printf("\nFile cannot be opened\n");
}
else{
while ( !feof(pRead) ) {
fscanf(pRead, "%s%s%s", info1[num].first, info1[num].last, info1[num].number);
printf{"%s%s%s",info1[num].first, info1[num].last, info1[num].number); //this prints statement works fine
num++;
}
}
//if I add a print statement after all that I get windows directory and junk code.
This makes me think that the items are not being saved into the struct. Any help would be great. Thanks!
EDIT: Okay so it does save it fine but when I pass it to my function it gives me garbage code.
When I call it:
sho(num, book);
My show function:
void sho (int nume, info* info2){
printf("\n\n\nfirst after passed= %s\n\n\n", info2[0].first); //i put 0 to see the first entry
}
I think you meant int num = 0;, instead of into.
printf{... is a syntax error, printf(... instead.
Check the result of fscanf, if it isn't 3 it hasn't read all 3 strings.
Don't use (f)scanf to read strings, at least not without specifying the maximum length:
fscanf(pRead, "%10s%20s%20s", ...);
But, better yet, use fgets instead:
fgets(info1[num].first, sizeof info1[num].first, pRead);
fgets(info1[num].last, sizeof info1[num].last, pRead);
fgets(info1[num].number, sizeof info1[num].number, pRead);
(and check the result of fgets, of course)
Make sure num doesn't go higher than 499, or you'll overflow info:
while(num < 500 && !feof(pRead)){.
1.-For better error handling, recommend using fgets(), using widths in your sscanf(), validating sscanf() results.
2.-OP usage of feof(pRead) is easy to misuse - suggest fgets().
char buffer[sizeof(info)*2];
while ((n < 500) && (fgets(buffer, sizeof buffer, pRead) != NULL)) {
char sentinel; // look for extra trailing non-whitespace.
if (sscanf(buffer, "%20s%20s%10s %c", info1[num].first,
info1[num].last, info1[num].number, &sentinel) != 3) {
// Handle_Error
printf("Error <%s>\n",buffer);
continue;
}
printf("%s %s %s\n", info1[num].first, info1[num].last, info1[num].number);
num++;
}
BTW: using %s does not work well should a space exists within a first name or within a last name.

bilingual program in console application in C

I have been trying to implement a way to make my program bilingual : the user could chose if the program should display French or English (in my case).
I have made lots of researches and googling but I still cannot find a good example on how to do that :/
I read about gettext, but since this is for a school's project we are not allowed to use external libraries (and I must admit I have nooo idea how to make it work even though I tried !)
Someone also suggested to me the use of arrays one for each language, I could definitely make this work but I find the solution super ugly.
Another way I thought of is to have to different files, with sentences on each line and I would be able to retrieve the right line for the right language when I need to. I think I could make this work but it also doesn't seem like the most elegant solution.
At last, a friend said I could use DLL for that. I have looked up into that and it indeed seems to be one of the best ways I could find... the problem is that most resources I could find on that matter were coded for C# and C++ and I still have no idea how I would do to implement in C :/
I can grasp the idea behind it, but have no idea how to handle it in C (at all ! I do not know how to create the DLL, call it, retrieve the right stuff from it or anything >_<)
Could someone point me to some useful resources that I could use, or write a piece of code to explain the way things work or should be done ?
It would be seriously awesome !
Thanks a lot in advance !
(Btw, I use visual studio 2012 and code in C) ^^
If you can't use a third party lib then write your own one! No need for a dll.
The basic idea is the have a file for each locale witch contains a mapping (key=value) for text resources.
The name of the file could be something like
resources_<locale>.txt
where <locale> could be something like en, fr, de etc.
When your program stars it reads first the resource file for specified locale.
Preferably you will have to store each key/value pair in a simple struct.
Your read function reads all key/value pair into a hash table witch offers a very good access speed. An alternative would be to sort the array containing the key/value pairs by key and then use binary search on lookup (not the best option, but far better than iterating over all entries each time).
Then you'll have to write a function get_text witch takes as argument the key of the text resource to be looked up an return the corresponding text in as read for the specified locale. You have to handle keys witch have no mapping, the simplest way would be to return key back.
Here is some sample code (using qsort and bsearch):
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define DEFAULT_LOCALE "en"
#define NULL_ARG "[NULL]"
typedef struct localized_text {
char* key;
char* value;
} localized_text_t;
localized_text_t* localized_text_resources = NULL;
int counter = 0;
char* get_text(char*);
void read_localized_text_resources(char*);
char* read_line(FILE*);
void free_localized_text_resources();
int compare_keys(const void*, const void*);
void print_localized_text_resources();
int main(int argc, char** argv)
{
argv++;
argc--;
char* locale = DEFAULT_LOCALE;
if(! *argv) {
printf("No locale provided, default to %s\n", locale);
} else {
locale = *argv;
printf("Locale provided is %s\n", locale);
}
read_localized_text_resources(locale);
printf("\n%s, %s!\n", get_text("HELLO"), get_text("WORLD"));
printf("\n%s\n", get_text("foo"));
free_localized_text_resources();
return 0;
}
char* get_text(char* key)
{
char* text = NULL_ARG;
if(key) {
text = key;
localized_text_t tmp;
tmp.key = key;
localized_text_t* result = bsearch(&tmp, localized_text_resources, counter, sizeof(localized_text_t), compare_keys);
if(result) {
text = result->value;
}
}
return text;
}
void read_localized_text_resources(char* locale)
{
if(locale) {
char localized_text_resources_file_name[64];
sprintf(localized_text_resources_file_name, "resources_%s.txt", locale);
printf("Read localized text resources from file %s\n", localized_text_resources_file_name);
FILE* localized_text_resources_file = fopen(localized_text_resources_file_name, "r");
if(! localized_text_resources_file) {
perror(localized_text_resources_file_name);
exit(1);
}
int size = 10;
localized_text_resources = malloc(size * sizeof(localized_text_t));
if(! localized_text_resources) {
perror("Unable to allocate memory for text resources");
}
char* line;
while((line = read_line(localized_text_resources_file))) {
if(strlen(line) > 0) {
if(counter == size) {
size += 10;
localized_text_resources = realloc(localized_text_resources, size * sizeof(localized_text_t));
}
localized_text_resources[counter].key = line;
while(*line != '=') {
line++;
}
*line = '\0';
line++;
localized_text_resources[counter].value = line;
counter++;
}
}
qsort(localized_text_resources, counter, sizeof(localized_text_t), compare_keys);
// print_localized_text_resources();
printf("%d text resource(s) found in file %s\n", counter, localized_text_resources_file_name);
}
}
char* read_line(FILE* p_file)
{
int len = 10, i = 0, c = 0;
char* line = NULL;
if(p_file) {
line = malloc(len * sizeof(char));
c = fgetc(p_file);
while(c != EOF) {
if(i == len) {
len += 10;
line = realloc(line, len * sizeof(char));
}
line[i++] = c;
c = fgetc(p_file);
if(c == '\n' || c == '\r') {
break;
}
}
line[i] = '\0';
while(c == '\n' || c == '\r') {
c = fgetc(p_file);
}
if(c != EOF) {
ungetc(c, p_file);
}
if(strlen(line) == 0 && c == EOF) {
free(line);
line = NULL;
}
}
return line;
}
void free_localized_text_resources()
{
if(localized_text_resources) {
while(counter--) {
free(localized_text_resources[counter].key);
}
free(localized_text_resources);
}
}
int compare_keys(const void* e1, const void* e2)
{
return strcmp(((localized_text_t*) e1)->key, ((localized_text_t*) e2)->key);
}
void print_localized_text_resources()
{
int i = 0;
for(; i < counter; i++) {
printf("Key=%s value=%s\n", localized_text_resources[i].key, localized_text_resources[i].value);
}
}
Used with the following resource files
resources_en.txt
WORLD=World
HELLO=Hello
resources_de.txt
HELLO=Hallo
WORLD=Welt
resources_fr.txt
HELLO=Hello
WORLD=Monde
run
(1) out.exe /* default */
(2) out.exe en
(3) out.exe de
(4) out.exe fr
output
(1) Hello, World!
(2) Hello, World!
(3) Hallo, Welt!
(4) Hello, Monde!
gettext is the obvious answer but it seems it's not possible in your case. Hmmm. If you really, really need a custom solution... throwing out a wild idea here...
1: Create a custom multilingual string type. The upside is that you can easily add new languages afterwards, if you want. The downside you'll see in #4.
//Terrible name, change it
typedef struct
{
char *french;
char *english;
} MyString;
2: Define your strings as needed.
MyString s;
s.french = "Bonjour!";
s.english = "Hello!";
3: Utility enum and function
enum
{
ENGLISH,
FRENCH
};
char* getLanguageString(MyString *myStr, int language)
{
switch(language)
{
case ENGLISH:
return myStr->english;
break;
case FRENCH:
return myStr->french;
break;
default:
//How you handle other values is up to you. You could decide on a default, for instance
//TODO
}
}
4: Create wrapper functions instead of using plain old C standard functions. For instance, instead of printf :
//Function should use the variable arguments and allow a custom format, too
int myPrintf(const char *format, MyString *myStr, int language, ...)
{
return printf(format, getLanguageString(myStr, language));
}
That part is the painful one : you'll need to override every function you use strings with to handle custom strings. You could also specify a global, default language variable to use when one isn't specified.
Again : gettext is much, much better. Implement this only if you really need to.
the main idea of making programs translatable is using in all places you use texts any kind of id. Then before displaying the test you get the text using the id form the appropriate language-table.
Example:
instead of writing
printf("%s","Hello world");
You write
printf("%s",myGetText(HELLO_WORLD));
Often instead of id the native-language string itself is used. e.g.:
printf("%s",myGetText("Hello world"));
Finally, the myGetText function is usually implemented as a Macro, e.g.:
printf("%s", tr("Hello world"));
This macro could be used by an external parser (like in gettext) for identifying texts to be translated in source code and store them as list in a file.
The myGetText could be implemented as follows:
std::map<std::string, std::map<std::string, std::string> > LangTextTab;
std::string GlobalVarLang="en"; //change to de for obtaining texts in German
void readLanguagesFromFile()
{
LangTextTab["de"]["Hello"]="Hallo";
LangTextTab["de"]["Bye"]="Auf Wiedersehen";
LangTextTab["en"]["Hello"]="Hello";
LangTextTab["en"]["Bye"]="Bye";
}
const char * myGetText( const char* origText )
{
return LangTextTab[GlobalVarLang][origText ].c_str();
}
Please consider the code as pseudo-code. I haven't compiled it. Many issues are still to mention: unicode, thread-safety, etc...
I hope however the example will give you the idea how to start.

Valgrind C: How to input string from stdio

This is code:
char* inputString(){
int n = 5;
int size = n;
char* const_str = (char*)malloc((n+1)*sizeof(char));
char* substring = (char*)malloc((n+n)*sizeof(char)); /*here*/
char*p;
while((fgets(const_str,n,stdin)!=NULL)&&(strchr(const_str,'\n')==NULL)){
strcat(substring,const_str);
size += n;
substring = (char*)realloc(substring,size*sizeof(char)); /*here*/
}
strcat(substring,const_str);
size += n;
substring = (char*)realloc(substring,size*sizeof(char)); /*here*/
/*
printf("<%s> is \n",const_str);
printf("%s is \n",substring);
printf("%d is \n",size);
*/
if ((p=strchr(substring,'\n'))!=NULL){
p[0]='\0';
}
if(feof(stdin)){
changeToFull();
}
return substring;
}
and it will not be work on valgrind.
I guess, that i have memory leak here, but, i can't see any good solution to rewrite this function for valgrind.
Please, help!
I haven't tried it, but I found this on a question on SO:
--input-fd=<number> [default: 0, stdin]
Specify the file descriptor to use for reading input from the
user. This is used whenever valgrind needs to prompt the user
for a decision.
Original question here: making valgrind able to read user input when c++ needs it
EDIT:
So for your case, you may try:
mkfifo /tmp/abcd
exec 3</tmp/abcd
valgrind_command...... --input-fd=3
& in another terminal, use
cat > /tmp/abcd

Resources