Why are the values in my dynamic 2D char array being overwritten? - c

I'm trying to build a process logger in C on Linux, but having trouble getting it right. I'd like it to have 3 colums: USER, PID, COMMAND. I'm using the output of ps aux and trying to dynamically append it to an array. That is, for every line ps aux outputs, I want to add a row to my array.
This is my code. (To keep the output short, I only grep for sublime. But this could be anything.)
#define _BSD_SOURCE
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char** processes = NULL;
char* substr = NULL;
int n_spaces = 0;
int columns = 1;
char line[1024];
FILE *p;
p = popen("ps -eo user,pid,command --sort %cpu | grep sublime", "r");
if(!p)
{
fprintf(stderr, "Error");
exit(1);
}
while(fgets(line, sizeof(line) - 1, p))
{
puts(line);
substr = strtok(line, " ");
while(substr != NULL)
{
processes = realloc(processes, sizeof(char*) * ++n_spaces);
if(processes == NULL)
exit(-1);
processes[n_spaces - 1] = substr;
// first column user, second PID, third all the rest
if(columns < 2)//if user and PID are already in array, don't split anymore
{
substr = strtok(NULL, " ");
columns++;
}
else
{
substr = strtok(NULL, "");
}
}
columns = 1;
}
pclose(p);
for(int i = 0; i < (n_spaces); i++)
printf("processes[%d] = %s\n", i, processes[i]);
free(processes);
return 0;
}
The output of the for loop at the end looks like this.
processes[0] = user
processes[1] = 7194
processes[2] = /opt/sublime_text/plugin_host 27184
processes[3] = user
processes[4] = 7194
processes[5] = /opt/sublime_text/plugin_host 27184
processes[6] = user
processes[7] = 27194
processes[8] = /opt/sublime_text/plugin_host 27184
processes[9] = user
processes[10] = 27194
processes[11] = /opt/sublime_text/plugin_host 27184
But, from the puts(line) I get that the array should actually contain this:
user 5016 sh -c ps -eo user,pid,command --sort %cpu | grep sublime
user 5018 grep sublime
user 27184 /opt/sublime_text/sublime_text
user 27194 /opt/sublime_text/plugin_host 27184
So, apparently all the values are being overwritten and I can't figure out why... (Also, I don't get where the value 7194 in processes[0] = 7194 and processes[4] = 7194 comes from).
What am I doing wrong here? Is it somehow possible to make the output look like the output of puts(line)?
Any help would be appreciated!

The return value of strtok is a pointer into the string that you tokenise. (The token has been made null-terminated by overwriting the first separator after it with '\0'.)
When you read a new line with fgets, you overwrite the contents of this line and all tokens, not just the ones from the last parsing, point to the actual content of the line. (The pointers to the previous tokens remain valid, but the content at these locations changes.)
The are several ways to fix this.
You could make the tokens you save arrays of chars and strcpy the parsed contents.
You could duplicate the parsed tokens with (the non-standard) strdup, which allocates memory for the strings on the heap.
You could read an array of lines, so that the tokens are really unique.

strtok returns a pointer into the string it processes. That string is, always, stored in the variable line. So all your pointers in the array point to the same place in memory.
As #Lundin wrote in a comment, there is no two-dimensional array here, just a one-dimensional array of pointers.

Related

fgets loop only works properly if string is malloc'd again at the end of the loop

I have to read in comma separated lines from a file, break down the args between that commas, and handle them accordingly. I have the following code set up to do exactly what I need:
char *string = malloc(MAX_INPUT);
char * current_parent;
char * current_child;
// while there are still lines to read from the file
while(fgets(string, BUFF_SIZE, file) != NULL){
// get the parent of a line and its first child
current_parent = trim(strtok(string, ","));
if(strcmp(current_parent, "") != 0){
current_child = trim(strtok(NULL, ","));
if(current_child == NULL || strcmp(current_child, "") == 0){
tree = add_child(tree, current_parent, "");
}
else{
while(current_child != NULL){
tree = add_child(tree, current_parent, current_child);
current_child = trim(strtok(NULL, ","));
}
}
current_parent = 0;
current_child = 0;
}
string = (char *)malloc(MAX_INPUT);
}
// close the file
free(string);
fclose(file);
Example of a file:
john 1, sam 2
sam 2, ben 4, frances, sam 3
ben 4, sam 4, ben 5, nancy 2, holly
ben 5, john 2, sam 5
Whatever is before the first comma is the parent's name, and all other strings after that are its children's names.
My problem is that for whatever reason it only reads in the data properly if I malloc the string all over again at the end of the loop. In addition to this, if I try to free the memory before malloc'ing again, it doesn't work - resulting in memory leaks. I've even tried setting up the loop to just use a fixed sized string, but it still then reads in this jumbled data at the end. I've also tried manually emptying string at the end of each loop. I have a feeling that I may be overlooking something really simple but this has been driving me crazy. Let me know if I can provide anything else that would be of help. Thanks.
Edit: Not sure if this is important for anything but MAX_INPUT and BUFF_SIZE are both 1024

How to pass an array of char's to a function in C

I'm making my own command prompt (school project) and I'm trying to keep track of the last 10 commands the user uses. So I have an array:
char* history[10];
From my understanding this means I have an array of pointers, which point to strings. My problem with this is that I have another variable, input that is the users input. But whenever the user inputs something new, then the value of input changes, meaning all of the strings in my array change to the user's new input.
I'm wondering how I can get around this?
I tried changing my array to the following:
char *history[10][MAX] //Where MAX = 256
Where I could instead use strcpy instead but I was unable to figure out how to input an array of arrays into a method, and then use strcpy to copy the string into the array of arrays.
Here is my current method:
char* updateHistory(char *history[], char command[], int histIndex) {
history[histIndex] = command;
return *history;
}
Any help on another solution or how to get my solution working?
Your array of pointers needs to point to heap allocated memory, it sounds as if you point to some buffer that changes
So something like this should work
#define MAX_HISTORY 10
char* history[MAX_HISTORY];
if (fgets(input, sizeof(input), stdin) != NULL)
{
input[strlen(input)-1] = '\0'; // remove \n
history[n++] = strdup(input); // allocates and copies the input string
if ( n == MAX_HISTORY )
{
// throw away the oldest and move pointers forward one step
}
}
strdup is conceptually the same as
malloc( ) + strcpy()
so when you move the pointers forward and when you want clear the history you need to free() what the pointers point to.
Alternatively if you do not want to use the heap you could have a big buffer where you put the history
char history[MAX_HISTORY][MAX_CMD_LEN] but then you would need to shift more data and that is not so elegant/effective or have some elaborate indexing system to keep track of the contents
meaning all of the strings in my array change to the user's new input.
This happens probably because, you have a single variable to which command refers to inside updateHistory function. So anytime you make assignment on the first line of updateHistory function, all pointers in your array of pointers point to the same memory location - command.
To fix this, you need to allocate your array of pointers like this (for example you can do this outside your function):
char *history[10];
for ( i=0; i < 10; i++ )
{
history[i] = malloc(MAXLEN);
}
Then to copy string (this could go inside your function):
strcpy(history[i], command);
Also don't forget to free each variable in the array in the end.
While you are free to allocate space on the heap with malloc or calloc, if you are limiting your history to a reasonable size, a simple 2D statically declared character array can work equally well. For example:
#include <stdio.h>
#include <string.h>
/* constants for max pointers, max chars */
enum {MAXP = 10, MAXC = 256};
int main (void) {
char history[MAXP][MAXC] = {{0}};
char buf[MAXC] = {0};
size_t i, n = 0;
while (printf ("prompt > ") && fgets (buf, MAXC, stdin)) {
size_t buflen = strlen (buf);
buf[--buflen] = 0; /* strip newline */
/* check buflen to prevent saving empty strings */
if (!buflen) continue;
strncpy (history[n++], buf, buflen);
if (n == MAXP) /* handle full history */
break;
}
for (i = 0; i < n; i++)
printf (" history[%zu] : %s\n", i, history[i]);
return 0;
}
Example Use/Output
$ ./bin/fgets_static2d_hist
prompt > ls -al
prompt > mv a b/foo.txt
prompt > rsync ~/tmp/xfer hostb:~/tmp
prompt > du -hcs
prompt > cat /proc/cpuinfo
prompt > grep buflen *
prompt > ls -rt debug
prompt > gcc -Wall -Wextra -Ofast -o bin/fgets_static2d_hist fgets_static2d_hist.c
prompt > objdump obj/fgets_static2d.obj
prompt > source-highlight -i fgets_static2d.c -o fgets_static2d.html
history[0] : ls -al
history[1] : mv a b/foo.txt
history[2] : rsync ~/tmp/xfer hostb:~/tmp
history[3] : du -hcs
history[4] : cat /proc/cpuinfo
history[5] : grep buflen *
history[6] : ls -rt debug
history[7] : gcc -Wall -Wextra -Ofast -o bin/fgets_static2d_hist fgets_static2d_hist.c
history[8] : objdump obj/fgets_static2d.obj
history[9] : source-highlight -i fgets_static2d.c -o fgets_static2d.html
The benefit you get from a statically declared array is automatic memory management of your array storage and a slight benefit in efficiency from the memory being allocated from the stack. Either will do, it is just a matter of how much information you are managing.
When you want to pass array of pointer to a function, then you can use '&' sign to pass the address when you call a function.
For example:
This is what you have declared an array char* history[10];
This is the function you have used:
char* updateHistory(char *history[], char command[], int histIndex) {
history[histIndex] = command;
return *history;
}
So, while calling the function in the body of main(), call it like this
main()
{
updateHistory(&history, command, histIndex);
}
I hope this will help you out.. ok.

Breaking a string in C with multiple spaces

Ok, so my code currently splits a single string like this: "hello world" into:
hello
world
But when I have multiple spaces in between, before or after within the string, my code doesn't behave. It takes that space and counts it as a word/number to be analyzed. For example, if I put in two spaces in between hello and world my code would produce:
hello
(a space character)
world
The space is actually counted as a word/token.
int counter = 0;
int index = strcur->current_index;
char *string = strcur->myString;
char token_buffer = string[index];
while(strcur->current_index <= strcur->end_index)
{
counter = 0;
token_buffer = string[counter+index];
while(!is_delimiter(token_buffer) && (index+counter)<=strcur->end_index)//delimiters are: '\0','\n','\r',' '
{
counter++;
token_buffer = string[index+counter];
}
char *output_token = malloc(counter+1);
strncpy(output_token,string+index,counter);
printf("%s \n", output_token);
TKProcessing(output_token);
//update information
counter++;
strcur->current_index += counter;
index += counter;
}
I can see the problem area in my loop, but I'm a bit stumped as to how to fix this. Any help would be must appreciated.
From a coding stand point, if you wanted to know how to do this without a library as an exercise, what's happening is your loop breaks after you run into the first delimeter. Then when you loop to the second delimeter, you don't enter the second while loop and print a new line again. You can put
//update information
while(is_delimiter(token_buffer) && (index+counter)<=strcur->end_index)
{
counter++;
token_buffer = string[index+counter];
}
Use the standard C library function strtok().
Rather than redevelop such a standard function.
Here's the related related manual page.
Can use as following in your case:
#include <string.h>
char *token;
token = strtok (string, " \r\n");
// do something with your first token
while (token != NULL)
{
// do something with subsequents tokens
token = strtok (NULL, " \r\n");
}
As you can observe, each subsequent call to strtok using the same arguments will send you back a char* adressing to the next token.
In the case you're working on a threaded program, you might use strtok_r() C function.
First call to it should be the same as strtok(), but subsequent calls are done passing NULL as the first argument. :
#include <string.h>
char *token;
char *saveptr;
token = strtok_r(string, " \r\n", &saveptr)
// do something with your first token
while (token != NULL)
{
// do something with subsequents tokens
token = strtok_r(NULL, " \r\n", &saveptr)
}
Just put the process token logic into aif(counter > 0){...}, which makes malloc happen only when there was a real token. like this
if(counter > 0){ // it means has a real word, not delimeters
char *output_token = malloc(counter+1);
strncpy(output_token,string+index,counter);
printf("%s \n", output_token);
TKProcessing(output_token);
}

Parsing commands shell-like in C

I want to parse user input commands in my C (just C) program. Sample commands:
add node ID
add arc ID from ID to ID
print
exit
and so on. Then I want to do some validation with IDs and forward them to specified functions. Functions and validations are of course ready. It's all about parsing and matching functions...
I've made it with many ifs and strtoks, but I'm sure it's not the best way... Any ideas (libs)?
I think what you want is something like this:
while (1)
{
char *line = malloc(128); // we need to be able to increase the pointer
char *origLine = line;
fgets(line, 128, stdin);
char command[20];
sscanf(line, "%20s ", command);
line = strchr(line, ' ');
printf("The Command is: %s\n", command);
unsigned argumentsCount = 0;
char **arguments = malloc(sizeof(char *));
while (1)
{
char arg[20];
if (line && (sscanf(++line, "%20s", arg) == 1))
{
arguments[argumentsCount] = malloc(sizeof(char) * 20);
strncpy(arguments[argumentsCount], arg, 20);
argumentsCount++;
arguments = realloc(arguments, sizeof(char *) * argumentsCount + 1);
line = strchr(line, ' ');
}
else {
break;
}
}
for (int i = 0; i < argumentsCount; i++) {
printf("Argument %i is: %s\n", i, arguments[i]);
}
for (int i = 0; i < argumentsCount; i++) {
free(arguments[i]);
}
free(arguments);
free(origLine);
}
You can do what you wish with 'command' and 'arguments' just before you free it all.
It depends on how complicated your command language is. It might be worth going to the trouble of womping up a simple recursive descent parser if you have more than a couple of commands, or if each command can take multiple forms, such as your add command.
I've done a couple of RDPs by hand for some projects in the past. It's a bit of work, but it allows you to handle some fairly complex commands that wouldn't be straightforward to parse otherwise. You could also use a parser generator like lex/yacc or flex/bison, although that may be overkill for what you are doing.
Otherwise, it's basically what you've described; strok and a bunch of nested if statements.
I just wanted to add something to Richard Ross's reply: Check the returned value from malloc and realloc. It may lead to hard-to-find crashes in your program.
All your command line parameters will be stored into a array of strings called argv.
You can access those values using argv[0], argv[1] ... argv[n].

Shell program pipes C

I am trying to run a small shell program and the first step to make sure my code is running properly is to make sure I get the correct command and parameters:
//Split the command and store each string in parameter[]
cp = (strtok(command, hash)); //Get the initial string (the command)
parameter[0] = (char*) malloc(strlen(cp)+ 1); //Allocate some space to the first element in the array
strncpy(parameter[0], cp, strlen(cp)+ 1);
for(i = 1; i < MAX_ARG; i++)
{
cp = strtok(NULL, hash); //Check for each string in the array
parameter[i] = (char*) malloc(strlen(cp)+ 1);
strncpy(parameter[i], cp, strlen(cp)+ 1); //Store the result string in an indexed off array
if(parameter[i] == NULL)
{
break;
}
if(strcmp(parameter[i], "|") == 0)
{
cp = strtok(NULL, hash);
parameter2[0] = (char*) malloc(strlen(cp)+ 1);
strncpy(parameter2[0], cp, strlen(cp)+ 1);
//Find the second set of commands and parameters
for (j = 1; j < MAX_ARG; j++)
{
cp = strtok(NULL, hash);
if (strlen(cp) == NULL)
{
break;
}
parameter2[j] = (char*) malloc(strlen(cp)+ 1);
strncpy(parameter2[j], cp, strlen(cp)+ 1);
}
break;
}
I am having a problem when I compare cp and NULL, my program crashes. What I want is to exit the loop once the entries for the second set or parameters have finished (which is what I tried doing with the if(strlen(cp) == NULL)
I may have misunderstood the question, but your program won't ever see the pipe character, |.
The shell processes the entire command line, and your program will only be given it's share of the command line, so to speak.
Example:
cat file1 file2 | sed s/frog/bat/
In the above example, cat is invoked with only two arguments, file1, and file2. Also, sed is invoked with only a single argument: s/frog/bat/.
Let's look at your code:
parameter[0] = malloc(255);
Since strtok() carves up the original command array, you don't have to allocate extra space with malloc(); you could simply point the parameter[n] pointers to the relevant sections of the original command string. However, once you move beyond space-separated commands (in a real shell, the | symbol does not have to be surrounded by spaces, but it does in yours), then you will probably need to copy the parts of the command string around, so this is not completely wrong.
You should check for success of memory allocation.
cp = strtok(command, " "); //Get the initial string (the command)
strncpy(parameter[0], cp, 50);
You allocated 255 characters; you copy at most 49. It might be better to wait until you have the parameter isolated, and then duplicate it - allocating just the space that is needed. Note that if the (path leading to the) command name is 50 characters or more, you won't have a null-terminated string - the space allocated by malloc() is not zeroed and strncpy() does not write a trailing zero on an overlong string.
for (i = 1; i < MAX_ARG; i++)
It is not clear that you should have an upper limit on the number of arguments that is as simple as this. There is an upper limit, but it is normally on the total length of all the arguments.
{
parameter[i] = malloc(255);
Similar comments about memory allocation - and checking.
cp = strtok(NULL, " ");
parameter[i] = cp;
Oops! There goes the memory. Sorry about the leak.
if (strcmp(parameter[i], "|") == 0)
I think it might be better to do this comparison before copying... Also, you don't want the pipe in the argument list of either command; it is a notation to the shell, not part of the command's argument lists. You should also ensure that the first command's argument list is terminated with a NULL pointer, especially since i is set just below to MAX_ARG so you won't know how many arguments were specified.
{
i = MAX_ARG;
cp = strtok(NULL, " ");
parameter2[0] = malloc(255);
strncpy(parameter2[0], cp, 50);
This feels odd; you isolate the command and then process its arguments separately. Setting i = MAX_ARG seems funny too since your next action is to break the loop.
break;
}
if(parameter[i] == NULL)
{
break;
}
}
//Find the second set of commands and parameter
//strncpy(parameter2[0], cp, 50);
for (j = 1; j < MAX_ARG; j++)
{
parameter2[j] = malloc(255);
cp = strtok(NULL, " ");
parameter2[j] = cp;
}
You should probably only enter this loop if you found a pipe. Then this code leaks memory like the other one does (so you're consistent - and consistency is important; but so is correctness).
You need to review your code to ensure it handles 'no pipe symbol' properly, and 'pipe but no following command'. At some point, you should consider multi-stage pipelines (three, four, ... commands). Generalizing your code to handle that is possible.
When writing code for Bash or an equivalent shell, I frequently use notations such as this script, which I used a number of times today.
ct find /vobs/somevob \
-branch 'brtype(dev.branch)' \
-version 'created_since(2011-10-11T00:00-00:00)' \
-print |
grep -v '/0$' |
xargs ct des -fmt '%u %d %Vn %En\n' |
grep '^jleffler ' |
sort -k 4 |
awk '{ printf "%-8s %s %-25s %s\n", $1, $2, $3, $4; }'
It doesn't much matter what it does (but it finds all checkins I made since 11th October on a particular branch in ClearCase); it's the notation that I was using that is important. (Yes, it could probably be optimized - it wasn't worth doing so.) Equally, this is not necessarily what you need to deal with now - but it does give you an inkling of where you need to go.

Resources