I am having some trouble using system calls in C. I am trying to use read to read some input from stdin, then use strtok to load the values in an array, but I can't seem to do that right (I keep on getting segfault).
Here is the code I began with:
void read_input()
{
char* c;
read(0, c, 128);
printf("%s", c);
}
So, this works fine, and I can print the output. However, I have tried several things next and they haven't worked. I have tried:
Creating an array char arr[128], then using different variations of strcpy, strncpy, and memcpy to copy c into arr, but they haven't worked, and I get a segfault.
Actually, that's all I have tried. I am not sure how I am supposed to copy c into arr so I can use strtok. Can anyone explain?
Thanks
Edit:
Okay, this is my new code:
void
read_input()
{
char arr[129];
int r = read(0, arr, 129);
printf("%s", arr);
arr[r] = '\0';
char* pch;
pch = strtok(arr, " \n");
while(pch != NULL)
{
printf("%s\n", pch);
pch = strtok(NULL, " \n");
}
}
I am trying to read from stdin an input like "hi my name is john". This is the output I get from the printfs:
hi my name is john
�����hi
my
name
is
john
Why does the first token look like that? I noticed that if I dont add arr[r] = '\0', then "john" will look similar to "hi". Is there something I need to do for the first character, similar to the last?
You could directly go for this:
void read_input()
{
char arr[128];
read(0, arr, 128);
printf("%s", arr);
}
Or the dynamic memory allocation route:
char *arr = malloc((sizeof(*arr) * 128) + 1);
if (arr == NULL)
{
// handle error
}
int r = read(0, arr, 128);
arr[r] = '\0';
printf("%s", arr);
Your second program doesn't work because you put the NUL terminator after the the call to printf. You need to do it before printf:
char arr[129]; // 128 + 1 for the NUL terminator
int r = read(0, arr, 128);
arr[r] = '\0';
printf("%s\n", arr); // \n for better readbility of the output
You also need one more byte for the NUL terminator, hence the 129.
printf with the %s format specifier prints NUL terminated strings. As you don't put the NUL before printf, latter displays all characters until it encounters a NUL hence the output you get:
hi my name is john
�����
...
This output may vary, it depends on the previous content of the arr buffer.
The first version is wrong because the c pointer is not initialized, it points nowhere. The program may appear to work correctly in this case, but this is so called "undefined behaviour", google that and keep in mind that "undefined behaviour" includes "apparently working fine".
Related
I have this code where its read multiple files and print a certain value. After reading files, at a certain moment my while loop stop and show a segmentation fault ...
Here is my code
int main () {
const char s[2] = ",";
const char s2[2] = ":";
char var1[] = "fiftyTwoWeekHigh\"";
char *fiftyhigh;
char *fiftyhigh2;
char *fiftyhigh_token;
char *fiftyhigh2_token;
char var2[] = "fiftyTwoWeekLow\"";
char *fiftylow;
char *fiftylow2;
char *fiftylow_token;
char *fiftylow2_token;
char var3[] = "regularMarketPrice\"";
char *price;
char *price2;
char *price_token;
char *price2_token;
FILE *fp;
char* data = "./data/";
char* json = ".json";
char line[MAX_LINES];
char line2[MAX_LINES];
int len;
char* fichier = "./data/indices.txt";
fp = fopen(fichier, "r");
if (fp == NULL){
printf("Impossible d'ouvrir le fichier %s", fichier);
return 1;
}
while (fgets(line, sizeof(line), fp) != NULL) {
char fname[10000];
len = strlen(line);
if (line[len-1] == '\n') {
line[len-1] = 0;
}
int ret = snprintf(fname, sizeof(fname), "%s%s%s", data, line, json);
if (ret < 0) {
abort();
}
printf("%s\n", fname);
FILE* f = fopen(fname, "r");
while ( fgets( line2, MAX_LINES, f ) != NULL ) {
fiftyhigh = strstr(line2, var1);
fiftyhigh_token = strtok(fiftyhigh, s);
fiftyhigh2 = strstr(fiftyhigh_token, s2);
fiftyhigh2_token = strtok(fiftyhigh2, s2);
printf("%s\n", fiftyhigh2_token);
fiftylow = strstr(line2, var2);
fiftylow_token = strtok(fiftylow, s);
fiftylow2 = strstr(fiftylow_token, s2);
fiftylow2_token = strtok(fiftylow2, s2);
printf("%s\n", fiftylow2_token);
price = strstr(line2, var3);
price_token = strtok(price, s);
price2 = strstr(price_token, s2);
price2_token = strtok(price2, s2);
printf("%s\n", price2_token);
//printf("\n%s\t%s\t%s\t%s\t%s", line, calculcx(fiftyhigh2_token, price2_token, fiftylow2_token), "DIV-1", price2_token, "test");
}
fclose(f);
}
fclose(fp);
return 0;
}
and the output is :
./data/k.json
13.59
5.31
8.7
./data/BCE.json
60.14
46.03
56.74
./data/BNS.json
80.16
46.38
78.73
./data/BLU.json
16.68
2.7
Segmentation fault
It is like my program stop because it can't reach a certain data at a certain file... Is there a way to allocate more memory ? Because my MAX_LINES is already set at 6000.
I'm assuming that the lines in your file look something like this:
{"fiftyTwoWeekLow":32,"fiftyTwoWeekHigh":100, ... }
In other words it's some kind of JSON format. I'm assuming that the line starts with '{' so each line is a JSON object.
You read that line into line2, which now contains:
{"fiftyTwoWeekLow":32,"fiftyTwoWeekHigh":100, ... }\0
Note the \0 at the end that terminates the string. Note also that "fiftyTwoWeekLow" comes first, which turns out to be really important.
Now let's trace through the code here:
fiftyhigh = strstr(line2, var1);
fiftyhigh_token = strtok(fiftyhigh, s);
First you call strstr to find the position of "fiftyTwoWeekHigh". This will return a pointer to the position of that field name in the line. Then you call strtok to find the comma that separates this value from the next. I think that this is where things start to go wrong. After the call to strtok, line2 looks like this:
{"fiftyTwoWeekLow":32,"fiftyTwoWeekHigh":100\0 ... }\0
Note that strtok has modified the string: the comma has been replaced with \0. That's so you can use the returned pointer fiftyhigh_token as a string without seeing all the stuff that came after the comma.
fiftyhigh2 = strstr(fiftyhigh_token, s2);
fiftyhigh2_token = strtok(fiftyhigh2, s2);
printf("%s\n", fiftyhigh2_token);
Next you look for the colon and then call strtok with a pointer to the colon. Since the delimiter you're passing to strok is the colon, strtok ignores the colon and returns the next token, which (because the string we're looking at, which ends after "100," has no more colons) is the rest of the string, in other words, the number.
So you've gotten your number, but probably not in the way you expected? There was really no point in the second call to strtok since (assuming the JSON was well-formed) the position of "100" was just fiftyhigh2+1.
Now we try to find "fiftyTwoWeekLow:"
fiftylow = strstr(line2, var2);
fiftylow_token = strtok(fiftylow, s);
fiftylow2 = strstr(fiftylow_token, s2);
fiftylow2_token = strtok(fiftylow2, s2);
printf("%s\n", fiftylow2_token);
This is basically the same process, and after you call strtok, line2 like this:
{"fiftyTwoWeekLow":32\0"fiftyTwoWeekHigh":100\0 ... }\0
Note that you're only able to find "fiftyTwoWeekLow" because it comes before "fiftyTwoWeekHigh" in the line. If it had come after, then you'd have been unable to find it due to the \0 added after "fiftyTwoWeekHigh" earlier. In that case, strstr would have returned NULL, which would cause strtok to return NULL, and then you'd definitely have gotten a seg fault after passing NULL to strstr.
So the code is really sensitive to the order in which the fields appear in the line, and it's probably failing because some of your lines have the fields in a different order. Or maybe some fields are just missing from some lines, which would have the same effect.
If you're parsing JSON, you should really use a library designed for that purpose. But if you really want to use strtok then you should:
Read line2.
Call strtok(line2, ",") once, then repeatedly call strtok(NULL, ",") in a loop until it returns null. This will break up the line into tokens that each look like "someField":100.
Isolate the field name and value from each of these tokens (just call strchr(token, ':') to find the value). Do not call strtok here, because it will change the internal state of strtok and you won't be able to use strtok(NULL, ",") to continue processing the line.
Test the field name, and depending on its value, set an appropriate variable. In other words, if it's the "fiftyTwoWeekLow" field, set a variable called fiftyTwoWeekLow. You don't have to bother to strip off the quotes, just include them in the string you're comparing with.
Once you've processed all the tokens (strtok returns NULL), do something with the variables you set.
You may be to pass ",{}" as the delimiter to strtok in order to get rid of any open and close curly braces that surround the line. Or you could look for them in each token and ignore them if they appear.
You could also pass "\"{},:" as the delimiter to strtok. This would cause strtok to emit an alternating sequence of field names and values. You could call strtok once to get the field name, again to get the value, then test the field name and do something with the value.
Using strtok is a pretty primitive way of parsing JSON, but it will will work as long as your JSON only contains simple field names and numbers and doesn't include any strings that themselves contain delimiter characters.
Did you mean '\0' ?
if (line[len-1] == '\n') {
line[len-1] = 0;
}
I advise you to use gdb to see where the segfault occurs and why.
I don't think you have to allocate much more memory. But the segfault may happens because you don't have anymore data and you still print the result.
Use if(price2_token!=NULL) printf("%s\n", price2_token); for example.
it seems like such a silly thing to ask but i seriously don't know why this is happening. Could be that it's almost 5am and i'm still doing this but..
It should print -CA but why when i compile it, it is printing
-
CA?
instead of -CA, there isn't a '\n' anywhere in sight.
Can you guys think of anything logical that would explain it?
int main(int argc, char* argv[]){
int check = 0;
char *thing = (char*)malloc(2 * sizeof(char));
strcpy(char, "CA");
some code..
do{
more code...
if(condition== 1) {
more code....
if(check == 0) {
printf("-");
check++;
}
if (some conditon != NULL){
printf("%s\n",thing);
}while(condition)
return 0;
}
You didn't allocate enough space for your string. Every string has a null terminator, so a 2-character string needs 3 bytes in the array. Your strcpy() is writing outside the bounds of the thing array when it copies the null byte, which results in undefined behavior.
Use
char *thing = malloc(3);
You can also use strdup(), which makes a copy of a string in dynamic memory, automatically allocating enough space based on the length of the original string.
char *thing = strdup("CA");
printf prints a null terminated string to the stdout. if the sting is not null terminated the printf will go on printing garbage to the stdout till a null terminator is met
so you should all one to your character array
char thing[3] = {'C','A',0};
now printf will print
-
CA
I am in the process of teaching myself C. I have the following code that prints a string char by char forwards and backwards:
#include<stdio.h>
#include<string.h>
main(){
char *str;
fgets(str, 100, stdin);
//printf("%i", strlen(str));
int i;
for(i = 0; i < strlen(str) - 1; i++){
printf("%c", str[i]);
}
for(i = strlen(str); i > -1; i--){
printf("%c", str[i]);
}
}
When run, it gives me the following output (assuming I typed "hello"):
cello
ollec
In addition, if I uncomment the 7th line of code, I get the following output (assuming I typed "hello"):
6 ♠
For the life of me, I cannot figure out what I am doing that is causing the first character in the output to change. In the second example, I know that the string length would be 6 because 'h' + 'e' + 'l' + 'l' + 'o' + '\0' = 6. That is fine, but where is the spade symbol coming from? Why is it only printing one of them?
It is pretty obvious to me that I have some kind of fundamental misunderstanding of what is happening under the hood here and I cant find any examples of this elsewhere. Can anyone explain what is going wrong here?
You never allocate memory for the string. Instead of
char *str;
use
char str[100];
so that you have enough space for the up to 100 characters you read in there with the fgets call.
In this code:
char *str;
fgets(str, 100, stdin);
str points to an effectively random location. Then you tell fgets to read characters and put them where str is pointing. This causes undefined behaviour; the symptoms you are seeing probably occur because str happened to point to some memory where the first character of that memory that was being used for other purposes, but the other characters weren't being used.
Instead you need to allocate memory:
char str[100]; // allocate 100 bytes
fgets(str, 100, stdin);
Pointers only point at memory which already is allocated somewhere; they do not "own" or "contain" any memory.
You should preallocate space for your string, otherwise you are writing to who knows where, which is bad.
char str[100]; //I must be big enough to hold anything fgets might pass me
You should also be sure to only access parts of the string which contain characters:
for(i = strlen(str)-1; i > -1; i--){
printf("%c", str[i]);
}
Note that the character at strlen(str) is \0, the string-terminating null character. So you can access this space, but trying to print it or otherwise treating it like a standard letter is going to lead to issues at some point.
Your str is a pointer to char, but you don't have any actual character buffer for it to point to. You need a character array instead:
char str[100];
Only then can fgets have somewhere to store the data it reads.
Then on your reverse-printing loop, your indices are wrong:
for(i = strlen(str); i > -1; i--){
With the above, you try to print str[i] for i = strlen(str), but that's one past the end of the valid string data. Change to:
for(i = strlen(str) - 1; i > -1; i--){
The issue is that you are not allocating your
char *str
what you need to do is either
1)
char *str = malloc(sizeof(char) * 100);
and then when you are no longer using it:
free(str)
2)
char str[100];
I seem to have some trouble getting my string to terminate with a \0. I'm not sure if this the problem, so I decided to make a post.
First of all, I declared my strings as:
char *input2[5];
Later in the program, I added this line of code to convert all remaining unused slots to become \0, changing them all to become null terminators. Could've done with a for loop, but yea.
while (c != 4) {
input2[c] = '\0';
c++;
}
In Eclipse when in debug mode, I see that the empty slots now contain 0x0, not \0. Are these the same things? The other string where I declared it as
char input[15] = "";
shows \000 when in debug mode though.
My problem is that I am getting segmentation faults (on Debian VM. Works on my Linux 12.04 though). My GUESS is that because the string hasn't really been terminated, the compiler doesn't know when it stops and thus continues to try to access memory in the array when it is clearly already out of bound.
Edit: I will try to answer all other questions soon, but when I change my string declaration to the other suggested one, my program crashes. There is a strtok() function, used to chop my fgets input into strings and then putting them into my input2 array.
So,
input1[0] = 'l'
input1[1] = 's'
input1[2] = '\n'
input2[0] = "ls".
This is a shell simulating program with fork and execvp. I will post more code soon.
Regarding the suggestion:
char *input2[5]; This is a perfectly legal declaration, but it
defined input2 as an array of pointers. To contain a string, it needs
to be an array of char.
I will try that change again. I did try that earlier, but I remember it giving me another run-time error (seg fault?). I think it is because of the way I implemented my strtok() function though. I will check it out again. Thanks!
EDIT 2: I added a response below to update my progress so far. Thanks for all the help!
It is here.
.
You code should rather look like this:
char input2[5];
for (int c=0; c < 4; c++) {
input2[c] = '\0';
}
0x0 and \0 are different representation of the same value 0;
Response 1:
Thanks for all the answers!
I made some changes from the responses, but I reverted the char suggestion (or correct string declaration) because like someone pointed out, I have a strtok function. Strtok requires me to send in a char *, so I reverted back to what I originally had (char * input[5]). I posted my code up to strtok below. My problem is that the program works fine in my Ubuntu 12.04, but gives me a segfault error when I try to run it on the Debian VM.
I am pretty confused as I originally thought the error was because the compiler was trying to access an array index that is already out of bound. That doesn't seem like the problem because a lot of people mentioned that 0x0 is just another way of writing \000. I have posted my debug window's variable section below. Everything seems right though as far as I can see.. hmm..
Input2[0] and input[0], input[1 ] are the focus points.
Here is my code up to the strtok function. The rest is just fork and then execvp call:
int flag = 0;
int i = 0;
int status;
char *s; //for strchr, strtok
char input[15] = "";
char *input2[5];
//char input2[5];
//Prompt
printf("Please enter prompt:\n");
//Reads in input
fgets(input, 100, stdin);
//Remove \n
int len = strlen(input);
if (len > 0 && input[len-1] == '\n')
input[len-1] = ' ';
//At end of string (numb of args), add \0
//Check for & via strchr
s = strchr (input, '&');
if (s != NULL) { //If there is a &
printf("'&' detected. Program not waiting.\n");
//printf ("'&' Found at %s\n", s);
flag = 1;
}
//Now for strtok
input2[i] = strtok(input, " "); //strtok: returns a pointer to the last token found in string, so must declare
//input2 as char * I believe
while(input2[i] != NULL)
{
input2[++i] = strtok( NULL, " ");
}
if (flag == 1) {
i = i - 1; //Removes & from total number of arguments
}
//Sets null terminator for unused slots. (Is this step necessary? Does the C compiler know when to stop?)
int c = i;
while (c < 5) {
input2[c] = '\0';
c++;
}
Q: Why didn't you declare your string char input[5];? Do you really need the extra level of indirection?
Q: while (c < 4) is safer. And be sure to initialize "c"!
And yes, "0x0" in the debugger and '\0' in your source code are "the same thing".
SUGGESTED CHANGE:
char input2[5];
...
c = 0;
while (c < 4) {
input2[c] = '\0';
c++;
}
This will almost certainly fix your segmentation violation.
char *input2[5];
This is a perfectly legal declaration, but it defined input2 as an array of pointers. To contain a string, it needs to be an array of char.
while (c != 4) {
input2[c] = '\0';
c++;
}
Again, this is legal, but since input2 is an array of pointers, input2[c] is a pointer (of type char*). The rules for null pointer constants are such that '\0' is a valid null pointer constant. The assignment is equivalent to:
input2[c] = NULL;
I don't know what you're trying to do with input2. If you pass it to a function expecting a char* that points to a string, your code won't compile -- or at least you'll get a warning.
But if you want input2 to hold a string, it needs to be defined as:
char input2[5];
It's just unfortunate that the error you made happens to be one that a C compiler doesn't necessarily diagnose. (There are too many different flavors of "zero" in C, and they're often quietly interchangeable.)
I'm trying to separate the following string into three separate variables, i.e., a, b and c.:
" mov/1/1/1,0 STR{7}, r7"
each need to hold a different segment of the string, e.g:
a = "mov/1/1/1,0"
b = "STR{7}"
c = "r7"
There may be a space or also a tab between each command; this what makes this code part trickier.
I tried to use strtok, for the string manipulation, but it didn't work out.
char command[50] = " mov/1/1/1,0 STR{7}, r7";
char a[10], b[10], c[10];
char * ptr = strtok(command, "\t");
strcpy(a, ptr);
ptr = strtok(NULL, "\t");
strcpy(b, ptr);
ptr = strtok(NULL, ", ");
strcpy(c, ptr);
but this gets things really messy as the variables a, b and c get to hold more values than they should, which leads the program to crash.
Input may vary from:
" mov/1/1/1,0 STR{7}, r7"
"jsr /0,0 PRTSTR"
"mov/1/1/0,0 STRADD{5}, LASTCHAR {r3} "
in which the values of a,b and c change to different part of the given string.
I was told it is safer to use sscanf for that kind of manners than strtok, but I'm not sure why and how it could assist me.
I would be more than glad to hear your opinion!
This should do the trick :
sscanf(command, "%s,%s,%s", &a, &b, &c)
From scanf manpage, %s eats whitespaces, be them spaces or tabs :
s : Matches a sequence of non-white-space characters; the next pointer
must be a pointer to character array that is long enough to hold the
input sequence and the terminating null byte ('\0'), which is added
automatically. The input string stops at white space or at the
maximum field width, whichever occurs first.
As you might be knowing that you can use sscanf() the same way as scanf(), the difference is sscanf scans from string, while scanf from standard input.
In this problem you can specify scanf, with a set of characters to "always skip", as done in this link.
Since you have different set of constraints for scanning all the three strings, you can specify, using %*[^...], these constraints, before every %s inside sscanf().
I have reservations about using strtok(), but this code using it seems to do what you need. As I noted in a comment, the sample string "jsr /0,0 PRTSTR" throws a spanner in the works; it has a significant comma in the second field, whereas in the other two example strings, the comma in the second field is not significant. If you need to remove trailing commas, you can do that after the space-based splitting — as shown in this code. The second loop tests the zap_trailing_commas() function to ensure that it behaves under degenerate cases, zapping trailing commas but not underflowing the start of the buffer or anything horrid.
#include <stdio.h>
#include <string.h>
static void zap_trailing_commas(char *str)
{
size_t len = strlen(str);
while (len-- > 0 && str[len] == ',')
str[len] = '\0';
}
static void splitter(char *command)
{
char a[20], b[20], c[20];
char *ptr = strtok(command, " \t");
strcpy(a, ptr);
zap_trailing_commas(a);
ptr = strtok(NULL, " \t");
strcpy(b, ptr);
zap_trailing_commas(b);
ptr = strtok(NULL, " \t");
strcpy(c, ptr);
zap_trailing_commas(c);
printf("<<%s>> <<%s>> <<%s>>\n", a, b, c);
}
int main(void)
{
char data[][50] =
{
" mov/1/1/1,0 STR{7}, r7",
"jsr /0,0 PRTSTR",
"mov/1/1/0,0 STRADD{5}, LASTCHAR {r3} ",
};
for (size_t i = 0; i < sizeof(data)/sizeof(data[0]); i++)
splitter(data[i]);
char commas[][10] = { "X,,,", "X,,", "X,", "X" };
for (size_t i = 0; i < sizeof(commas)/sizeof(commas[0]); i++)
{
printf("<<%s>> ", commas[i]);
zap_trailing_commas(&commas[i][1]);
printf("<<%s>>\n", commas[i]);
}
return 0;
}
Sample output:
<<mov/1/1/1,0>> <<STR{7}>> <<r7>>
<<jsr>> <</0,0>> <<PRTSTR>>
<<mov/1/1/0,0>> <<STRADD{5}>> <<LASTCHAR>>
<<X,,,>> <<X>>
<<X,,>> <<X>>
<<X,>> <<X>>
<<X>> <<X>>
I also tested a variant with commas in place of the X's and that left the single comma alone.