how can i parse a string using c - c

Given below is my string
char test[1000]="$GPGSA,A,3,14,20,22,25,31,32,,,,,,,2.4,1.4,1.9*3A
$GPGSV,4,1,16,31,76,060,35,14,28,070,34,20,32,309,32,32,61,309,32*72\n
$GPGSV,4,2,16,25,21,053,29,24,37,258,29,23,14,277,27,12,,,21*44\n
$GPGSV,4,3,16,22,13,133,20,11,20,272,,16,11,161,,30,,,*4F\
n$GPGSV,4,4,16,29,,,,28,,,,27,,,,26,,,*7E\n
$GPGGA,150427.8,4001.022852,N,10505.269674,W,1,06,1.4,1559.6,M,-21.0,M,,*53\n
$PQXFI,150427.8,4001.022852,N,10505.269674,W,1559.6,35.12,25.46,2.05*4A\n
$GPVTG,nan,T,nan,M,0.0,N,0.0,K,A*23\n
$GPRMC,150427.8,A,4001.022852,N,10505.269674,W,0.0,,280611,,,A*50";
I want to get string
"$GPGGA,150427.8,4001.022852,N,10505.269674,W,1,06,1.4,1559.6,M,-21.0,M,,*53\n"
from above big string using C Language.
Please help me out.

You say which line you want, but you didn't say why. If you say what it is about this line that makes it the line you are after, then I could comment on how you'd find it.
But basically, you'll probably want to separate the string into lines. You can use strtok() to break on \n. You can then examine the lines, one at a time.

This looks like GPS data to me and is used (and parsed) in many applications.
http://mbed.org/users/todotani/notebook/gps-nmea-parser/
http://www.edaboard.com/thread204021.html
You might be able to save yourself some time by re-using some other open source parsers.

The strstr() command, part of the standard C library, can be used to find a substring within a string.

Related

Check if a string has only whitespace characters in C

I am implementing a shell in C11, and I want to check if the input has the correct syntax before doing a system call to execute the command. One of the possible inputs that I want to guard against is a string made up of only white-space characters. What is an efficient way to check if a string contains only white spaces, tabs or any other white-space characters?
The solution must be in C11, and preferably using standard libraries. The string read from the command line using readline() from readline.h, and it is a saved in a char array (char[]). So far, the only solution that I've thought of is to loop over the array, and check each individual char with isspace(). Is there a more efficient way?
So far, the only solution that I've thought of is to loop over the array, and check each individual char with isspace().
That sounds about right!
Is there a more efficient way?
Not really. You need to check each character if you want to be sure only space is present. There could be some trick involving bitmasks to detect non-space characters in a faster way (like strlen() does to find a NUL terminator), but I would definitely not advise it.
You could make use of strspn() or strcspn() checking the returned value, but that would surely be slower since those functions are meant to work on arbitrary accept/reject strings and need to build lookup tables first, while isspace() is optimized for its purpose using a pre-built lookup table, and will most probably also get inlined by the compiler using proper optimization flags. Other than this, vectorization of the code seems like the only way to speed things up further. Compile with -O3 -march=native -ftree-vectorize (see also this post) and run some benchmarks.
"loop over the array, and check each individual char with isspace()" --> Yes go with that.
The time to do that is trivial compared to readline().
I'm going to provide an alternative solution to your problem: use strtok. It splits a string into substrings based on a specific set of ignored delimiters. With an empty string, you'd just get no tokens at all.
If you need more complicated matching than that for your shell (eg. To do quoted arguments) you're best off writing a small tokenizer/lexer. The strtok method is basically to just look for any of the delimeters you've specified, temporarily replace them with \0, returning the substring up to that point, putting the old character back, and repeating until it reaches the end of the string.
Edit:
As the busybee points out in the comment below, strtok does not put back the character that it replaces with \0. The above paragraph was worded poorly, but my intent was to explain how to implement your own simple tokenizer/lexer if you needed to, not to explain exactly how strtok works down to the smallest detail.

Reading text from a file in c

So I am trying to write a program that reads random lines of text from an input file. I can open the file but I dont know how to read characters yet, let alone character strings (only numbers so far). I am trying to make it so it can read in random lines of text and then I can manipulate them (i.e. print them in any order)
And is it possible for the program to recognize spaces (or even better periods) in between words in the input file? For example could I make it stop reading after the end of a sentence?
I am not so much looking for someone to write the code for me or anything, I am using this project as kind of a learning exercise so if anyone could tell me what topics in c to study to make this possible that would be great!
Thanks!
Reading your file with fscanf() is a good start. Have a look at the man page which will tell you how to read this either one variable at a time or many variables at a time. If your file is more free form than that, you may want to read the whole thing into memory and then process into tokens (perhaps using strtok or strtok_r), or use a combination of fscanf then process strings you have read with strtok or strtok_r. That should give you a start.

Standard (or convenient) method to read and write tabular data to a text file in c

This might sound rather awkward, but I want to ask if there is a commonly practiced way of storing tabular data in a text file to be read and written in C.
Like in python you can load a full text file nto an array by f.readlines then go through all the lines and split each line by a specific character or sequence of characters (delimiter).
How do you approach this problem in C?
Pretty much the same way you would in any other language. Pick a field separator (I.E., tab character), open the text file for reading and parse each line.
Of course, in C it will never be as easy as it is in Python, but approaches are similar.
Whoa. I am a bit baffled by the other answers which make me feel like I'm on Mainframes.stackexchange.com instead of stackoverflow.com
Why don't you pick a modern data format like JSON or XML and follow best practices for the data format of your choice?
If you want a good JSON reader/writer for C, I've used Jansson, and it's very easy and fast.
If you want a good XML reader/writer for C, I've used miniXML and it's also easy and fast. Also has SAX *and * DOM support depending on how you want to read in the XML.
Obviously there are a wealth of other libraries available as well.
Please don't give the next guy to come along and support your program some wacky custom file format to deal with.
I find getline() and strtok() to be quite convenient (getline was a gnu extension, standardized in POSIX.1-2008).
There's a handful of mechanisms, but there's a reason why scripting languages have become so popular over the least twenty years -- some of the tasks that seem simple in scripting languages are ponderous in C.
You could use flex and bison to write a parser for your tables. This really only works if the format is very well defined and "static". They're amazing tools that can do more than you might suspect, but it is very heavy machinery for what could be done simply with a split() in a scripting language.
You could read individual fields using getdelim(3). However, this was only standardized with POSIX.1-2008, so this is far from ubiquitous. (Every Linux machine with glibc should have them.)
You could read lines with fgets(3) and discover the split locations using strchr(3).
You could read lines with fgets(3) and use strtok(3) to tokenize strings.
You can use scanf(3) to perform input and scanning in one go; it seems from the questions here that scanf(3) is difficult to use correctly.
You could use character-at-a-time parsing approaches: read characters using getc(3), inspect it, do something with it, iterate until no more characters.

How can I parse text input and convert strings to integers?

I have a file input, in which i have the following data.
1 1Apple 2Orange 10Kiwi
2 30Apple 4Orange 1Kiwi
and so on. I have to read this data from file and work on it but i dont know how to retrieve the data. I want to store 1(of 1 apple) as integer and then Apple as a string.
I thought of reading the whole 1Apple as a string. and then doing something with the stoi function.
Or I could read the whole thing character by character and then if the ascii value of that character lies b/w 48 to 57 then i will combine that as an integer and save the rest as string? Which one shall I do? Also how do I check what is the ASCII value of the char. (shall I convert the char to int and then compare, or is there any inbuilt function?)
How about using the fscanf() function if and only if your input pattern is not going to change. Otherwise you should probably use fgets() and perform checks if you want to separate the number from the string such as you suggested.
There is one easy right way to do this with standard C library facilities, one rather more difficult right way, and a whole lot of wrong ways. This is the easy right way:
Read an entire line into a char[] buffer using fgets.
Extract numbers from this line using strtol or strtoul.
It is very important to understand why the easier-looking alternatives (*scanf and atoi) should never be used. You might write less code initially, but once you start thinking about how to handle even slightly malformed input, you will discover that you should have used strtol.
The "rather more difficult right way" is to use lex and yacc. They are much more complicated but also much more powerful. You shouldn't need them for this problem.

Parsing a document , C

I need to parse a document in C language. I was about to use the strtok function but I don't know if it's the best method or if just a token system is enough (searching for \n, space etc).
The structure of each line of the document is : element \n element "x".
thanks :-)
Token system if fine, strtok is just an implementation of that. However, you're better off with using strtok_r which does not keep any internal state outside control of your program.
I don't remember the details, but I saw in several sources that strtok was an unsafe piece of work. You'd be better off rolling your own, if you ask me.

Resources