hey guys!
is there any way of directly accessing a cell in a .csv file format using C?
e.g. i want to sum up a column using C, how do i do it?
It's probably easiest to use the scanf-family for this, but it depends a little on how your data is organized. Let's say you have three columns of numeric data, and you want to sum up the third column, you could loop over a statement like this: (file is a FILE*, and is opened using fopen, and you loop until end of file is reached)
int n; fscanf(file, "%*d,%*d,%d", &n);
and sum up the ns. If you have other kinds of data in your file, you need to specify your format string accordingly. If different lines have different kinds of data, you'll probably need to search the string for separators instead and pick the third interval.
That said, it's probably easier not to use C at all, e.g. perl or awk will probably do a better job, :) but I suppose that's not an option.
If you have to use C: read the entire line to memory, go couting "," until you reach your desired column, read the value and sum it, go to next line.
When you reach your value, you can use sscanf to read it.
You might want to start by looking at RFC 4180: Common Format and MIME Type for Comma-Separated Values (CSV) Files, and then looking for implementations of the same. (Be aware though, that the notion of comma separated values predates the RFC and there are many implementations that do not comply with that document.)
I find:
ccsv
And not many others in plain c. There are quite a few c++ implementations, and most of the are probably readily adapted to c.
Related
I am an beginner C programmer and I am currently working on a project to implement viola jones object detection algorithm using C. I would like to know how I would be able to store data in a 2-Dimensional array to a file that can be easily ported and accessed by different program files(e.g. main.c, header_file.h etc.)
Thank you in advance.
There's not quite enough detail to be sure what you're looking for, but the basic structure of what you want to do is going to look something like this:
open file.csv for writing
for(iterate through one dimension of the array using i)
{
for(iterate through the other dimension of the array using j)
{
fprintf(yourfilehandle,"%d,",yourvalue[i][j]);
}
fprintf(yourfilehandle,"\n");
}
close your file
As has been suggested by others, this will leave you with a .CSV file, which is a pretty good choice, as it's easy to read in and parse, and you can open your file in Notepad or Excel and view it no problems.
This is assuming you really meant to do this with C file I/O, which is a perfectly valid way of doing things, some just feel it's a bit dated.
Note this leaves an extraneous comma at the end of the line. If that bugs you it's easy enough to do the pre and post conditions to only get commas where you want. Hint: it involves printing the comma before the entry inside the second for loop, reducing the number of entries you iterate over for the interior for loop, and printing out the first and last case of each row special, immediately before and after the inner for loop, respectively. Harder to explain that to do, probably.
Here is a reference for C-style file I/O, and here is a tutorial.
Without knowing anything about what type of data you're storing, I would say to store this as a matrix. You'll need to choose a delimiter to separate your elements (tab or space are common choices, aka 'tsv' and 'csv', respectively) and then something to mark the end of a row (new line is a good choice here).
So your saved file might look something like:
10 162 1 5
7 1 4 12
9 2 2 0
You can also define your format as having some metadata in the first line -- the number of rows and columns may be useful if you want to pre-allocate memory, along with other information like character encoding. Start simple and add as necessary!
I have a large sparse matrix stored in Compressed Row Storage (CRS) format. This is basically three arrays: an array containing the Values, an array for Column Index, and a final array containing the Row Pointers. E.g. http://web.eecs.utk.edu/~dongarra/etemplates/node373.html
I want to write this information into a text (.txt) file, which is intended to be read and put into three arrays using C. I currently plan to do this by writing all the entries in the Value array in one long line separated by commas. E.g. 5.6,10,456,78.2,... etc. Then do the same for the other two arrays.
My C code will end read the first line, put all the values into an array labeled "Value". And so on.
Question
Is this "correct"? Or is there a standard way of putting CRS data into text files?
No standard format that I'm aware of. You decide on a format that makes your life easy.
First, consider that if you want to look at one of these text files, you'll be instantly put off by the long lines. Some text editors might simply hate you. There's nothing wrong with splitting lines up.
Second, consider writing out the number of elements in each array (well, I suppose there's only two different array lengths for the three arrays) at the beginning of the file. This will let you preallocate your arrays. If you have all array lengths at hand, you have the option of doing a single memory allocation.
Finally, consider writing out some sensible tag names. Some kind of header that can identify your file is the correct format, then something to denote the start of each array. It's kind of a sanity thing for your code to detect problems with the file. It might just be one character, but it's something.
Now... call me a grungy old programmer, but I'd probably just write whole lot in binary. Especially if it's floating point data, I wouldn't want to deal with the loss of precision you get when you write out numbers as text (or the space they can consume when you write them with full precision). Binary files are easy to write and quick to run. You just have to be careful if you're going to be using them across platforms with different endian order.
That's my 2 cents worth.. Hope it's useful to you.
If you want to stick to some widely-used standards, have a look at the Matrix Market. This is a repository with many matrices arising in a variety of engineering and science problems. You can find software libraries to save and read the matrices as well.
I am making a text based game and want the user to be able to save. When they save all the variables will be saved to a text file.
I can't figure out how to take them out of the file and assigning them to specific variables and pointers.
The file will look somewhat like this:
jesse
hello
yes
rifle
0
1
3
20
Is there anyway I can specify what line I want to take out with fscanf? Or do I have to take a different approach?
There is no way to specify what line to read from because the concept of a file stream in C does not explicitly distinguish new lines. They are simply treated as a character. To read from a specific line, you would have to loop forward with fseek and fgetc until you find '\n' at which point you can update some variable that holds the current line number the stream points to.
One way around this would be to have information at a fixed offset. For example, say you are storing player information then if you make player information a fixed size X and have the constituent data at fixed indexes into each structure, you can just fseek to the right location straight away.
However, if you have structured data, it may be more suitable to use a format which is able to represent these structures inherently such as XML or JSON.
Altough I can't exactly tell what you want, I'd do a few suggestions:
Use a SQLite file instead of a text file. This way you can use SQL to get exactly what you want. Shortcut for you: http://www.sqlite.org/
If you still want to use a text file, use it comma-separated instead of spaces-separated. It's more common of a practice.
I have created a simple settings reader for my C program, maybe it might be useful to you to know how to parse test files
https://codereview.stackexchange.com/questions/8620/coding-style-in-c
This might sound rather awkward, but I want to ask if there is a commonly practiced way of storing tabular data in a text file to be read and written in C.
Like in python you can load a full text file nto an array by f.readlines then go through all the lines and split each line by a specific character or sequence of characters (delimiter).
How do you approach this problem in C?
Pretty much the same way you would in any other language. Pick a field separator (I.E., tab character), open the text file for reading and parse each line.
Of course, in C it will never be as easy as it is in Python, but approaches are similar.
Whoa. I am a bit baffled by the other answers which make me feel like I'm on Mainframes.stackexchange.com instead of stackoverflow.com
Why don't you pick a modern data format like JSON or XML and follow best practices for the data format of your choice?
If you want a good JSON reader/writer for C, I've used Jansson, and it's very easy and fast.
If you want a good XML reader/writer for C, I've used miniXML and it's also easy and fast. Also has SAX *and * DOM support depending on how you want to read in the XML.
Obviously there are a wealth of other libraries available as well.
Please don't give the next guy to come along and support your program some wacky custom file format to deal with.
I find getline() and strtok() to be quite convenient (getline was a gnu extension, standardized in POSIX.1-2008).
There's a handful of mechanisms, but there's a reason why scripting languages have become so popular over the least twenty years -- some of the tasks that seem simple in scripting languages are ponderous in C.
You could use flex and bison to write a parser for your tables. This really only works if the format is very well defined and "static". They're amazing tools that can do more than you might suspect, but it is very heavy machinery for what could be done simply with a split() in a scripting language.
You could read individual fields using getdelim(3). However, this was only standardized with POSIX.1-2008, so this is far from ubiquitous. (Every Linux machine with glibc should have them.)
You could read lines with fgets(3) and discover the split locations using strchr(3).
You could read lines with fgets(3) and use strtok(3) to tokenize strings.
You can use scanf(3) to perform input and scanning in one go; it seems from the questions here that scanf(3) is difficult to use correctly.
You could use character-at-a-time parsing approaches: read characters using getc(3), inspect it, do something with it, iterate until no more characters.
I have a scientific application for which I want to input initial values at run time. I have an option to get them from the command line, or to get them from an input file. Either of these options are input to a generic parser that uses strtod to return a linked list of initial values for each simulation run. I either use the command-line argument or getline() to read the values.
The question is, should I be rolling my own parser, or should I be using a parser-generator or some library? What is the standard method? This is the only data I will read at run time, and everything else is set at compile time (except for output files and a few other totally simple things).
Thanks,
Joel
Also check out strtof() for floats, strtod() for doubles.
sscanf
is probably the standard way to parse them.
However, there are some problems with sscanf, especially if you are parsing user input.
And, of course,
atof
In general, I prefer to have data inputs come from a file (e.g. the initial conditions for the run, the total number of timesteps, etc), and flag inputs come from the command line (e.g. the input file name, the output file name, etc). This allows the files to be archived and used again, and allows comments to be embedded in the file to help explain the inputs.
If the input file has a regular format:
For parsing, read in a full line from the file, and use sscanf to "parse" the line into variables.
If the input file has an irregular format:
Fix the file format so that it is regular (if that is an option).
If not, then strtof and strtod are the best options.