Reading in Specific data from text file (C)

Reading in Specific data from text file (C) - c

I am trying to read in specific data from a variable file in fmt format. The data needed is the value of a,b and c as well as the fft coefficients (width,height,depth) (25,300,300) in this case.
An example would be from this file to assign the variables:
a = 2.467
b = 30.000
c = 30.000
width = 25
height = 300
depth = 300.
The values of these will change however as the input file changes.
Currently the only way I can think of to read these in, is from their position in the text file. I do not like this however as it is prone to bugs if the text file changes slightly in layout. Can anyone suggest an alternate method (Is there something similar to the python re module in C)?
Please see an example text file below:
BEGIN header
Real Lattice(A) Lattice parameters(A) Cell Angles
2.4675850 0.0000000 0.0000000 a = 2.467585 alpha = 90.000000
0.0000000 30.0000000 0.0000000 b = 30.000000 beta = 90.000000
0.0000000 0.0000000 30.0000000 c = 30.000000 gamma = 90.000000
1 ! nspins
25 300 300 ! fine FFT grid along <a,b,c>
END header: data is "<a b c> pot" in units of Hartrees

You first should specify and formalize the actual file format of your input (a single example is not enough). You might use, at least for documentation purposes, some EBNF notation (I could guess but am not sure that BEGIN and Lattice are important in it, but the fmt wikipage don't mention them).
An example would be from this file
That is a wrong approach. You need to know the general file format your program will be able to handle and that is part of your software design. So better specify it first.
Then you'll use usual parsing techniques. Read also about lexical analysis. Perhaps a parser generator like GNU bison could be helpful, or perhaps a simple recursive descent parser could be enough. Maybe your input format cares about lines, then you could read them one by one (e.g. with POSIX getline) and parse each of them.
Reading the Dragon Book is worthwhile.
Is there something similar to the python re module in C
POSIX has <regex.h>; see regcomp(3) ; Look also into pcre2. I am not sure it is relevant here.

Related

Creating an initialization file in pari-gp

This question is definitely a stupid question. But, coming from C; I'm having trouble adding header files or "an initialization" file to my pari-gp code. This is to mean; I have a 1hr compile of code to make one vector; and I can use that vector once initialized; but I want to make a file of this vector such that I can access it once it's compiled once.
Here's the code without a header file; which takes about an hour to compile (given the series precision/numerical precision which are set to 100).
\p 100
\ps 100
Phi_Inv(w,l,{n=100}) =
{
my(out = 0);
for(i=0,n,
out = w*exp(out)/(exp(l*(n+1-i))+w)
);
out;
}
beta_init(n) = {
beta_taylor = vector(100,i,polcoef(Phi_Inv(w,l,n),i-1,w));
print(beta_taylor);
}
Rather than the brutal assignment of beta_taylor; and the caveman like print(beta_taylor), how can I write this to an initialization file I can package with the script. That is; an X mb file with all the coefficients neatly packed together. And if the file is lost, just run the code (which will take an hour) to write the initialization file again.
I mean, how would I properly do #include test.h where test.h is just a very long list of Taylor series values. So that I can just include this file, and write beta_taylor[i] for the i'th function. Such that it's as simple as including variables, like in C. I know I'm missing something simple and it's frustrating--making me feel stupid.
I'm mostly just asking about the syntax to go about and do this. I think I know how; but I imagine it's not the best way.
Any help or suggestions are greatly (And I really mean it, Greatly) appreciated.
To make a long story short; how do I save beta_taylor as a file we load when we initialize the program, and if the file is deleted we can save the program again by running the code for an hour?
Regards

So you want to serialize your vector of numbers to a file and read it back in later?
writebin() to the rescue. Something like
beta_init(n) = {
beta_taylor = vector(100,i,polcoef(Phi_Inv(w,l,n),i-1,w));
writebin("beta_taylor.dat", beta_taylor);
}
Run the function in one gp session, and then in another session, beta_taylor=read("beta_taylor.dat").
Compiling your code first with gp2c before running it to calculate the numbers will speed things up if you're not already doing that, btw. gp2c-run makes it easy by compiling a file and starting a new gp session with the resulting shared library already loaded. You might also look into if the parallel operations can be used here to speed up the initial computation; reading the documentation for parvector() I don't think it can be, though, because of that mysterious l variables used in beta_init() that I don't see you define anywhere, but you might be able to re-phrase your equation with hardcoded constants or something.

An initialization for Pari/GP, at program start:
(file gprc.txt in directory of gp.exe)
lines = 25
colors = "brightfg"
histfile = "gp_history.txt"
breakloop = 0
help = "# perl\\perl gphelp.pl -detex -ch 10 -cb 11 -cu 12"
prompt = "gp >"
prompt_cont="gpc>"
datadir = "u://paritty/syntax/_v11/"
path = "u://paritty/syntax/_v11/"
primelimit = 100 000 000
parisizemax = 1 000 000 000
read "__init.gp"
echo = 0
The file __init.gp contains my generally used functions; commands to read precomputed data-vectors can of course be included there. If no path is indicated it will be searched in the directory given in the path= statement.

Parsing Testing Data into C Program

I'm developing a module that will be run on an Embedded ARM chip to run an attitude controller, which is written in C. We have a MATLAB simulation, with a bunch of low-level functions that I'd like to be able to make unit tests for with data generated by the MATLAB program.
Each function is reasonably complex, and I'd like to calculate the error between the Matlab output and the C output for validation purposes. Each function has the same Inputs and Outputs between the two implementations, so they should match (to an allowable tolerance).
Are there any good existing file formats that could be useful? The types of test data would be:
<Test Input 1> <Test Input 2> <Test input 3> <Expected Output 1> <Expected output 2>
Where inputs and outputs are arbitrary single floats, arrays or matrices. I have considered XML because there are some nice parsers, but thats about all i know.
Any suggestions or direction?

an easy way is to use CSV file format:
it is easy to handle from C. see here
use OpenOffice/Excel later by just changing the file suffix to *.csv
see more here about CSV files

It sounds like you want to run these unit tests from C? Have you considered running them in MATLAB instead? If so then you would be able to leverage the MATLAB Unit Test Framework and parameterized testing to encode the actual and expected values (using the "sequential" ParameterCombination attribute in your MATLAB test. This would require that you create MEX wrappers for your C code so that you can invoke them from MATLAB, but other than that extra step this could be quite seamless. Also, have you looked into using MATLAB Coder for this?
The MATLAB Unit Test would look something like this:
classdef Times2Test < matlab.unittest.TestCase
properties(TestParameter)
input = {1,2,3};
expectedResult = {2,4,6};
end
methods(Test, ParameterCombination='sequential')
function testMATLABSimulation(testCase, input, expectedResult)
actualResult = times2(input);
testCase.verifyEqual(actualResult, expectedResult, ...
'RelTol', 1e-6);
end
function testCAlgorithm(testCase, input, expectedResult)
% Must expose to MATLAB by compiling C code to Mex
actualResult = times2Mex(input);
testCase.verifyEqual(actualResult, expectedResult, ...
'RelTol', 1e-6);
end
end
end

Since each function has the same input, there is no reason not to create input files in the most simple form - just numbers!
You know exactly the type and amount of numbers you want to read, so just read them using fscanf
The file could look like:
12.3 100 200.3
1 2 3
4 5 6
7 8 9
The first row is the arbitrary float numbers, you read each one into a variable.
The next 9 are a matrix, so you read them in a loop into a 3x3 matrix, etc...

There is one bit in your question which is kind of an eyebrow raiser:
"inputs and outputs are arbitrary single floats, arrays or matrices". This is going to add some complexity but maybe there is no way around that.
An .Xml file format is a good choice because it gives you a lot of flexibility and you can import/export your tests in an editor to help you make sense of it.
But perhaps an even better choice is a .JSON file format. It offers the same flexibility as a xml files but is not as heavy weight. There are open source libraries available to work with them in C and I'm sure matlab can export data in this format as well.

Import ASCII table in C as array

I have a series of data saved as a table in an ASCII file, for example:
1 100 2.345
2 342 8.233
3 65 89.23
I have just returned back to C after a few years of working in Python and wondered isn't there already any library that can do this import? Something like numpy.loadtxt() in Python? For example to output a float or a double array? I remember in the past I had to write a program myself to do this job in C (C99 for example), is there any standard package that will do the import? How about saving the results to an ASCII file? I can write a program myself for both of these but I don't want to repeat what other people have definitely done before me!

For machine generated files with a regular format, fscanf() can work flawlessly:
int index;
int x;
double y;
while (fscanf(infile, "%d %d %lf\n", &index, &x, &y) == 3) {
/* ... */
}
If you want to be able to handle files with any number of columns and just have the files fed into a data structure that you can later search or manipulate, then it is probably better to use a program or script to generate the same table in CSV or XML format. Then, use a library like libcsv or Mini-XML to parse the file for you.

I wrote my own library to do this (with the help of the kind people here who answered my questions very promptly), you can see it here. It does the job of turning a table of data (with an unknown number of headers, columns and rows) to a C array that can be used in the program. I would appreciate any suggestions. Thank you.

How to plot data by c program?

I am a mechanical engineer who has only limited knowledge in C programming. I wrote some code in order to make simulations, and I want to visualize the simulation results. At the moment I am using Dev-C for writing my codes. With fopen and fprintf commands I generate a .dat file which includes the results. Then I open GNUPLOT program and import my .dat file to plot the results. This takes time and I have to wait till the end of the simulation. Is there an easy way to connect my plotter with Dev-C, so my plotter starts plotting data during the simulation? Any library or etc. ?

Since you already know gnuplot, the simplest thing to do may be to just call gnuplot from your program and pipe the data to it:
FILE *gnuplot = popen("gnuplot", "w");
fprintf(gnuplot, "plot '-'\n");
for (i = 0; i < count; i++)
fprintf(gnuplot, "%g %g\n", x[i], y[i]);
fprintf(gnuplot, "e\n");
fflush(gnuplot);

I've been using PLPlot for plotting from C and have found it both effective and easy. It's cross platform, open source, and supports a rich array of plot capabilities. I'd recommend having a look at the examples to get started.

OK, one solution, as you are writing out to a file, would be to just make a system() call when you write out to the file, and call gnuplot.
But, that means you should change the filename each time, but I fear if you do that that it won't look correct, since you are sending small amounts of data each time.
http://www.gnu.org/software/libc/manual/html_node/System-Calls.html
I have never used gnuplot, but if you look at this page (http://gnuplot-tricks.blogspot.com/) you may find some tricks that would enable this to work well.
But, I think, unless you are going to plot everything yourself, and skip gnuplot, then you may just need to wait, as you are.
You may find the C interface to gnuplot may help you:
http://ndevilla.free.fr/gnuplot/

pbPlots is very easy to use and works with all C compilers.
Download pbPlots.c/h and supportLib.c/h from the github page and include them in your build.
Here is a complete example that will produce a plot in a PNG-file.
#include "pbPlots.h"
#include "supportLib.h"
int main(){
double x [] = {-1, 0, 1, 2, 3, 4, 5, 6};
double y [] = {-5, -4, -3, -2, 1, 0, 1, 2};
RGBABitmapImageReference *imageRef = CreateRGBABitmapImageReference();
DrawScatterPlot(imageRef, 600, 400, x, 8, y, 8);
size_t length;
double *pngData = ConvertToPNG(&length, imageRef->image);
WriteToFile(pngData, length, "plot.png");
return 0;
}
There are a lot more options available, these are described on the github page.

There is a extensive library called DISLIN for scientific purpose. Avaliable even in Fortran language.
You can check examples in official website.
You can obtain it freely in DISLIN Homepage

I had a wee look around to see what other people have done regarding real-time plotting in gnuplot, and found this:
http://users.softlab.ntua.gr/~ttsiod/gnuplotStreaming.html
There's a little Perl script that can drive the plotting, and you just pipe your information in.
Since your data is being written to file, you may want to tail -f yourdata.dat and pipe that into the real-time plotter.
Also, because you're using the stdio file calls, you'd need to flush regularly (by calling fflush)
Obviously your simulation would be running in the background or in another shell. That way you could break out of the plotting at any time without interrupting the simulation.
Hope that's of some use to you.

Retrieving Global Variable Values from Command Line

In one particular project, we're trying to embed version information into shared object files. We'd like to be able to use some standard linux tool to parse the shared object to determine the version for automated testing.
Currently I have "const int plugin_version = 14;". I can use 'nm' and 'objdump' and verify that it's there:
00000000000dcfbc r plugin_version
I can't, however, seem to be able to get the value of that variable easily from command line. I figured there'd be a POSIX tool for showing the initialized values for globals. I have contemplated using a format for the variable as the information itself, ie, plugin_version_14, but that seems like a huge hack. Embedding the information in the filename unfortunately is NOT an option. Any other suggestions welcome.

You could embed it as a string
"MAGIC MARKER STRING VERSION: 4.56 END OF MAGIC" then just look for "MAGIC MARKER STRING" in the file and extract the version information that comes after it.
if you make it a standard, you could easily make command line tool to find these embeded strings on all your software.
if you require it also to be an int, a little macro magic will construct both the int and magic string to make sure they are never out of synch.

There's a couple of options I think.
My first instinct is to make sure the version information lives in its own section in the ELF file. You can use objdump -s -j name of section /bin/whatever.
This rather relies on objdump being available of course.
Alternatively you can do what Keith suggested, and just use 'strings', along with a magical marker string. This feels a little hackish, but should work quite well.
Finally, why don't you just add a --version command line option? You can then store the version information however you like, and trivially retrieve it using the one tool which is certain to be installed on any system which has your software.

A terrible hack that I've used in the past is to embed the version information in a variable name, so nm will show:
00000000000dcfbc r plugin_version_14

Why not writing your own tool to get that version in C/C++ ? You could Use dlopen, then dlsym to get the symbol and print its value to standard output. This way you also verify if the symbol is already there. It looks like 20 ~ 30 lines of code to me and about 20 minutes of your life :)
I know that the question is about command line, but writing such a tool yourself should be easy (especially if such a command line tool does not exist).

If the binary is not stripped, you could use gdb to print the variable. (I just tried to script gdb, but it seems to refuse work if stdin is not a tty, maybe expect will do the job ? )

If you can accept using python, this might help:
import struct
import sys
import subprocess
if __name__ == '__main__':
so = sys.argv[1]
sym = sys.argv[2]
addr = subprocess.check_output('nm %s | grep %s' % (so, sym), shell=True)
addr = int(addr.split()[0], 16)
so_file = open(so)
so_file.seek(addr)
data = so_file.read(4)
print struct.unpack('#i', data)[0]
Disclaimer: This script doesn't do any error checking (if you like it I'm sure you can come up with some ;)). It also assumes you're reading a 4-byte native int value.
$ cat global.c
const int plugin_version = 14;
$ python readsym.py global.so plugin_version
14