Import ASCII table in C as array - c

I have a series of data saved as a table in an ASCII file, for example:
1 100 2.345
2 342 8.233
3 65 89.23
I have just returned back to C after a few years of working in Python and wondered isn't there already any library that can do this import? Something like numpy.loadtxt() in Python? For example to output a float or a double array? I remember in the past I had to write a program myself to do this job in C (C99 for example), is there any standard package that will do the import? How about saving the results to an ASCII file? I can write a program myself for both of these but I don't want to repeat what other people have definitely done before me!

For machine generated files with a regular format, fscanf() can work flawlessly:
int index;
int x;
double y;
while (fscanf(infile, "%d %d %lf\n", &index, &x, &y) == 3) {
/* ... */
}
If you want to be able to handle files with any number of columns and just have the files fed into a data structure that you can later search or manipulate, then it is probably better to use a program or script to generate the same table in CSV or XML format. Then, use a library like libcsv or Mini-XML to parse the file for you.

I wrote my own library to do this (with the help of the kind people here who answered my questions very promptly), you can see it here. It does the job of turning a table of data (with an unknown number of headers, columns and rows) to a C array that can be used in the program. I would appreciate any suggestions. Thank you.

Related

Creating an initialization file in pari-gp

This question is definitely a stupid question. But, coming from C; I'm having trouble adding header files or "an initialization" file to my pari-gp code. This is to mean; I have a 1hr compile of code to make one vector; and I can use that vector once initialized; but I want to make a file of this vector such that I can access it once it's compiled once.
Here's the code without a header file; which takes about an hour to compile (given the series precision/numerical precision which are set to 100).
\p 100
\ps 100
Phi_Inv(w,l,{n=100}) =
{
my(out = 0);
for(i=0,n,
out = w*exp(out)/(exp(l*(n+1-i))+w)
);
out;
}
beta_init(n) = {
beta_taylor = vector(100,i,polcoef(Phi_Inv(w,l,n),i-1,w));
print(beta_taylor);
}
Rather than the brutal assignment of beta_taylor; and the caveman like print(beta_taylor), how can I write this to an initialization file I can package with the script. That is; an X mb file with all the coefficients neatly packed together. And if the file is lost, just run the code (which will take an hour) to write the initialization file again.
I mean, how would I properly do #include test.h where test.h is just a very long list of Taylor series values. So that I can just include this file, and write beta_taylor[i] for the i'th function. Such that it's as simple as including variables, like in C. I know I'm missing something simple and it's frustrating--making me feel stupid.
I'm mostly just asking about the syntax to go about and do this. I think I know how; but I imagine it's not the best way.
Any help or suggestions are greatly (And I really mean it, Greatly) appreciated.
To make a long story short; how do I save beta_taylor as a file we load when we initialize the program, and if the file is deleted we can save the program again by running the code for an hour?
Regards
So you want to serialize your vector of numbers to a file and read it back in later?
writebin() to the rescue. Something like
beta_init(n) = {
beta_taylor = vector(100,i,polcoef(Phi_Inv(w,l,n),i-1,w));
writebin("beta_taylor.dat", beta_taylor);
}
Run the function in one gp session, and then in another session, beta_taylor=read("beta_taylor.dat").
Compiling your code first with gp2c before running it to calculate the numbers will speed things up if you're not already doing that, btw. gp2c-run makes it easy by compiling a file and starting a new gp session with the resulting shared library already loaded. You might also look into if the parallel operations can be used here to speed up the initial computation; reading the documentation for parvector() I don't think it can be, though, because of that mysterious l variables used in beta_init() that I don't see you define anywhere, but you might be able to re-phrase your equation with hardcoded constants or something.
An initialization for Pari/GP, at program start:
(file gprc.txt in directory of gp.exe)
lines = 25
colors = "brightfg"
histfile = "gp_history.txt"
breakloop = 0
help = "# perl\\perl gphelp.pl -detex -ch 10 -cb 11 -cu 12"
prompt = "gp >"
prompt_cont="gpc>"
datadir = "u://paritty/syntax/_v11/"
path = "u://paritty/syntax/_v11/"
primelimit = 100 000 000
parisizemax = 1 000 000 000
read "__init.gp"
echo = 0
The file __init.gp contains my generally used functions; commands to read precomputed data-vectors can of course be included there. If no path is indicated it will be searched in the directory given in the path= statement.

Reading in Specific data from text file (C)

I am trying to read in specific data from a variable file in fmt format. The data needed is the value of a,b and c as well as the fft coefficients (width,height,depth) (25,300,300) in this case.
An example would be from this file to assign the variables:
a = 2.467
b = 30.000
c = 30.000
width = 25
height = 300
depth = 300.
The values of these will change however as the input file changes.
Currently the only way I can think of to read these in, is from their position in the text file. I do not like this however as it is prone to bugs if the text file changes slightly in layout. Can anyone suggest an alternate method (Is there something similar to the python re module in C)?
Please see an example text file below:
BEGIN header
Real Lattice(A) Lattice parameters(A) Cell Angles
2.4675850 0.0000000 0.0000000 a = 2.467585 alpha = 90.000000
0.0000000 30.0000000 0.0000000 b = 30.000000 beta = 90.000000
0.0000000 0.0000000 30.0000000 c = 30.000000 gamma = 90.000000
1 ! nspins
25 300 300 ! fine FFT grid along <a,b,c>
END header: data is "<a b c> pot" in units of Hartrees
You first should specify and formalize the actual file format of your input (a single example is not enough). You might use, at least for documentation purposes, some EBNF notation (I could guess but am not sure that BEGIN and Lattice are important in it, but the fmt wikipage don't mention them).
An example would be from this file
That is a wrong approach. You need to know the general file format your program will be able to handle and that is part of your software design. So better specify it first.
Then you'll use usual parsing techniques. Read also about lexical analysis. Perhaps a parser generator like GNU bison could be helpful, or perhaps a simple recursive descent parser could be enough. Maybe your input format cares about lines, then you could read them one by one (e.g. with POSIX getline) and parse each of them.
Reading the Dragon Book is worthwhile.
Is there something similar to the python re module in C
POSIX has <regex.h>; see regcomp(3) ; Look also into pcre2. I am not sure it is relevant here.

Parsing Testing Data into C Program

I'm developing a module that will be run on an Embedded ARM chip to run an attitude controller, which is written in C. We have a MATLAB simulation, with a bunch of low-level functions that I'd like to be able to make unit tests for with data generated by the MATLAB program.
Each function is reasonably complex, and I'd like to calculate the error between the Matlab output and the C output for validation purposes. Each function has the same Inputs and Outputs between the two implementations, so they should match (to an allowable tolerance).
Are there any good existing file formats that could be useful? The types of test data would be:
<Test Input 1> <Test Input 2> <Test input 3> <Expected Output 1> <Expected output 2>
Where inputs and outputs are arbitrary single floats, arrays or matrices. I have considered XML because there are some nice parsers, but thats about all i know.
Any suggestions or direction?
an easy way is to use CSV file format:
it is easy to handle from C. see here
use OpenOffice/Excel later by just changing the file suffix to *.csv
see more here about CSV files
It sounds like you want to run these unit tests from C? Have you considered running them in MATLAB instead? If so then you would be able to leverage the MATLAB Unit Test Framework and parameterized testing to encode the actual and expected values (using the "sequential" ParameterCombination attribute in your MATLAB test. This would require that you create MEX wrappers for your C code so that you can invoke them from MATLAB, but other than that extra step this could be quite seamless. Also, have you looked into using MATLAB Coder for this?
The MATLAB Unit Test would look something like this:
classdef Times2Test < matlab.unittest.TestCase
properties(TestParameter)
input = {1,2,3};
expectedResult = {2,4,6};
end
methods(Test, ParameterCombination='sequential')
function testMATLABSimulation(testCase, input, expectedResult)
actualResult = times2(input);
testCase.verifyEqual(actualResult, expectedResult, ...
'RelTol', 1e-6);
end
function testCAlgorithm(testCase, input, expectedResult)
% Must expose to MATLAB by compiling C code to Mex
actualResult = times2Mex(input);
testCase.verifyEqual(actualResult, expectedResult, ...
'RelTol', 1e-6);
end
end
end
Since each function has the same input, there is no reason not to create input files in the most simple form - just numbers!
You know exactly the type and amount of numbers you want to read, so just read them using fscanf
The file could look like:
12.3 100 200.3
1 2 3
4 5 6
7 8 9
The first row is the arbitrary float numbers, you read each one into a variable.
The next 9 are a matrix, so you read them in a loop into a 3x3 matrix, etc...
There is one bit in your question which is kind of an eyebrow raiser:
"inputs and outputs are arbitrary single floats, arrays or matrices". This is going to add some complexity but maybe there is no way around that.
An .Xml file format is a good choice because it gives you a lot of flexibility and you can import/export your tests in an editor to help you make sense of it.
But perhaps an even better choice is a .JSON file format. It offers the same flexibility as a xml files but is not as heavy weight. There are open source libraries available to work with them in C and I'm sure matlab can export data in this format as well.

Want to check values are present or not in file [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Here I have one file which contains some information, and I want to check some values of tags are present or not in file.If present, then I want to retrieve these values.
Here number of tags is fixed and lenght of tag would not be more than 16 and length of value of tag is almost fix it would not more than 10 bytes.
From below file I want to check KERNEL tag value is present or not,FS tag value is present or not, etc etc.
I want to check all values are present or not after : (colon).
My file contains text like this.
KERNEL:2.31
FS:3.4
XLOADER:1.1
UBOOT:2.2
or like this
KERNEL:2.31
FS:
XLOADER:1.3
UBOOT:
I am using this code
#include <stdio.h>
int main() {
FILE *infile = fopen("example.txt", "r");
char buffer[256];
char value[128];
while (fgets(buffer, sizeof(buffer), infile))
if (1 == sscanf(buffer, "KERNEL:%s", value)) {
printf("Value = %s\n", value);
}
return 0;
}
but in code i have to call 4 this function 4 times for different different values. like KERNEL,FS, etc.
this code output like this
Value = 2.31
i read line from the data file and than i want to parse value of every tag (Means wants to identify values is present or not).
So this is Good way to do this thing can any body suggest me ?
Your question does not make it clear what context the file is in (or did not when I started answering). Presumably, it is a text file that the application can find by some means.
You've not specified whether the contents (tags) are supposed to be fixed, whether the file can contain other information, whether there's a comment convention in place, whether blank lines are allowed, what happens if a tag is missing altogether, what happens if there's an unexpected tag in the file, what happens if a tag is repeated (with the same version, with a different version). Are the tags case-sensitive; are leading blanks allowed before the tag; before the version; after the version? What characters are allowed in the version number? These are details that matter.
Let's assume that the list of names is fixed. Let's assume that tags are not longer than 7 (8 with terminal null). Let's assume that version numbers are not longer than 15 characters in total (16 allowing for terminal null). Let's assume that you need to keep a record of which tags you found and the version that you found.
In that case, you will end up with a data structure a bit like this:
typedef struct VersionInfo
{
char tag[8];
char version[16];
} VersionInfo;
static VersionInfo version_data[] =
{
{ "KERNEL", "" },
{ "FS", "" },
{ "XLOADER", "" },
{ "UBOOT", "" },
};
Clearly, with that basic structure available, you can write a function to read each line from the data file, discarding any comment or blank lines (if that's appropriate). You can probably use fgets() for this; the expected lines are short. You should probably complain about long lines, and decide whether to ignore them or stop processing. You can look for each of the tags in the version_data array, and when you find one, note whether it has been found before or not, and then find the version information on the line and copy it into the corresponding part of the version_data array.
You might extend the structure with an 'expected version' field as well as the actual version field. This will allow you to decide what to do based on the versions you find. You might have ranges related the allowable versions, based on what was known when the program was compiled, for example. You might allow the program that was compiled with 3.0.3 of something to run with 3.0.4 and later, and maybe you're even willing to work with 3.0.0 and later, but not any version 2.
Note that version comparison is a fine art in its own right. For example, the versions 3.0.3, 3.1.2, 3.6, 3.6.1, 3.6.1.2, and 3.10.0 should probably be treated as being in ascending order of version number. Using strcmp() won't achieve that (it will place 3.10.0 ahead of all the 3.6 versions).
I see that your example code uses sscanf() with a fixed string. That probably ties you to a fixed order and certainly not as flexible as the data structure allows you to be. If using sscanf(), I've be expecting to use a format such as:
if (sscanf(buffer, "%7s:%15s", tag, version) != 2)
...something up with the input line...
...process tag and version that you did find...
You can refine the scans with the character class notations:
if (sscanf(buffer, "%7[A-Z]:%15[0-9.]", tag, version) != 2)
This only accepts upper-case letters in the tag and only accepts digits and dots in the version information (but is quite happy with version "..0...0...0" which you probably shouldn't accept as valid).
Given your samples, the first approach I'd try is:
Read a line
Remove all whitespace
If the last character on the line is a colon <do something>
Else
Do something else

Debugging in c for log file

I have written down a program in c and I am trying to create a log file of it.
The problem I am facing is that while printing the outputs of each line in the file I want to write some distinctive feature such as the time of execution of that line or even the line number in the code.
Is there any way I can get to know any of these two.
I don't mind if you suggest some other way to get a distinctive feature. All I want is that looking at the log file the user gets to know that a certain part of the code was getting executed.
Thanks
I am working on linux and thus using the GCC compiler....
I have made a header file and in it I am for testing purposes writing __LINE__ . What I want to do is that in a program when I include this function of header file the line number gets printed where the function is. But instead i get the line number of the header file printf statement.
What do I need to do to get the line number of the file .
This is just a test format given below :-
new.h
void print()
{
printf("Line number is %d",__LINE__);
}
actual file
#include "new.h"
int main()
{
print();
}
Then I want that the line number that should be printed is that of actual file and not new.h which happens now....
Most C compilers provide some macros to identify each line, function, etc. With GCC, for example, you can use __LINE__, __FUNCTION__, and so on. Check your compiler documentation for details. To get a timestamp, you'll need to let us know what system you're working on.
If you want the actual date and time the function was executed, try asctime(). There is a good reference on how it's done here.
This will output something akin to Sat May 20 17:36:17 2000. If you want the time in seconds since the program started, have a variable such as int startTime = time() which holds the program start time in seconds from the Unix Epoch. Then, simply print startTime - time() to get the number of seconds since program start.
In GCC you can get line number as "__LINE__". Filename - "__FILE__".
If you want calculate execution time then just remember time on start, get time on end and substract them.
The line number can be obtained by the preprocessor macro __LINE__. The file is __FILE__. As for time, use the relevant OS library.
Or, use a logging library that support these.
Use __FILE__ and __LINE__ to get the current file and line number.
Edit: based on your edited question. Here's a simple way to do it to start.
new.h
#define PRINT() print(__LINE__)
void print(int line)
{
printf("Line number is %d",line);
}
actual file
#include "new.h"
int main()
{
PRINT();
}

Resources