HDF5 attributes: Strings are only one character long

HDF5 attributes: Strings are only one character long - c

I am writing code to produce HDFs, and I'm having a problem with attributes. These attributes are all variable length strings, read in from a text file, but I'm capping them quite generously at 256 characters.
My code compiles and runs with no errors. When I open the file in HDFView, the attributes all have the correct names, but only the first character in the string shows up.
I have started to code for accepting attributes as either single strings, or arrays of strings. I haven't finished that part, so right now it doesn't do anything with the string arrays.
Here is the input file:
year 2013
julian_date 23
start_time_utc 13:54:03
end_time_utc 14:32:05
pixels_per_degree 1000
latitude_corners 34.988644 35.503284 35.960529 36.364529
longitude_corners -119.571877 -118.467979 -120.158424 -119.004395
And here is the code snippet:
#define FIELDSTRINGLENGTH 256
/*...*/
while(fgets(line, ATTRSTRINGLENGTH, topattributefile)!=NULL) {
//parse line into individual words
field=strtok(line," \n");
strncpy(attributename, field, FIELDSTRINGLENGTH);
numfields=0;
field=strtok(NULL," \n");
while(field!=NULL) {
strncpy(attributevalue[numfields++], field, FIELDSTRINGLENGTH);
field=strtok(NULL," \n");
}
if(numfields==0) {printf("ERROR: Attribute %s had no value; skipping\n", attributename);}
else if(numfields>1) {
if(verboseflag) {printf("Making array of %d attributes with name %s:\n",
numfields, attributename);}
for(i=0;i<numfields;i++) {
if(verboseflag) {printf("\t%d: %s\n", i, attributevalue[i]);}
}
}
else {
printf("Making single attribute: %s: %s\n",
attributename, attributevalue[0]);}
//make single attribute
attrdataspaceid = H5Screate(H5S_SCALAR);
attrdatatypeid = H5Tcopy(H5T_C_S1);
status = H5Tset_size(attrdatatypeid, FIELDSTRINGLENGTH);
status = H5Tset_strpad(attrdatatypeid, H5T_STR_NULLTERM);
attributeid = H5Acreate2(fileid, attributename, attrdatatypeid, attrdataspaceid, H5P_DEFAULT, H5P_DEFAULT);
status = H5Awrite(attributeid, H5T_C_S1, attributevalue[0]);
}
}
Here is the stdout of the relevant snippet:
Making top level attributes...
Making single attribute: year: 2013
Making single attribute: julian_date: 23
Making single attribute: start_time_utc: 13:54:03
Making single attribute: end_time_utc: 14:32:05
Making single attribute: pixels_per_degree: 1000
Making array of 4 attributes with name latitude_corners:
0: 34.988644
1: 35.503284
2: 35.960529
3: 36.364529
Making array of 4 attributes with name longitude_corners:
0: -119.571877
1: -118.467979
2: -120.158424
3: -119.004395
Finished making top level attributes.
Finally, here is the metadata of the HDF as read in HDFView.
XXXX.XXXXXX.XXXXXX.hdf (0)
Group size = 1
Number of attributes = 5
year = 2
julian_date = 2
start_time_utc = 1
end_time_utc = 1
pixels_per_degree = 1
Anything strike you as odd here?

Your error is in your call to H5Awrite().
status = H5Awrite(attributeid, H5T_C_S1, attributevalue[0]);
This is writing only a single character. As the definition of H5T_C_S1 is
One-byte, null-terminated string of eight-bit characters.
I normally create a derived type and set it's size. Then call H5Awrite with this type, as the write calls copies the size of this in memory type.
Here's a simple example.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <hdf5.h>
#define FILENAME "att_str.h5"
int main(){
hid_t fid;
hid_t att;
hid_t ds;
hid_t type;
herr_t status;
char x[] = "lah lah lah";
int32_t len = 0;
fid = H5Fcreate(FILENAME, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
type = H5Tcopy(H5T_C_S1);
len = strlen(x);
status = H5Tset_size(type, len);
ds = H5Screate(H5S_SCALAR);
att = H5Acreate(fid, "test", type, ds, H5P_DEFAULT, H5P_DEFAULT);
status = H5Awrite(att, type, &x);
status = H5Aclose(att);
status = H5Tclose(type);
status = H5Sclose(ds);
status = H5Fclose(fid);
return(EXIT_SUCCESS);
}
Once it is compiled and run the generated file contains:
HDF5 "att_str.h5" {
GROUP "/" {
ATTRIBUTE "test" {
DATATYPE H5T_STRING {
STRSIZE 11;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SCALAR
DATA {
(0): "lah lah lah"
}
}
}
}
Notice how I create a derived type that is the length of string to be written:
len = strlen(x);
status = H5Tset_size(type, len);
The HDF5 group does have an example of writing variable length attributes.

Related

Can't transmate string through MariaDB connect/c Prepared Statement

I'm using "MariaDB Connector/C" for my homework, but I got a problem: I always get an empty string when I pass in a string parameter, the db table is:
MariaDB none#(none):test> SELECT * FROM t3
a
b
0
abc
1
bcd
2
af
3 rows in set
Time: 0.010s
MariaDB none#(none):test> DESC t3
Field
Type
Null
Key
Default
Extra
a
int(11)
NO
PRI
b
char(10)
YES
2 rows in set
Time: 0.011s
And the code I use to test:
#include <mysql/mysql.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
MYSQL *mysql;
mysql = mysql_init(NULL);
if (!mysql_real_connect(mysql,NULL , "none", "linux", "test", 0,"/tmp/mariadb.sock",0)){
printf( "Error connecting to database: %s",mysql_error(mysql));
} else
printf("Connected...\n");
if(mysql_real_query(mysql,"SET CHARACTER SET utf8",(unsigned int)sizeof("SET CHARACTER SET utf8"))){
printf("Failed to set Encode!\n");
}
char query_stmt_2[]="select * from t3 where b=?";
MYSQL_STMT *stmt2 = mysql_stmt_init(mysql);
if(mysql_stmt_prepare(stmt2, query_stmt_2, -1))
{
printf("STMT2 prepare failed.\n");
}
MYSQL_BIND instr_bind;
char instr[50]="abc";
my_bool in_is_null = 0;
my_bool in_error = 0;
instr_bind.buffer_type = MYSQL_TYPE_STRING;
instr_bind.buffer = &instr[0];
char in_ind = STMT_INDICATOR_NTS;
instr_bind.u.indicator = &in_ind;
unsigned long instr_len=sizeof(instr);
// instr_bind.length = &instr_len;
// instr_bind.buffer_length=instr_len;
instr_bind.is_null = &in_is_null;
instr_bind.error = &in_error;
MYSQL_BIND out_bind[2];
memset(out_bind, 0, sizeof(out_bind));
int out_int[2];
char outstr[50];
my_bool out_int_is_null[2]={0,0};
my_bool out_int_error[2]={0,0};
unsigned long out_int_length[2]={0,0};
out_bind[0].buffer = out_int+0;
out_bind[0].buffer_type = MYSQL_TYPE_LONG;
out_bind[0].is_null = out_int_is_null+0;
out_bind[0].error = out_int_error+0;
out_bind[0].length = out_int_length+0;
out_bind[1].buffer = outstr;
out_bind[1].buffer_type = MYSQL_TYPE_STRING;
out_bind[1].buffer_length = 50;
out_bind[1].is_null = out_int_is_null+1;
out_bind[1].error = out_int_error+1;
out_bind[1].length = out_int_length+1;
if(mysql_stmt_bind_param(stmt2, &instr_bind) ||
mysql_stmt_bind_result(stmt2, out_bind)){
printf("Bind error\n");
}
if(mysql_stmt_execute(stmt2))
{
printf("Exec error: %s",mysql_stmt_error(stmt2));
}
if(mysql_stmt_store_result(stmt2)){
printf("Store result error!\n");
printf("%s\n",mysql_stmt_error(stmt2));
}
while(!mysql_stmt_fetch(stmt2))
{
printf("%d\t%s\n", out_int[0], outstr);
}
mysql_stmt_close(stmt2);
end:
mysql_close(mysql);
}
I only got an empty result:
❯ ./Exec/test/stmt_test
Connected...
I have been in trouble with this for two days, and tomorrow is the deadline, I'm very anxious. Can you help? Thanks a lot!

1) General
Avoid "it was hard to write, so it should be hard to read" code
add variable declarations at the beginning of the function, not in the middle of code (Wdeclaration-after-statement)
don't use c++ comments in C
set character set with api function mysql_set_character_set()
write proper error handling, including mysql_error/mysql_stmt_error results and don't continue executing subsequent code after error.
always initialize MYSQL_BIND
2) input bind buffer
u.indicator is used for bulk operations and doesn't make sense here
bind.is_null is not required, since you specified a valid buffer address
buffer_length is not set (in comments)
3) Output bind buffer
Always bind output parameters after mysql_stmt_execute(), since mysql_stmt_prepare can't always determine the number of parameters, e.g. when calling a stored procedure: In this case mysql_stmt_bind_param will return an error.
binding an error indicator doesn't make much sense without setting MYSQL_REPORT_DATA_TRUNCATION (mysql_optionsv)
For some examples how to deal with prepared statements check the file ps.c of MariaDB Connector/C unit tests

Reading file containing multiple data formats

I'm supposed to read a file in C with a structure that looks like this
A:
1
2
3
4
B:
1 1
2 2
3 3
4 4
C:
1 1 1
2 2 2
3 3 3
4 4 4
The file is always separated into three parts and each part starts with an identifier (A:, B:,..).
Identifier is followed by unspecified number of rows containing data. But in each part the format of the data is different. Also it's not just integers but that's not important in this question.
I don't have a problem reading the file. My question is what would be an optimal way to read such a file? It can contain thousands of rows or even more parts than just three. The result should be for example string arrays each containing rows from a different part of the file.
I didn't post any code because I don't need/want you to post any code either. Idea is good enough for me.

You could read the file line by line and check every time if a new section starts. If this is the case, you allocate new memory for a new section and read all the following lines to the data structure for that new section.
For dynamic memory allocation, you will need some counters so you know how many lines per section and how many sections in total you have read.
To illustrate the idea (no complete code):
typedef struct {
int count;
char **lines;
} tSection;
int section_counter = 0;
tSection *sections = NULL;
tSection *current_section = NULL;
char line[MAXLINE];
while (fgets(line, MAXLINE, file)) {
if (isalpha(line[0])) { // if line is identifier, start new section
sections = realloc(sections, sizeof(tSection)*(section_counter+1));
current_section = &sections[section_counter];
current_section->lines = NULL;
current_section->count = 0;
section_counter++;
}
else { // if line contains data, add new line to structure of current section
current_section->lines = realloc(current_section->lines, sizeof(char*)*(current_section->count+1));
current_section->lines[current_section->count] = malloc(sizeof(char)*MAXLINE);
strcpy(current_section->lines[current_section->count], line);
current_section->count++;
}
}

If each section in the file has a fixed format and the section header has a fixed format, you can use fscanf and a state machine based approach. For example in the code below, the function readsec reads a section based on the parameters passed to it. The arguments to readsec depends on which state it is in.
void readsec(FILE* f, const char* fmt, int c, char sec) {
printf("\nReading section %c\n",sec);
int data[3];
int ret=0;
while ((ret=fscanf(f, fmt, &data[0], &data[1], &data[2]))!=EOF){
if (ret!=c) {
return;
}
// processData(sec, c, &data); <-- process read data based on section
}
}
int main() {
FILE * f = fopen("file","r");
int ret = 0;
char sect = 0;
while ((ret=fscanf(f, "%c%*c\n", &sect))!=EOF){
switch (sect) {
case 'A':
readsec(f, "%d", 1, 'A');break;
case 'B':
readsec(f, "%d %d", 2, 'B');break;
case 'C':
readsec(f, "%d %d %d", 3, 'C');break;
default:break;
}
}
return 0;
}

GCC C and passing integer to PostgreSQL

I know that there probably was plenty on that but after several days of searching I am unable to find how to do one simple passing of integer and char in one go to PostgreSQL from C under Linux.
In PHP it is easy, like 123, and in C using libpq it seem to be like something out of ordinary.
I had a look at PQexecParams but is seem to be not helping. Examples on the net are not helping as well and it seems to be an impossible mission.
Would someone be kind enough to translate this simple PHP statement to C and show me how to pass multiple vars of different types in one INSERT query.
col1 is INT
col2 is CHAR
$int1 = 1;
$char1 = 'text';
$query = "INSERT INTO table (col1, col2) values ('$int1',$char1)";
$result = ibase_query($query);
This would show what I am trying to do (please mind the code is very wrong):
void insert_CommsDb(PGconn *conn, PGresult *pgres, int csrv0) { const char * params[1];
params[0] = csrv0;
pgres = PQexecParams(conn, "INSERT INTO comms_db (srv0::int) values ($1)",
1,
NULL,
params,
1,
NULL,
0);
if (PQresultStatus(pgres) != PGRES_COMMAND_OK)
{
fprintf(stderr, "INSERT failed: %s", PQerrorMessage(conn));
exit_nicely(conn,pgres);
}
PQclear(pgres);
}

https://www.postgresql.org/docs/current/static/libpq-exec.html
As #joop commented above:
If the paramTypes argument is NULL, all the params are assumed to be strings.
So, you should transform your int argument to a string.
void insert_CommsDb(PGconn *conn, int csrv0)
{
PGresult *pgres;
char * params[1];
char buff[12];
sprintf(buff, "%d", csrv0);
params[0] = buff;
pgres = PQexecParams(conn
, "INSERT INTO comms_db (srv0::int) values ($1)" // The query (we dont need the cast here)
, 1 // number of params
, NULL // array with types, or NULL
, params // array with parameter values
, NULL // ARRAY with parameter lenghts
, NULL // array with per-param flags indicating binary/non binary
, 0 // set to 1 if we want BINARY results, 0 for txt
);
if (PQrresultStatus(pgres) != PGRES_COMMAND_OK)
{
fprintf(stderr, "INSERT failed: %s", PQerrorMessage(conn));
exit_nicely(conn,pgres);
}
PQclear(pgres);
}

wildplasser's answer shows the way in general.
Since you explicitly asked about several parameters, I'll add an example for that.
If you are not happy to convert integers to strings, the alternative would be to use the external binary format of the data type in question. That requires inside knowledge and probably reading the PostgreSQL source. For some data types, it can also depend on the hardware.
PGresult *res;
PGconn *conn;
Oid types[2];
char * values[2];
int lengths[2], formats[2];
int arg0;
/* connect to the database */
/*
* The first argument is in binary format.
* Apart from having to use the "external binary
* format" for the data, we have to specify
* type and length.
*/
arg0 = htonl(42); /* external binary format: network byte order */
types[0] = 23; /* OID of "int4" */
values[0] = (char *) &arg0;
lengths[0] = sizeof(int);
formats[0] = 1;
/* second argument is in text format */
types[1] = 0;
values[1] = "something";
lengths[1] = 0;
formats[1] = 0;
res = PQexecParams(
conn,
"INSERT INTO mytab (col1, col2) values ($1, $2)",
2,
types,
(const char * const *)values,
lengths,
formats,
0 /* results in text format */
);
I'd recommend that you use the text format for most data types.
The notable exception is bytea, where it usually is an advantage to use the binary format, as it saves space and CPU power. In this case, the external binary format is simply the bytes.

VS C++ not liking htonl(42):
arg0 = htonl(42); /* external binary format: network byte order */

Pointers getting junk values outside of functions, but regular values inside them

Full disclosure: This is my first time doing any significant programming in C, and my first post on Stack Overflow.
I'm working on code that will eventually be used with Bison to implement a small subset of the Scheme/Racket language. All of this code is in a single C file. I have three structs: Binding, Lambda, and SymbolEntry. I'm not using the Lambda struct yet, it's just there for completeness. I also have a symbol table that holds symbol entries. printSymbolTable() does exactly what the name implies:
typedef struct
{
char* name;
char* value;
} Binding;
typedef struct
{
int numBindings;
Binding** bindings;
char* functionBody;
} Lambda;
typedef struct
{
Binding* binding;
Lambda* function;
} SymbolEntry;
SymbolEntry* symbolTable = NULL;
int numSymbols = 0;
void printSymbolTable()
{
if (symbolTable)
{
int i = 0;
for (i; i < numSymbols; i++)
{
printf("\tsymbolTable[%i]: %s = %s\n", i, symbolTable[i].binding->name, symbolTable[i].binding->value);
}
}
}
I'm currently trying to work out the logic for defining and looking up variables. The 2 relevant functions:
// Takes a name and an exprssion and stores the result in the symbol table
void defineVar(char* name, char* expr)
{
printf("\nSetting %s = %s\n", name, expr);
printf("Previous number of symbols: %i\n", numSymbols);
Binding props;
props.name = name;
props.value = expr;
SymbolEntry entry;
entry.binding = &props;
entry.function = NULL;
symbolTable = realloc(symbolTable, sizeof(SymbolEntry) * ++numSymbols);
if (!symbolTable)
{
printf("Memory allocation failed. Exiting.\n");
exit(1);
}
symbolTable[numSymbols - 1] = entry;
printf("New number of symbols: %i\n", numSymbols);
printf("defineVar result:\n");
printSymbolTable();
}
// Test storing and looking up at least 4 variables, including one that is undefined
void testVars()
{
printf("Variable tests\n");
defineVar("foo", "0");
printf("After returning from defineVar:\n");
printSymbolTable();
defineVar("bar", "20");
printf("After returning from defineVar:\n");
printSymbolTable();
}
main() calls testVars(). I get no warnings or errors when compiling, and the program executes successfully. However, this is the result:
Variable tests
Setting foo = 0
Previous number of symbols: 0
New number of symbols: 1
defineVar result:
symbolTable[0]: foo = 0
After returning from defineVar:
symbolTable[0]: 1�I��^H��H���PTI��# = �E
Setting bar = 20
Previous number of symbols: 1
New number of symbols: 2
defineVar result:
symbolTable[0]: bar = 20
symbolTable[1]: bar = 20
After returning from defineVar:
symbolTable[0]: 1�I��^H��H���PTI��# = �E
symbolTable[1]: 1�I��^H��H���PTI��# = �E���
Not only am I getting junk values when outside of the defineVar() function, but the call to define bar shows incorrect non-junk values as well. I'm not sure what I'm doing wrong, but I assume it's probably something with realloc(). However, a similar strategy worked when parsing a string into individual tokens, so that's what I was trying to emulate. What am I doing wrong?

Because it's pointing to variables (or variable — at least props, haven't read further) local to functions and the stack frame is discarded (and soon overwritten) after you return.

ns_parserr: Message to long; error message when using BIND resolver library function ns_parserr()

The code below is suppose to print out the TXT Resource Records i have in my zone file.
When i execute the code only with BLOCK 1 (BLOCK 2 not present) i get the name, Type, Class,TTL and Data Length for each of the 3 TXT RRs i have.
But when i execute the code only with BLOCK 2, i only get the answer for the first TXT RR and then an error message: ns_parserr: Message to long.
Can somebody plese help me with this problem.
Thanks in advance.
int rrnum; /* resource record number */
ns_rr rr; /* expanded resource record */
for(rrnum = 0; rrnum < ns_msg_count(handle, ns_s_an); rrnum++)
{
//from section ns_s_an(ANSWER) take out answer number rrnum and put it in rr
if (ns_parserr(&handle, ns_s_an, rrnum, &rr)) {
fprintf(stderr, "ns_parserr: %s\n", strerror(errno));
}
if (ns_rr_type(rr)==ns_t_txt){
//BLOCK 1
char *cp;
cp=(char *)ns_rr_name(rr);
printf("CP->%s\n",(char *)cp);
int i1=ns_rr_type(rr);
printf("Type->%d\n", i1);
int i2=ns_rr_class(rr);
printf("Class->%d\n", i2);
int i3=ns_rr_ttl(rr);
printf("TTL->%d\n", i3);
int i4=ns_rr_rdlen(rr);
printf("Data Length->%d\n\n", i4);
//BLOCK 2
u_char const *rdata=ns_rr_rdata(rr);
printf("Data->%s\n",(u_char *)rdata);
char *rdatatemp;
rdatatemp=(char *)rdata;
int len=strlen(rdata);
printf("%d\n",len);
rdatatemp[strlen(rdata)-2]='\0';
printf("Data->%s\n",(u_char *)rdatatemp);
}
}
This is the result i get with the two blocks on:
vanco#vanco-laptop:~/Desktop$ gcc d2ip.c /usr/lib/libresolv.a
vanco#vanco-laptop:~/Desktop$ ./a.out www.example.com
CP->www.example.com
Type->16
Class->1
TTL->10800
Data Length->41
Data->(ver=dgw1 pre=8 id=0 name=www.example.com�
43
Data->(ver=dgw1 pre=8 id=0 name=www.example.com
ns_parserr: Message too long

You're modifying the memory pointed to by the u_char const * returned by ns_rr_rdata(rr); via circumventing the const using a cast
u_char const *rdata=ns_rr_rdata(rr);
...
char *rdatatemp;
rdatatemp=(char *)rdata;
...
rdatatemp[strlen(rdata)-2]='\0';
You need to allocate a new char array and copy from rdata

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

HDF5 attributes: Strings are only one character long - c

Related

Can't transmate string through MariaDB connect/c Prepared Statement

Reading file containing multiple data formats

GCC C and passing integer to PostgreSQL

Pointers getting junk values outside of functions, but regular values inside them

ns_parserr: Message to long; error message when using BIND resolver library function ns_parserr()

Categories

Resources