how to parse xml file in C language on linux - c

I'm going to make a weather program, so I need to parse an XML file.
I've installed a libxml(in fact, it was installed)
However, I don't know how to parse a number.
Here is part of my XML code:
<tmx>-999.0</tmx>
<tmn>-999.0</tmn>
<sky>2</sky>
<pty>0</pty>
and I need a number in the last line; 'pty'
Thank you very much for helping me.

Considering you might not need libXML just to extract a few value, and as long as you don't care about the validity of your input, you could load the file in memory ( or part of it ).
char* xmlChunk; // your data will be stored inside
char* tagStartBegin = strstr(xmlChunk,"<pty");
char* tagStartEnd = strstr(tagStartBegin,">");
char* value = tagStartEnd+1;
// get the end tag
char* tagEndBegin = strstr(tagStartEnd,"</pty>");
//end the string here
*tagEndBegin = '\0';
// parse your value
doSomeThing(value);
You might need to adapt your code for unicode input if any

Related

Need help parsing a "|" seperated line from a file

I have to parse a file that would look something like this
String|OtherString|1234|0
String2|OtherString2|4321|1
...
So, I need to go through every line of the file and take each seperate token of each line.
FILE *fp=fopen("test1.txt","r");
int c;
char str1[500];
char str2[500];
int num1=0;
int num2;
while((c=fgetc(fp))!=EOF){
fscanf(fp, "%s|%s|%d|%d", &str1[0], &str2[0], &num1, &num2);
}
fclose(fp);
There's more to it, but these are the sections relevant to my question. fscanf isn't working, presumably because I've written it wrong. What's supposed to happen is that str1[500] should be set to String, in this case, str2 to OtherString, etc. It seems as though fscanf isn't doing anything, however. Would greatly appreciate some help.
EDIT: I am not adamant about using fgetc or fscanf, these are just what I have atm, I'd use anything that would let me do what I have to
strtok() in a loop will work for you. The following is a bare bones example, with very little error handling etc, but illustrates the concept...
char strArray[4][80];
char *tok = NULL;
char *dup = strdup(origLine);
int i = 0;
if(dup)
{
tok = strtok(dup, "|\n");
while(tok)
{
strcpy(strArray[i], tok);
tok = strtok(NULL, "|\n");
i++;
}
free(dup);
}
If reading from a file, then put this loop inside another while loop that reads the file, line by line. Functions useful for this will include fopen(), fgets() and fclose(). One additional feature that should be considered for code that reads data from a file is to determine the number of records (lines) in the file to be read, and use that information to create a properly sized container with which to populate with the parsing results. But this will be for another question.
Note: fgetc() is not suggested here as it reads one char per loop, and would be less efficient than using fgets() for reading lines from a file when used in conjunction with strtok().
Note also, in general, the more consistently a file is formatted in terms of number of fields, content of fields, etc. the least complicated a parser needs to be. The inverse is also true. The less consistently formatted input file requires a more complex parser. For example, for human entered line data, the parser required is typically more complicated than say one used for a computer generated set of uniform lines.

sscanf to get segment of string surrounded by two fixed strings

I'm trying to remove the extension of a file (I know it is .txt) using sscanf(). I've tried with many format strings I think may work, but with no success. The main problem is that I just can't understand sscanf()'s documentation, so I don't get how to use this [=%[*][width][modifiers]type=] I've tried to tell it that end must be ".txt" or to save initial string in a variable and a %4ccorresponding to the extension in another one, but again… can't make it work.
I know this has been asked before here: sscanf: get first and last token in a string but as I said... I don´t understand its solution.
The part of my code that does that:
sscanf(fileName,"the_sender_is_%s%*[.txt]", sender);
The input file name is, for example: "the_sender_is_Monika.txt"
In sender I should have
Monika
but whatever I try gives me
Monika.txt
When you use
sscanf(fileName,"the_sender_is_%s%*[.txt]", sender);
The function reads as much as it can with %s before it processes %*[.txt].
Use
sscanf(fileName,"the_sender_is_%[^.]", sender);
While sscanf() is powerful, it is not the universal tool. There are limits on what you can do with it, and you're hitting them. A moderate approximation to the task would be:
char body[32];
char tail[5];
if (sscanf("longish-name-without-dots.txt", "%31[^.]%4s", body, tail) != 2)
…oops — can't happen with the constant string, but maybe with a variable one…
This gets you longish-name-without-dots into body and .txt into tail. But it won't work all that well if there are dots in the name part before the extension.
You're probably looking for:
const char *file = "longish-name.with.dots-before.txt";
char *dot = strrchr(file, '.');
if (dot == NULL)
…oops — can't happen with the literal, but maybe with a variable…
strcpy(tail, dot); // Beware buffer overflow
memcpy(body, file, dot - file);
body[dot - file] = '\0';

I can't delete file using SHFileOperation

I want to delete a file into recycle bin. I using this code.
SHFILEOPSTRUCT FileOp;
FileOp.hwnd = NULL;
FileOp.wFunc=FO_DELETE;
FileOp.pFrom= lpFileName; //it's my value \\?\C:\WorkFolder\qweqw.docx
FileOp.pTo = NULL;
FileOp.fFlags=FOF_ALLOWUNDO|FOF_NOCONFIRMATION;
FileOp.hNameMappings=NULL;
int t_res = SHFileOperation(&FileOp); // t_res = 124
return t_res;
What's i doing wrong? Thanks in advance.
What is t_res, it should give the error code and suggest the reason
Note that pFrom takes files, not single file, so you should terminate the buffer with two zeros, see doc excerpt from MSDN:
Although this member is declared as a single null-terminated string,
it is actually a buffer that can hold multiple null-delimited file
names. Each file name is terminated by a single NULL character. The
last file name is terminated with a double NULL character ("\0\0") to
indicate the end of the buffer.
The error code is, according to the documentation:
DE_INVALIDFILES 0x7C The path in the source or destination or both was invalid.
You don't mention any analysis of this, so my suggestion would be to dig into how the filename is represented. Is it the proper encoding?

C file streams, inserting at the beginning

Is there a simple way to insert something at the beginning of a text file using file streams? Because the only way I can think of, is to load a file into a buffer, write text-to-append and then write the buffer. I want to know if it is possible to do it without my buffer.
No, its not possible. You'd have to rewrite the file to insert text at the beginning.
EDIT: You could avoid reading the whole file into memory if you used a temporary file ie:
Write the value you want inserted at the beginning of the file
Read X bytes from the old file
Write those X bytes to the new file
Repeat 2,3 until you are done reading the old file
Copy the new file to the old file.
There is no simple way, because the actual operation is not simple. When the file is stored on the disk, there are no empty available bytes before the beginning of the file, so you can't just put data there. There isn't an ideal generic solution to this -- usually, it means copying all of the rest of the data to move it to make room.
Thus, C makes you decide how you want to solve that problem.
Just wanted to counter some of the more absolute claims in here:
There is no way to append data to the beginning of a file.
Incorrect, there are, given certain constraints.
When the file is stored on the disk, there are no empty available bytes before the beginning of the file, so you can't just put data there.
This may be the case when dealing at the abstraction level of files as byte streams. However, file systems most often store files as a sequence of blocks, and some file systems allow a bit more free access at that level.
Linux 4.1+ (XFS) and 4.2+ (XFS, ext4) allows you to insert holes into files using fallocate, given certain offset/length constraints:
Typically, offset and len must be a multiple of the filesystem logical block size, which varies according to the filesystem type and configuration.
Examples on StackExchange sites can be found by web searching for 'fallocate prepend to file'.
There is no way to append data to the beginning of a file.
The questioner also says that the only way they thought of solving the problem was by reading the whole file into memory and writing it out again. Here are other methods.
Write a placeholder of zeros of a known length. You can rewind the file handler and write over this data, so long as you do not exceed the placeholder size.
A simplistic example is writing the size of an unsigned int at the start that represents the count of lines that will follow, but will not be able to fill in until you reached the end and can rewind the file handler and rewrite the correct value.
Note: Some versions of 'C' on different platforms insist you finally place the file handler at the end of file before closing the file handler for this to work correctly.
Write the new data to a new file and then using file streams append the old data to the new file. Delete the old file and then rename the new file as the old file name. Do not use copy, it is a waste of time.
All methods have trade offs of disk size versus memory and CPU usage. It all depends on your application requirements.
Not strictly a solution, but if you're adding a short string to a long file, you can make a looping buffer of the same length you want to add, and sort of roll the extra characters out:
//also serves as the buffer; the null char gives the extra char for the begining
char insRay[] = "[inserted text]";
printf("inserting:%s size of buffer:%ld\n", insRay,sizeof(insRay));
//indecies to read in and out to the file
int iRead = sizeof(insRay)-1;
int iWrite = 0;
//save the input, so we only read once
int in = '\0';
do{
in = fgetc(fp);
//don't go to next char in the file
ungetc(in,fp);
if(in != EOF){
//preserve og file char
insRay[iRead] = in;
//loop
iRead++;
if(iRead == sizeof(insRay))
iRead = 0;
//insert or replace chars
fputc(insRay[iWrite],fp);
//loop
iWrite++;
if(iWrite == sizeof(insRay))
iWrite = 0;
}
}while(in != EOF);
//add buffer to the end of file, - the char that was null before
for(int i = 0; i < sizeof(insRay)-1;i++){
fputc(insRay[iWrite],fp);
iWrite++;
if(iWrite == sizeof(insRay))
iWrite = 0;
}

Sending structured data over named pipe (Linux)

i am using a named pipe for IPC on a Debian system. I will be sending some data as a set of strings from a bash script to a background running process written in C code.
The data i want to send is four strings eg accountid, firstname,surname, description. Currently i am sending the data as a char array separated by spaces from my bash script.
echo "accountid firstname surname description" >$pipe
In the background process i read the pipe data like this into char array 'datain'
res = read(pipe_fd, datain, BUFFER_SIZE);
then i am just iterating over the array looking for spaces
eg
char* p = datain;
char accountid[80];
char firstname[80];
// extract the accountid
while(p!='')
{
accountid = p;
++p;
}
++p;
while(p!='')
{
firstname = p;
++p;
}
etc....
This method seems a bit crude however my programming skills are not that good so i was wondering if there was a better strategy for transferring this set of data over a named pipe in Linux.
Thanks
A pipe (named or not) is a stream of bytes. If you were using the same language on both sides, there might be a better way of sending structured data. In your situation, a manual encoding and decoding, like you're doing, is by far the easiest solution.
Don't use spaces to separate fields that may contain spaces, such as people's names. Use :, like /etc/passwd.
In C, read is hard to use, because you have to decide on a buffer size in advance and you have to call it in a loop because it may return less than the buffer size on a whim. The functions from stdio.h (that operate on a FILE* rather than a file descriptor) are easier to use but still require work to handle long lines. If you don't care about portability outside Linux, use getline:
FILE *pipe = fdopen(fd, "r");
char *line = NULL;
size_t line_length;
getline(&line, &line_length, pipe);
Then use strchr to locate the :s in the line. (Don't be tempted to use strtok, it's only suitable for whitespace-separated fields that can't be empty.)
Since it's 2010, you might want to encode your data in JSON or XML, both of which are readily available as libraries for C and almost any other language.

Resources