Appending data in existing XML using libxml2 - c

Well, I am trying to append data using C programming and libxml2 modulel but am facing a lot of problems as I am fairly new to this.
My code is designed to first fetch me an Element Node from the XML file based on the user input and then grab the parent of that child node and append another child in it.
XML FILE:
<policyList>
<policySecurity>
<policyName>AutoAdd</policyName>
<deviceName>PA-722</deviceName>
<status>ACTIVE</status>
<srcZone>any</srcZone>
<dstZone>any</dstZone>
<srcAddr>5.5.5.5</srcAddr>
<dstAddr>5.5.5.4</dstAddr>
<srcUser>any</srcUser>
<application>any</application>
<service>htds</service>
<urlCategory>any</urlCategory>
<action>deny</action>
</policySecurity>
<policySecurity>
<policyName>Test-1</policyName>
<deviceName>PA-710</deviceName>
<status>ACTIVE</status>
<srcZone>any</srcZone>
<dstZone>any</dstZone>
<srcAddr>192.168.1.23</srcAddr>
<dstAddr>8.8.8.8</dstAddr>
<srcUser>vivek</srcUser>
<application>any</application>
<service>http</service>
<urlCategory>any</urlCategory>
<action>deny</action>
</policySecurity>
</policyList>
C CODE:
int main(){
xmlDocPtr pDoc = xmlReadFile("/var/www/db/db_policy.xml", NULL, XML_PARSE_NOBLANKS | XML_PARSE_NOERROR | XML_PARSE_NOWARNING | XML_PARSE_NONET);
if (pDoc == NULL)
{
fprintf(stderr, "Document not parsed successfully.\n");
return 0;
}
root_element = xmlDocGetRootElement(pDoc);
if (root_element == NULL)
{
fprintf(stderr, "empty document\n");
xmlFreeDoc(pDoc);
return 0;
}
printf("Root Node is %s\n", root_element->name);
xmlChar* srcaddr = "5.5.5.5";
xmlChar *xpath = (xmlChar*) "//srcAddr";
xmlNodeSetPtr nodeset;
xmlXPathObjectPtr result;
int i;
xmlChar *keyword;
xmlXPathContextPtr context;
xmlNodePtr resdev;
xmlChar* resd;
context = xmlXPathNewContext(pDoc);
if (context == NULL) {
printf("Error in xmlXPathNewContext\n");
}
result = xmlXPathEvalExpression(xpath, context);
xmlXPathFreeContext(context);
if (result == NULL) {
printf("Error in xmlXPathEvalExpression\n");
}
if(xmlXPathNodeSetIsEmpty(result->nodesetval)){
xmlXPathFreeObject(result);
printf("No result\n");
};
if (result) {
nodeset = result->nodesetval;
for (i=0; i < nodeset->nodeNr; i++) {
keyword = xmlNodeListGetString(pDoc, nodeset->nodeTab[i]->xmlChildrenNode, 1);
printf("keyword: %s\n", keyword);
if(strcmp(keyword, srcaddr) == 0){
xmlNodePtr pNode = xmlNewNode(0, (xmlChar*)"service");
xmlNodeSetContent(pNode, (xmlChar*)"nonser");
xmlAddSibling(result, pNode);
printf("added");
}
xmlFree(keyword);
}
xmlXPathFreeObject (result);
}
xmlFreeDoc(pDoc);
xmlCleanupParser();
return (1);
}
On running this code, it gets compiled and executed(with a few warnings, but nothing that hinders execution), but it does not add anything to my XML File.

I think this topic is old but I just had a similar problem. So, I am just sharing for those who still have similar problems.
On running this code, it gets compiled and executed(with a few warnings, but nothing that hinders execution), but it does not add anything to my XML File.
First of all: In my opinion warnings in C are so much worse than errors because it lets you run the wrong code. So, my very first advice is not to ignore the warnings (although I am not in a position to advise anyone but anyway).
Second: When I was running this code, I saw a warning which makes sense:
> warning: passing argument 1 of ‘xmlAddSibling’ from incompatible
> pointer type [-Wincompatible-pointer-types]
>
> note: expected ‘xmlNodePtr {aka struct _xmlNode *}’ but argument is of
> type ‘xmlXPathObjectPtr {aka struct _xmlXPathObject *}’
As you check the xmlAddSibling from http://www.xmlsoft.org/html/libxml-tree.html you can see:
xmlNodePtr xmlAddSibling (xmlNodePtr cur, xmlNodePtr elem)
Which means the type of both of the arguments should be of xmlNodePtr. However, "result" has the type of xmlXPathObjectPtr which means the pointer types are completely different. What you really want to do is to add a child to a parent that you have found based on the string that you compared: (if(strcmp(keyword, srcaddr) == 0)).
So your way to find the parent is completely correct. But two problems are: first you never updated the "result" (if we assume you imagined the "result" is the parent which is not correct) because "nodeset->nodeTab[i]" is in a for loop that never puts anything in "result". The second problem is even if you updated the "result" based on "nodeset->nodeTab[i]", still they have different types of the pointers (as we discussed previously). So, you have to use xmlAddSibling for the correct parent and with the correct pointer type. As you can see hereunder, the "nodeTab" has the type of "xmlNodePtr" which we were looking for, and "nodeset->nodeTab[i]" is the parent.
Structure xmlNodeSet
struct _xmlNodeSet {
int nodeNr : number of nodes in the set
int nodeMax : size of the array as allocated
> `xmlNodePtr * nodeTab : array of nodes in no particular order`
}
So you should change the:
xmlAddSibling(result, pNode);
to:
xmlAddSibling(nodeset->nodeTab[i], pNode);
Finally: you didn't save the changes. So, save it by adding
xmlSaveFileEnc("note.xml", pDoc, "UTF-8");
before
xmlFreeDoc(pDoc);
With these changes, I was able to run your code with your XML file and with no warnings.

Your commands modify the DOM representation of the XML in memory, but you missed writing it back to the file. So adding the following line should solve your problem:
...
}
// write back to file:
xmlSaveFileEnc("/var/www/db/db_policy.xml", pDoc, "UTF-8");
xmlFreeDoc(pDoc);
xmlCleanupParser();
return (1);

Related

Problem constructing and appendNode function ifor a linked llist C

source code from tbaMUD
In file "handler.c", we have this "obj_to_room" function, which takes any object dropped in the room and adds it to a linked list. This creates a stack of objects, with the first object at the bottom (head) of the stack, the second object stacked on top the first, and so on. This is the default behavior for linked lists in C.
In file "act.informative.c" we have the "look_at_room" function. That calls the "list_obj_to_char" function. This uses a "for loop" to read the list/stack.
When using a "for loop" to read the list/stack/node, it does so from top (tail) to bottom (head). This, too is the default behavior in C. Therefore, objects dropped in the room are displayed with the most recently dropped object at the top of the list and the first object dropped at the bottom.
That's what causes this issue:
www.tbamud.com/forum/2-general/5530-has-anyone-else-noticed
My goal is to invert the order of objects in that linked list. There are a few hacks I might pull off, but that's just what they would be - hacks, not exactly proper and certainly not elegant. I think the best solution is coding a function using "appendNode" to add objects at the tail (top) of the list instead of its head (bottom).
Toward that end, I need to change this:
/* put an object in a room */
void obj_to_room(struct obj_data *object, room_rnum room)
{
if (!object || room == NOWHERE || room > top_of_world)
log("SYSERR: Illegal value(s) passed to obj_to_room. (Room #%d/%d, obj %p)",
room, top_of_world, (void *)object);
else {
object->next_content = world[room].contents;
world[room].contents = object;
IN_ROOM(object) = room;
object->carried_by = NULL;
if (ROOM_FLAGGED(room, ROOM_HOUSE))
SET_BIT_AR(ROOM_FLAGS(room), ROOM_HOUSE_CRASH);
}
}
to something like this:
/*put an object in a room */
void obj_to_room(struct obj_data *object, room_rnum room)
{
if (!object || room == NOWHERE || room > top_of_world)
{
log("SYSERR: Illegal value(s) passed to obj_to_room. (Room #%d/%d, obj %p)",
room, top_of_world, (void*) object);
}
else
{
/*function to add objects at the tail of the list instead of its head*/
/*everything hinges on this single line and I probably have it all kinds of wrong*/
/*struct node* appendNode(struct node** head, int key)*/
struct world[room].contents* appendNode(struct world[room].contents** object, room_rnum room)
{
/* special case for length 0*/
if (object == NULL)
{
*object = world[room].contents;
}
else
{
/* locate the last node */
while (object->next_content != NULL)
{
object = object->next_content;
}
object->next_content = world[room].contents;
world[room].contents = object;
IN_ROOM(object) = room;
object->carried_by = NULL;
if (ROOM_FLAGGED(room, ROOM_HOUSE))
SET_BIT_AR(ROOM_FLAGS(room), ROOM_HOUSE_CRASH);
}
}
}
}
Problem 1
Although I'm familiar with multiple programming languages, C is not one of them. When it comes to the idiosyncrasies and technical fine points of the language, I know nothing. That makes reading C code challenging and writing it even more so.
Problem2
I understand the format should be:
struct node* appendNode(struct node** head, int key)
I think the head and int key are correct, but I'm unable to identify the node in the original code. So I used my best guess.
It's not surprising that attempting to compile this code yields:
handler.c: In function ‘obj_to_room’:
handler.c:681:19: error: expected identifier or ‘(’ before ‘[’ token
681 | struct world[room].contents* appendNode(struct world[room].contents** object, room_rnum room)
| ^
make[1]: *** [<builtin>: handler.o] Error 1
Ok, I suspect there's all sorts of things wrong with that line, but I don't know how to fix it. I'm hoping that some brilliant coder will be kind enough to help out.
This does the trick.
/* put an object in a room */
void obj_to_room(struct obj_data* object, room_rnum room)
{
if (!object || room == NOWHERE || room > top_of_world) {
log("SYSERR: Illegal value(s) passed to obj_to_room. (Room #%d/%d, obj %p)", room, top_of_world, (void *)object);
}
else {
if (world[room].contents == NULL) { // here, we have an empty list.
world[room].contents = object; // Just add it.
}
else {
struct obj_data* i = world[room].contents; // define a temporary pointer
while (i->next_content != NULL) {
i = i->next_content; // find the first without a next_content
}
i->next_content = object; // add it at the end
}
object->next_content = NULL; // end of the linked list
IN_ROOM(object) = room;
object->carried_by = NULL;
if (ROOM_FLAGGED(room, ROOM_HOUSE)) {
SET_BIT_AR(ROOM_FLAGS(room), ROOM_HOUSE_CRASH);
}
}
}

C Programming - fprintf and printf in while cicle doesn't work

I'm getting a strange problem with a while cicle inside of a function.
I have to look for the extreme vertices of a .ply model. All the data is stored in a linked list. When I'm done creating the list, I call the findExtremeVertex function, that modifies 6 global variables (leftVertex, rightVertex, downwardVertex, upwardVertex, backVertex and frontVertex).
To see if the values are right (the models I use are a bit too big to control every single line to find the maximum of every vertex) I decided to print every change in the max-min values but, when I try to print them in a file, the file is empty. Why is that? Also, when I saw that the file was empty, I tried to print something directly in the console but that didn't work either.
Here's the code of the funcion:
void findExtremeVertex(Vertex *vertex){
FILE *modelInfoFile;
int i = 0;
///Giving data to direction-vertices pointers
leftVertex = malloc(sizeof(Vertex));
rightVertex = malloc(sizeof(Vertex));
upwardVertex = malloc(sizeof(Vertex));
downwardVertex = malloc(sizeof(Vertex));
frontVertex = malloc(sizeof(Vertex));
backVertex = malloc(sizeof(Vertex));
///Giving the direction-vertices the values of the parameter
leftVertex = vertex;
rightVertex = vertex;
upwardVertex = vertex;
downwardVertex = vertex;
frontVertex = vertex;
backVertex = vertex;
///Opening file
modelInfoFile = fopen(us2, "w");
if(modelInfoFile == NULL){
printf("Error in file opening. Exiting.");
exit(EXIT_FAILURE);
}
///Scrolling the list
while(vertex->prev != NULL){
vertex = vertex->prev;
///If the given element of the list is more to the right than the global variable,
///I assign the values of the element to the global variable
if(vertex->vertexCoordinates.x > rightVertex->vertexCoordinates.x){
rightVertex = vertex;
}
/**
I'm omitting the other if constructs because are basically
the same, but the syntax is correct
**/
///Printing in file the cycle information
fprintf(modelInfoFile, "********** CYCLE %d **********\n\n", i);
fprintf(modelInfoFile, "Vertex sx\n");
fprintf(modelInfoFile, "%1.4f %1.4f %1.4f %1.4f %1.4f %1.4f\n\n", leftVertex->vertexCoordinates.x,
leftVertex->vertexCoordinates.y,
leftVertex->vertexCoordinates.z,
leftVertex->vertexNormals.x,
leftVertex->vertexNormals.y,
leftVertex->vertexNormals.z);
/**
Again, I'm omitting some repetitions but the syntax is correct
**/
}
}
I call this function in another function, but there's no segmentation fault signal, the compiler doesn't tell me anything, the program doesn't crash. I have no clue of the error, except from the fact that the file where I print the infos about the cycles is empty. What am I doing wrong?
There are many problems in your code.
You malloc() 6 variables and never use any of them, and you don't check if malloc() succeeded.
You never call fclose() or fflush() so maybe you are seeing the file before the data is flushed to the disk.
You reassign all the *Vertex (except for rightVertex) variables after they are malloc()ed to the same pointer vertex which means
You are causing a memory leak.
You are using 6 variables for a single pointer.
All the *Vertex variables are not declared inside the function which means that they are in the global scope, that is very likely a bad design choice. Given the code you posted it's not possible to tell whether or not global variables are the right choice, but 99% of the time they are a bad choice and there is a much more elegant and safe way to do things.
The bold point above is likely the reason why your program is behaving as it is.
The code
leftVertex = vertex;
rightVertex = vertex;
upwardVertex = vertex;
downwardVertex = vertex;
frontVertex = vertex;
backVertex = vertex;
sets the pointer value but not the actual value. You malloc space, get a pointer to that space, and then throw that pointer away setting it to the pointer of virtex.
Do you mean to use
*leftVertex = *vertex;
*rightVertex = *vertex;
*upwardVertex = *vertex;
*downwardVertex = *vertex;
*frontVertex = *vertex;
*backVertex = *vertex;
///Scrolling the list
while(vertex->prev != NULL){
vertex = vertex->prev;
And what happens if vertex is NULL after this?
You're checking if it's NULL, then changing it's value such that it can become NULL.
///Opening file
if(modelInfoFile == NULL){
printf("Error in file opening. Exiting.");
exit(EXIT_FAILURE);
}
I don't see you opening file.
if((modelInfoFile=fopen(filename,"w")) == NULL){
Should work.
EDIT
In you while loop you change -
vertex = vertex->prev;
But in fprintf you store in file in value of leftVertex->vertexCoordinates.x
So how do you expect to print inside file correctly.

C standard binary trees

I'm pretty much of a noob in regards to C programming.
Been trying for a few days to create a binary tree from expressions of the form:
A(B,C(D,$))
Where each letters are nodes.
'(' goes down a level in my tree (to the right).
',' goes to the left-side branch of my tree
'$' inserts a NULL node.
')' means going up a level.
This is what I came up with after 2-3 days of coding:
#define SUCCESS 0
typedef struct BinaryTree
{
char info;
BinaryTree *left,*right,*father;
}BinaryTree;
int create(BinaryTree*nodeBT, const char *expression)
{
nodeBT *aux;
nodeBT *root;
nodeBT *parent;
nodeBT=(BinaryTree*) malloc (sizeof(BinaryTree));
nodeBT->info=*expression;
nodeBT->right=nodeBT->left=NULL;
nodeBT->father = NULL;
++expression;
parent=nodeBT;
root=nodeBT;
while (*expression)
{if (isalpha (*expression))
{aux=(BinaryTree*) malloc (sizeof(BinaryTree));
aux->info=*expression;
aux->dr=nodeBT->st=NULL;
aux->father= parent;
nodeBT=aux;}
if (*expression== '(')
{parent=nodeBT;
nodeBT=nodeBT->dr;}
if (*expression== ',')
{nodeBT=nodeBT->father;
nodeBT=nodeBT->dr;}
if (*expression== ')')
{nodeBT=nodeBT->father;
parent= nodeBT->nodeBT;}
if (*expression== '$')
++expression;
++expression;
}
nodeBT=root;
return SUCCESS;
}
At the end, while trying to access the newly created tree, I keep getting "memory unreadable 0xCCCCCC". And I haven't got the slightest hint where I'm getting it wrong.
Any idea ?
Several problems:
You haven't shown us the definition of type nodeBT, but you've declared aux, root, and parent to be pointers to that type.
You then assign aux to point to a BinaryTree even though it's declared to point to a nodeBT.
You assign to aux->dr, which isn't part of BinaryTree, so I can't just assume you typed nodeBT where you meant BinaryTree. You assign to nodeBT->st, that is not a part of BinaryTree either.
You try to return the parsed tree by assigning nodeBT=root. The problem is that C is a “call-by-value” language. This implies that when your create function assigns to nodeBT, it is only changing its local variable's value. The caller of create doesn't see that change. So the caller doesn't receive the root node. That's probably why you're getting your “memory unreadable” error; the caller is accessing some random memory, not the memory containing the root node.
Your code will actually be much easier to understand if you write your parser using a standard technique called “recursive descent”. Here's how.
Let's write a function that parses one node from the expression string. Naively, it should have a signature like this:
BinaryTree *nodeFromExpression(char const *expression) {
To parse a node, we first need to get the node's info:
char info = expression[0];
Next, we need to see if the node should have children.
BinaryTree *leftChild = NULL;
BinaryTree *rightChild = NULL;
if (expression[1] == '(') {
If it should have children, we need to parse them. This is where we put the “recursive” in “recursive descent”: we just call nodeFromExpression again to parse each child. To parse the left child, we need to skip the first two characters in expression, since those were the info and the ( of the current node:
leftChild = nodeFromExpression(expression + 2);
But how much do we skip to parse the right child? We need to skip all the characters that we used while parsing the left child…
rightChild = nodeFromExpression(expression + ???
We don't know how many characters that was! It turns out we need to make nodeFromExpression return not just the node it parsed, but also some indication of how many characters it consumed. So we need to change the signature of nodeFromExpression to allow that. And what if we run into an error while parsing? Let's define a structure that nodeFromExpression can use to return the node it parsed, the number of characters it consumed, and the error it encountered (if there was one):
typedef struct {
BinaryTree *node;
char const *error;
int offset;
} ParseResult;
We'll say that if error is non-null, then node is null and offset is the offset in the string where we found the error. Otherwise, offset is just past the last character consumed to parse node.
So, starting over, we'll make nodeFromExpression return a ParseResult. It will take the entire expression string as input, and it will take the offset in that string at which to start parsing:
ParseResult nodeFromExpression(char const *expression, int offset) {
Now that we have a way to report errors, let's do some error checking:
if (!expression[offset]) {
return (ParseResult){
.error = "end of string where info expected",
.offset = offset
};
}
char info = expression[offset++];
I didn't mention this the first time through, but we should handle your $ token for NULL here:
if (info == '$') {
return (ParseResult){
.node = NULL,
.offset = offset
};
}
Now we can get back to parsing the children.
BinaryTree *leftChild = NULL;
BinaryTree *rightChild = NULL;
if (expression[offset] == '(') {
So, to parse the left child, we just call ourselves recursively again. If the recursive call gets an error, we return the same result:
ParseResult leftResult = nodeFromExpression(expression, offset);
if (leftResult->error)
return leftResult;
OK, we parsed the left child successfully. Now we need to check for, and consume, the comma between the children:
offset = leftResult.offset;
if (expression[offset] != ',') {
return (ParseResult){
.error = "comma expected",
.offset = offset
};
}
++offset;
Now we can recursively call nodeFromExpression to parse the right child:
ParseResult rightResult = nodeFromExpression(expression, offset);
The error case now is a bit more complex if we don't want to leak memory. We need to free the left child before returning the error:
if (rightResult.error) {
free(leftResult.node);
return rightResult;
}
Note that free does nothing if you pass it NULL, so we don't need to check for that explicitly.
Now we need to check for, and consume, the ) after the children:
offset = rightResult.offset;
if (expression[offset] != ')') {
free(leftResult.node);
free(rightResult.node);
return (ParseResult){
.error = "right parenthesis expected",
.offset = offset
};
}
++offset;
We need to set our local leftChild and rightChild variables while the leftResult and rightResult variables are still in scope:
leftChild = leftResult.node;
rightChild = rightResult.node;
}
We've parsed both children, if we needed to, so now we're ready to construct the node we need to return:
BinaryTree *node = (BinaryTree *)calloc(1, sizeof *node);
node->info = info;
node->left = leftChild;
node->right = rightChild;
We have one last thing to do: we need to set the father pointers of the children:
if (leftChild) {
leftChild->father = node;
}
if (rightChild) {
rightChild->father = node;
}
Finally, we can return a successful ParseResult:
return (ParseResult){
.node = node,
.offset = offset
};
}
I've put all the code in this gist for easy copy'n'paste.
UPDATE
If your compiler doesn't like the (ParseResult){ ... } syntax, you should look for a better compiler. That syntax has been standard since 1999 (§6.5.2.5 Compound Literals). While you're looking for a better compiler, you can work around it like this.
First, add two static functions:
static ParseResult ParseResultMakeWithNode(BinaryTree *node, int offset) {
ParseResult result;
memset(&result, 0, sizeof result);
result.node = node;
result.offset = offset;
return result;
}
static ParseResult ParseResultMakeWithError(char const *error, int offset) {
ParseResult result;
memset(&result, 0, sizeof result);
result.error = error;
result.offset = offset;
return result;
}
Then, replace the problematic syntax with calls to these functions. Examples:
if (!expression[offset]) {
return ParseResultMakeWithError("end of string where info expected",
offset);
}
if (info == '$') {
return ParseResultMakeWithNode(NULL, offset);
}

Pointer to Pointer

I am having a lot of trouble with this piece of code (I am not good at pointers :P). So here is the code.
printf("\n Enter the file name along with its extensions that you want to delete:-");
scanf("%s",fileName);
deletefile_1_arg=fileName;
printf("test\n");
result_5 = deletefile_1(&deletefile_1_arg, clnt);
if (result_5 == (int *) NULL) {
clnt_perror (clnt, "call failed");
}
else
{
printf("\n File is deleted sucessfully");
goto Menu2;
}
break;
Function that is getting called is as following.
int *
deletefile_1_svc(char **argp, struct svc_req *rqstp)
{
static int result;
printf("test2\n");
printf("%s",**argp);
if(remove(**argp));
{
printf("\nFile Has Been Deleted");
result=1;
}
return &result;
}
I am getting test2 on console but. It does not print value of argp / removes that perticular file. I am not sure what I am doing wrong. Please help me.
The argp is a pointer to a pointer char, and you are trying to use it as a pointer to char, try change your code to:
printf("%s", *argp);
You would also need to change your remove call to:
remove(*argp);
I always found drawing pictures helped understand pointers. Use boxes for memory addresses and a label for the box is the variable name. If the variable is a pointer, then the contents of the box is the address of another box (draw line to the other box).
You are using pointers when you don't need to. Your "deletefile1_svc" function doesn't manipulate the value of "argp" at all so it doesn't need a pointer-to-pointer. Plus your "result" doesn't need to be returned as a pointer since it is simply a numeric value. You also don't initialize result (it might be zero) or re-initialize it (it is static so it will remember the last value assigned to it).
int
deletefile_1_svc(const char *argp, struct svc_req *rqstp)
{
int result = 0; /* Initial value => failure */
if (remove (argp) == 0)
{
result = 1; /* 1 => success */
}
return result;
}
To call the function use:
result_5 = deletefile1_svc(filename, clnt);
if (result_5 == 0)
// Failed
else
// Success
That will make the code simpler and less prone to bugs.

How can libxml2 be used to parse data from XML?

I have looked around at the libxml2 code samples and I am confused on how to piece them all together.
What are the steps needed when using libxml2 to just parse or extract data from an XML file?
I would like to get hold of, and possibly store information for, certain attributes. How is this done?
I believe you first need to create a Parse tree. Maybe this article can help, look through the section which says How to Parse a Tree with Libxml2.
libxml2 provides various examples showing basic usage.
http://xmlsoft.org/examples/index.html
For your stated goals, tree1.c would probably be most relevant.
tree1.c: Navigates a tree to print
element names
Parse a file to a tree, use
xmlDocGetRootElement() to get the root
element, then walk the document and
print all the element name in document
order.
http://xmlsoft.org/examples/tree1.c
Once you have an xmlNode struct for an element, the "properties" member is a linked list of attributes. Each xmlAttr object has a "name" and "children" object (which are the name/value for that attribute, respectively), and a "next" member which points to the next attribute (or null for the last one).
http://xmlsoft.org/html/libxml-tree.html#xmlNode
http://xmlsoft.org/html/libxml-tree.html#xmlAttr
I found these two resources helpful when I was learning to use libxml2 to build a rss feed parser.
Tutorial with SAX interface
Tutorial using the DOM Tree (code example for getting an attribute value included)
Here, I mentioned complete process to extract XML/HTML data from file on windows platform.
First download pre-compiled .dll form http://xmlsoft.org/sources/win32/
Also download its dependency iconv.dll and zlib1.dll from the same page
Extract all .zip files into the same directory. For Ex: D:\demo\
Copy iconv.dll, zlib1.dll and libxml2.dll into c:\windows\system32 deirectory
Make libxml_test.cpp file and copy following code into that file.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <libxml/HTMLparser.h>
void traverse_dom_trees(xmlNode * a_node)
{
xmlNode *cur_node = NULL;
if(NULL == a_node)
{
//printf("Invalid argument a_node %p\n", a_node);
return;
}
for (cur_node = a_node; cur_node; cur_node = cur_node->next)
{
if (cur_node->type == XML_ELEMENT_NODE)
{
/* Check for if current node should be exclude or not */
printf("Node type: Text, name: %s\n", cur_node->name);
}
else if(cur_node->type == XML_TEXT_NODE)
{
/* Process here text node, It is available in cpStr :TODO: */
printf("node type: Text, node content: %s, content length %d\n", (char *)cur_node->content, strlen((char *)cur_node->content));
}
traverse_dom_trees(cur_node->children);
}
}
int main(int argc, char **argv)
{
htmlDocPtr doc;
xmlNode *roo_element = NULL;
if (argc != 2)
{
printf("\nInvalid argument\n");
return(1);
}
/* Macro to check API for match with the DLL we are using */
LIBXML_TEST_VERSION
doc = htmlReadFile(argv[1], NULL, HTML_PARSE_NOBLANKS | HTML_PARSE_NOERROR | HTML_PARSE_NOWARNING | HTML_PARSE_NONET);
if (doc == NULL)
{
fprintf(stderr, "Document not parsed successfully.\n");
return 0;
}
roo_element = xmlDocGetRootElement(doc);
if (roo_element == NULL)
{
fprintf(stderr, "empty document\n");
xmlFreeDoc(doc);
return 0;
}
printf("Root Node is %s\n", roo_element->name);
traverse_dom_trees(roo_element);
xmlFreeDoc(doc); // free document
xmlCleanupParser(); // Free globals
return 0;
}
Open Visual Studio Command Promt
Go To D:\demo directory
execute cl libxml_test.cpp /I".\libxml2-2.7.8.win32\include" /I".\iconv-1.9.2.win32\include" /link libxml2-2.7.8.win32\lib\libxml2.lib command
Run binary using libxml_test.exe test.html command(Here test.html may be any valid HTML file)
You can refere this answer.
here they store data into structure format and use further by passing structure address to a function.
You can find detail code in c for use.
code ->> this

Resources