Appending a big number of nodes to an xml tree - c

I'm using libxml via C, to work with xml file creation and parsing. Until recently everything worked smoothly, but a case emerged where a single tree has a subnode, lets call it S, with approximately 200,000 children. This case works surprisingly slow and I suspect the function :
xmlNewChild(/**/);
which I'm using to build the tree, has to iterate over every child of S to add one more child. Since a function that also accepts a hint (a pointer to the last added function) doesn't seem to exist, is there a better way to build the tree (maybe a batch build method) ? In case such numbers are insignificant and I should search for deficiencies elsewhere, please let me know.

Yeah, rather than keeping the entire XML in memory with xmlTree, you may want to use a combination of libxml's xmlReader and xmlWriter APIs. They're both streaming, so it won't have to keep the entire document in memory and won't have any scaling problems based on the number of elements.
Examples of both xmlReader and xmlWriter can be found here:
http://www.xmlsoft.org/examples/index.html

Related

I have MDLAsset created from an SCNScene. How do I extract MDLMeshs, MDLCamera(s), and MDLLights?

I am struggling trying to traverse an MDLAsset instance created by loading an SCNScene file (.scn).
I want to identify and extract the MDLMeshs as well as camera(s) and lights. I see no direct way to do that.
For example I see this instance method on MDLAsset:
func childObjects(of objectClass: Swift.AnyClass) -> [MDLObject]
Is this what I use?
I have carefully labeled things in the SceneKit modeler. Can I not refer to those which would be ideal. Surely, there is a dictionary of ids/labels that I can get access to. What am I missing here?
UPDATE 0
I had to resort to pouring over the scene graph in the Xcode debugger due to the complete lack of Apple documentation. Sigh ...
A few things. I see the MDLMesh and MDLSubmesh that is what I am after. What is the traversal approach to get it? Similarly for lights, and camera.
I also need to know the layout of the vertex descriptors so I can sync with my shaders. Can I force a specifc vertex layout on the parsed SCNScene?
MDLObject has a name (because of its conformance to the MDLNamed protocol), and also a path, which is the slash-separated concatenation of the names of its ancestors, but unfortunately, these don't contain the names of their SceneKit counterparts.
If you know you need to iterate through the entire hierarchy of an asset, you may be better off explicitly recursing through it yourself (by first iterating over the top-level objects of the asset, then recursively enumerating their children), since using childObjects(of:) repeatedly will wind up internally iterating over the entire hierarchy to collect all the objects of the specified type.
Beware that even though MDLAsset and MDLObjectContainerComponent conform to NSFastEnumeration, enumerating over them in Swift can be a little painful, and you might want to manually extend them to conform to Sequence to make your work a little easier.
To get all cameras,
[asset childObjectsOfClass:[MDLCamera class]]
Similarly, to get all MDLObjects,
[asset childObjectsOfClass:[MDLObjects class]]
Etc.
MDLSubmeshes aren't MDLObjects, so you traverse those on the MDLMesh.
There presently isn't a way to impose a vertex descriptor on MDL objects created from SCN objects, but that would be useful.
One thing you can do is to impose a new vertex descriptor on an existing MDL object by setting a mesh's vertexDescriptor property. See the MDLMesh.h header for some discussion.

Solve maze using queue data structure?

I'm taking a class in data structures and was given the assignment to find the shortest path through a maze using C and implementing the queue data structure. However, I can't really wrap my head around how to use a queue here.
I know the idea is to count every possible move from the start position, and when you hit the target, you're supposed to trace back to the initial position. This is what I don't understand. Because if I use a queue and delete all the moves that leads up to the target, I have no data to use to do the trace back, and if I don't delete the moves that lead to the target (i.e. saving all the possible moves and deleting them when I actually do the trace back), I might as well be using a stack.
I know there's something I don't quite get, but I can't figure out what it is. How would I utilize the queue data structure in this case?
What your professor is trying to get you to use is called "breadth-first search". The queue comes in for deciding which spaces to explore next. When you are looking at the possible paths to take, you enqueue all the paths you have yet to explore. Instead of continuing down the path you're on (which would be "depth-first search"), you dequeue the next spot you need to check, which will take you back to one of the positions you were considering earlier.
The actual implementation is up to you, I'd recommend looking for examples of breadth-first search online.

Generate MEL to rebuild a model in Maya

I made a 3D tree in Maya and need MEL code to run in a for loop and generate as many trees as I want. Is there any way to convert a model made in Maya into MEL code that would rebuild the tree?
I can't just duplicate it because the script needs to generate the tree from scratch. Unfortunately, I cleared my history because it was messy, so I am looking for a way to generate the MEL code given just the geometry.
It is slightly unfortunate that you deleted the history. If you had not then you could have just built a script out if the history tree. But all is not lost, still i will explain how you can do this in case you do remake the history at some point.
The complexity of your history is never really a issue, at least when modeling. I would advice to not delete any history until your sure its of no use. Having a 1000 node history is not in any way slowing you down since most of it is inactive. Its only a problem once you enter animation even then its not necessarily a problem at all. Since the history deletion is a bit destructive its a good idea to save the scene when you are deleting big areas of history. So rule of thumb do not delete history and use the freeze transformation operation unless its really necessary (and it may be you never actually do this).
Another side remark: Copying stuff is still valid, unless like in this case there is a artificial limitation. The only time when this limitation is valid is if this is homework (in which case you may want to demonstrate that your willing to follow instruction). Doing a copy does not detract in any way the end result. Copying is mel (and all maya ascii files are mel of sorts).
Method 1: (when history is cleared)
You can reduce the problem to a normal rigging operation. What you do is you rig the tree that you built with bones, clusters and possibly blend shapes. Then just randomize the channels of your rig on each copy operation. The neat thing here is that the script can be easily extended with different base trees, or just about anything, just by making new separate rigs.
The bonus here is that you could now also easily animate the tree being blown by the wind etc.
This might feel like something outside the scope of the assignment. However a good MEL programmer does not really separate tasks. Using a node to do something is just as valid, if not more valid, as writing everything in code. Something like this demonstrates really thorough understanding of maya use and maya programming. On the other hand not all users are as enlightened (most are not).
Everything in Maya is or at least should be a rig. (the auxiliary to this is that your mel should strive to build a rig or you should use API to make nodes)
Method 2: (when history is cleared)
Chunk your result into pieces. Then randomize the chunk positions with L-system like rule based generator. It may seem counter intuitive to use a L-system with hand modeled pieces when the entire thing could be generated with a L-system.
But clunking allows for a very simple l-system to be built. The end result also retains the artistic integrity of the original in some ways that might be much more pleasing.
A slightly different version of this and method 3 is to make 2-3 overlapping set of branches in the model and then randomly delete branches for variation.
Method 3: (when history is cleared)
Cheat, well i would actually argue that in CG there's no such thing as cheating. Just copy the same tree about, alter its rotation ans scale. And randomize new shaders to your model (swap different leaf textures different color etc). When combined with even slight amount of rigginng approach can totally hide the fact that there is just one tree. Copies dont have to be all that exact they might be shaped with a lattice or something. This is actually pretty efficient, most people wouldn't notice.
Method 4: (when history is not cleared)
When you model something manually maya records this quite efficiently. The history can be turned into a script with little or no effort. The best kept secret. If you save this kind of scene as maya ascii then the maya ascii file is (almost) just mel and you can repeat this by adding variables to the stream to instrument it to mel. Beware tough that not all tools build meaningful history so point tweaks should be done on clusters instead on manual point tweaking that ends up in the shapes tweak array.
It is also possible to build the mel automatically with a mel script that checks for connections and values in nodes to reproduce the thing in code. The good thing is that you can readily introduce the instrumentation. Here's a really old version of code as giveaway (you need to handle the shaders and inputComponents for example):
/* jooConvertNodeNetworkToMelLadder.mel 0.0
Authors: Janne 'Joojaa' Ojala
Testing: Has known bugs, code deprecated, no bug reporting
Licence: Creative Commons Attribution-ShareAlike 3.0 Unported
About:
A deprecated early version of code generation it has some bugs and
is nowhere near perfect but you can have this code for the heck of
it. Main thing missing is that it does not check for selection
sets for nodes so you need to make those manually.
Install Instructions:
copy jooConvertNodeNetworkToMelLadder.mel to your maya script
directory
Usage:
Select node and run jooConvertNodeNetworkToMelLadder from mel
commandline.
Deprecated at 31.12.2006
*/
proc string generateAttrib(string $node, string $createNode,
string $attrib,string $varName){
string $return="";
if (getAttr($node + "." + $attrib) !=
getAttr($createNode + "." + $attrib))
$return = (" setAttr (" +
$varName + "+\"." + $attrib+"\") "+
getAttr($node + "." + $attrib) + ";\n");
return $return;
}
proc string doOneNode(string $nodeOrig,string $connections[]){
string $hist[]=`listConnections -c 1 -p 1 $nodeOrig`;
string $return="";
$type=`nodeType $nodeOrig`;
$varName=("$"+$nodeOrig);
$createNode=`createNode $type`;
$return=(" "+$varName+"=`createNode -n "+$nodeOrig+" "+$type+"`;\n");
if ($type != "mesh") {
for ($attrib in `listAttr -settable -multi -w -scalar $createNode`){
$return += generateAttrib($nodeOrig, $createNode,
$attrib, $varName);
}
}
delete $createNode;
for ($i=0;$i< size($hist);$i=$i+2){
if (size(`connectionInfo -dfs $hist[$i]`)){
string $node,$port,$type,$nodeOrig,$portOrig;
$node=`match "[^.]*" $hist[$i+1]`;
$port=`match "[.].*$" $hist[$i+1]`;
$nodeOrig=`match "[^.]*" $hist[$i]`;
$portOrig=`match "[.].*$" $hist[$i]`;
$connections[size($connections)]=
(" connectAttr -f ($" +
$nodeOrig + "+\"" + $portOrig + "\") ($" +
$node+"+\""+$port+"\");");
}
}
return $return;
}
global proc jooConvertNodeNetworkToMelLadder()
{
print "\n// jooConvertNodeNetworkToMelLadder result:\n{\n";
string $a[]={};
for ($node in `listHistory`)
print(`doOneNode $node $a`);
print $a;
print "}\n";
}
But this is all I have time for, hope this helps.

Read & Process in memory XML data in a streaming manner in C

Original question below, update regarding solution, if someone has a similar problem:
For a fast regex I found http://re2c.org/ ; for xml parsing http://expat.sourceforge.net/
Is there an xml library I can use to parse xml from memory (and not from file) in a streaming manner in c?
Currently I have:
libxml2 ; XMLReader seems to only be possible to use with a filehandle and not in-memory
rapidxml is c++ and does not seem to expose a c interface
Requirements:
I need to process the individual xml nodes without having the whole xml (400GB uncompressed, and "only" 29GB as original .bz2 file) in memory ( bzip'd file gets read in and decompressed piecewise, and I would pass those uncompressed pieces to be consumed by the xml parser )
It does not need to very fast, but I would prefer an efficient solution
I (most probably) don't need the path of an extracted node, so it would be fine to just discard them as soon as they have been processed by my callback (if I would need the path contrary to what I think right now, I could then still track it myself)
This is part of me trying to solve my own problem posted here (and no, it's not the same question): How to efficiently parse large bz2 xml file in C
Ideally I'd like to be able to feed the library a certain amount of bytes at a time and have a function called whenever a node is completed.
Thank you very much
Here's some pseudo c code (way shorter than actual c code) for a better understanding
// extracted data gets put here
strm.next_out = buffer_ptr;
while( bytes_processed_total < filesize ) {
// extracts up to amount of data set in strm.avail_in
BZ2_bzDecompress( strm );
bytes_processed = strm.next_out - buffer_ptr;
bytes_processed_total += bytes_processed;
// here I would like to pass bytes_processed of buffer_ptr to xmlreader
}
About the data I want to parse: http://wiki.openstreetmap.org/wiki/OSM_XML
At the moment I only need certain <node ...> nodes from this, which have subnode <tag k="place" v="country|county|city|town|village"> (the '|' means at least one of those in this context, in the file it's of course only "country" etc without the '|')
xmlReaderForMemory from libxml2 seems a good one to me (but haven't used it so, I may be wrong)
the char * buffer needs to point to a valid XML document (that can be a part of your entire XML file). This can be extracted reading in chuncks your file but obtaining a valid XML fragment.
What's the structure of your XML file ? A root containing subsequent similar nodes or a fully fledged tree ?
If I had an XML like this:
<root>
<node>...</node>
<node>...</node>
<node>...</node>
</root>
I'd read starting from the opening <node> till the closing </node> and then parse it with the xmlReaderForMemory function, do what I need to do, then go on with the next <node> node.
Ofc if your <node> content is too complex/long, you may have to go deep some levels:
<node>
<subnode>....</subnode>
<subnode>....</subnode>
<subnode>....</subnode>
<subnode>....</subnode>
</node>
And read from the file until you have the entire <subnode> node (but keeping track that you're in a <node>.
I know it's ugly, but is a viable way. Or you can try to use a sax parser (dunno if some C implementation exists).
Sax parsing fires events on each node start and node end, so you can do nothing untill you find your nodes and process just them.
Another viable way can be using some external tools to filter the whole XML (XQuery or XPath processors) in order to extract just your interesting nodes from the whole file, obtain a smaller doc and then work on it.
EDIT: Zorba was a good XQuery framework, with command line preprocessor, may be a good place to look at
EDIT2: well since you have this dimensions, one alternative solution can be manage the file as a text file, so read and uncompress in chunks and then matching something like:
<yourNode>.*</yourNode>
with regexp.
If you're on a Linux/Unix you should have POSIX regexp library. Check
this question on S.O. for further insights.

Eclipse - How do I view java arrays / collections better in debugger

Viewing/Searching java arrays and collections in the Eclipse Java debugger is tedious and time-consuming.
I tried this promising plugin (in alpha as of Aug 2012)
http://www.cvast.tuwien.ac.at/projects/visualdebugging/ArrayExplorer
But it freezes Eclipse for simple arrays beyond a few hundred elements.
I do use Detail formatters, but that still needs clicking on each element to see the values.
Are there any better ways to view this array/collection data?
Use the 'Expressions' tab.
There you can type in any number of expressions and have them evaluated in the current scope.
ie: collection.size(), collection.getValueAt(i), ect...
Eclipse > Preferences > Java > Debug >Detail Formatter
This may be close to what you are looking for. It is another tedious work to setup but once done you can see the value of objects in Expressions window.
Here is link to start
override toString method of your class and you will be able to see what you want to see. i'm attaching example to show you exactly that.
Even though i could not find a way to see them in nice table/array, i found a halfway workaround.
The solution is to define a static method in a throwaway class that takes the array as input and returns a string of concatenated values that one wants to quickly glance at. it could include the array index and newlines to view results formatted nicely. It can be fine tuned to print out only certain array indices to reduce clutter.
This static method can then be used in the watch area.

Resources