XQuery - Help Needed - union

I am supposed to write an XQuery based SQL script to merge data from 3 different XML files into one unified format
.
The output should be of the following format:
<Courses>
<Course school="NYU">
<Number>30144</Number>
<Title>C‐PAC II</Title>
<Instructor>Lewis</Instructor>
</Course>
...
<Course school="Harvard">
<Number>4949</Number>
<Title>Computer Science 50. Introduction to Computer Science I</Title>
<Instructor>Michael D. Smith</Instructor>
</Course>
...
</Courses>
I had written the following script to achieve the above:
SELECT catalog.query('
<Courses>
{
(<Course school = "NYU">
{
for $x in (/nyu/Course)
return
<Number> {$x/CallNo/text()} </Number>
<Title> {$x/Name/text()} </Title>
<Instructor> {$x/Instructor/text()} </Instructor>
}
</Course>)
union
(<Course school = "Harvard">
{
for $y in (/harvard/Course)
return
<Number> {$y/Number/text()} </Number>
<Title> {$y/Title/text()} </Title>
<Instructor> {$y/Instructor/text()} </Instructor>
}
</Course>)
union
(<Course scool = "Umich">
{
for $z in (/umich/Course)
return
<Number> {$z/#catalognumber} </Number>
<Title> {$z/name/text()} </Title>
}
</Course>)
}
</Courses>
')
from catalogs
Can anyone please tell me where I have went wrong?

XQuery operators work with sequences of nodes, not files. The syntax would be something like this:
document("umich.xml")/umich/Course UNION document("harvard.xml")/harvard/Course UNION document("nyu.xml")/nyu/Course
Use the document function to return nodes from files, and the union operator to create the result set.

Related

groovy XPath with multiple conditions and loops

New guy here.
i need your opinion on this scenario, this is a dynamic unit of measure conversion based on the xml below, the groovy program will calculate the element EachesConversion evaluating every CorrespondingQuantity and Quantity fields.
e.g. 1 XPX = 126 XCS and 1 XCS = 12 EA
Please advice.
<QuantityConversion>
<Quantity unitCode="XCS">1.0</Quantity>
<CorrespondingQuantity unitCode="EA">12.0</CorrespondingQuantity>
***<EachesConversion>12.0</EachesConversion>***
</QuantityConversion>
<QuantityConversion>
<Quantity unitCode="XPX">1.0</Quantity>
<CorrespondingQuantity unitCode="XCS">126.0</CorrespondingQuantity>
***<EachesConversion>1512</EachesConversion>***
</QuantityConversion>
------ this is what I try---
How I use a condition in the find ?
findAll { it.CorrespondingQuantity.#unitCode == 'EA' && it.Quantity.#unitCode == 'XCS' }*.value()

Ruby: Extract elements from deeply nested JSON structure based on criteria

Want to extract every marketID from every market that has a marketName == 'Moneyline'. Tried a few combinations of .maps, .rejects, and/or .selects but can't narrow it down as the complicated structure is confusing me.
There are many markets in events, and there are many events as well. A sample of the structure (tried to edit it for brevity):
{"currencyCode"=>"GBP",
"eventTypes"=>[
{"eventTypeId"=>6423,
"eventNodes"=>[
{"eventId"=>28017227,
"event"=>
{"eventName"=>"Philadelphia # Seattle"
},
"marketNodes"=>[
{"marketId"=>"1.128274650",
"description"=>
{"marketName"=>"Moneyline"}
},
{"marketId"=>"1.128274625",
"description"=>
{"marketName"=>"Winning Margin"}
}}}]},
{"eventId"=>28018251,
"event"=>
{"eventName"=>"Arkansas # Mississippi State"
},
"marketNodes"=>[
{"marketId"=>"1.128299882",
"description"=>
{"marketName"=>"Under/Over 60.5pts"}
},
{"marketId"=>"1.128299881",
"description"=>
{"marketName"=>"Moneyline"}
}}}]},
{"eventId"=> etc....
Tried all kinds of things, for example,
markets = json["eventTypes"].first["eventNodes"].map {|e| e["marketNodes"].map { |e| e["marketId"] } if (e["marketNodes"].map {|e| e["marketName"] == 'Moneyline'})}
markets.flatten
# => yields every marketId not every marketId with marketName of 'Moneyline'
Getting a simple array with every marketId from Moneyline markets with no other information is sufficient. Using Rails methods is fine too if preferred.
Sorry if my editing messed up the syntax. Here's the source. It looks like this only with => instead of : after parsing the JSON.
Thank you!
I love nested maps and selects :D
require 'json'
hash = JSON.parse(File.read('data.json'))
moneyline_market_ids = hash["eventTypes"].map{|type|
type["eventNodes"].map{|node|
node["marketNodes"].select{|market|
market["description"]["marketName"] == 'Moneyline'
}.map{|market| market["marketId"]}
}
}.flatten
puts moneyline_market_ids.join(', ')
#=> 1.128255531, 1.128272164, 1.128255516, 1.128272159, 1.128278718, 1.128272176, 1.128272174, 1.128272169, 1.128272148, 1.128272146, 1.128255464, 1.128255448, 1.128272157, 1.128272155, 1.128255499, 1.128272153, 1.128255484, 1.128272150, 1.128255748, 1.128272185, 1.128278720, 1.128272183, 1.128272178, 1.128255729, 1.128360712, 1.128255371, 1.128255433, 1.128255418, 1.128255403, 1.128255387
Just for fun, here's another possible answer, this time with regexen. It is shorter but might break depending on your input data. It reads the json data directly as String :
json = File.read('data.json')
market_ids = json.scan(/(?<="marketId":")[\d\.]+/)
market_names = json.scan(/(?<="marketName":")[^"]+/)
moneyline_market_ids = market_ids.zip(market_names).select{|id,name| name=="Moneyline"}.map{|id,_| id}
puts moneyline_market_ids.join(', ')
#=> 1.128255531, 1.128272164, 1.128255516, 1.128272159, 1.128278718, 1.128272176, 1.128272174, 1.128272169, 1.128272148, 1.128272146, 1.128255464, 1.128255448, 1.128272157, 1.128272155, 1.128255499, 1.128272153, 1.128255484, 1.128272150, 1.128255748, 1.128272185, 1.128278720, 1.128272183, 1.128272178, 1.128255729, 1.128360712, 1.128255371, 1.128255433, 1.128255418, 1.128255403, 1.128255387
It outputs the same result as the other answer.

StAX Parser : Duplicated Node name and specific comments

I'm try to parse xml file with StAX parser but I face two problems:
First: Two nodes have the same name
Second: read the exactly comment before the values
<database>
<!-- 2015-03-10 01:29:00 EET / 130 --> <row><v> 2.74 </v><v> 1.63 </v></row>
<!-- 2015-03-10 01:30:00 EET / 170 --> <row><v> 5.33 </v><v> 1.68 </v></row>
<!-- 2015-03-10 01:31:00 EET / 180 --> <row><v> 7.62 </v><v> 1.83 </v></row>
<database>
I want to collect the data like that:
Date:2015-03-10 01:29:00
V1: 2.74
V2:1.63
I was using Dom parser before and it was so easy to deal with dublicate node name and comments unfortunately I have to use StAX now and I don't know how to solve those problems :(
The first issue: two nodes have the same name
<v> 2.74 </v><v> 1.63 </v>
There is no issue with StAX, if you follow the events you will get in order:
startElement ( v )
characters ( 2.74 )
endElement ( v )
startElement ( v )
characters ( 1.63 )
endElement ( v )
So it is up to you to handle minimal of context information in your code to know if it is the first or the second time you are starting a <v> element.
The second issue: read the comments
There is no issue neither, the StAX parsing triggers events for comments as well, you can simply get the comment as String with the API and extract yourself the expected value, for instance:
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
XMLStreamReader streamReader = inputFactory.createXMLStreamReader(inputStream);
while (streamReader.hasNext()) {
int event = streamReader.next();
if(event == XMLStreamConstants.COMMENT) {
String aDateStringVal = streamReader.getText();
// + extract your date value from the comment string
}
}

Extract Text from Array - perl

I am trying to extract the Interface from an array created from an SNMP Query.
I want to create an array like THIS:
my #array = ( "Gig 11/8",
"Gig 10/1",
"Gig 10/4",
"Gig 10/2");
It currently looks like THIS:
my #array =
( "orem-g13ap-01 Gig 11/8 166 T AIR-LAP11 Gig 0",
"orem-g15ap-06 Gig 10/1 127 T AIR-LAP11 Gig 0",
"orem-g15ap-05 Gig 10/4 168 T AIR-LAP11 Gig 0",
"orem-g13ap-03 Gig 10/2 132 T AIR-LAP11 Gig 0");>
I am doing THIS:
foreach $ints (#array) {
#gig = substr("$ints", 17, 9);
print("Interface: #gig");
Sure it works, but the hostname [orem-g15ap-01] doesn't always stay the same length, it varies depending on the site. I need to extract the word "Gig" plus the next 6 characters. I have no idea what is the best way of doing this.
I am a novice at perl but trying. Thanks
# "I need to extract the word "Gig" plus the next 6 characters."
# This looks like a fixed-with file format, so consider using unpack.
foreach ( #lines ) {
my( $orem, $gig, $rest ) = unpack 'a17 a9 a*';
print "[$gig]\n";
}
If it's not fixed-with format, then you need to find out what the file spec is and then maybe use a regular expression, something like:
my( $orem, $gig, $rest ) = m/(\S+)\s+(.{9})(.*)/;
But this will not work in the general case without a proper file spec.
Stuff like that is what Perl is made for. Regular Expressions are the way to go. Read the perldoc perlre.
foreach $ints (#array) {
$ints =~ s/(Gig.{6})/$1/g;
}
So you want the second and third field.
my #array = map { /^\S+\s+(\S+\s\S+)/s } #source;
This one is like ikegami's, but I recommend that if you know how something you want looks, then by all means, specify that. Because this is done in a list context, any string that does not match the spec, returns an empty list--or is ignored.
my #results = map { m!(\bGig\s+\d+/d+)! } #array;

ANTLR3 C Target - parser return 'misses' out root element

I'm trying to use the ANTLR3 C Target to make sense of an AST, but am running into some difficulties.
I have a simple SQL-like grammar file:
grammar sql;
options
{
language = C;
output=AST;
ASTLabelType=pANTLR3_BASE_TREE;
}
sql : VERB fields;
fields : FIELD (',' FIELD)*;
VERB : 'SELECT' | 'UPDATE' | 'INSERT';
FIELD : CHAR+;
fragment
CHAR : 'a'..'z';
and this works as expected within ANTLRWorks.
In my C code I have:
const char pInput[] = "SELECT one,two,three";
pANTLR3_INPUT_STREAM pNewStrm = antlr3NewAsciiStringInPlaceStream((pANTLR3_UINT8) pInput,sizeof(pInput),NULL);
psqlLexer lex = sqlLexerNew (pNewStrm);
pANTLR3_COMMON_TOKEN_STREAM tstream = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT,
TOKENSOURCE(lex));
psqlParser ps = sqlParserNew( tstream );
sqlParser_sql_return ret = ps->sql(ps);
pANTLR3_BASE_TREE pTree = ret.tree;
cout << "Tree: " << pTree->toStringTree(pTree)->chars << endl;
ParseSubTree(0,pTree);
This outputs a flat tree structure when you use ->getChildCount and ->children->get to recurse through the tree.
void ParseSubTree(int level,pANTLR3_BASE_TREE pTree)
{
ANTLR3_UINT32 childcount = pTree->getChildCount(pTree);
for (int i=0;i<childcount;i++)
{
pANTLR3_BASE_TREE pChild = (pANTLR3_BASE_TREE) pTree->children->get(pTree->children,i);
for (int j=0;j<level;j++)
{
std::cout << " - ";
}
std::cout <<
pChild->getText(pChild)->chars <<
std::endl;
int f=pChild->getChildCount(pChild);
if (f>0)
{
ParseSubTree(level+1,pChild);
}
}
}
Program output:
Tree: SELECT one , two , three
SELECT
one
,
two
,
three
Now, if I alter the grammar file:
sql : VERB ^fields;
.. the call to ParseSubTree only displays the child nodes of fields.
Program output:
Tree: (SELECT one , two , three)
one
,
two
,
three
My question is: why, in the second case, is Antlr just give the child nodes? (in effect missing out the SELECT token)
I'd be very grateful if anybody can give me any pointers for making sense of the tree returned by Antlr.
Useful Information:
AntlrWorks 1.4.2,
Antlr C Target 3.3,
MSVC 10
Placing output=AST; in the options section will not produce an actual AST, it only causes ANTLR to create CommonTree tokens instead of CommonTokens (or, in your case, the equivalent C structs).
If you use output=AST;, the next step is to put tree operators, or rewrite rules inside your parser rules that give shape to your AST.
See this previous Q&A to find out how to create a proper AST.
For example, the following grammar (with rewrite rules):
options {
output=AST;
// ...
}
sql // make VERB the root
: VERB fields -> ^(VERB fields)
;
fields // omit the comma's from the AST
: FIELD (',' FIELD)* -> FIELD+
;
VERB : 'SELECT' | 'UPDATE' | 'INSERT';
FIELD : CHAR+;
SPACE : ' ' {$channel=HIDDEN;};
fragment CHAR : 'a'..'z';
will parse the following input:
UPDATE field, foo , bar
into the following AST:
I think it is important that you realize that the tree you see in Antrlworks is not the AST. The ".tree" in your code is the AST but may look different from what you expect. In order to create the AST, you need to specify the nodes using the ^ symbol in strategic places using rewrite rules.
You can read more here

Resources