How to select rows in cassandra for indexing in solr

How to select rows in cassandra for indexing in solr - solr

I have rows in cassasndra, how would I go on about quering those rows in order to index them for example in solr. What query or what way should i use in order to query all those rows in cassandra once?

Please find below the example to integrate Cassandra and Solr:
CREATE TABLE tutor (
id int,
name text,
org text,
dep text,
sal text,
place text,
PRIMARY KEY ((org),name)
)
cqlsh:test> select * FROM tutor;
org | name | dep | id | place | sal
------+------+------+----+---------+------
org1 | abc | dep1 | 1 | sanjose | 4500
org1 | bbb | dep1 | 2 | sanjose | 5500
org2 | ccc | dep1 | 3 | sanjose | 5500
org2 | ddd | dep2 | 4 | sanjose | 5500
org2 | eee | dep3 | 5 | sanjose | 4500
org2 | fff | dep4 | 6 | sanjose | 7500
Requirements for SOLR and Cassandra integration:
SOLR version:
solr 4.9.0
Lib/Jar:
Cassandra :
cassandra-all-1.2.5.jar
libthrift-0.6.0.jar
cassandra-thrift-1.2.5.jar
Daata Import Handler:
solr-dataimporthandler-4.9.0.jar
solr-dataimporthandler-extras-4.9.0.jar
MySql:
mysql-connector-java-5.1.31-bin.jar
In Solor: Following files to be update:
• dataconfig.xml
• schema.xml
• solorconfig.xml
• dataconfig.xml
Here we have to update the JDBC connector for CASSANDRA.
< dataConfig>
< dataSource type="JdbcDataSource"
driver="org.apache.cassandra.cql.jdbc.CassandraDriver"
url="jdbc:cassandra://10.234.31.231:9160/test"
autoCommit="true"/>
< document name="content">
< entity name="test"
query="SELECT id,org,name,dep,place,sal from tutor" autoCommit="true">
< field column="id" name="id" />
< field column="org" name="org" />
< field column="name" name="name" />
< field column="dep" name="dep" />
< field column="place" name="place" />
< field column="sal" name="sal" />
< entity>
< document>
< dataConfig>
schema.xml
< field name="id" type="string" indexed="true" stored="true" required="true" />
< field name="org" type="string" indexed="true" stored="true" required="true" />
< field name="dep" type="string" indexed="true" stored="true" required="true" />
< field name="place" type="string" indexed="true" stored="true" required="true" />
< field name="sal" type="string" indexed="true" stored="true" required="true" />
Solorconfig.xml
< ! - - Add your library Path - →
< lib dir="/home/solr/lib" regex="solr-dataimporthandler-.*.jar" />
< lib dir="/home/solr/lib" regex="cassandra-jdbc-.*.jar" />
< lib dir="/home/solr/lib" regex="cassandra-all-.*.jar" />
< lib dir="/home/solr/lib" regex="cassandra-thrift-.*.jar" />
< lib dir="/home/solr/lib" regex="libthrift-.*.jar" />
. . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
< requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
< lst name="defaults">
< str name="config">dataconfigCassandra.xml
< /lst>

Not sure how is your setup (which language you are using), but probably the best is to use a Cassandra client library and write an applications to query all the entries (rows) of your cassandra "column family" (tables) and then write the data you want to index from each row in Solr. Here you have a nice overview of several clients for cassandra: http://www.datastax.com/download/clientdrivers.
To perform the "read all entries" you can perform the following "native query":
select * from columnfamilyname;
This will depend very much on the client library you use... but I imagine most of the clients allow you to do such "native queries" (like the ones you perform in Cassandra cassandra-cli or cqlsh).
Be careful how big is your DB to perform this query... In that case if you if they are indexed/ordered (by key) you can perform a query such as: select * from columnfamily where indexkey > 101 limit 100 allow filtering.
After you create the "initial index" in Solr, most probably you should also use an update method that keeps solr index updated with new entries of the Cassandra DB.

Related

How to read XML HTTP Response in SQL Server

I need help with reading some nodes in the below XML, it's an HTTP XML response which I store in a variable and need to read the values.
I want to read all the nodes that have the v31,v11,v12 namespace prefix into a table using XQuery
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Header xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd" xmlns:v1="http://xmlns.fl.com/InfoRequest/V1" xmlns:cor="http://soa.mic.co.af/coredata_1" xmlns:v3="http://xmlns.fl.com/RequestHeader/V3" xmlns:v2="http://xmlns.fl.com/ParameterType/V2">
<cor:SOATransactionID>96514584-be43-40f7-9335-b17f6d6669bd</cor:SOATransactionID>
</soapenv:Header>
<soapenv:Body xmlns:v1="http://xmlns.fl.com/InfoRequest/V1" xmlns:cor="http://soa.fl.co.af/coredata_1" xmlns:v3="http://xmlns.fl.com/RequestHeader/V3" xmlns:v2="http://xmlns.fl.com/ParameterType/V2">
<v11:InfoResponse xmlns:v11="http://xmlns.fl.com/InfoResponse/V1">
<v31:ResponseHeader xmlns:v31="http://xmlns.fl.com/ResponseHeader/V3">
<v31:GeneralResponse>
<v31:correlationID>AD-nXfJhT5ZMVEmKkut</v31:correlationID>
<v31:status>OK</v31:status>
<v31:code>Successfully-001</v31:code>
<v31:description>successfully.</v31:description>
</v31:GeneralResponse>
</v31:ResponseHeader>
<v11:responseBody>
<v11:msisdn>+149012098788</v11:msisdn>
<v11:customer>
<v12:name xmlns:v12="http://xmlns.fl.com/CustomerType/V1">FL Company</v12:name>
<v12:clientCode xmlns:v12="http://xmlns.fl.com/CustomerType/V1">5690001212</v12:clientCode>
<v12:firstName xmlns:v12="http://xmlns.fl.com/CustomerType/V1">FL Company</v12:firstName>
<v12:lastName xmlns:v12="http://xmlns.fl.com/CustomerType/V1">RA</v12:lastName>
<v12:birthDate xmlns:v12="http://xmlns.fl.com/CustomerType/V1">1997-08-07</v12:birthDate>
<v12:gender xmlns:v12="http://xmlns.fl.com/CustomerType/V1">Female</v12:gender>
<v12:address xmlns:v12="http://xmlns.fl.com/CustomerType/V1">example#fl.com</v12:address>
<v12:email xmlns:v12="http://xmlns.fl.com/CustomerType/V1">example#fl.com</v12:email>
<v12:nationality xmlns:v12="http://xmlns.fl.com/CustomerType/V1">233</v12:nationality>
<v12:clientGrade xmlns:v12="http://xmlns.fl.com/CustomerType/V1">CustomerLevel2</v12:clientGrade>
<v12:clientType xmlns:v12="http://xmlns.fl.com/CustomerType/V1">Company</v12:clientType>
<v12:clientState xmlns:v12="http://xmlns.fl.com/CustomerType/V1">Unspecified</v12:clientState>
<v12:admissionDate xmlns:v12="http://xmlns.fl.com/CustomerType/V1">2022-07-16T08:19:20</v12:admissionDate>
<v12:additionalProperties xmlns:v12="http://xmlns.fl.com/CustomerType/V1">
<v2:ParameterType>
<v2:parameterName>Value</v2:parameterName>
<v2:parameterValue>0.0</v2:parameterValue>
</v2:ParameterType>
</v12:additionalProperties>
</v11:customer>
<v11:subscriber>
<v12:imsi xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">990423434534</v12:imsi>
<v12:clientCode xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">7870559304234</v12:clientCode>
<v12:brandId xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">120</v12:brandId>
<v12:language xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">English</v12:language>
<v12:smsLanguage xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">English</v12:smsLanguage>
<v12:ussdLanguage xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">English</v12:ussdLanguage>
<v12:customerType xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">Foreign</v12:customerType>
<v12:initialCredit xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">30</v12:initialCredit>
<v12:mainProductID xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">990877687</v12:mainProductID>
</v11:subscriber>
</v11:responseBody>
</v11:InfoResponse>
</soapenv:Body>
</soapenv:Envelope>

Please try the following.
The XML sample has lots of namespaces. You should be disciplined with them as well as with XPath expressions.
SQL
DECLARE #xml XML =
N'<?xml version="1.0"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Header xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd" xmlns:v1="http://xmlns.fl.com/InfoRequest/V1" xmlns:cor="http://soa.mic.co.af/coredata_1" xmlns:v3="http://xmlns.fl.com/RequestHeader/V3"
xmlns:v2="http://xmlns.fl.com/ParameterType/V2">
<cor:SOATransactionID>96514584-be43-40f7-9335-b17f6d6669bd</cor:SOATransactionID>
</soapenv:Header>
<soapenv:Body xmlns:v1="http://xmlns.fl.com/InfoRequest/V1" xmlns:cor="http://soa.fl.co.af/coredata_1" xmlns:v3="http://xmlns.fl.com/RequestHeader/V3" xmlns:v2="http://xmlns.fl.com/ParameterType/V2">
<v11:InfoResponse xmlns:v11="http://xmlns.fl.com/InfoResponse/V1">
<v31:ResponseHeader xmlns:v31="http://xmlns.fl.com/ResponseHeader/V3">
<v31:GeneralResponse>
<v31:correlationID>AD-nXfJhT5ZMVEmKkut</v31:correlationID>
<v31:status>OK</v31:status>
<v31:code>Successfully-001</v31:code>
<v31:description>successfully.</v31:description>
</v31:GeneralResponse>
</v31:ResponseHeader>
<v11:responseBody>
<v11:msisdn>+149012098788</v11:msisdn>
<v11:customer>
<v12:name xmlns:v12="http://xmlns.fl.com/CustomerType/V1">FL Company</v12:name>
<v12:clientCode xmlns:v12="http://xmlns.fl.com/CustomerType/V1">5690001212</v12:clientCode>
<v12:firstName xmlns:v12="http://xmlns.fl.com/CustomerType/V1">FL Company</v12:firstName>
<v12:lastName xmlns:v12="http://xmlns.fl.com/CustomerType/V1">RA</v12:lastName>
<v12:birthDate xmlns:v12="http://xmlns.fl.com/CustomerType/V1">1997-08-07</v12:birthDate>
<v12:gender xmlns:v12="http://xmlns.fl.com/CustomerType/V1">Female</v12:gender>
<v12:address xmlns:v12="http://xmlns.fl.com/CustomerType/V1">example#fl.com</v12:address>
<v12:email xmlns:v12="http://xmlns.fl.com/CustomerType/V1">example#fl.com</v12:email>
<v12:nationality xmlns:v12="http://xmlns.fl.com/CustomerType/V1">233</v12:nationality>
<v12:clientGrade xmlns:v12="http://xmlns.fl.com/CustomerType/V1">CustomerLevel2</v12:clientGrade>
<v12:clientType xmlns:v12="http://xmlns.fl.com/CustomerType/V1">Company</v12:clientType>
<v12:clientState xmlns:v12="http://xmlns.fl.com/CustomerType/V1">Unspecified</v12:clientState>
<v12:admissionDate xmlns:v12="http://xmlns.fl.com/CustomerType/V1">2022-07-16T08:19:20</v12:admissionDate>
<v12:additionalProperties xmlns:v12="http://xmlns.fl.com/CustomerType/V1">
<v2:ParameterType>
<v2:parameterName>Value</v2:parameterName>
<v2:parameterValue>0.0</v2:parameterValue>
</v2:ParameterType>
</v12:additionalProperties>
</v11:customer>
<v11:subscriber>
<v12:imsi xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">990423434534</v12:imsi>
<v12:clientCode xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">7870559304234</v12:clientCode>
<v12:brandId xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">120</v12:brandId>
<v12:language xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">English</v12:language>
<v12:smsLanguage xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">English</v12:smsLanguage>
<v12:ussdLanguage xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">English</v12:ussdLanguage>
<v12:customerType xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">Foreign</v12:customerType>
<v12:initialCredit xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">30</v12:initialCredit>
<v12:mainProductID xmlns:v12="http://xmlns.fl.com/SubscriberType/V1">990877687</v12:mainProductID>
</v11:subscriber>
</v11:responseBody>
</v11:InfoResponse>
</soapenv:Body>
</soapenv:Envelope>';
;WITH XMLNAMESPACES(DEFAULT 'http://schemas.xmlsoap.org/soap/envelope/'
, 'http://xmlns.fl.com/InfoResponse/V1' AS v11
, 'http://xmlns.fl.com/CustomerType/V1' AS v12
, 'http://xmlns.fl.com/SubscriberType/V1' AS v121)
SELECT r.value('(v11:msisdn/text())[1]','varchar(50)') AS msisdn
, c.value('(v12:name/text())[1]','varchar(50)') AS custname
, c.value('(v12:clientCode/text())[1]','varchar(50)') AS clientCode
, s.value('(v121:customerType/text())[1]','varchar(50)') AS customerType
, s.value('(v121:initialCredit/text())[1]','INT') AS initialCredit
FROM #XML.nodes('/Envelope/Body/v11:InfoResponse/v11:responseBody') AS t1(r)
CROSS APPLY t1.r.nodes('v11:customer') AS t2(c)
CROSS APPLY t1.r.nodes('v11:subscriber') AS t3(s);
Output
+---------------+------------+------------+--------------+---------------+
| msisdn | custname | clientCode | customerType | initialCredit |
+---------------+------------+------------+--------------+---------------+
| +149012098788 | FL Company | 5690001212 | Foreign | 30 |
+---------------+------------+------------+--------------+---------------+

Scala: Delete empty array values from a Spark DataFrame

I'm a new learner of Scala. Now given a DataFrame named df as follows:
+-------+-------+-------+-------+
|Column1|Column2|Column3|Column4|
+-------+-------+-------+-------+
| [null]| [0.0]| [0.0]| [null]|
| [IND1]| [5.0]| [6.0]| [A]|
| [IND2]| [7.0]| [8.0]| [B]|
| []| []| []| []|
+-------+-------+-------+-------+
I'd like to delete rows if all columns is an empty array (4th row).
For example I might expect the result is:
+-------+-------+-------+-------+
|Column1|Column2|Column3|Column4|
+-------+-------+-------+-------+
| [null]| [0.0]| [0.0]| [null]|
| [IND1]| [5.0]| [6.0]| [A]|
| [IND2]| [7.0]| [8.0]| [B]|
+-------+-------+-------+-------+
I'm trying to use isNotNull (like val temp=df.filter(col("Column1").isNotNull && col("Column2").isNotNull && col("Column3").isNotNull && col("Column4").isNotNull).show()
) but still show all rows.
I found python solution of using a Hive UDF from link, but I had hard time trying to convert to a valid scala code. I would like use scala command similar to the following code:
val query = "SELECT * FROM targetDf WHERE {0}".format(" AND ".join("SIZE({0}) > 0".format(c) for c in ["Column1", "Column2", "Column3","Column4"]))
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sqlContext.sql(query)
Any help would be appreciated. Thank you.

Using the isNotNull or isNull will not work because it is looking for a 'null' value in the DataFrame. Your example DF does not contain null values but empty values, there is a difference there.
One option: You could create a new column that has the length of of the array and filter for if the array is zero.
val dfFil = df
.withColumn("arrayLengthColOne", size($"Column1"))
.withColumn("arrayLengthColTwo", size($"Column2"))
.withColumn("arrayLengthColThree", size($"Column3"))
.withColumn("arrayLengthColFour", size($"Column4"))
.filter($"arrayLengthColOne" =!= 0 && $"arrayLengthColTwo" =!= 0
&& $"arrayLengthColThree" =!= 0 && $"arrayLengthColFour" =!= 0)
.drop("arrayLengthColOne", "arrayLengthColTwo", "arrayLengthColThree", "arrayLengthColFour")
Original DF:
+-------+-------+-------+-------+
|Column1|Column2|Column3|Column4|
+-------+-------+-------+-------+
| [A]| [B]| [C]| [d]|
| []| []| []| []|
+-------+-------+-------+-------+
New DF:
+-------+-------+-------+-------+
|Column1|Column2|Column3|Column4|
+-------+-------+-------+-------+
| [A]| [B]| [C]| [d]|
+-------+-------+-------+-------+
You could also create a function that will map across all the columns and do it.

Another approach (in addition to accepted answer) would be using Datasets.
For example, by having a case class:
case class MyClass(col1: Seq[String],
col2: Seq[Double],
col3: Seq[Double],
col4: Seq[String]) {
def isEmpty: Boolean = ...
}
You can represent your source as a typed structure:
import spark.implicits._ // needed to provide an implicit encoder/data mapper
val originalSource: DataFrame = ... // provide your source
val source: Dataset[MyClass] = originalSource.as[MyClass] // convert/map it to Dataset
So you could do filtering like following:
source.filter(element => !element.isEmpty) // calling class's instance method

How to import XML to SQL Server when base nodes repeat

I am importing generated XML files into a SQL Server database. It's been working fine, but now I need additional information from a node that repeats, and I can't figure it out. I am using VBScript with SQLXMLBulkLoad.SQLXMLBulkload.4.0 to import the files. There are many <file> nodes in each file, and here a simplified as an example:
<?xml version="1.0" encoding="UTF-8"?>
<samp_catalog version='1.0'>
<file>
<name>123.wav</name>
<xyz>
<a_samp>
<a_one>1</a_one>
<a_two>2</a_two>
<a_three>3</a_three>
</a_samp>
<b_samp version='2.0'>
<b_four>sample</b_four>
<b_five>sample</b_five>
<b_six>sample</b_six>
</b_samp>
<c_samp version='1.0'>
<c_entry>
<total_size>0</total_size>
<sig>sample</sig>
<type>SAMPLE</type>
<data_size>256</data_size>
<data encoding='string'></data>
</c_entry>
</c_samp>
<c_samp version='1.0'>
<c_entry>
<total_size>0</total_size>
<sig>sample</sig>
<type>PHONE</type>
<data_size>256</data_size>
<data encoding='string'>15555551212</data>
</c_entry>
</c_samp>
<c_samp version='1.0'>
<c_entry>
<total_size>0</total_size>
<sig>sample</sig>
<type>OTHER_SAMPLE</type>
<data_size>256</data_size>
<data encoding='string'></data>
</c_entry>
</c_samp>
</xyz>
</file>
</samp_catalog>
Without the repeating c_samp node, the schema was easy:
<xsd:schema xmlns:xsd = "http://www.w3.org/2001/XMLSchema" xmlns:sql = "urn:schemas-microsoft-com:mapping-schema">
<xsd:element name = "samp_catalog" sql:is-constant = "1">
<xsd:complexType>
<xsd:sequence>
<xsd:element name = "file" sql:relation = "files" maxOccurs = "unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name = "name" type = "xsd:string" sql:field = "name" />
<xsd:element name = "xyz" sql:is-constant = "1">
<xsd:complexType>
<xsd:sequence>
<xsd:element name = "a_samp" sql:is-constant = "1">
<xsd:complexType>
<xsd:sequence>
<xsd:element name = "a_one" type = "xsd:integer" sql:field = "a_one" />
<xsd:element name = "a_two" type = "xsd:integer" sql:field = "a_two" />
<xsd:element name = "a_three" type = "xsd:integer" sql:field = "a_three" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name = "b_samp" sql:is-constant = "1">
<xsd:complexType>
<xsd:sequence>
<xsd:element name = "b_four" type = "xsd:string" sql:field = "b_four" />
<xsd:element name = "b_five" type = "xsd:string" sql:field = "b_five" />
<xsd:element name = "b_six" type = "xsd:string" sql:field = "b_six" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
The important one is the second data element (of type PHONE), but importing them all is acceptable as well. I have tried defining all 3 c_samps in the schema and defining a sql:field for only the specified data element and defining it for all elements, however the import always results in NULL for everything under a c_samp element.
How can I get the repeating nodes to import into the database?

I would write a stored procedure in SQL-Server and pass over the XML as-is. Use SQL Server's abilities to read the XML without a schema.
The following code will extract all information from the XML given.
Just do something like
CREATE dbo.ImportMyXML(#xml XML)
AS
BEGIN
--Code here
END
GO
Within the BEGIN END you write this query (out-commented the INSERT INTO to test the query's result:
--INSERT INTO YourTargetTable(col1,col2,col3,...)
SELECT #xml.value(N'(/samp_catalog/file/name/text())[1]',N'nvarchar(max)') AS name
,a_samp.value(N'(*/text())[1]',N'nvarchar(max)') AS a_one
,a_samp.value(N'(*/text())[2]',N'nvarchar(max)') AS a_two
,a_samp.value(N'(*/text())[3]',N'nvarchar(max)') AS a_three
,b_samp.value(N'#version',N'nvarchar(max)') AS b_version
,b_samp.value(N'(*/text())[1]',N'nvarchar(max)') AS b_one
,b_samp.value(N'(*/text())[2]',N'nvarchar(max)') AS b_two
,b_samp.value(N'(*/text())[3]',N'nvarchar(max)') AS b_three
,NodesWith_c_entry.value(N'(../#version)[1]',N'nvarchar(max)') AS c_version
,NodesWith_c_entry.value(N'(total_size/text())[1]','int') AS c_total_size
,NodesWith_c_entry.value(N'(sig/text())[1]','nvarchar(max)') AS c_sig
,NodesWith_c_entry.value(N'(type/text())[1]','nvarchar(max)') AS c_type
,NodesWith_c_entry.value(N'(data_size/text())[1]','int') AS c_data_size
,NodesWith_c_entry.value(N'(data/#encoding)[1]','nvarchar(max)') AS c_data_encoding
,NodesWith_c_entry.value(N'(data/text())[1]','nvarchar(max)') AS c_data_text
FROM #xml.nodes(N'samp_catalog/file/xyz') AS A(xyz)
OUTER APPLY xyz.nodes('a_samp') AS B(a_samp)
OUTER APPLY xyz.nodes('b_samp') AS C(b_samp)
OUTER APPLY xyZ.nodes('*/c_entry') AS D(NodesWith_c_entry);
With the XML given the result is this:
+---------+-------+-------+---------+-----------+--------+--------+---------+-----------+--------------+--------+--------------+-------------+-----------------+-------------+
| name | a_one | a_two | a_three | b_version | b_one | b_two | b_three | c_version | c_total_size | c_sig | c_type | c_data_size | c_data_encoding | c_data_text |
+---------+-------+-------+---------+-----------+--------+--------+---------+-----------+--------------+--------+--------------+-------------+-----------------+-------------+
| 123.wav | 1 | 2 | 3 | 2.0 | sample | sample | sample | 1.0 | 0 | sample | SAMPLE | 256 | string | NULL |
+---------+-------+-------+---------+-----------+--------+--------+---------+-----------+--------------+--------+--------------+-------------+-----------------+-------------+
| 123.wav | 1 | 2 | 3 | 2.0 | sample | sample | sample | 1.0 | 0 | sample | PHONE | 256 | string | 15555551212 |
+---------+-------+-------+---------+-----------+--------+--------+---------+-----------+--------------+--------+--------------+-------------+-----------------+-------------+
| 123.wav | 1 | 2 | 3 | 2.0 | sample | sample | sample | 1.0 | 0 | sample | OTHER_SAMPLE | 256 | string | NULL |
+---------+-------+-------+---------+-----------+--------+--------+---------+-----------+--------------+--------+--------------+-------------+-----------------+-------------+

So far, I haven't found an answer for what I was looking to do, but here is a work around I'm using for now. In VBScript, after loading the file with Microsoft.XMLDOM I loop through the file and find all type nodes whose text value is PHONE and then create a new element and append it to b_samp with its sibling data's text value.
VBScript:
Set nodes = xmlFile.selectNodes("//type")
For Each node in nodes
If node.text = "PHONE" Then
Set phoneNode = xmlFile.CreateElement("phone")
For Each child in node.ParentNode.ChildNodes
If child.nodeName = "data" Then phoneNode.text = child.text
Next
For Each child in node.ParentNode.ParentNode.ParentNode.ChildNodes
If child.nodeName = "b_samp" Then child.appendChild phoneNode
Next
End If
Next
Then save the file with a new name to leave the original intact, and import it with a new schema adding this line right after the b_six element:
<xsd:element name = "phone" type = "xsd:string" sql:field = "phone" />

Count unique rows of column in Array of object

Using
Ruby 1.9.3-p194
Rails 3.2.8
Here's what I need.
Count the different human resources (human_resource_id) and divide this by the total number of assignments (assignment_id).
So, the answer for the dummy-data as given below should be:
1.5 assignments per human resource
But I just don't know where to go anymore.
Here's what I tried:
Table name: Assignments
id | human_resource_id | assignment_id | assignment_start_date | assignment_expected_end_date
80101780 | 20200132 | 80101780 | 2012-10-25 | 2012-10-31
80101300 | 20200132 | 80101300 | 2012-07-07 | 2012-07-31
80101308 | 21100066 | 80101308 | 2012-07-09 | 2012-07-17
At first I need to make a selection for the period I need to 'look' at. This is always from max a year ago.
a = Assignment.find(:all, :conditions => { :assignment_expected_end_date => (DateTime.now - 1.year)..DateTimenow })
=> [
#<Assignment id: 80101780, human_resource_id: "20200132", assignment_id: "80101780", assignment_start_date: "2012-10-25", assignment_expected_end_date: "2012-10-31">,
#<Assignment id: 80101300, human_resource_id: "20200132", assignment_id: "80101300", assignment_start_date: "2012-07-07", assignment_expected_end_date: "2012-07-31">,
#<Assignment id: 80101308, human_resource_id: "21100066", assignment_id: "80101308", assignment_start_date: "2012-07-09", assignment_expected_end_date: "2012-07-17">
]
foo = a.group_by(&:human_resource_id)
Now I got a beautiful 'Array of hash of object' and I just don't know what to do next.
Can someone help me?

You can try to execute the request in SQL :
ActiveRecord::Base.connection.select_value('SELECT count(distinct human_resource_id) / count(distinct assignment_id) AS ratio FROM assignments');

You could do something like
human_resource_count = assignments.collect{|a| a.human_resource_id}.uniq.count
assignment_count = assignments.collect{|a| a.assignment_id}.uniq.count
result = human_resource_count/assignment_count

retrieving multiple rows from database and display using struts2

I am developing a photo album web application in which I am using two tables USER_INFORMATION and USER_PHOTO. USER_INFORMATION contains only user record which is one record, USER_PHOTO contains more than one photo form a single user. I want to get the number of user information along with their userimage path and store it in to a Java Pojo variable like a list so that I can display it using display tag in struts 2.
The table goes like this.
USER_PHOTO USER_INFORMATION
======================= =================================
| IMAGEPATH | USER_ID | | USER_NAME | AGE | ADDRESS |ID |
======================= =================================
|xyz | 1 | | abs | 34 | sdas | 1 |
|sdas | 1 | | asddd | 22 | asda | 2 |
|qwewq | 2 | | sadl | 121 | asd | 3 |
| asaa | 1 | ==================================
| 121 | 3 |
=======================
i some what manage to get the user information correctly by using HashSet and display tag the code goes like this...
public ArrayList loadData() throws ClassNotFoundException, SQLException {
ArrayList<UserData> userList = new ArrayList<UserData>();
try {
String name;
String fatherName;
int Id;
int age;
String address;
String query = "SELECT NAME,FATHERNAME,AGE,ADDRESS,ID FROM USER_INFORMATION ,USER_PHOTO WHERE ID=USER_ID";
ps = con.prepareStatement(query);
ResultSet rs = ps.executeQuery();
while (rs.next()) {
name = rs.getString(1);
fatherName = rs.getString(2);
age = rs.getInt(3);
address = rs.getString(4);
//Id = rs.getInt(5);
UserData list = new UserData();
list.setName(name);
list.setFatherName(fatherName);
list.setAge(age);
list.setAddress(address);
//list.setPhot(Id);
userList.add(list);
}
ps.close();
con.close();
} catch (Exception e) {
e.printStackTrace();
}
ArrayList al = new ArrayList();
HashSet hs = new HashSet();
hs.addAll(userList);
al.clear();
al.addAll(hs);
//return (userList);
return al;
and my display tag goes like this in jsp...
<display:table id="data" name="sessionScope.UserForm.userList" requestURI="/userAction.do" pagesize="15" >
<display:column property="name" title="NAME" sortable="true" />
<display:column property="fatherName" title="User Name" sortable="true" />
<display:column property="age" title="AGE" sortable="true" />
<display:column property="address" title="ADDRESS" sortable="true" />
</display:table>
as u can see i am displaying only user information not his photo uploaded i want to display his uploaded filename also. i some what created another table to store his photo information. if i include Imagepath attribute in my sql query than their will be duplicate rows means if user have uploaded 5 photos than display tag will display five records with same information with different ImagePath.
so any can tell me hoe make it to display only one record of user information with his multiple ImagePath so that it not display user information repeatedly.
Thanks In Advance

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to select rows in cassandra for indexing in solr - solr

I have rows in cassasndra, how would I go on about quering those rows in order to index them for example in solr. What query or what way should i use in order to query all those rows in cassandra once?

Related

How to read XML HTTP Response in SQL Server

Scala: Delete empty array values from a Spark DataFrame

How to import XML to SQL Server when base nodes repeat

Count unique rows of column in Array of object

retrieving multiple rows from database and display using struts2

Categories

Resources