Import SPSS statistics data tree model into SPSS modeler? - export

I am trying to use SPSS Modeler to test a decision tree model built in SPSS statistics, but I can't find any straightforward way to do it (only xml export, which I cannot import later). I also tried to re-build the model in Modeler using the same dataset and settings (CHAID, stopping rule minimum records in parent branch = 100 and 50 in child branch, etc.) but the results are completely different. I am using 3 input variables that in SPSS Statistics result in a 343 node tree, whereas in Modeler only one of them is included in a model of 3 nodes. Is there any way to import/export models or re-built them sharing settings?

You can import Statistics's CHAID model in .xml as modeling nugget into Modeler via "Import PMML" choice [right click in model tab, lowest choice].

You can apply a Tree model directly in Statistics. The rules can be saved in a number of ways. I don't know of a direct way of using the xml from Statistics Trees directly in Modeler, though.

Related

How to link the Tag Items to TestCaseId in Team Foundation Server 2017?

We are trying to create a report to link the Tag Items to their TestCaseIds.
However, the tables in the TFSWarehouse are empty. How would we link these sets of data.
It was very interesting - I did a search on every field on every table in the TFS database and came to know the Tags were actually embedded in XML in the rows. This is a very unique way of placing data, as the XML rows could be readily 'read' by a browser. However, this was very difficult to SELECT and perform JOINS on data.

Is incremental adds are possible in neo4j?

I have a quick question. I have database of one million nodes and 4 million relationships. This all data in neo4j i have created with import csv command. Now after testing the graph database and analyzing the queries according to my need. Now i want to make a php program where the data will be automatically loaded and i will get the results in the end (according to my query). Now here is the question, as my data will update after 15 min. Is neo4j has a ability of incremental adds. Like to show which new relationships or nodes added in this specific time.i was thinking to use the time command to see which data was created in that time. Correct me if i am wrong. i only want to see the new addition. because i dont want neo4j to waste time on the calculation of already existing nodes/relationships.is there any other way to do that.
thanks in advance.
You could add a property to store a string of the date/time that the nodes are added. Then you could query for everything since the last date/time. I'm not 100% sure on the index performance of that, though.
However, if all you care about is showing the the most recently imported, you could have a boolean value with an index:
CREATE INDEX ON :Label(recently_added)
Then when you import your data you can unset all of the current ones and set the new ones like this:
MATCH (n:Label {recently_added: true})
REMOVE n.recently_added
Second query:
MATCH (n:Label {id: {id_list}})
SET n.recently_added = true
That is assuming that you have some sort of unique identifier on the nodes which you can use to set the ones which you just added.

Is it possible to import a database table to Drools Guvnor as a decision table?

We will be having data fed in this database table regularly, and I was wondering if it was possible to import this data on a timely basis into Drools Guvnor?
If you want to maintain rules in a database table, then you should be looking at rule templates:
http://docs.jboss.org/drools/release/6.0.1.Final/drools-docs/html_single/index.html#d0e4969
Rule templates provide a relatively simple mechanism for merging DRL with data.
FWIW - The documentation for this in the manual is poor, so here is a hint on the kind of thing you need to do:
To generate rules from a combination of database data and a template, you will need to import org.drools.template.jdbc.ResultSetGenerator. This class can be used to generate DRL code from a database query result set and a template.
// Get results from your DB query...
resultSet = preparedStmt.executeQuery();
// Generate the DRL...
resultSetGenerator = new ResultSetGenerator();
String drl = resultSetGenerator.compile(resultSet,
new FileInputStream("path/to/template.drt"));
Then you create a package through the API and add that generated DRL to it.

Insert data from another DB in tables

I'm having some issue here. Let me explain.
So I was about done with migration of this project and I've decided to run the test suite to make sure the logic was still working as expected. Unfortunately, it didn't... but that's not the issue.
At the end of the suite, there was a nice script that execute a delete on the datas of 5 tables of our developement database. That would be fine if there was also a script to actually populate the database...
The good side is that we still have plenty of data in production environement, so I'm looking for a way and/or possibly a tool to extract the data on these 5 particular tables in production and insert them in dev environement. There is all sort of primary and foreign key between these tables, maybe auto-increment fields, (and also A LOT of data) that's why I don't want to do it manually.
Our database is db2 v9 if it makes any difference. I'm also working with SQuirreL, there might be a plugin, but I haven't found yet.
Thanks
This is sort of a shot in the dark, as I've never used db2, but from previous experience, my intuition immidiately says "Try csv". I'm willing to bet my grandmother you can import / export csv-files in your software ( why did i just start thinking of George from Seinfeld? )
This should also leave you with FKs and IDs intact. You might have to reset your auto increment value to whatever is appropriate, if need be. That, of course, would be done after the import
In addittion, csv files are plaintext and very easily manipulated should any quirks show their head.
Best of luck to you!
Building on Arve's answer, DB2 has a built-in command for importing CSV files:
IMPORT FROM 'my_csv_file.csv'
OF del
INSERT INTO my_table
You can specify a list of columns if they are not in the default order:
IMPORT FROM 'my_csv_file.csv'
OF del
-- 1st, 2nd, 3rd column in CSV
METHOD P(1, 2, 3)
INSERT INTO my_table
(foo_col, bar_col, baz_col)
And you can also specify a different delimiter if it's not comma-delimited. For example, the following specifies a file delimited by |:
IMPORT FROM 'my_csv_file.csv'
OF del
MODIFIED BY COLDEL|
-- 1st, 2nd, 3rd column in CSV
METHOD P(1, 2, 3)
INSERT INTO my_table
(foo_col, bar_col, baz_col)
There are a lot more options. The official documentation is a bit hairy:
DB2 Info Center | IMPORT command
Do you have access to the emulator? there's a function in the emulator that allows you to import CSV into tables directly.
Frank.
Personally, I am not aware of any automated tools that can "capture" a smaller subset of your production data into a test suite, but in my day, I was able to use QMF and some generic queries to do just that. It does require forward planning / analysis of your table structures, parent-child dependencies, referential integrity and other things.
It did take some initial work to do, but once it was done, I was able to use, and re-use these tools to extract several different views of production data for my testing purposes.
If this appeals to you, read on.
On a high-level view, you could do this:
Determine what the key column names are.
Create a "keys" table for them.
Write several queries to look for your test conditions and populate the keys_table.
Once you are satisfied that keys_table has a satisfactory subset of keys, then you can use your created tools to strip out the data for you.
Write a generic query that joins the keys_table with that of your production tables and export the data into flat files.
Write a proc to do all the extractions / populations for you automatically.
If you have access to QMF (and you probably do in a DB2 shop), you may be able to do something like this:
Determine all of the tables that you need.
Determine the primary indexes for those tables.
Determine any referential integrity requirements for those tables.
Determine Parent - Child relationships between all the tables.
For the lowest level child table (typically the one with most indexes) note all the columns used to identify a unique key.
With the above information, you can create a generic query to strip out a smaller subsection of production data, for #5. In other words, you can create a series of specific queries and populate a small Key table that you create.
In QMF, you can create a generic query like this:
select t.*
from &t_tbl t
, &k_tbl k
where &cond
order by 1, 2, 3
In the proc, you simply pass the tablename, keys, and condtions variables. Once the data is captured, you EXPORT the data into some filename.
You can create an EXPORT_TABLE proc would look something like this:
run query1 (&&t_tbl = students_table , &&k_tbl = my_test_keys ,
+ &&cond = (t.stud_id = k.stud_id and t.course_id = k.course_id)
export data to studenttable
run query1 (&&t_tbl = course_table , &&k_tbl = my_test_keys ,
+ &&cond = (t.cour_id = k.cour_id
+ (and t.cour_dt between 2009-01-01 and 2010-02-02)
export data to coursetable
.....
This could capture all the data as needed.
You can then create an IMPORT_TEST proc to do the opposite:
import data from studenttable
save data as student_table (replace = yes
import data from coursetable
save data as course_table (replace = yes
....
It may take a while to create, but at least you would then have a re-useable tool to extract your data.
Hope that helps.

How can I automate exporting of tables into proper XML files from MSSQL or Access?

We have a customer requesting data in XML format. Normally this is not required as we usually just hand off an Access database or csv files and that is sufficient. However in this case I need to automate the exporting of proper XML from a dozen tables.
If I can do it out of SQL Server 2005, that would be preferred. However I can't for the life of me find a way to do this. I can dump out raw xml data but this is just a tag per row with attribute values. We need something that represents the structure of the tables. Access has an export in xml format that meets our needs. However I'm not sure how this can be automated. It doesn't appear to be available in any way through SQL so I'm trying to track down the necessary code to export the XML through a macro or vbscript.
Any suggestions?
Look into using FOR XML AUTO. Depending on your requirements, you might need to use EXPLICIT.
As a quick example:
SELECT
*
FROM
Customers
INNER JOIN Orders ON Orders.CustID = Customers.CustID
FOR XML AUTO
This will generate a nested XML document with the orders inside the customers. You could then use SSIS to export that out into a file pretty easily I would think. I haven't tried it myself though.
If you want a document instead of a fragment, you'll probably need a two-part solution. However, both parts could be done in SQL Server.
It looks from the comments on Tom's entry like you found the ELEMENTS argument, so you're getting the fields as child elements rather than attributes. You'll still end up with a fragment, though, because you won't get a root node.
There are different ways you could handle this. SQL Server provides a method for using XSLT to transform XML documents, so you could create an XSL stylesheet to wrap the result of your query in a root element. You could also add anything else the customer's schema requires (assuming they have one).
If you wanted to leave some fields as attributes and make others elements, you could also use XSLT to move those fields, so you might end up with something like this:
<customer id="204">
<firstname>John</firstname>
<lastname>Public</lastname>
</customer>
There's an outline here of a macro used to export data from an access db to an xml file, which may be of some use to you.
Const acExportTable = 0
Set objAccess = CreateObject("Access.Application")
objAccess.OpenCurrentDatabase "C:\Scripts\Test.mdb"
'Export the table "Inventory" to test.xml
objAccess.ExportXML acExportTable,"Inventory","c:\scripts\test.xml"
The easiest way to do this that I can think of would be to create a small app to do it for you. You could do it as a basic WinForm and then just make use of a LinqToSql dbml class to represent your database. Most of the time you can just serialize those objects using XmlSerializer namespace. Occasionally it is more difficult than that depending on the complexity of your database. Check out this post for some detailed info on LinqToSql and Xml Serialization:
http://www.west-wind.com/Weblog/posts/147218.aspx
Hope that helps.

Resources