Renaming a Dataset using a list

Renaming a Dataset using a list - database

I'm new at programming, and I've been banging my head trying to figure this out for work.
I am trying to pull roughly 300 mysql tables into Matlab into my workspace.
I have attached the following code, which is designed to pull one table (I plan to loop this code though the 300 mysql tables when it is working).
The code successfully works to import single table into the workspace as a new dataset.
My problem arise when I try to rename this new dataset with the name of the original mysql table.
Please see code below for this part where I screw up (%Assign data to output variable)
I have a list of all the 300 tables names, and I plan to store them in a list called 'name'... Hence, name(1)... is this the right approach?
For example, the original mysql table was called 'options_20020208'.
After I run the script, I need the new dataset that Matlab imports to be called 'options_20020208' as well.
Any ideas here?
%Define Query
name = 'options_20020208'
%Set preferences with setdbprefs.
setdbprefs('DataReturnFormat', 'dataset');
setdbprefs('NullNumberRead', 'NaN');
setdbprefs('NullStringRead', 'null');
%Make connection to database.
conn = database('', 'root', 'password', 'Vendor', 'MYSQL', 'Server', 'localhost', 'PortNumber', 3306);
%Read data from database.
curs = exec(conn, [['SELECT ',name,'.UnderlyingSymbol , ']...
, [name,'.UnderlyingPrice , ']...
, [name,'.Expiration , ']...
, ['FROM ','PriceMatrix.',name,' ']...
]);
curs = fetch(curs);
close(curs);
%Assign data to output variable
name(1) = curs.Data;
%Close database connection.
close(conn);
%Clear variables
clear curs conn

If you have defined a variable name, name(1) means "the first element of variable name" (in this case just "o"). Regardless of the dimensions of the variable it returns a single value (i.e. even if X is some 5-D monstrosity, X(50) returns only the value of the 50th element). name(1) = data means "set the first element of variable name to be equal to data" and will cause an error if data is not of the right size, and either an error or unexpected behaviour if it's not the right type.
For example, try this at the command line:
name = 'options_20020208';
name(1) = 1
Now, technically what you want can be done, although I don't recommend it. If you have all the names in some sort of 300 x (length) variable then over a loop of n = 1:300 it would be something like this (where name is your list of variable names):
eval([name(n,:) ' = curs.Data;'])
This will return 300 variables named 'options_20020208' or similar each containing one set of curs.Data. However, there are better ways of storing data in your workspace that will make further operations on your data easier, for example you could use structures:
myStruct(n).name = name(n,:);
myStruct(n).Data = curs.Data;
If you wanted to do some analysis and then save out all this data in some format, for example, it's going to be much easier to loop over the structure, and set the filename to [myStruct(n).name,'.csv'] and the file contents to mystruct(n).AdjustedData, etc., then to deal with 300 named variables.

Related

How to find a MoveTo destination filled by database?

I could need some help with a Anylogic Model.
Model (short): Manufacturing scenario with orders move in a individual route. The workplaces (WP) are dynamical created by simulation start. Their names, quantity and other parameters are stored in a database (excel Import). Also the orders are created according to an import. The Agent population "order" has a collection routing which contains the Workplaces it has to stop in the specific order.
Target: I want a moveTo block in main which finds the next destination of the agent order.
Problem and solution paths:
I set the destination Type to agent and in the Agent field I typed a function agent.getDestination(). This function is in order which returns the next entry of the collection WP destinationName = routing.get(i). With this I get a Datatype error (while run not compiling). I quess it's because the database does not save the entrys as WP Type but only String.
Is there a possiblity to create a collection with agents from an Excel?
After this I tried to use the same getDestination as String an so find via findFirst the WP matching the returned name and return it as WP. WP targetWP = findFirst(wps, w->w.name == destinationName);
Of corse wps (the population of Workplaces) couldn't be found.
How can I search the population?
Maybe with an Agentlink?
I think it is not that difficult but can't find an answer or a solution. As you can tell I'm a beginner... Hope the description is good an someone can help me or give me a hint :)
Thanks

Is there a possiblity to create a collection with agents from an Excel?
Not directly using the collection's properties and, as you've seen, you can't have database (DB) column types which are agent types.1
But this is relatively simple to do directly via Java code (and you can use the Insert Database Query wizard to construct the skeleton code for you).
After this I tried to use the same getDestination as String an so find via findFirst the WP matching the returned name and return it as WP
Yes, this is one approach. If your order details are in Excel/the database, they are presumably referring to workplaces via some String ID (which will be a parameter of the workplace agents you've created from a separate Excel worksheet/database table). You need to use the Java equals method to compare strings though, not == (which is for comparing numbers or whether two objects are the same object).
I want a moveTo block in main which finds the next destination of the agent order
So the general overall solution is
Create a population of Workplace agents (let's say called workplaces in Main) from the DB, each with a String parameter id or similar mapped from a DB column.
Create a population of Order agents (let's say called orders in Main) from the DB and then, in their on-startup action, set up their collection of workplace IDs (type ArrayList, element class String; let's say called workplaceIDsList) using data from another DB table.
Order probably also needs a working variable storing the next index in the list that it needs to go to (so let's say an int variable nextWorkplaceIndex which starts at 0).
Write a function in Main called getWorkplaceByID that has a single String argument id and returns a Workplace. This gets the workplace from the population that matches the ID; a one-line way similar to yours is findFirst(workplaces, w -> w.id.equals(id)).
The MoveTo block (which I presume is in Main) needs to move the Order to an agent defined by getWorkplaceByID(agent.workplaceIDsList.get(nextWorkplaceIndex++)). (The ++ bit increments the index after evaluating the expression so it is ready for the next workplace to go to.)
For populating the collection, you'd have two tables, something like the below (assuming using strings as IDs for workplaces and orders):
orders table: columns for parameters of your orders (including some String id column) other than the workplace-list. (Create one Order agent per row.)
order_workplaces table: columns order_id, sequence_num and workplace_id (so with multiple rows specifying the sequence of workplace IDs for an order ID).
In the On startup action of Order, set up the skeleton query code via the Insert Database Query wizard as below (where we want to loop through all rows for this order's ID and do something --- we'll change the skeleton code to add entries to the collection instead of just printing stuff via traceln like the skeleton code does).
Then we edit the skeleton code to look like the below. (Note we add an orderBy clause to the initial query so we ensure we get the rows in ascending sequence number order.)
List<Tuple> rows = selectFrom(order_workplaces)
.where(order_workplaces.order_id.eq(id))
.orderBy(order_workplaces.sequence_num.asc())
.list();
for (Tuple row : rows) {
workplaceIDsList.add(row.get(order_workplaces.workplace_id));
}
1 The AnyLogic database is a normal relational database --- HSQLDB in fact --- and databases only understand their own specific data types like VARCHAR, with AnyLogic and the libraries it uses translating these to Java types like String. In the user interface, AnyLogic makes it look like you set the column types as int, String, etc. but these are really the Java types that the columns' contents will ultimately be translated into.
AnyLogic does support columns which have option list types (and the special Code type column for columns containing executable Java code) but these are special cases using special logic under the covers to translate the column data (which is ultimately still a string of characters) into the appropriate option list instance or (for Code columns) into compiled-on-the-fly-and-then-executed Java).

Welcome to Stack Overflow :) To create a Population via Excel Import you have to create a method and call Code like this. You also need an empty Population.
int n = excelFile.getLastRowNum(YOUR_SHEET_NAME);
for(int i = FIRST_ROW; i <= n; i++){
String name = excelFile.getCellStringValue(YOUR_SHEET_NAME, i, 1);
double SEC_PARAMETER_TO_READ= excelFile.getCellNumericValue(YOUR_SHEET_NAME, i, 2);
WP workplace = add_wps(name, SEC_PARAMETER_TO_READ);
}
Now if you want to get a workplace by name, you have to create a method similar to your try.
Functionbody:
WP workplaceToFind = wps.findFirst(w -> w.name.equals(destinationName));
if(workplaceToFind != null){
//do what ever you want
}

ALTERing a database table inside a test

For a seemingly good reason, I need to alter something in a table that was created from a fixture while a test runs.
Of many things I tried, here's what I have right now. Before I even get to the ALTER query, first I want to make sure I have access to that database where the temporary tables sit.
public function testFailingUpdateValidation()
{
// i will run SQL against a connection, lets get it's exact name
// from one of it's tables; first get the instance of that table
// (the setup was blindly copied and edited from some docs)
$config = TableRegistry::getTableLocator()->exists('External/Products')
? []
: ['className' => 'External/Products'];
$psTable = TableRegistry::getTableLocator()
->get('External/Products', $config);
// get the connection name to work with from the table
$connectionName = $psTable->getConnection()->configName();
// get the connection instance to run the query
$db = \Cake\Datasource\ConnectionManager::get($connectionName);
// run the query
$statement = $db->query('SHOW TABLES;');
$statement->execute();
debug($statement->fetchAll());
}
The output is empty.
########## DEBUG ##########
[]
###########################
I'm expecting it to have at least a name of that table I got the connection name from. Once I have the working connection, I'd run an ALTER query on a specific column.
How do I do that? Please help
Thought I'd clarify what I'm up to, if that's needed
This is a part of an integrated code test. The methods involved are already tested individually. The code inside the method I'm testing takes a bunch of values from one database table and copies them to another, then it immediately validates whether values in the two tables match. I need to simulate a case when they don't.
The best I could come up with is to change the schema of the table column in order to have it store values with low precision, and cause my code test to fail when it tries to match (validate) the values.

The problem was I didn't load the fixture for the table I wanted to ALTER. I know it's kind of obvious, but I'm still confused because then why didn't SHOW TABLES return the other tables that did have their fixtures loaded? It showed me no tables so I thought it was not working at all, and didn't even get to loading that exact fixture. Once I did, it now shows me the table of that fixture, and the others.
Anyway, here's the working code. Hope it helps someone figuring this out.
// get the table instance
$psTable = TableRegistry::getTableLocator()->get('External/Products');
// get the table's corresponding connection name
$connectionName = $psTable->getConnection()->configName();
// get the underlying connection instance by it's name
$connection = \Cake\Datasource\ConnectionManager::get($connectionName);
// get schemas available within connection
$schemaCollection = $connection->getSchemaCollection();
// describe the table schema before ALTERating it
$schemaDescription = $schemaCollection->describe($psTable->getTable());
// make sure the price column is initially DECIMAL
$this->assertEquals('decimal', $schemaDescription->getColumnType('price'));
// keep the initial column data for future reference
$originalColumn = $schemaDescription->getColumn('price');
// ALTER the price column to be TINYINT(2), as opposed to DECIMAL
$sql = sprintf(
'ALTER TABLE %s '.
'CHANGE COLUMN price price TINYINT(2) UNSIGNED NOT NULL',
$psTable->getTable()
);
$statement = $connection->query($sql);
$statement->closeCursor();
// describe the table schema after ALTERations
$schemaDescription = $schemaCollection->describe($psTable->getTable());
// make sure the price column is now TINYINT(2)
$this->assertEquals('tinyinteger', $schemaDescription->getColumnType('price'));
Note: after this test was done, I had to restore the column back to its initial type using the data from $originalColumn above.

SQL - trying to find a set of data which only has a certain set of values but not anything else in a few columns

Hopefully an easy problem for an experienced SQL person. I have an application which uses SQL Server, and I cannot perform this query in the application, so I'm hoping to back-door it, but I need help.
I have a table with a large list of emails and all its metadata. I'm trying to find email that is only between parties of this one company and flag them.
What I did was search where companyName.com is in To and From and marked a TagField as 1 (I did this through my application's front end).
Now what I need to do is search where any other possible values, ignoring companyName.com exist in To and From where I've already flagged them as 1 in TagField. From will usually just have one value, but To could have multiple, all formatted differently, but all separated by a semi-colon (I will probably have to apply this same search to CC and BCC columns, too).
Any thoughts?

Replace the ; with the empty string. Then check to see if the length changed. If there's one email address, there shouldn't be a ';'. You could also use the same technique to replace the company name with the empty string. Anything left would be the other companies.
select email_id, to_email
from yourtable
where TagField = 1 and len(to_email) <> len(replace(to_email,';',''))
This solution is based on the following thread
Number of times a particular character appears in a string

So I went an entirely different route and exported my data to a CSV and used Python to get to where I needed. Here's the code I used in case anybody needs it. What this returned for me was a list of DocIDs (unique identifiers that were in the CSV) where ever there was an email address in the To field that wasn't from one specific domain. I went into the original CSV and made sure all instances of this domain name were in all lowercase, too.
import csv
import tkinter as tk
from tkinter import filedialog
root = tk.Tk()
root.withdraw()
file_path = filedialog.askopenfilename()
sub = "domainname"
def findMultipleTo(dict):
for row in reader:
if row['To'].find(";") != -1:
toArray = row['To'].split(';')
newTo = [s for s in toArray if sub not in s]
row['To'] = newTo
else:
row['To'] = 'empty'
with open('location\\newCSV-BCCFieldSearch.csv', 'a') as f:
if row['To'] != "empty" and row['To'] != []:
print(row['DocID'], row['To'], file = f)
else:
pass
with open(file_path) as csvfile:
reader = csv.DictReader(csvfile)
findMultipleTo(reader)

How to do fetch pages when querying universe database using .net sdk and sql

I am connecting to a universe database (from rocket software) using their .net driver. I would like to fetch data on demand on user request per page i.e. do pagination. With other databases we could use (offset fetch) but universe db doesn't seem to support it. It does not recognize keyword offset, something like
SELECT NAME, AGE FROM CONTACTS WHERE AGE > 25 offset 5 sample 5 does not work. I does not recognize those keywords and there is no good documentation :-(
Note: Although it is traditionally a multi-value database, the one I am using does not use multi value types but the structure is normalized.

This is certainly one of the shortcomings of this platform. I have worked through this in the past with the something similar to the following subroutine. I had to remove a bunch of stuff for brevity but this compiles so it must work completely bug free, right?
Caveats: You need to have #SELECT DICT item in each file you want to use this with containing all of the columns you want to return.
Multivalues get a little tricky. I had flattened the data I was using this with so I did not run into that problem, but this does not do UNNESTs.
Also you might want to add a value saying how many records there are total and possibly work out some kind of token passing and list saving to cut down on executing the query each time you run it but that gets much, much deeper than the basic question at hand.
SUBROUTINE SQLSelectWithOffset(TableName,UVWithClause,Starting,Offset)
***********************************************************************
* PROGRAM ID: SQLSelectWithOffset
*
* PROGRAM TITLE: SQLSelectWithOffset
*
* DESCRIPTION: Universe doesn't support sql commands using starting and offset
* which makes life hard when you want all of a file
* but you choke on the size. Tokens allow for the selectlist to be saved
* TableName = UV FIle to select on. If this is blank program will return the number of records remaining
* UVWithClause = Your critera, WITH or BY criteria you want in a sort select.
* Starting = Holds you place in line
* Offest = How many records to return
************************************************************************
$INCLUDE UNIVERSE.INCLUDE ODBC.H
RETURN.LIST = ""
IF Starting = "" or Starting < 1 THEN
Starting = 1
END
GOSUB GET.MASTER.LIST
FOR X=Starting TO Offset
ID = EXTRACT(FULL.LIST,X,0,0)
IF ID = "" THEN CONTINUE
RETURN.LIST<-1> = ID
NEXT X
SELECT RETURN.LIST TO 9
SQLSTMT ="SELECT * FROM ":TableName:" SLIST 9"
ST=SQLExecDirect(#HSTMT, SQLSTMT)
RETURN
GET.MASTER.LIST:
STMT = "SSELECT ":TableName
IF UVWithClause NE "" THEN
STMT := " ":UVWithClause
END
EXECUTE "CLEARSELECT"
EXECUTE STMT
READLIST FULL.LIST ELSE FULL.LIST = ""
RETURN
END
Good luck, please only use this information for good!

How to Improve Performance and speed

I have written this program for connecting and fetching the data into file, but this program is so slow in fetching . is there is any way to improve the performance and faster way to load the data into the file . iam targeting around 100,000 to million of records so thats why iam worried about performance and also can i use array fetch size and batch size as we can do in java.
import java.sql as sql
import java.lang as lang
def main():
driver, url, user, passwd = ('oracle.jdbc.driver.OracleDriver','jdbc:oracle:thin:#localhost:1521:xe','odi_temp','odi_temp')
##### Register Driver
lang.Class.forName(driver)
##### Create a Connection Object
myCon = sql.DriverManager.getConnection(url, user, passwd)
f = open('c:/test_porgram.txt', 'w')
try:
##### Create a Statement
myStmt = myCon.createStatement()
##### Run a Select Query and get a Result Set
myRs = myStmt.executeQuery("select emp_id ,first_name,last_name,date_of_join from src_sales_12")
##### Loop over the Result Set and print the result in a file
while (myRs.next()):
print >> f , "%s,%s,%s,%s" %(myRs.getString("EMP_ID"),myRs.getString("FIRST_NAME"),myRs.getString("LAST_NAME"),myRs.getString("DATE_OF_JOIN") )
finally:
myCon.close()
f.close()
### Entry Point of the program
if __name__ == '__main__':
main()

Unless you're on the finest, finest gear for the DB and file server, or the worst gear running the script, this application is I/O bound. After the select has returned from the DB, the actual movement of the data will dominate more than any inefficiencies in Jython, Java, or this code.
You CPU is basically unconscious during this process, you're simply not doing enough data transformation. You could write a process that is slower than the I/O, but this isn't one of them.
You could write this in C and I doubt you'd see a substantial difference.

Can't you just use the Oracle command-line SQL client to directly export the results of that query into a CSV file?

You might use getString with hardcoded indices instead of the column name (in your print statement) so the program doesn't have to look up the names over and over. Also, I don't know enough about Jython/Python file output to say whether this is enabled by default or not, but you should try to make sure your output is buffered.
EDIT:
Code requested (I make no claims about the correctness of this code):
print >> f , "%s,%s,%s,%s" %(myRs.getString(0),myRs.getString(1),myRs.getString(2),myRs.getString(3) )
or
myRs = myStmt.executeQuery("select emp_id ,first_name,last_name,date_of_join from src_sales_12")
hasFirst = myRs.next()
if (hasFirst):
empIdIdx = myRs.findColumn("EMP_ID")
fNameIdx = myRs.findColumn("FIRST_NAME")
lNameIdx = myRs.findColumn("LAST_NAME")
dojIdx = myRs.findColumn("DATE_OF_JOIN")
print >> f , "%s,%s,%s,%s" %(myRs.getString(empIdIdx),myRs.getString(fNameIdx),myRs.getString(lNameIdx),myRs.getString(dojIdx) )
##### Loop over the Result Set and print the result in a file
while (myRs.next()):
print >> f , "%s,%s,%s,%s" %(myRs.getString(empIdIdx),myRs.getString(fNameIdx),myRs.getString(lNameIdx),myRs.getString(dojIdx) )

if you just want to fetch data into files ,you can try database tools(for example , "load","export").

You may also find that if you do the construction of the string which goes into the file in the SQL select statement, you will get better performance.
So your SQL select should be SELECT EMP_ID || ',' || FIRST_NAME || ',' || LAST_NAME || ',' || DATE_OF_JOIN MY_DATA ... (depending on what database and separator is)
then in your java code you just get the one string empData = myRs.findColumn("EMP_DATA") and write that to a file. We have seen significant performance benefits doing this.
The other thing you may see benefit from is changing the JDBC connection to use a larger read buffer - rather than 30 rows at a time in the fetch, fetch 5000 rows.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Renaming a Dataset using a list - database

Related

How to find a MoveTo destination filled by database?

ALTERing a database table inside a test

SQL - trying to find a set of data which only has a certain set of values but not anything else in a few columns

How to do fetch pages when querying universe database using .net sdk and sql

How to Improve Performance and speed

Categories

Resources