Snowflake- creating external table by using pattern - snowflake-cloud-data-platform

I am trying to create the external table(xyz) in snowflake by using pattern to load historical file from stage, there are multiple files and using following pattern to load the file name started with below
201802242300_5d80272d1abcd32cc7a981da083ed498.gz. ( Feb 24th 2018 file)
Create external table xyz
(
samplecol1 varchar as (value:samplecol1::varchar),
samplecol2 varchar as (value:samplecol2::varchar),
date as to_date(substr(metadata$filename,1,8),yyyymmdd)
)
partition by (date)
location = #snowflakestage.largetable
pattern='.*/20180224.*[_].*.gz'
file_format = (type = 'JSON');
it's executing successfully but not loading any data. Is my pattern right to pick the file name listed above?

A good way to test patterns is via the LIST command as it takes the same PATTERN option.
thus for you:
LIST #snowflakestage.largetable pattern='.*/20180224.*[_].*.gz'
for example using the CitiBike example data, there are not parque files
so if you try load all files, you get errors.
create stage citibike.public.citibike_trips
url = 's3://snowflake-workshop-lab/citibike-trips';
list #citibike_trips;
name
s3://snowflake-workshop-lab/citibike-trips-parquet/2022/01/08/data_01a19496-0601-8b21-003d-9b03003c624a_3106_4_0.snappy.parquet
s3://snowflake-workshop-lab/citibike-trips-parquet/2022/01/09/data_01a19496-0601-8b21-003d-9b03003c624a_1906_6_0.snappy.parquet
s3://snowflake-workshop-lab/citibike-trips-parquet/2022/01/10/data_01a19496-0601-8b21-003d-9b03003c624a_2206_6_0.snappy.parquet
s3://snowflake-workshop-lab/citibike-trips/json/2013-06-01/data_01a304b5-0601-4bbe-0045-e8030021523e_005_7_0.json.gz
s3://snowflake-workshop-lab/citibike-trips/json/2013-06-01/data_01a304b5-0601-4bbe-0045-e8030021523e_005_7_1.json.gz
s3://snowflake-workshop-lab/citibike-trips/json/2013-06-01/data_01a304b5-0601-4bbe-0045-e8030021523e_005_7_2.json.gz
So I played around till I found this pattern worked for the files I wanted.
list #citibike_trips pattern = '.*trips_.*csv.gz';

Related

Loading data in Snowflake using bind variables

We're building dynamic data loading statements for Snowflake using the Python interface.
We want to create a stage at query runtime, and use that stage in a subsequent statement. Table and stage names are dynamic using bind variable.
Yet, it doens't seem like we can find the correct syntax as we tried everything on https://docs.snowflake.com/en/user-guide/python-connector-api.html
COPY INTO IDENTIFIER( %(table_name)s )(SRC, LOAD_TIME, ROW_HASH)
FROM (SELECT t.$1, CURRENT_TIMESTAMP(0), MD5(t.$1) FROM "'%(stage_name)s'" t)
PURGE = TRUE;
Is this even possible? Does it work for anyone?
Your code does not create stage as you mentioned, and you don't need create a stage, instead use table stage or user stage. The SQL below uses table stage.
You also need to change your syntax a little and use more pythonic way : f-strings
sql = f"""COPY INTO {table_name} (SRC, LOAD_TIME, ROW_HASH)
FROM (SELECT t.$1, CURRENT_TIMESTAMP(0), MD5(t.$1) FROM #%{table_name} t)
PURGE = TRUE"""

How to find a MoveTo destination filled by database?

I could need some help with a Anylogic Model.
Model (short): Manufacturing scenario with orders move in a individual route. The workplaces (WP) are dynamical created by simulation start. Their names, quantity and other parameters are stored in a database (excel Import). Also the orders are created according to an import. The Agent population "order" has a collection routing which contains the Workplaces it has to stop in the specific order.
Target: I want a moveTo block in main which finds the next destination of the agent order.
Problem and solution paths:
I set the destination Type to agent and in the Agent field I typed a function agent.getDestination(). This function is in order which returns the next entry of the collection WP destinationName = routing.get(i). With this I get a Datatype error (while run not compiling). I quess it's because the database does not save the entrys as WP Type but only String.
Is there a possiblity to create a collection with agents from an Excel?
After this I tried to use the same getDestination as String an so find via findFirst the WP matching the returned name and return it as WP. WP targetWP = findFirst(wps, w->w.name == destinationName);
Of corse wps (the population of Workplaces) couldn't be found.
How can I search the population?
Maybe with an Agentlink?
I think it is not that difficult but can't find an answer or a solution. As you can tell I'm a beginner... Hope the description is good an someone can help me or give me a hint :)
Thanks
Is there a possiblity to create a collection with agents from an Excel?
Not directly using the collection's properties and, as you've seen, you can't have database (DB) column types which are agent types.1
But this is relatively simple to do directly via Java code (and you can use the Insert Database Query wizard to construct the skeleton code for you).
After this I tried to use the same getDestination as String an so find via findFirst the WP matching the returned name and return it as WP
Yes, this is one approach. If your order details are in Excel/the database, they are presumably referring to workplaces via some String ID (which will be a parameter of the workplace agents you've created from a separate Excel worksheet/database table). You need to use the Java equals method to compare strings though, not == (which is for comparing numbers or whether two objects are the same object).
I want a moveTo block in main which finds the next destination of the agent order
So the general overall solution is
Create a population of Workplace agents (let's say called workplaces in Main) from the DB, each with a String parameter id or similar mapped from a DB column.
Create a population of Order agents (let's say called orders in Main) from the DB and then, in their on-startup action, set up their collection of workplace IDs (type ArrayList, element class String; let's say called workplaceIDsList) using data from another DB table.
Order probably also needs a working variable storing the next index in the list that it needs to go to (so let's say an int variable nextWorkplaceIndex which starts at 0).
Write a function in Main called getWorkplaceByID that has a single String argument id and returns a Workplace. This gets the workplace from the population that matches the ID; a one-line way similar to yours is findFirst(workplaces, w -> w.id.equals(id)).
The MoveTo block (which I presume is in Main) needs to move the Order to an agent defined by getWorkplaceByID(agent.workplaceIDsList.get(nextWorkplaceIndex++)). (The ++ bit increments the index after evaluating the expression so it is ready for the next workplace to go to.)
For populating the collection, you'd have two tables, something like the below (assuming using strings as IDs for workplaces and orders):
orders table: columns for parameters of your orders (including some String id column) other than the workplace-list. (Create one Order agent per row.)
order_workplaces table: columns order_id, sequence_num and workplace_id (so with multiple rows specifying the sequence of workplace IDs for an order ID).
In the On startup action of Order, set up the skeleton query code via the Insert Database Query wizard as below (where we want to loop through all rows for this order's ID and do something --- we'll change the skeleton code to add entries to the collection instead of just printing stuff via traceln like the skeleton code does).
Then we edit the skeleton code to look like the below. (Note we add an orderBy clause to the initial query so we ensure we get the rows in ascending sequence number order.)
List<Tuple> rows = selectFrom(order_workplaces)
.where(order_workplaces.order_id.eq(id))
.orderBy(order_workplaces.sequence_num.asc())
.list();
for (Tuple row : rows) {
workplaceIDsList.add(row.get(order_workplaces.workplace_id));
}
1 The AnyLogic database is a normal relational database --- HSQLDB in fact --- and databases only understand their own specific data types like VARCHAR, with AnyLogic and the libraries it uses translating these to Java types like String. In the user interface, AnyLogic makes it look like you set the column types as int, String, etc. but these are really the Java types that the columns' contents will ultimately be translated into.
AnyLogic does support columns which have option list types (and the special Code type column for columns containing executable Java code) but these are special cases using special logic under the covers to translate the column data (which is ultimately still a string of characters) into the appropriate option list instance or (for Code columns) into compiled-on-the-fly-and-then-executed Java).
Welcome to Stack Overflow :) To create a Population via Excel Import you have to create a method and call Code like this. You also need an empty Population.
int n = excelFile.getLastRowNum(YOUR_SHEET_NAME);
for(int i = FIRST_ROW; i <= n; i++){
String name = excelFile.getCellStringValue(YOUR_SHEET_NAME, i, 1);
double SEC_PARAMETER_TO_READ= excelFile.getCellNumericValue(YOUR_SHEET_NAME, i, 2);
WP workplace = add_wps(name, SEC_PARAMETER_TO_READ);
}
Now if you want to get a workplace by name, you have to create a method similar to your try.
Functionbody:
WP workplaceToFind = wps.findFirst(w -> w.name.equals(destinationName));
if(workplaceToFind != null){
//do what ever you want
}

Syncronizing data between two django servers

I have a central Django server containing all of my information in a database. I want to have a second Django server that contains a subset of that information in a second database. I need a bulletproof way to selectively sync data between the two.
The secondary Django will need to pull its subset of data from the primary at certain times. The subset will have to be filtered by certain fields.
The secondary Django will have to occasionally push its data to the primary.
Ideally, the two-way sync would keep the most recently modified objects for each model.
I was thinking something along the lines of having using TimeStampedModel (from django-extensions) or adding my own DateTimeField(auto_now=True) so that every object stores its last modified time. Then, maybe a mechanism to dump the data from one DB and load it in to the other such that only the more recently modified objects are kept.
Possibilities I am considering are django's dumpdata, django-extensions dumpscript, django-test-utils makefixture or maybe django-fixture magic. There's a lot to think about, so I'm not sure which road to proceed down.
Here is my solution, which fits all of my requirements:
Implement natural keys and unique constraints on all models
Allows for a unique way to refer to each object without using primary key IDs
Sublcass each model from TimeStampedModel in django-extensions
Adds automatically updated created and modified fields
Create a Django management command for exporting, which filters a subset of data and serializes it with natural keys
baz = Baz.objects.filter(foo=bar)
yaz = Yaz.objects.filter(foo=bar)
objects = [baz, yaz]
flat_objects = list(itertools.chain.from_iterable(objects))
data = serializers.serialize("json", flat_objects, indent=3, use_natural_keys=True)
print(data)
Create a Django management command for importing, which reads in the serialized file and iterates through the objects as follows:
If the object does not exist in the database (by natural key), create it
If the object exists, check the modified timestamps
If the imported object is newer, update the fields
If the imported object is older, do not update (but print a warning)
Code sample:
# Open the file
with open(args[0]) as data_file:
json_str = data_file.read()
# Deserialize and iterate
for obj in serializers.deserialize("json", json_str, indent=3, use_natural_keys=True):
# Get model info
model_class = obj.object.__class__
natural_key = obj.object.natural_key()
manager = model_class._default_manager
# Delete PK value
obj.object.pk = None
try:
# Get the existing object
existing_obj = model_class.objects.get_by_natural_key(*natural_key)
# Check the timestamps
date_existing = existing_obj.modified
date_imported = obj.object.modified
if date_imported > date_existing:
# Update fields
for field in obj.object._meta.fields:
if field.editable and not field.primary_key:
imported_val = getattr(obj.object, field.name)
existing_val = getattr(existing_obj, field.name)
if existing_val != imported_val:
setattr(existing_obj, field.name, imported_val)
except ObjectDoesNotExist:
obj.save()
The workflow for this is to first call python manage.py exportTool > data.json, then on another django instance (or the same), call python manage.py importTool data.json.

Searching through a document using DXL

A friend of mine with little experience in (Telelogic Doors) DXL was given a
problem to search through a document for objects with possible matches of string.
The problem was :
We have 2 attributes: Severity and Likelihood
Please see the table below for their values:
Edit added (Sample):
A sample document looks as follows
2) So now if I have a combination like Severity = Negligible AND Likelihood = Improbable, then I want to parse through the document and then try to find all the objects that have these values and display the total number of objects.
3) Then I move to the next combination ex: Severity = Minor Injury AND Likelihood = Unlikely and then display the total objects for this combination.
4) So now I go through all the 25 combinations and display the total for each combination.
Trouble is I have no experience of DXL .
I know how to do it in C/C++ but not in DXL.
Need a DXL based solution to above.
Must you do this in DXL? It may be much easier to do this another way. For example, depending on how the documents are structured, you might be able to create a view categorized by severity and likelihood, and then present totals for each category.
Or you could export the data and calculate the totals easily using a spreadsheet.
UPDATE:
DXL is merely a XML format that applies to Domino. So once you have a database in DXL format, you can parse it like any other XML document using C/C++ if you're most comfortable with that. The key to this task, then, is to get the database into a DXL format.
With the Lotus Notes C/C++ API you can create a DXLExport from a NotesSession object, and call into the DXLExporter class to perform the export (excuse me if I'm messing up the object names - I'm used to LotusScript mainly).
Another option that could work for you is to use this DXLExporter Wizard for Domino 8.5. That will take the work out of creating the DXL and you can focus on parsing it instead.
This has nothing to do with Domino DXL, but everything with Telelogic Doors eXtension Language. The documentation: http://publib.boulder.ibm.com/infocenter/rsdp/v1r0m0/index.jsp?topic=/com.ibm.help.download.doors.doc/topics/doors_version9_1.html
Suggestion: remove the lotus-notes tag.
Amitd, the most straight-forward path is to create a view with object heading, object text, severity, likelihood, and any other relevant attributes; then perfom the basic export to Excel. Once in Excel, we'll manipulate the data as required.
Open the exported document, and sort by Severity, then Likelihood. Create the aggregate calculations using the Excel built-in functions: COUNTIF, and Data > GROUP and Data > SUBTOTAL options. You may then sort on the aggregate totals, or filter on the attributes for different combinations.
FYI - DXL is DOORS eXtension Language--nothing to do with Domino.
I know I am a bit late to the party here but what you are looking for is this:
First save your input file as a csv.
Module m = current
Object o
Stream infile = read("PathToYourCsvFile")
string inline = ""
infile >> inline
while(!end(infile))
{
string Severity = ""
string Likelihood = ""
// Here do some code to get the values from the line in the csv
// If you still are interested I can add this in with an update later.
for o in m do
{
if((o."Severity" "" == Severity) && (o."Likelihood" "" == Likelihood))
{
count++;
}
}
infoBox "Severity = " Severity " and Likelihood = " Likelihood " MATCHES: " count ""
}
This will pop up a box after each line is calculated with the number of matches found. You can easily have it just popup one box at the end with all of the matches if you wanted. Again if you are still interested, give me some more info on what you want for inputs and outputs and I can give you more accurate code.

Extracting SQLite database into text fields from specific row

I am trying to connect to a SQLite database and have a method that specifies a specific row from the database (the first column in the database is “ID” and is a primary key) then extract the information from a few other columns in that row and display them in text fields.
This will be used for a simple Trivia game I am making; I will later make a random method that will choose the row at random.
I have been struggling with this problem for several weeks and I have been through loads of tutorials but all of them deal with displaying the data in a table view, I want to display it simply on text fields in a View based app. I am fairly confused at this point so any help starting from loading the database to displaying the data in the text fields would be GREATLY APPRECIATED!
Thanks!
Link to libsqlite3.dylib (and import <sqlite3.h>) to access the power of SQLite. There are a number of lightweight Objective-C front ends and I suggest you pick one. In this example, I use fmdb (https://github.com/ccgus/fmdb) to read the names of people out of a previously created database:
NSString* docsdir = [NSSearchPathForDirectoriesInDomains(
NSDocumentDirectory, NSUserDomainMask, YES) lastObject];
NSString* dbpath = [docsdir stringByAppendingPathComponent:#"people.db"];
FMDatabase* db = [FMDatabase databaseWithPath:dbpath];
if (![db open]) {
NSLog(#"Ooops");
return;
}
FMResultSet *rs = [db executeQuery:#"select * from people"];
while ([rs next]) {
NSLog(#"%# %#",
[rs stringForColumn:#"firstname"],
[rs stringForColumn:#"lastname"]);
}
[db close];
/* output:
Snidely Whiplash
Dudley Doright
*/
That illustrates talking to the database; knowing SQL is up to you (and is a different topic). You can include a previously constructed SQLite file in your app bundle, but you can't write to it there; the solution is to copy it from your app bundle into another location, such as the Documents directory, before you start working with it.
Finally, to put strings into text fields (UITextField), set their text property. So for example instead of the while loop shown above, where I log the database results, I could use those results to set text field values:
myTextField.text = [rs stringForColumn:#"firstname"];
myOtherTextField.text = [rs stringForColumn:#"lastname"];

Resources