python-select specific excel files from folder to merge - xlsx

new python writer here just wanting to see if i can get help with this code that i am working with. i have a folder where i just want specific files for example:
report1-AM.xlsx
report1-PM.xlsx
report1-all day.xlsx
report2-AM.xlsx
report2-PM.xlsx
report2-all.xlsx day and so on about 45 reports
i am wanting to see if i can just get all reports under report(#)-all day.
this is what i am working with:
df = pd.DataFrame()
for file in files:
if file.endswith('.xlsx'):
df = df.append(pd.read_excel(file), ignore_index=True)

Related

Snowflake- creating external table by using pattern

I am trying to create the external table(xyz) in snowflake by using pattern to load historical file from stage, there are multiple files and using following pattern to load the file name started with below
201802242300_5d80272d1abcd32cc7a981da083ed498.gz. ( Feb 24th 2018 file)
Create external table xyz
(
samplecol1 varchar as (value:samplecol1::varchar),
samplecol2 varchar as (value:samplecol2::varchar),
date as to_date(substr(metadata$filename,1,8),yyyymmdd)
)
partition by (date)
location = #snowflakestage.largetable
pattern='.*/20180224.*[_].*.gz'
file_format = (type = 'JSON');
it's executing successfully but not loading any data. Is my pattern right to pick the file name listed above?
A good way to test patterns is via the LIST command as it takes the same PATTERN option.
thus for you:
LIST #snowflakestage.largetable pattern='.*/20180224.*[_].*.gz'
for example using the CitiBike example data, there are not parque files
so if you try load all files, you get errors.
create stage citibike.public.citibike_trips
url = 's3://snowflake-workshop-lab/citibike-trips';
list #citibike_trips;
name
s3://snowflake-workshop-lab/citibike-trips-parquet/2022/01/08/data_01a19496-0601-8b21-003d-9b03003c624a_3106_4_0.snappy.parquet
s3://snowflake-workshop-lab/citibike-trips-parquet/2022/01/09/data_01a19496-0601-8b21-003d-9b03003c624a_1906_6_0.snappy.parquet
s3://snowflake-workshop-lab/citibike-trips-parquet/2022/01/10/data_01a19496-0601-8b21-003d-9b03003c624a_2206_6_0.snappy.parquet
s3://snowflake-workshop-lab/citibike-trips/json/2013-06-01/data_01a304b5-0601-4bbe-0045-e8030021523e_005_7_0.json.gz
s3://snowflake-workshop-lab/citibike-trips/json/2013-06-01/data_01a304b5-0601-4bbe-0045-e8030021523e_005_7_1.json.gz
s3://snowflake-workshop-lab/citibike-trips/json/2013-06-01/data_01a304b5-0601-4bbe-0045-e8030021523e_005_7_2.json.gz
So I played around till I found this pattern worked for the files I wanted.
list #citibike_trips pattern = '.*trips_.*csv.gz';

How to turn tensorflow record into a variable?

so let's say I have converted 1000 images of dog jpegs into a tensorflow record file (python version, not C++)
Let's say this tensorflow record file has the following path
path = "/Users/Bill/Desktop/work/tensorflowproject1"
data = path +'train.tfrecords'
now the filepath to this tf record file is stored in "data" the string
this is where it gets sort of tricky
for i in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
This is what the sample on the internet/google website runs on the example they provided "mnist"
But how would I get a variable that represents the training set directly? NOT just the filepath?
I want a dataset within tensorflow that I will then run something like this:
train_df = df.sample(frac = .6, random_state = 0).sort()
test_df = df.drop(train_df.index)
to split up the data. I have done this a lot with arrays, or dataframes. Never images, or a tensorflow record file

Creating a Neo4j Graph Database Using LOAD CSV

I have a CSV file containing the data that I want to convert into a graph database using Neo4j. The Columns in the file are in the following format :
Person1 | Person2 | Points
Now the ids in Person1 and Person2 are redundant , so I am using a Merge statement instead. But I am not getting the correct results.
For a sample dataset , the output seems to be correct , but when I import my dataset consisting of 2M rows, it somehow doesn't create the relationships.
I am putting the cypher code that I am using currently.
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM "file:C:/Users/yogi/Documents/Neo4j/default.graphdb/sample.csv" AS csvline
MERGE (p1:Person {id:toInt(csvline.id1)})
MERGE (p2:Person {id:toInt(csvline.id2)})
CREATE (p1)-[:points{count:toInt(csvline.c)}]->(p2)
Some things you should check:
are you using an index: CREATE INDEX ON :Person(id) should be run before the import
depending on the Neo4j version you're using, the statement might be subject to "eager-pipe" which basically prevents the periodic commit. For more on eager pipe, see http://www.markhneedham.com/blog/2014/10/23/neo4j-cypher-avoiding-the-eager/

Neo4j Failed to load csv from local disk (windows 7)

I am new to Neo4j and have been trying to load a CSV from my local disk, but without a success.
LOAD CSV WITH HEADERS FROM "file:C:/Neo4j/Persons.csv" AS csvLine
CREATE (p:Person { id: toInt(csvLine.id), name: csvLine.name })
I am getting the following response and error
Couldn't load the external resource at: file:C:/Neo4j/Persons.csv
Neo.TransientError.Statement.ExternalResourceFailure
Can you verify that the file is in the C:/Neo4j/ directory?
Could not be the perfect solutiuon but might be work for you.
You should try this one:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS
FROM "file:C:/Neo4j/Persons.csv" AS csvLine
CREATE (:PERSONS {id:csvLine.id, name:csvLine.name})
But you should take notice the headers from your Persons.csv file.
Imagine that your file got this header
id | name |
you must use this Cypher code, before the CREATE statment:
FIELDTERMINATOR "|"

Extracting SQLite database into text fields from specific row

I am trying to connect to a SQLite database and have a method that specifies a specific row from the database (the first column in the database is “ID” and is a primary key) then extract the information from a few other columns in that row and display them in text fields.
This will be used for a simple Trivia game I am making; I will later make a random method that will choose the row at random.
I have been struggling with this problem for several weeks and I have been through loads of tutorials but all of them deal with displaying the data in a table view, I want to display it simply on text fields in a View based app. I am fairly confused at this point so any help starting from loading the database to displaying the data in the text fields would be GREATLY APPRECIATED!
Thanks!
Link to libsqlite3.dylib (and import <sqlite3.h>) to access the power of SQLite. There are a number of lightweight Objective-C front ends and I suggest you pick one. In this example, I use fmdb (https://github.com/ccgus/fmdb) to read the names of people out of a previously created database:
NSString* docsdir = [NSSearchPathForDirectoriesInDomains(
NSDocumentDirectory, NSUserDomainMask, YES) lastObject];
NSString* dbpath = [docsdir stringByAppendingPathComponent:#"people.db"];
FMDatabase* db = [FMDatabase databaseWithPath:dbpath];
if (![db open]) {
NSLog(#"Ooops");
return;
}
FMResultSet *rs = [db executeQuery:#"select * from people"];
while ([rs next]) {
NSLog(#"%# %#",
[rs stringForColumn:#"firstname"],
[rs stringForColumn:#"lastname"]);
}
[db close];
/* output:
Snidely Whiplash
Dudley Doright
*/
That illustrates talking to the database; knowing SQL is up to you (and is a different topic). You can include a previously constructed SQLite file in your app bundle, but you can't write to it there; the solution is to copy it from your app bundle into another location, such as the Documents directory, before you start working with it.
Finally, to put strings into text fields (UITextField), set their text property. So for example instead of the while loop shown above, where I log the database results, I could use those results to set text field values:
myTextField.text = [rs stringForColumn:#"firstname"];
myOtherTextField.text = [rs stringForColumn:#"lastname"];

Resources