Compiling a multiple sheet Excel Spreadsheet into pycel - pycel

TL;DR - trying to compile this multi-sheet Pycel tests/fixture spreadsheet (I'm particularly interested in capturing inter-sheet dependencies) using into Python code using Pycel
The Problem
The helpful Pycel example in the example/example.py in the Pycel repository runs perfectly and produces a handy .gexf graph, however, I can't work out how to compile a multi-sheet Spreadsheet (in my use case there is terrifying inter-sheet dependencies!) using Pycel.
#Stephen Rauch mentioned that "Pycel does really deal with sheets. It works with cells. Those cells can come from any sheet in the workbook" in this Stackoverflow discussion, however, I couldn't find any detail on this in the Pycel repo or in prior questions.
What I tried
I tried compiling the spreadsheet included in tests/fixtures/excelcompiler.xlsx as this is multi-sheet and it only seems to compile for Sheet1 as the resulting plot is empty (Sheet1 is empty).
import pycel
excel = pycel.ExcelCompiler('excelcompiler.xlsx')
excel.plot_graph()
Is it possible to use Pycel to compile dependencies between sheets?
I'm currently trying to compile this Excel spreadsheet tool1 into Python for modelling purposes2.
1 For more details see The Irish National Building Energy Rating Assessment
2 Running Dublin-wide simulations of the energy/carbon impacts of different building-policy decisions

TL;DR
Pycel compiles multi-sheet Excel spreadsheets easily. Pycel's function pycel.ExcelCompiler is lazy and so doesn't actually compile Excel into Python until explicitly asked to do so via excel.evaluate('Result!E16')} (i.e. evaluate some cell)
In detail
I tried:
excel = pycel.ExcelCompiler(filename='deap.xlsx')
# for viewing in gephi (quicker runtime)
excel.export_to_gexf('deap.gexf')
# or via matplotlib
excel.plot_graph()
And my result was an empty graph.
I then tried (as in the example at pycel):
excel = pycel.ExcelCompiler(filename='deap.xlsx')
print(f"TotalDeliveredEnergy is {excel.evaluate('Result!E16')}")
# for viewing in gephi (quicker runtime)
excel.export_to_gexf('deap.gexf')
# or via matplotlib
excel.plot_graph()
And this worked!
In other words, my issue was that I was trying to plot a Pycel relationship before it had been evaluated - Pycel is lazy and so only compiles Excel to Python when called upon with excel.evaluate.

Related

Can I improve excel to sql powerquery performance? I'm using named range parameters instead of cell values?

This is my first post so I hope I get my question across OK.
I have designed a product configurator app in excel. Using a series of drop down menus and user forms I can build a basic front elevation of what my product will look like whilst also deriving a list of sub assembly names that will make up the product. This sub assembly list can be viewed on a hidden worksheet. There could be over 200 of these required to make up the product.
(e.g. My product could be made up of up to 16 Sections. Each Section could have 15 different sub assemblies. So my full product could have a combination of ~240 different sub assemblies, depending on the specific requirements.)
I have named cells on the sub assembly worksheet where the sub assembly is stored. This sub assembly name is dynamic depending on selections in the configurator.
(e.g.
Cell name - ARE_Section01
Dynamic sub assembly name in cell -
AU_ARE_36_36_GA11_AL3_X_13_3__ANSI_61_
or
AU_ARE_22_36_GA11_GA11_R_13_3__ANSI_61_
or
AU_ARE_22_36_GA11_GA11_X_13_3__ANSI_61_
or
etc.)
In order to build a top level bill of materials I am pulling information from an SQL Server where there is a table for every possible sub assembly (over 700 of them). Using PowerQuery, I have a query for each of the 15 sub assemblies mentioned above for each of the 16 sections. In these queries I am using a parameter that looks at the relevant named cell value rather than a hard coded value. This means I will only have 240 queries rather than 700+.
(e.g.
let
Source = Sql.Database("x_BJMCC\SQLEXPRESS", "AMX"),
dbo_AU_ARE_22_36_GA11_GA11_R_13_3__ANSI_61_ = Source{[Schema="dbo",Item=GetValue("ARE_Section01")]}[Data]
in
dbo_AU_ARE_22_36_GA11_GA11_R_13_3__ANSI_61_
)
I then have a query for each of the 16 Sections which appends all the relevant sub assemblies. I then have a top level Panel query that appends all Section queries.
I hope I have explained this properly so far.....
My problem is this...
When I hard code the assembly names into the queries the information gets pulled very quickly from SQL. But when I use the parameters (as above) its a lot slower.
Has anyone got any tips on how I can improve performance?
I guess, the problem is with your GetValue function. For example, using fnGetParameter function by Ken Puls doesn't break query folding.

pgadmin4 - Download Query result as CSV

I wrote a query using the query tool in pgadmin 4. Now I want to download the results as a csv. I´ve got two problems with that.
The 'Download as CSV'-button does not work sometimes. Especially when the result contains 1000+ rows.
When I finally have a csv and I want to open it, this message is all I see:
"'ascii' codec can't encode character u'\xbb' in position 26: ordinal not in range(128)"
Since I´m fairly new to all of this, could someone enlighten me to what is wrong?
On your questions:
The broken CSV download was a known bug that was fixed in pgAdmin v1.5 (Bug summary at the login-required https://redmine.postgresql.org/issues/2253; the gist is that there were multiple issues with exporting JSON data and Unicode). If you're not on that version, try updating and see whether you continue to have the issue.
You didn't specify where you're seeing that message regarding encoding, but the character referenced in the error is a "Right-Pointing Double Angle Quotation Mark" (») (http://www.codetable.net/hex/bb).

Google Sheets ARRAYFORMULA() different results from Excel{}?

I'm getting some odd results when using an ARRAYFORMULA() function in Google Sheets. Comparing the same formula in Excel, I get a correct answer in Excel and an incorrect answer in Google Sheets.
Here is a shared Google Sheet with the error and a screenshot of the result from Excel
The result should be 12, meaning that there are 12 months where Bob works in at least one location.
Any ideas would be much appreciated! TIA!
Google sheets has a lot of different functions. Use this instead:
=count(UNIQUE(filter(A2:A22,B2:B22=E4)))

How pull a select html formatted chunk of a google spreadsheet using a URL

I have a tree farm.
I have a Google spreadsheet that has my inventory in the form that I took it.
I have pivot table that summarizes that sheet.
How can I run a query from the Jack Pine description page on my website that pulls the appropriate blob off the pivot table on the spreadsheet?
Here's what I've done so far:
Create a new spreadsheet that does an importrange() from the individual sheet with my pivot table.
Share to the world, published to the web. Using another browser where I am not logged in with my google ID I can see the file, and it is view only.
https://docs.google.com/spreadsheets/d/13pXb7Kek010B6s8Ez3h6yX4qF92MgvV4uMk71dJhe3o/edit#gid=0
I'm basing this on this article: [https://blog.ouseful.info/2009/05/18/using-google-spreadsheets-as-a-databace-with-the-google-visualisation-api-query-language/][1]
Now, in a query (split line for reading convenience)
https://spreadsheets.google.com/d/
13pXb7Kek010B6s8Ez3h6yX4qF92MgvV4uMk71dJhe3o/tq?
tqx=out.html&tq=select+*+where+B+contains+%27Pine,%20Jack%27
And I get the following message:
google.visualization.Query.setResponse({
"version":"0.6","status":"error","errors
[{"reason":"access_denied","message":"Access
denied","detailed_message":"Access denied"}]});
Obviously I'm missing something here. How do I troubleshoot this?
Google has changed something. This answer no longer works
Added Sunday.
The following now will fetch the entire sheet:
https://docs.google.com/spreadsheets/d/
13pXb7Kek010B6s8Ez3h6yX4qF92MgvV4uMk71dJhe3o/
edit?tqx=out.html&tq=select+A,B,C,+where+A+starts+with+%27Pine%27#gid=0
But while it fetches, the select statement returns the entire sheet, or rather the query is ignored.
(I originally had %20's for all the +'s, but Google rewrote them, or my browser does.)
This method
https://docs.google.com/spreadsheets/d/
13pXb7Kek010B6s8Ez3h6yX4qF92MgvV4uMk71dJhe3o/
gviz/tq?tq=select%20A,B,C%20where%20A%20contains%20'Pine'#gid=0
returns a file json.txt. I don't read JSON, but sliding over the brackets and punctuation the content is there.
Note the difference around gviz/tq...
Google rewrites the URL removing tq? from it.
I cannot leave the tqx=out.html in place. I get no JSON file and a 'file unavailable error.'
Turns out what I need is tqx=out:html Colon, not period.
Found the information in a table labeled "Request Format" in the document
https://developers.google.com/chart/interactive/docs/dev/implementing_data_source

Read from excel file in C

I want to read from an excel file in C. The excel 2007 file contains about 6000 rows and 2 columns. I want to store the contents in a 2-D array in C. If there exists a C library or any other method then please let me know.
Excel 2007 stores the data in a bunch of files, most of them in XML, all crammed together into a zip file. If you want to look at the contents, you can rename your .xlsx to whatever.zip and then open it and look at the files inside.
Assuming your Excel file just contains raw data, and all you care about is reading it (i.e., you do not need/want to update its contents and get Excel to open it again), reading the data is actually pretty easy. Inside the zip file, you're looking for the subdirectory xl\worksheets\, which will contain a number of .xml files, one for each worksheet from Excel (e.g., a default workbook will have three worksheets named sheet1.xml, sheet2.xml and sheet3.xml).
Inside of those, you're looking for the <sheet data> tag. Inside of that, you'll have <row> tags (one for each row of data), and inside of them <c> tags with an attribute r=RC where RC is replaced by the normal row/column notation (e.g., "A1"). The <c> tag will have nested <v> tag where you'll find the value for that cell.
I do feel obliged to add a warning though: while reading really simple data can indeed be just this easy, life can get a lot more complex in a hurry if you decide to do much more than reading simple rows/columns of numbers. Trying to do anything even slightly more complex than that can get a lot more complex in a hurry.
You have several choices:
1) Save your excel worksheet to a csv file and parse that.
2) Use the COM API (Windows proprietary and tricky)
3) See this link for a C++ class that you could modify.
Another C lib to read data from excel files can be found here.

Resources