Storing data in Spreadsheet instead of any database - google-app-engine

Is it possible to use Spreadsheet has a database to store data...I don't want to use any database externally, I want to use Python, Google Apps and spreadsheet only.
For Example: using Python and Google Apps I have developed leave application form, on submit of that form I have to store that data in to spreadsheet instead of using any database (MySql or Oracle)
If its possible give me some reference code
Thanks in advance

Whilst storing data in a spreadsheet might seem like a good idea at first, as your application gets more complicated you may come to regret it. If your data structures become more complicated - especially if you need relationships between tables - you'll be much better off with a database.
My advice would be to use a database and then use The xlwt module to convert your data into Spreadsheet format.

I would recommend you store this data internally in a flat file or database and transform the data to the spreadsheet. Doing it this way will let you change with the future. If you use google apps spreadsheet, and the api changes even slightly, you could lose data. If you store in an xls spreadsheet, you might as well store in a flat file and export the data as that will be easier than reading/writing to a spreadsheet.
Why do you want to do it this way?

Related

Database from Excel Sheets

I have a problem statement.
A client of mine has three years of data in very complicated Excel Sheets. They are currently doing data entry, reporting and every thing basically in excel and these excels are shared with each other when ever there is a need to share the data.
They want eventually to move to a web based solution.
The first step of the solution agreed is to make a routine and a database. The routine will import the data from excels into the database. The database itself will need to be designed in such a way that the same database can be used at the back for a web site.
The said routine will be used as a one time activity to import all the data for the last three years and also will be used on periodic basis so that they keep importing the data until the web front is ready or agreed upon.
The defacto method for me is to analyze the excel sheets, make a normalized database, make a service in java that will use poi, read the excel sheets, apply business logic and insert the data into the database. The problem here is that any time they change the excel or even the data format, the routine will need to be rewritten.
I am willing to go any way, maybe use an ETL tool.
Please recommend.

is Using JSON data is better then Querying Database when there is no security issue for data

For my new project I'm looking forward to use JSON data as a text file rather then fetching data from database. My concept is to save a JSON file on the server whenever admin creates a new entry in the database.
As there is no issue of security, will this approach will make user access to data faster or shall I go with the usual database queries.
JSON is typically used as a way to format the data for the purpose of transporting it somewhere. Databases are typically used for storing data.
What you've described may be perfectly sensible, but you really need to say a little bit more about your project before the community can comment on your approach.
What's the pattern of access? Is it always read-only for the user, editable only by site administrator for example?
You shouldn't worry about performance early on. Worry more about ease of development, maintenance and reliability, you can always optimise afterwards.
You may want to look at http://www.mongodb.org/. MongoDB is a document-centric store that uses JSON as its storage format.
JSON in combination with Jquery is a great fast web page smooth updating option but ultimately it still will come down to the same database query.
Just make sure your query is efficient. Use a stored proc.
JSON is just the way the data is sent from the server (Web controller in MVC or code behind in standind c#) to the client (JQuery or JavaScript)
Ultimately the database will be queried the same way.
You should stick with the classic method (database), because you'll face many problems with concurrency and with having too many files to handle.
I think you should go with usual database query.
If you use JSON file you'll have to sync JSON files with the DB (That's mean an extra work is need) and face I/O problems (if your site super busy).

Benefit of reading data off a mat file as opposed from the database

I've seen some code read large data from mat files instead of doing queries on a database. What are the benefits of doing this as oppose to using a database? Is it possible to easily move the mat file contents into a database and vice versa?
Reading data from mat file, is also a "database" in which you read your data from file.
Eventually, you will have to implement queries by yourself, and take care of many other issues.
Also, it is not a scalable solution, which means that for a large amount of data, it won't work well.
Of course, if you have small amount of data, and only basic queries, the fuss of setting up a database, using SQL isn't worth it.
Regarding your second question, it really depends on the data you have there.
I agree with Andrey. It depends on the data and what you want to do with it. I created a small program in Matlab that queries a relatively small .mat database but as the database and users grew performance has been going down.
In the light of this we decided to use a MySQL database. I created a small java application that talks to the database and imported that into Matlab to move data between Matlab and MySQL. But I had to create specific queries for my data. If someone can bring me a better solution I would be grateful.
Perhaps it wouldn't be such a bad idea to generate a general script that moves data between .mat data between Matlab and a SQL database. Store the data in a structure and use that to create the tables.
If you want to discuss something like this further via email I would be happy to. Maybe we can learn a thing or two from each other.

Designing a generic unstructured data store

The project I have been given is to store and retrieve unstructured data from a third-party. This could be HR information – User, Pictures, CV, Voice mail etc or factory related stuff – Work items, parts lists, time sheets etc. Basically almost any type of data.
Some of these items may be linked so a User many have a picture for example. I don’t need to examine the content of the data as my storage solution will receive the data as XML and send it out as XML. It’s down to the recipient to convert the XML back into a picture or sound file etc. The recipient may request all Users so I need to be able to find User records and their related “child” items such as pictures etc, or the recipient may just want pictures etc.
My database is MS SQL and I have to stick with that. My question is, are there any patterns or existing solutions for handling unstructured data in this way.
I’ve done a bit of Googling and have found some sites that talk about this kind of problem but they are more interested in drilling into the data to allow searches on their content. I don’t need to know the content just what type it is (picture, User, Job Sheet etc).
To those who have given their comments:
The problem I face is the linking of objects together. A User object may be added to the data store then at a later date the users picture may be added. When the User is requested I will need to return the both the User object and it associated Picture. The user may update their picture so you can see I need to keep relationships between objects. That is what I was trying to get across in the second paragraph. The problem I have is that my solution must be very generic as I should be able to store anything and link these objects by the end users requirements. EG: User, Pictures and emails or Work items, Parts list etc. I see that Microsoft has developed ZEntity which looks like it may be useful but I don’t need to drill into the data contents so it’s probably over kill for what I need.
I have been using Microsoft Zentity since version 1, and whilst it is excellent a storing huge amounts of structured data and allowing (relatively) simple access to the data, if your data structure is likely to change then recreating the 'data model' (and the regression testing) would probably remove the benefits of using such a system.
Another point worth noting is that Zentity requires filestream storage so you would need to have the correct version of SQL Server installed (2008 I think) and filestream storage enabled.
Since you deal with XML, it's not an unstructured data. Microsoft SQL Server 2005 or later has XML column type that you can use.
Now, if you don't need to access XML nodes and you think you will never need to, go with the plain varbinary(max). For your information, storing XML content in an XML-type column let you not only to retrieve XML nodes directly through database queries, but also validate XML data against schemas, which may be useful to ensure that the content you store is valid.
Don't forget to use FILESTREAMs (SQL Server 2008 or later), if your XML data grows in size (2MB+). This is probably your case, since voice-mail or pictures can easily be larger than 2 MB, especially when they are Base64-encoded inside an XML file.
Since your data is quite freeform and changable, your best bet is to put it on a plain old file system not a relational database. By all means store some meta-information in SQL where it makes sense to search through structed data relationships but if your main data content is not structured with data relationships then you're doing yourself a disservice using an SQL database.
The filesystem is blindingly fast to lookup files and stream them, especially if this is an intranet application. All you need to do is share a folder and apply sensible file permissions and a large chunk of unnecessary development disappears. If you need to deliver this over the web, consider using WebDAV with IIS.
A reasonably clever file and directory naming convension with a small piece of software you write to help people get to the right path will hands down, always beat any SQL database for both access speed and sequential data streaming. Filesystem paths and file names will always beat any clever SQL index for data location speed. And plain old files are the ultimate unstructured, flexible data store.
Use SQL for what it's good for. Use files for what they are good for. Best tools for the job and all that...
You don't really need any pattern for this implementation. Store all your data in a BLOB entry. Read from it when required and then send it out again.
Yo would probably need to investigate other infrastructure aspects like periodically cleaning up the db to remove expired entries.
Maybe i'm not understanding the problem clearly.
So am I right if I say that all you need to store is a blob of xml with whatever binary information contained within? Why can't you have a users table and then a linked(foreign key) table with userobjects in, linked by userId?

Custom Database integration with MOSS 2007

Hopefully someone has been down this road before and can offer some sound advice as far as which direction I should take. I am currently involved in a project in which we will be utilizing a custom database to store data extracted from excel files based on pre-established templates (to maintain consistency). We currently have a process (written in C#.Net 2008) that can extract the necessary data from the spreadsheets and import it into our custom database. What I am primarily interested in is figuring out the best method for integrating that process with our portal. What I would like to do is let SharePoint keep track of the metadata about the spreadsheet itself and let the custom database keep track of the data contained within the spreadsheet. So, one thing I need is a way to link spreadsheets from SharePoint to the custom database and vice versa. As these spreadsheets will be updated periodically, I need tried and true way of ensuring that the data remains synchronized between SharePoint and the custom database. I am also interested in finding out how to use the data from the custom database to create reports within the SharePoint portal. Any and all information will be greatly appreciated.
I have actually written a similar system in SharePoint for a large Financial institution as well.
The way we approached it was to have an event receiver on the Document library. Whenever a file was uploaded or updated the event receiver was triggered and we parsed through the data using Aspose.Cells.
The key to matching data in the excel sheet with the data in the database was a small header in a hidden sheet that contained information about the reporting period and data type. You could also use the SharePoint Item's unique ID as a key or the file's full path. It all depends a bit on how the system will be used and your exact requirements.
I think this might be awkward. The Business Data Catalog (BDC) functionality will enable you to tightly integrate with your database, but simultaneously trying to remain perpetually in sync with a separate spreadsheet might be tricky. I guess you could do it by catching the update events for the document library that handles the spreadsheets themselves and subsequently pushing the right info into your database. If you're going to do that, though, it's not clear to me why you can't choose just one or the other:
Spreadsheets in a document library, or
BDC integration with your database
If you go with #1, then you still have the ability to search within the documents themselves and updating them is painless. If you go with #2, you don't have to worry about sync'ing with an actual sheet after the initial load, and you could (for example) create forms as needed to allow people to modify the data.
Also, depending on your use case, you might benefit from the MOSS server-side Excel services. I think the "right" decision here might require more information about how you and your team expect to interact with these sheets and this data after it's initially uploaded into your SharePoint world.
So... I'm going to assume that you are leveraging Excel because it is an easy way to define, build, and test the math required. Your spreadsheet has a set of input data elements, a bunch of math, and then there are some output elements. Have you considered using Excel Services? In this scenario you would avoid running a batch process to generate your output elements. Instead, you can call Excel services directly in SharePoint and run through your calculations. More information: available online.
You can also surface information in SharePoint directly from the spreadsheet. For example, if you have a graph in the spreadsheet, you can link to that graph and expose it. When the data changes, so does the graph.
There are also some High Performance Computing (HPC) Excel options coming out from Microsoft in the near future. If your spreadsheet is really, really big then the Excel Services route might not work. There is some information available online (search for HPC excel - I can't post the link).

Resources