can we use JSON as a database? - database

I'm looking for fast and efficient data storage to build my PHP based web site. I'm aware of MySql. Can I use a JSON file in my server root directory instead of a MySQL database? If yes, what is the best way to do it?

You can use any single file, including a JSON file, like this:
Lock it somehow (google PHP file locking, it's possibly as simple as adding a parameter to file open function or changing function name to locking version).
Read the data from file and parse it to internal data stucture.
Optionally modify the data in internal data structure.
If you modified the data, truncate the file to 0 length and write new data to it.
Unlock the file as soon as you can, other requests may be waiting...
You can keep using the data in internal structures to render the page, just remember it may be out-dated as soon as you release the file lock and other HTTP request can modify it.
Also, if you modify the data from user's web form, remember that it may have been modified in between. Like, load page with user details for editing, then other user deletes that user, then editer tries to save the changed details, and should probably get error instead of re-creating deleted user.
Note: This is very inefficient. If you are building a site where you expect more than say 10 simultaneous users, you have to use a more sophisticated scheme, or just use existing database... Also, you can't have too much data, because parsing JSON and generating modified JSON takes time.
As long as you have just one user at a time, it'll just get slower and slower as amount of data grows, but as user count increases, and more users means both more requests and more data, things start to get exponentially slower and you very soon hit limit where HTTP requests start to expire before file is available for handling the request...
At that point, do not try to hack it to make it faster, but instead pick some existing database framework (SQL or nosql or file-based). If you start hacking together your own, you just end up re-inventing the wheel, usually poorly :-). Well, unless it is just programming exercise, but even then it might be better to instead learn use of some existing framework.

I wrote an Object Document Mapper to use with json files called JSON ODM may be a bit late, but if it is still needed it is open source under MIT Licence.
It provides a query languge, and some GeoJSON tools

The new version of IBM Informix 12.10 xC2 supports now JSON.
check the link : http://pic.dhe.ibm.com/infocenter/informix/v121/topic/com.ibm.json.doc/ids_json_007.htm
The manual says it is compatible with MongoDB drivers.
About the Informix JSON compatibility
Applications that use the JSON-oriented query language, created by
MongoDB, can interact with data stored in Informix® databases. The
Informix database server also provides built-in JSON and BSON (binary
JSON) data types.
You can use MongoDB community drivers to insert, update, and query
JSON documents in Informix.
Not sure, but I believe you can use the Innovator-C edition (free for production) to test and use it with no-cost either for production enviroment.

One obvious case when you can prefer JSON (or another file format) over database is when all yours (relatively small) data stored in the application cache.
When an application server (re)starts, an application reads data from file(s) and stores it in the data structure.
When data changes, an application updates file(s).
Advantage: no database.
Disadvantage: for a number of reasons can be used only for systems with relatively small data. For example, a very specific product site with several hundreds of products.

Related

NoSQL Database - Saving JSON Files on Local Server instead of a Database Server

What are the disadvantages of saving json files on the virtual machine you paid for instead of saving it in a database like MongoDB (security concerns, efficiency,...)?
I have tried using both ways of storing data but still couldn't find any difference in performance.
It's probably faster to store the JSON files simply as files in your filesystem of the virtual machine.
Unless you need to do one or more of the following:
Use a query language to extract parts of the JSON files.
Search the JSON files efficiently.
Allow clients on multiple virtual machines to access the same JSON data via an API.
Enforce access controls, so some clients can access some JSON data.
Make a consistent backup copy of the data.
Store more JSON data than can fit on a single virtual machine.
Enforce a schema so the JSON files are assured to have consistent structure.
Update JSON data in a way that is assured to succeed completely, or else make no change.
Run aggregate queries like COUNT(), MAX(), SUM() over a collection of JSON files.
You could do all these things by writing more code. It will take you years to develop that code to be bug-free and well-optimized.
By the end, what would you have developed?
You'd have developed a database management system.
Well, for small data you won't probably find the difference. Rather you may find that the data from you VM takes less time to return because you're not sending request to another remote server.
But when your data grows it will be hard to maintain. That's why we use a database management system to manage and process our data efficiently.
So, if you are storing small configuration file then you can use your filesystem for that otherwise I definitely recommend using a DBMS.

Is it possible to store an in-memory Jena Dataset as a triple-store?

Warning! This question is a catch, I bring 0 XP considering RDF systems, so I couldn't express this in a single question. Feel free to skip the first two paragraphs.
What I'm trying to build, overall
I'm currently building a Spring app that will be the back-end for a system that will gather measurements.
I want to store the info in a triple-store instead of an RDBMS.
So, you may imagine a Spring Boot app with the addition of the Jena library.
The workflow of the system
About the methodology that I'm planning to deploy.
1. Once the App is up and running it would either create or connect to an existing triple-store database.
2. A POST request reaches an app controller.
3. I use SPARQL query to insert the new entry to the triple-store.
4. Other Controller/Service/DAO methods exist to serve GET requests for SELECT queries on the triple-store.
*The only reason I provided such a detailed view of my final goal is to avoid answers that would call my question a XY-problem.
The actual problem
1. Does a org.apache.jena.query.Dataset represent an in memory triple-store or is this kind of Dataset a completely different data structure?
2. If a Dataset is indeed a triple-store, then how can I store this in-memory Dataset to retrieve it in a later session?
3. If indeed one can store a Dataset, then what are the options? Is the default storing a Dataset as a file with .tdb extension? If so then what is the method for that and under which class?
4. If so far I am correct in my guess then would the assemble method be sufficient to "retrieve" the triple-store from the file stored?
5. Do all triple-store databases follow this concept, of being stored in .tdb files?
org.apache.jena.query.Dataset is an interface - there are multiple implementations with different characteristics.
DatasetFactory makes datasets of various kinds. DatasetFactory.createTxnMem is an in-memory, transactional dataset. It can be initialized with the contents of files but updates do not change the files.
An in-memory only exists for the JVM-session.
If you want data and data changes to persist across sessions, you can use TDB for persistent storage. Try TDBFactory or TDB2Factory
TDB (TDB1 or TDB2) are triplestore databases.
Fuseki is the triple store server. You can send SPARQL requests to Fuseki (query, update, bulk upload, ...)
You can start Fuseki with a TDB database (it creates if it does not exist)
fuseki-server -tdb2 --loc DB /myData
".tdb" isn't a file extension Apache Jena uses. Databases are a directory of files.

is Using JSON data is better then Querying Database when there is no security issue for data

For my new project I'm looking forward to use JSON data as a text file rather then fetching data from database. My concept is to save a JSON file on the server whenever admin creates a new entry in the database.
As there is no issue of security, will this approach will make user access to data faster or shall I go with the usual database queries.
JSON is typically used as a way to format the data for the purpose of transporting it somewhere. Databases are typically used for storing data.
What you've described may be perfectly sensible, but you really need to say a little bit more about your project before the community can comment on your approach.
What's the pattern of access? Is it always read-only for the user, editable only by site administrator for example?
You shouldn't worry about performance early on. Worry more about ease of development, maintenance and reliability, you can always optimise afterwards.
You may want to look at http://www.mongodb.org/. MongoDB is a document-centric store that uses JSON as its storage format.
JSON in combination with Jquery is a great fast web page smooth updating option but ultimately it still will come down to the same database query.
Just make sure your query is efficient. Use a stored proc.
JSON is just the way the data is sent from the server (Web controller in MVC or code behind in standind c#) to the client (JQuery or JavaScript)
Ultimately the database will be queried the same way.
You should stick with the classic method (database), because you'll face many problems with concurrency and with having too many files to handle.
I think you should go with usual database query.
If you use JSON file you'll have to sync JSON files with the DB (That's mean an extra work is need) and face I/O problems (if your site super busy).

No database on the company servers, alternatives to save data?

I've been struggling with this issue for a while. Our company servers lack any sort of database, i.e. no MySQL, MongoDB, etc in sight.
Since we can't install any for reasons beyond the scope of this question, I was wondering if there was any alternative to that that I could use to save data from a form. (We collect prospect data through a form on our site which then sends this data in the form of an email and is plugged in our internal database through email2DB...)
You could use a library like SQLite
You could also use indexed files like Gdbm
However, you should think about backup strategies. Perhaps serialization should be a concern (and using textual or portable data formats like XDR, ASN1, JSON, YAML, ...).
But you might also try to discuss with managers to install e.g. a MySQL server on a machine. You don't need a dedicated hardware for that, it can run (at least for development and test) on a machine used for some other things.
textfile?:)
or perhaps TinySQL?
You can save it as a flat file. Flat files work great when you are just saving things like logs, or output from a webform. They quickly start to fail if you have any *-to-many relationships.
Do you have access to PHP?

Designing a generic unstructured data store

The project I have been given is to store and retrieve unstructured data from a third-party. This could be HR information – User, Pictures, CV, Voice mail etc or factory related stuff – Work items, parts lists, time sheets etc. Basically almost any type of data.
Some of these items may be linked so a User many have a picture for example. I don’t need to examine the content of the data as my storage solution will receive the data as XML and send it out as XML. It’s down to the recipient to convert the XML back into a picture or sound file etc. The recipient may request all Users so I need to be able to find User records and their related “child” items such as pictures etc, or the recipient may just want pictures etc.
My database is MS SQL and I have to stick with that. My question is, are there any patterns or existing solutions for handling unstructured data in this way.
I’ve done a bit of Googling and have found some sites that talk about this kind of problem but they are more interested in drilling into the data to allow searches on their content. I don’t need to know the content just what type it is (picture, User, Job Sheet etc).
To those who have given their comments:
The problem I face is the linking of objects together. A User object may be added to the data store then at a later date the users picture may be added. When the User is requested I will need to return the both the User object and it associated Picture. The user may update their picture so you can see I need to keep relationships between objects. That is what I was trying to get across in the second paragraph. The problem I have is that my solution must be very generic as I should be able to store anything and link these objects by the end users requirements. EG: User, Pictures and emails or Work items, Parts list etc. I see that Microsoft has developed ZEntity which looks like it may be useful but I don’t need to drill into the data contents so it’s probably over kill for what I need.
I have been using Microsoft Zentity since version 1, and whilst it is excellent a storing huge amounts of structured data and allowing (relatively) simple access to the data, if your data structure is likely to change then recreating the 'data model' (and the regression testing) would probably remove the benefits of using such a system.
Another point worth noting is that Zentity requires filestream storage so you would need to have the correct version of SQL Server installed (2008 I think) and filestream storage enabled.
Since you deal with XML, it's not an unstructured data. Microsoft SQL Server 2005 or later has XML column type that you can use.
Now, if you don't need to access XML nodes and you think you will never need to, go with the plain varbinary(max). For your information, storing XML content in an XML-type column let you not only to retrieve XML nodes directly through database queries, but also validate XML data against schemas, which may be useful to ensure that the content you store is valid.
Don't forget to use FILESTREAMs (SQL Server 2008 or later), if your XML data grows in size (2MB+). This is probably your case, since voice-mail or pictures can easily be larger than 2 MB, especially when they are Base64-encoded inside an XML file.
Since your data is quite freeform and changable, your best bet is to put it on a plain old file system not a relational database. By all means store some meta-information in SQL where it makes sense to search through structed data relationships but if your main data content is not structured with data relationships then you're doing yourself a disservice using an SQL database.
The filesystem is blindingly fast to lookup files and stream them, especially if this is an intranet application. All you need to do is share a folder and apply sensible file permissions and a large chunk of unnecessary development disappears. If you need to deliver this over the web, consider using WebDAV with IIS.
A reasonably clever file and directory naming convension with a small piece of software you write to help people get to the right path will hands down, always beat any SQL database for both access speed and sequential data streaming. Filesystem paths and file names will always beat any clever SQL index for data location speed. And plain old files are the ultimate unstructured, flexible data store.
Use SQL for what it's good for. Use files for what they are good for. Best tools for the job and all that...
You don't really need any pattern for this implementation. Store all your data in a BLOB entry. Read from it when required and then send it out again.
Yo would probably need to investigate other infrastructure aspects like periodically cleaning up the db to remove expired entries.
Maybe i'm not understanding the problem clearly.
So am I right if I say that all you need to store is a blob of xml with whatever binary information contained within? Why can't you have a users table and then a linked(foreign key) table with userobjects in, linked by userId?

Resources