Core Data vs. Database fundamental difference? - database

Can someone explain to me what the fundamental difference is between Core Data (apparently, a "data store") and a database like SQLite or MySQL?
I am working on writing an iPhone app, and needed a table of static data to display. I thought core data would be a good choice for this, so I got everything set up and functioning as far as the database (i'm sorry - data STORE) went, and then went to try to import my data (it was in an excel file which I exported to CSV). I was thinking it should be a straight forward process like I have done in SQLite and other databases many times, but as it turned out after much research, the only "official" way to do this was to write a parser specifically for my data.
When I asked about this on the Apple Developer forums, the response I got was basically "What kind of idiot are you to think that you could possibly import data directly without having to write code to do it? Core data isn't a database- it's a data STORE!!" For the life of me, though, I can't see the distinction. In every way I have looked at it, core data behaves EXACTLY like a database, with a fancy way of accessing it and enough abstraction that it can use a variety of file formats for actually storing the data. In fact, I was eventually able to import my data using a simple SQLite .import command, so I really don't understand why the concept was so foreign to the responders to my original question.
So what am I missing here? What is so fundamentally different about a data store from a database that makes the concept of simple data importing completely alien to those who know the technology?

Core Data is not simply a means of persisting/storing data to and from disk as is SQL. Core Data's true function is to provide the complete model layer for the Model-View-Controller app design that the Apple API uses. As such Core Data is primarily an object-graph manager with persistence options tack onto the side.
An object-graph is a collection of live objects in memory. In Core Data, these are the managed objects. They are called "managed" objects because the managed object context observes the objects constantly making sure they are in the states and relationships that the data model says they should be in.
Core Data does provide persistence option but exactly what that option is for any particular implementation is largely hidden. You can even use the same data model and managed objects with different persistence methods, sometime in the same app.
The key difference with SQL is that SQL writes the actual data to disk whereas Core Data serializes live objects. When you look at a sqlite store in Core Data you are looking at objects that have been taken apart and "freeze dried". Obviously, "freeze drying" objects requires a rather specific data format in the sqlite store so the Core Data store uses its own custom schema that is largely the same regardless of details of the store.
That is why you can't just swap in any old SQL file and expect Core Data to import it. The SQL file is rows, tables and columns of data and not a specialized tables, columns and rows use to reconstitute freeze dried objects.
Since Core Data is first and foremost an object-graph manager, the only supported and reliable means of importing data is to create the object-graph. In the case of an SQL file, that means reading the SQL data using the SQL api and then generating managed objects from that data and then saving them to a persistent store.
That part is more work but you save time integrating the data into the rest of the app, upgrading the data and gains in reliability and maintainability.

A dictionary definition gives me:
Databases are data stores, but a data store isn't always a database.
The feature you expected isn't available in some databases either (but most are).
A data store can for example store non-relational data.

They should have just pointed you at the Wikipedia article on Core Data.
According to that article, "It allows data organised by the relational entity-attribute model to be serialised into XML, binary, or SQLite stores. The data can be manipulated using higher level objects representing entities and their relationships. Core Data manages the serialised version, providing object lifecycle and object graph management, including persistence. Core Data interfaces directly with SQLite, insulating the developer from the underlying SQL."
I guess it's the fact that "Core Data manages the serialised version" that means you can't import data directly. That is, you probably can't import data directly into SQLite in such a way that Core Data can manage it, although you probably can import data directly into SQLite in some way.

Core Data is not a data store, a data store is one part of Core Data. Core Data is closer related to an Object Relational Mapping (ORM) tool. Core Data actually has the option of using SQLite for it's datastore, but you can also choose XML files, proprietary format, or write your own datastore.
Not sure how you were able to import your data with a SQL import, shouldn't be compatible with Core Data since Core Data creates a proprietary SQL database schema that contains a ton of metadata.

Maybe it's better to think of Core Data as an "object store" and a database as a "data store". Core Data is good when you have a variety of types of object, with relationships to each other. The familiar example is a company with employees, who have bosses and reports, belong to departments, are assigned to clients, projects, etc., have schedules, go to meetings. Employees can get reassigned, etc. Even the types of relationships defined vary from time to time. That's a more heavyweight process even with Core Data, but Core Data makes it more easy than with a raw database.
If you just have "data", and not "objects", it's easier to use a database. For example if you just have a table of the elements with atomic weights, etc., you might want to just use a database.
For your application it sounds like you just have one table. It will be easy to just use SQLite, which is available, so use it if it's more convenient.
On the other hand, iOS SDK has some pre-built features that interact with Core Data. If you use SQLite you don't get those. So you might avoid custom code to import your data but have to write custom code to display your data. Tough luck. When creating software sometimes you have to write code. Weird, I know.

Related

How to store CQRS Read Models in SQL Server Table?

I'm looking into storing CQRS read models in SQL Server tables due to legacy system concerns (see approaches 2 & 3 of this question).
While I'd like to implement the read models using document database such as MongoDB, due to outside systems that can't be reworked at this time, I'm stuck with keeping everything in the rdbms for now.
Since I'm looking at storing records in a properly de-normalized way, what's the best way to actually store them when dealing with typical hierarchical data, such as the typical Customer / Order / LineItems /etc, that must all be displayed in the same view? [EDIT: What I'm thinking is that I put the data needed to query the model in separate fields, but the full object in a "object data field" with it]
Due to my legacy systems (mostly out of my control) I'm thinking that I'll add triggers to the legacy system tables or make sproc changes to keep my read models current, but how should I actually store the data itself?
I considered simply storing them as JSON in a field, or storing them as XML, as both can easily be serialized/deserialized from a .net application, and can reasonably easily be updated by triggers from other activities in the database. (Xpath/XQuery isn't so bad when you get used to it, and from another answer here, I found a JSON parser for T-SQL)
Is there a better approach? If not, should I use XML or JSON?
I would go with XML as it has a built-in support in SQL Server. In general I would avoid using any additional stuff written in T-SQL, as maintaining this can be a nightmare.

is Using JSON data is better then Querying Database when there is no security issue for data

For my new project I'm looking forward to use JSON data as a text file rather then fetching data from database. My concept is to save a JSON file on the server whenever admin creates a new entry in the database.
As there is no issue of security, will this approach will make user access to data faster or shall I go with the usual database queries.
JSON is typically used as a way to format the data for the purpose of transporting it somewhere. Databases are typically used for storing data.
What you've described may be perfectly sensible, but you really need to say a little bit more about your project before the community can comment on your approach.
What's the pattern of access? Is it always read-only for the user, editable only by site administrator for example?
You shouldn't worry about performance early on. Worry more about ease of development, maintenance and reliability, you can always optimise afterwards.
You may want to look at http://www.mongodb.org/. MongoDB is a document-centric store that uses JSON as its storage format.
JSON in combination with Jquery is a great fast web page smooth updating option but ultimately it still will come down to the same database query.
Just make sure your query is efficient. Use a stored proc.
JSON is just the way the data is sent from the server (Web controller in MVC or code behind in standind c#) to the client (JQuery or JavaScript)
Ultimately the database will be queried the same way.
You should stick with the classic method (database), because you'll face many problems with concurrency and with having too many files to handle.
I think you should go with usual database query.
If you use JSON file you'll have to sync JSON files with the DB (That's mean an extra work is need) and face I/O problems (if your site super busy).

Should I use messaging instead of a database

I am designing a system that will allow users to take data from one system and send to other systems. One of the destination systems has a sophisticated SOA (web services) and the other is a mainframe that accepts flat files for input.
I have created a database that has a PublishEvent table and PublishEventType table. There are also normalized tables that are specific to the type of event being published.
I also have an "interface" table that is a flatened out version of the normalized data tables. The end user has a process that puts data into the interface table. I am not sure of the exact process - I think it's some kind of reporting application that they can export results to a SQL table. I then use an SSIS package to take the data out of the interface table and put it into the normalized data structure and create new rows in the PublishEvent table. I use the flat table because when I first showed them the relational tables they seemed to be very confused.
I have a windows service that watches for new rows in the PublishEvent table. The windows service is extended with plug-ins (using the MEF framework). Which plug-in is called depends on the value of the PublishEventTypeID field in the PublishEvent row.
PublishEventTypeID 1 calls the plug-in that reads data from one set of tables and calls the SOA Web service. PublishEventTypeID 2 calls the plug-in that reads data from a different set of tables and created the flat file to be sent to the mainframe.
This seems like I am implementing the "Database as IPC" anti-pattern. Should I change my design to use a messaging based system? Is the process of puting data into the flat table then into the normalized tables redundant?
EDIT: This is being developed in .NET 3.5
A MOM is probably the better solution but you also have to take in account the following points:
Do you have a message based system already in place as part of your
customer's architecture? If not, maybe introducing it is an
overkill.
Do you have any experience with Message-based systems? As an Jason
Plank correctly mentioned, you have to take in account specific
patterns for these, like having to ensure chronological order of
messages, managing dead letter channels and so on (see this
book for more).
You mentioned a mainframe system which has apparently limited
options for interfacing with. Who will take care of the layer that
will transform "messages" (either DB or MOM based) into something
that the mainframe can digest? Assuming it is you, would it be
easier (for you) to do that by accessing the DB (maybe you have
already worked on the problem in the past) or would the effort be
different depending on using a DB or a MOM?
To sum it up: if you are more confident by going the DB route, maybe it's better to do that, even if - as you correctly suggested yourself, it is a bit of an "anti-pattern".
Some key items to keep in mind are:
Row order consistency - Does your data model depend on the order of the data generated? If so, does your scheme ensure the pub and sub activity in the same order original data is created?
Do you have identity columns on either side? They are a problem since their value keeps changing based on the order the data is inserted. If Identity column is the sole primary key (surrogate key), a change in its value may make the data unusable.
How do you prove that you have not lost a record? This is the trickiest part of the solution, especially if you have millions of rows.
As for the architecture, you may want to check out the XMPP protocol - Smack for client (if Java) and eJabberD for Server.
Have a look at nServiceBus, Mass Transit or RhinoServiceBus if you're using .Net.

Can you provide some advice on setting up my database?

I'm working on a MUD (Multi User Dungeon) in Python and am just now getting around to the point where I need to add some rooms, enemies, items, etc. I could hardcode all this in, but it seems like this is more of a job for a database.
However, I've never really done any work with databases before so I was wondering if you have any advice on how to set this up?
What format should I store the data in?
I was thinking of storing a Dictionary object in the database for each entity. In htis way, I could then simply add new attributes to the database on the fly without altering the columns of the database. Does that sound reasonable?
Should I store all the information in the same database but in different tables or different entities (enemies and rooms) in different databases.
I know this will be a can of worms, but what are some suggestions for a good database? Is MySQL a good choice?
1) There's almost never any reason to have data for the same application in different databases. Not unless you're a Fortune500 size company (OK, i'm exaggregating).
2) Store the info in different tables.
As an example:
T1: Rooms
T2: Room common properties (aplicable to every room), with a row per **room*
T3: Room unique properties (applicable to minority of rooms, with a row per property per room - thos makes it easy to add custom properties without adding new columns
T4: Room-Room connections
Having T2 AND T3 is important as it allows you to combine efficiency and speed of row-per-room idea where it's applicable with flexibility/maintanability/space saving of attribute-per-entity-per-row (or Object/attribute/value as IIRC it's called in fancy terms) schema
Good discussion is here
3) Implementation wise, try to write something re-usable, e.g. have generic "Get_room" methods, which underneath access the DB -= ideally via transact SQL or ANSI SQL so you can survive changing of DB back-end fairly painlessly.
For initial work, you can use SQLite. Cheap, easy and SQL compatible (the best property of all). Install is pretty much nothing, DB management can be done by freeware tools or even FireFox plugin IIRC (all of FireFox 3 data stores - history, bookmarks, places, etc... - are all SQLite databases).
For later, either MySQL or Postgres (I don't do either one professionally so can't recommend one). IIRC at some point Sybase had free personal db server as well, but no idea if that's still the case.
This technique is called entity-attribute-value model. It's normally preferred to have DB schema that reflects the structure of the objects, and update the schema when your object structure changes. Such strict schema is easier to query and it's easier to make sure that the data is correct on the database level.
One database with multiple tables is the way to do.
If you want a database server, I've recommend PostgreSQL. MySQL has some advantages, like easy replication, but PostgreSQL is generally nicer to work with. If you want something smaller that works directly with the application, SQLite is a good embedded database.
Storing an entire object (serialized/encoded) as a value in the database is bad for querying - I am sure that some queries in your mud will NOT need to know 100% of attributes, or may retrieve a list of object by a value of attributes.
it seems like this is more of a job
for a database
True, although 'database' doesn't have to mean 'relational database'. Most existing MUDs store all data in memory, and read it in from flat-file saved in a plain-text data format. I'm not necessarily recommending this route, just pointing out that a traditional database is by no means necessary. If you do want to go the relational route, recent versions of Python come with sqlite which is a lightweight embedded relational database with good SQL support.
Using relational databases with your code can be awkward. Any change to a game logic class can require a parallel change to the database, and changes to the code that read and write to the database. For this reason good planning will help you a lot, but it's hard to plan a good database schema without experience. At least get your entity classes planned first, then build a database schema around it. Reading up on normalizing a database and understanding the principles there will help.
You may want to use an 'object-relational mapper' which can simplify a lot of this for you. Examples in Python include SQLObject, SQLAlchemy, and Autumn. These hide a lot of the complexities for you, but as a result can hide some of the important details too. I'd recommend using the database directly until you are more familiar with it, and consider using an ORM in the future.
I was thinking of storing a Dictionary
object in the database for each
entity. In htis way, I could then
simply add new attributes to the
database on the fly without altering
the columns of the database. Does that
sound reasonable?
Unfortunately not - if you do that, you waste 99% of the capabilities of the database and are effectively using it as a glorified data store. However, if you don't need aforementioned database capabilities, this is a valid route if you use the right tool for the job. The standard shelve module is well worth looking at for this purpose.
Should I store all the information in
the same database but in different
tables or different entities (enemies
and rooms) in different databases.
One database. One table in the database per entity type. That's the typical approach when using a relational database (eg. MySQL, SQL Server, SQLite, etc).
I know this will be a can of worms,
but what are some suggestions for a
good database? Is MySQL a good choice?
I would advise sticking with sqlite until you're more familiar with SQL. Otherwise, MySQL is a reasonable choice for a free game database, as is PostGreSQL.
One database. Each database table should refer to an actual data object.
For instance, create a table for all items, all creatures, all character classes, all treasures, etc.
Spend some time now and figure out how objects will relate to each other, as this will affect your database structure. For example, can a character have more than one character class? Can monsters have character classes? Can monsters carry items? Can rooms have more than one monster?
It seems pedantic, but you'll save yourself a whole lot of trouble early by figuring out what database objects "belong" to which other database objects.

Database recommendation

I'm writing a CAD (Computer-Aided Design) application. I'll need to ship a library of 3d objects with this product. These are simple objects made up of nothing more than 3d coordinates and there are going to be no more than about 300 of them.
I'm considering using a relational database for this purpose. But given my simple needs, I don't want any thing complicated. Till now, I'm leaning towards SQLite. It's small, runs within the client process and is claimed to be fast. Besides I'm a poor guy and it's free.
But before I commit myself to SQLite, I just wish to ask your opinion whether it is a good choice given my requirements. Also is there any equivalent alternative that I should try as well before making a decision?
Edit:
I failed to mention earlier that the above-said CAD objects that I'll ship are not going to be immutable. I expect the user to edit them (change dimensions, colors etc.) and save back to the library. I also expect users to add their own newly-created objects. Kindly consider this in your answers.
(Thanks for the answers so far.)
The real thing to consider is what your program does with the data. Relational databases are designed to handle complex relationships between sets of data. However, they're not designed to perform complex calculations.
Also, the amount of data and relative simplicity of it suggests to me that you could simply use a flat file to store the coordinates and read them into memory when needed. This way you can design your data structures to more closely reflect how you're going to be using this data, rather than how you're going to store it.
Many languages provide a mechanism to write data structures to a file and read them back in again called serialization. Python's pickle is one such library, and I'm sure you can find one for whatever language you use. Basically, just design your classes or data structures as dictated by how they're used by your program and use one of these serialization libraries to populate the instances of that class or data structure.
edit: The requirement that the structures be mutable doesn't really affect much with regard to my answer - I still think that serialization and deserialization is the best solution to this problem. The fact that users need to be able to modify and save the structures necessitates a bit of planning to ensure that the files are updated completely and correctly, but ultimately I think you'll end up spending less time and effort with this approach than trying to marshall SQLite or another embedded database into doing this job for you.
The only case in which a database would be better is if you have a system where multiple users are interacting with and updating a central data repository, and for a case like that you'd be looking at a database server like MySQL, PostgreSQL, or SQL Server for both speed and concurrency.
You also commented that you're going to be using C# as your language. .NET has support for serialization built in so you should be good to go.
I suggest you to consider using H2, it's really lightweight and fast.
When you say you'll have a library of 300 3D objects, I'll assume you mean objects for your code, not models that users will create.
I've read that object databases are well suited to help with CAD problems, because they're perfect for chasing down long reference chains that are characteristic of complex models. Perhaps something like db4o would be useful in your context.
How many objects are you shipping? Can you define each of these Objects and their coordinates in an xml file? So basically use a distinct xml file for each object? You can place these xml files in a directory. This can be a simple structure.
I would not use a SQL database. You can easy describe every 3D object with an XML file. Pack this files in a directory and pack (zip) all. If you need easy access to the meta data of the objects, you can generate an index file (only with name or description) so not all objects must be parsed and loaded to memory (nice if you have something like a library manager)
There are quick and easy SAX parsers available and you can easy write a XML writer (or found some free code you can use for this).
Many similar applications using XML today. Its easy to parse/write, human readable and needs not much space if zipped.
I have used Sqlite, its easy to use and easy to integrate with own objects. But I would prefer a SQL database like Sqlite more for applications where you need some good searching tools for a huge amount of data records.
For the specific requirement i.e. to provide a library of objects shipped with the application a database system is probably not the right answer.
First thing that springs to mind is that you probably want the file to be updatable i.e. you need to be able to drop and updated file into the application without changing the rest of the application.
Second thing is that the data you're shipping is immutable - for this purpose therefore you don't need the capabilities of a relational db, just to be able to access a particular model with adequate efficiency.
For simplicity (sort of) an XML file would do nicely as you've got good structure. Using that as a basis you can then choose to compress it, encrypt it, embed it as a resource in an assembly (if one were playing in .NET) etc, etc.
Obviously if SQLite stores its data in a single file per database and if you have other reasons to need the capabilities of a db in you storage system then yes, but I'd want to think about the utility of the db to the app as a whole first.
SQL Server CE is free, has a small footprint (no service running), and is SQL Server compatible

Resources