DataSet, an old confusion

DataSet, an old confusion - dataset

When should I use DataSet instead of DataReader?
When I must use DataSet instead of DataReader?
When should I use data in a disconnected fashion?
When I must use data in a disconnected fashion?
N.B.
I am not asking which is better. I need to know the appropriate scenarios of the use of DataSet. I am programming in .net for couple of years but I never seriously needed it.

One scenario, When you want to pass data from one layer to another layer of your application you could use dataset. For more information Dataset and DataReader

A DataSet holds all of the needed data records in memory, whereas a DataReader reads records from a data connection one record at a time.
DataSets are commonly filled with data using DataReaders.
Use a DataReader when you need a high-performance, forward-only reader.
Use a DataSet when you need to do something that requires all of the data to be present at once, such as serialization or passing data between tiers. However, as others have pointed out, using List<T> rather than a DataSet object provides better separation of concerns between tiers.
See http://articles.sitepoint.com/article/dataset-datareader and http://msdn.microsoft.com/en-us/magazine/cc188717.aspx for more info on this.

Related

How do I bind a database to a ListView in WPF?

I have some data (about 1000 rows) that I want to display. I want the user to be able to filter and sort the data dynamicly. Also I have to be able to save and load the data. I am relativly new to databases and need your help deciding which approach is the best/most resonable:
1st variant:
This is my current approach. I am using a ListView/GridView that is bound to a ObservableCollection of RowItems. I can now use the View to sort and filter. Saving and loading is done via Serialisation.
2nd variant:
Use of a database to store the data. The data is then loaded into the same business objects as above.
3rd variant:
Bind the ListView directly to a database. I just found out that this is somehow possible and am still somewhat hazy on the details. Especially on the aspect of sorting and filtering (Could that be done via a database Query?)
4th variant:
Store data in database. Use ObservableCollection of bussines objects for binding. But filter (and sort?) via SQL query.
These are the ideas I have to fullfilling my requirements. I'd like to know which one is the best/easiest/best performed/etc. or if you suggest an other approach.enter code here

Peter, there are couple of points you need to consider:
Where possible never make tight integration between back end data and display. Makes it easier to change either later on. you can use NHibernate or other ORM to make it easier to represent and read/write data from database. That way you have layer/tier in between them (aka n-tier)
you will only need to make call to DB for reading the data, it is 1000 rows it is not too bad, you can read them in one go and keep it in memory. Any filtering/sorting/grouping that you can do it on the in memory DomainModel/Buisness Object without touching the library.
i would use mixture of variant 1) and 2).
I would use business object to represent your data from Database (NHibernate is quite handy). I would then perform all data filtering/sorting on the client without requering DB each time. Any writes to the database must go through some sort of Domain Model layer or ORM so that they are not tighly linked. That allows to me to chnage DB without effecting the GUI Front end

When is it a good idea to use a database

I am doing an information retrieval project in c++. What are the advantages of using a database to store terms, as opposed to storing it in a data structure such as a vector? More generally when is it a good idea to use a database rather than a data structure?

(Shawn): Whenever you want to keep the data beyond the length of the instance of the program. (persistence across time)
(Michael Kjörling): Whenever you want many instances of your program, either in the same computer or in many computers, like in a network or the Net, access and manipulate (share) the same data. (persistence across network space)
Whenever you have very big amount of data that do not fit into memory.
Whenever you have very complex data structures and you prefer to not rewrite code to manipulate them, e.g search, update them, when the db programmers have already written such code and probably much faster than the code you (or I)'ll write.

Whenever you want to keep the data beyond the length of the instance of the program?

In addition to Shawn pointing out persistence: whenever you want multiple instances of the program to be able to easily share data?
In-memory data structures are great, but they are not a replacement for persistence.

It really depends on the scope. For example if you're going to have multiple applications accessing the data then a database is better because you won't have to worry about file locks, etc. Also, you'd use a database when you need to do things like joining other data, sorting, etc... unless you like to implement Quicksort.

How to use Data aware controls "correctly"?

I would like to ask experienced users, if you prefer to use data aware controls to add, insert, delete and edit data in DB or you favor to do it manualy.
I developed some DB applications, in which for the sake of "user friendly policy" I run into complicated web of table events (afterinsert, afteredit, after... and beforeedit, beforeinsert, before...). After that it was a quite nasty work to debug the application.
Aware of this risk (later by another application) I tried to avoid this problem, so I paid increased attention to write code well, readable and comprehensive. It seemed everything all right from the beginning, but as I needed to handle some preprocessing stuff before sending and loading data etc, I run into the same problems again, "slowly and inevitably". Sometime I could not use dataaware controls anyway, and what seemed to be a "cool" feature of DAControl at the beginning it turned to an obstacle on the end. I "had to" write special routine for non-dataaware controls, in order to behave as dataaware. Then I asked myself, why on earth should I use dataaware controls? Is it better to found application architecture on non-dataaware controls? It requires more time to write bug-proof code, of course, but does it worth of it? I do not know...
I happened to me several times, like jinxed : paradise on the beginning hell on the end...
I do not know, if I use wrong method to write DB program, if there is some standard common practice how to proceed. Or if it is common problem to everybody?
Thanx for advices and your experiences

I've written applications that used data aware components against TTable style components and applications which used non-data aware components.
My preference these days is to use data aware components but with TClientDataSets rather than TTable style components.
Using a TClientDataSet I don't have to make my user interface structure mimic my database structure. It's flexible enough to fill it with the data from several tables and then when you are applying the updates back to the database you can manually add/delete/update records as you see fit.

The secret should be in DataSet parameter automation, you can create a control that glues datasets together in master-slave way, just by defining connections between them. Ofcourse such control should be fed with form parameters in some other generalized way. In this case calling form with entity identifier, all datasets will get filled in a proper order and will allow to update data in database automatically by provider.
Generally it is better to have DataSets being an exact representation of tables with optional calculated fields (fkInternalCalc sometimes works better as it updates with row change not field change) bound to data aware controls. Data aware controls are the most optimal approach, and less error prone. Like in every aspect, there are exceptions to that.
If you must write too many glue functions, the problem probably is in design pattern not in VCL.

A lot of the time I use data aware controls linked to an in-memory table (kbmMemTable) that is filled from a query.
The benefits I see are:
I have full control over all inserts/updates/posts/edits to the database.
No need to worry about a user leaving a record in update mode (potentially locking other users)
Did I mention full control over all inserts/updates/posts/edits?
Using the in-memory table is as easy as:
dataset.sql.add('select a.field,b.field from a,b');
dataset.open;
inMemoryTable.loadfromdataset(dataset);
inMemoryTable.checkpoint;
And then "resolving" back to the database, you are given access to the original and new data for each field in each record (similar in a way to a trigger) - you can easily transaction and resolve a whole edit back in milliseconds - even if it took the end user 30 mins to fill in the data aware controls.

Have you considered a O/R mapper for Delphi like tiOPF or hcOPF?
This will separate the business domain logic from the database layer. For big and legacy systems, it is even common to add another layer, the 'Anti Corruption Layer', which protects the model from changes in the database design.

DAL using typed dataset

Are there any performance issues in using Typed Data Sets as DAL? Is it a recommended approach? I am using it for listing purposes only (repeater). It has paging, sorting functionalities too.

It works through untyped DataSet and just incapsulates type casting.

A dataset includes a lot of information other than the data that you need in the list. Therefore, if you can read the data out in a different way it would give you less information to transfer and therefore better performance.
Having said that, depending on your app you may not notice the difference. Do whatever is easiest for you and then check the performance.

Store static data in an array, or in a database?

We always have some static data which can be stored in a file as an array or stored in a database table in our web based project. So which one should be preferred?
In my opinion, arrays have some advantages:
More flexible (it can be any structure, which specifies a really complex relation)
Better performance (it will be loaded in memory, which will have better read/write performance compared with a database's I/O operations)
But my colleague argued that he preferred DB approach, since it can keep a uniform data persistence interface, and be more flexible.
So which should be preferred? Or how can we choose? Or we should prefer one in some scenario and another in other scenarios? what are the scenarios?
EDIT:
Let me clarify something. Truly just as Benjamin made the change to the title, the data we want to store in an array(file) won't change so frequently, which means the code won't change the value of the array in the runtime. If the data change very frequently I will use DB undoubtedly. That's why I made such a post.
And sometimes it's hard to store some really complex relations like:
Task = {
"1" : {
"name" : "xx",
"requirement" : {
"level" : 5,
"money" : 100,
}
...
}
Just like the above code sample(a python dict or you can think it as an array), the requirement field is hard to store in DB(store a structure like pickled object directly in DB? not so good I think). So in such condition, I will prefer arrays.
So what's your idea? In such scenario, we should prefer arrays to DB, right?
Regards.

Lets be pragmatic/objetive:
Do you write to your data on runtime? Yes: Db, No: File
Do you update your data more than once per week? Yes: Db, No: File
It's a pain to release an updated data file? Yes: Db, No: File,
Do you read that data often? Yes: File/Cache, No: Db
It is a pain to update that data file and you need extra tools? Yes: db, No: File
For sure I've forgotten other points, but I guess the basics are there.

The "flexiable" array in a file is fraught with a zillion issues already delt with by using a DB. Unless you can prove that the DB is really going to way slower than using the other approach use a DB. Move on and start solving business problems.
Edit
Comment from OP asks what the issues with using a file might be, here are a handful (pause to take a deep breath).
Concurrency: You have to manage the situation where multiple requests may be trying to write back to the file. Not too hard but it becomes a bottleneck.
Performance: Yes modifying an in-memory array is quicker but how do you determine how much and when the array needs to be persisted to a file. Note that using a DB doesn't pre-clude the use of an appropriate in-memory cache. Writing a file back each time a small modification is made isn't going to perform that well.
Scalability: Really a function of the first two. In order to acheive any scalable goals you need to be able to quickly modify small bits of the data that is persisted. IWO if you don't use a DB you would end up writing one. If you find you need more than one webserver to support growing demand where are you going to store the file(s)? Now you've got file I/O over a network (ableit likely a very quick one).
Structure: Your code will be responsible for managing the structure of data, querying it etc if you use an array. How will you do that in way which acheives greater "flexibility" than using a DB? All manner of choices and complexity are needed here.
Reliability: You need to ensure the integrity of your persisted data. In the event of some failure your array/file code would need to ensure that data is at least not so corrupt that the application can continue.

Your colleague is correct, BUT there's where you need to put aside the comp sci textbook and be pragmatic. How often will you be accessing this data from your application? If it's fairly frequently then don't incur the costs of access overhead. Instead of reading from a flat file you could still gain the advantages of a db, but use a caching strategy in your application. Depending on your development language you could look at something like memcache or jtreecache.

It depends on what kind of data you are looking at, and whether or not it needs to be updated regularly.
I tend to keep most things (non-config data) in the database, even if the data isn't going to be repeating (e.g. thosands of rows). Databases will scale so much easier than a flat file, if your system starts to grow fast your flat file might become a burden to your system.

If the data doesn't change very oftern, and your programming in Java, why not use Spring to hold the values?
They can be injected into your bean, and changed easly.
but thats if you'r developing in Java.

Yeah I agree with your implied assessment that databases are overused and basic flat files may work in multitude of scenarios. If your application is read-only (and writes are done by the admin when app restarts) I would definitely go with the file. Even if application writes to the file, but only in append mode (vs random inserts/updates) in one thread, I would also use file. Anything else -- need a real database with random updates, queries, concurrency control etc.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight