I'm creating a windows forms application with NHibernate. It's an MDI application, so there is no limit to how many forms the user can have open at the same time (probably many).
For most forms I want to have an "OK" and a "Cancel" button. Both close the form, but "OK" also saves the modified data to the DB. The forms can be pretty complex, and the modifications are likely to touch a whole graph of objects, adding some, deleting some, and changing some more. It would be good if the changes could be automatically detected and persisted as needed, without the need to manually keep track of each of them.
What would be a good way to do this?
Extra information: I can make whatever DB schema I want. I'm using MSSQL 2008 and currently have decided for GUID primary keys (with guid.comb generator) and a TIMESTAMP column for optimistic concurrency.
I tried to simply set FlushMode of a NHibernate ISession to Never, doing all modifications as needed, and then calling Flush() if the user clicked OK. But that didn't work.
A few possible solutions, assuming that you're using one ISession per form instance:
Call ISession.Clear if the user cancels.
Set the FlushMode to Commit and only initiate a transaction when the user clicks OK.
Stick with your original approach of using Manual FlushMode. It should work and the behavior you're seeing is not a bug.
Since you're using GUIDs for primary keys, you should not run into the issue of NH unexpectedly flushing a session because it needs the database to generate an identifier. You still need to be aware of scenarios where NH may flush before a select to ensure consistent results.
I don't think you need to worry about opening and closing the database connection. My understanding is that NH is very good about managing that.
This should help: NHibernate Best Practices with ASP.NET, 1.2nd Ed. (I know it's ASP.NET, but you should find the information useful and easily transferable to WinForms.)
In short, one should opt for a sesson per window architecture. Making impossible to open two different windows for the same exact task makes sense here. Then, you perhaps want to create an instance of the NHibernate ISession API on Form_Load(). At the end, if the user clicks OK, then just BeginTransaction(), Flush() the session, and Commit() the transaction, otherwise rolls it back.
Maybe if you wrap the whole thing in a transaction and commit the transaction when the user clicks OK?
Related
We are running a pretty uncommon erp-system of a small it-business which doesn't allow us to modify data in an extensive way. We thought about doing a data update by exporting the data we wanted to change directly from the db and by using Excel VBA to update a bunch of data of different tables. Now we got the data updated in excel which is supposed to be written into the Oracle DB.
The it-business support told us not to do so, because of all the triggers running in the background during a regular data update in their program. We are pretty afraid of damaging the db so we are looking for the best way to do the data update without bypassing any trigger. To be more specific there are some thousands of changes we've done in different columns and tables merged all together in one Excel-file. Now we have to be sure to insert the modified data into the db and firing all the triggers the erp-software does during data update.
Is there anyone who knows a good way to do so?
I don't know what ERP system you are using, but I can relate some experiences from Oracle's E-Business Suite.
Nowadays, Oracle's ERP includes a robust set of APIs that will allow your custom programs to safely maintain ERP data. For example, if you want to modify a sales order, you use Oracle's API for that purpose and it makes sure all the necessary, related validations and logic are applied.
So, step #1 -- find out if your ERP system offers any APIs to allow you to safely update your data.
Back in the early days of Oracle's ERP, there were not so many APIs. In those days, when we needed to update a lot of table and had no API available, the next approach would be to use some sort of data loader tool. The most popular was, in fact, called "Data Loader". What this would do is read your data from an Excel spreadsheet and send it to the ERP's user interface -- exactly as though it were being typed in by a user. Since the data went through the ERP's UI, all the necessary validations and logic would automatically be applied.
In really extreme cases, when there was no API and DataLoader was, for whatever reason, not practical, it was still sometimes deemed necessary and worth the risk to attempt our own direct update of the ERP tables. This is, in general, risky and a bad practice, but sometimes we do what we must.
In these cases, we would start a database trace going on a user's session as they keyed in a few updates via the ERP's user interface. Then, we would use the trace to figure out what validations and related logic we needed to apply during our custom direct updates. We would also analyze the source code of the ERP system (since we had it available in the case of Oracle's ERP). Then, we would test it extensively. And, after all that, it was still risky and also prone to break after upgrades. But, in general, it worked as a last resort.
No my problem is that I need to do the work fast by make some automation in my processes. The work is already done on excel that's true but it needed the modification anyway. It's only if I put it manually with c&p into the db over our ERP or all at once over I don't know what.
But I guess Mathew is right. There are validation processes in the ERP so we can't write it directly into the db.
I don't know maybe you could contact me if you have a clue to bypass the ERP in a non risky manner.
This seems to be an issue that keeps coming back in every web application; you're improving the back-end code and need to alter a table in the database in order to do so. No problem doing manually on the development system, but when you deploy your updated code to production servers, they'll need to automatically alter the database tables too.
I've seen a variety of ways to handle these situations, all come with their benefits and own problems. Roughly, I've come to the following two possibilities;
Dedicated update script. Requires manually initiating the update. Requires all table alterations to be done in a predefined order (rigid release planning, no easy quick fixes on the database). Typically requires maintaining a separate updating process and some way to record and manage version numbers. Benefit is that it doesn't impact running code.
Checking table properties at runtime and altering them if needed. No manual interaction required and table alters may happen in any order (so a quick fix on the database is easy to deploy). Another benefit is that the code is typically a lot easier to maintain. Obvious problem is that it requires checking table properties a lot more than it needs to.
Are there any other general possibilities or ways of dealing with altering database tables upon application updates?
I'll share what I've seen work best. It's just expanding upon your first option.
The steps I've usually seen when updating schemas in production:
Take down the front end applications. This prevents any data from being written during a schema update. We don't want writes to fail because relationships are messed up or a table is suddenly out of sync with the application.
Potentially disconnect the database so no connections can be made. Sometimes there is code out there using your database you don't even know about!
Run the scripts as you described in your first option. It definitely takes careful planning. You're right that you need a pre-defined order to apply the changes. Also I would note often times you need two sets of scripts, one for schema updates and one for data updates. As an example, if you want to add a field that is not nullable, you might add a nullable field first, and then run a script to put in a default value.
Have rollback scripts on hand. This is crucial because you might make all the changes you think you need (since it all worked great in development) and then discover the application doesn't work before you bring it back online. It's good to have an exit strategy so you aren't in that horrible place of "oh crap, we broke the application and we've been offline for hours and hours and what do we do?!"
Make sure you have backups ready to go in case (4) goes really bad.
Coordinate the application update with the database updates. Usually you do the database updates first and then roll out the new code.
(Optional) A lot of companies do partial roll outs to test. I've never done this, but if you have 5 application servers and 5 database servers, you can first roll out to 1 application/1 database server and see how it goes. Then if it's good you continue with the rest of the production machines.
It definitely takes time to find out what works best for you. From my experience doing lots of production database updates, there is no silver bullet. The most important thing is taking your time and being disciplined in tracking changes (versioning like you mentioned).
I have really odd user requirement. I have tried to explain to them there are much better ways of supporting their business process and they don't want to hear it. I am tempted to walk away but I first want to see if maybe there is another way.
Is there any way that I can lock a whole database as opposed to row-lock or table-lock. I know I can perhaps put the database into single-user mode but that means only one person can use it at a time. I would like many people to be able to read at a time but only one person to be able to write to it at a time.
They are trying to do some really odd data migration.
What do you want to achieve?
Do you want to make the whole database read-only? You can definitely do that
Do you want to prevent any new clients from connecting to the database? You can definitely do that too
But there's really no concept of a "database lock" in terms of only ever allowing one person to use the database. At least not in SQL Server, not that I'm aware of. What good would that make you, anyway?
If you want to do data migration out of this database, then setting the database into read-only mode (or creating a snapshot copy of it) will probably be sufficient and the easiest way to go.
UPDATE: for the scenario you mention (grab the data for people with laptops, and then re-syncronize), you should definitely check out ADO.NET Sync Services - that's exactly what it's made for!
Even if you can't use ADO.NET Sync Services, you should still be able to selectively and intelligently update your central database with the changes from laptops without locking the entire database. SQL Server has several methods to update rows even while the database is in use - there's really no need to completely lock the whole database just to update a few rows!
For instance: you should have a TIMESTAMP (or ROWVERSION) column on each of your data tables, which would easily allow you to see if any changes have occured at all. If the TIMESTAMP field (which is really just a counter - it has nothing to do with date or time) has not changed, the row has not changed and thus doesn't need to be considered for an update.
Triggers seems like a simple solution for Audit logging. Why should I use Interceptors?
Database portability is one con of trigger...
what are others?
Con of using anything except a trigger is that not all data changes may take place through the GUI and therefore might not get logged. You have to consider that databases are changed from many sources including data imports and set-based queries from the query window (for instance when someone is asked to update all prices by 10%). If you use another method, you had better make sure that it captures any way data can be changed. If you use dynamic sql at all, then all your tables are open to the users to make changes directly in the database including fradulaent changes designed to steal from the company. Users committing fraud are one of the key things audit triggers are designed to catch. If you think your audit solution is ok becasue it captures evreything from the user interface and that it all it needs to capture, you are very, very wrong. I don't know how interceptors work, but you had better test with SSIS (or DTS) imports and queries from the query window before you think the solution will work. Also if it works just from the GUI, remember there might be more than one GUI connecting to a database.
I think that the reason for using interceptors is two fold:
So that you don't tie yourself to a particular database. the porting to different DBMS is significantly easier.
So that your domain model doesn't bleed into other areas of your code. ie the database needing to know about if a record has changed.
But all this depends on the context. If it is vital that all changes to particular records is neccessary then i think than HLGEM is correct. triggers are the best for handling that type of senario.
I agree with HLGEM.
A good alternative to having the advantages of having both triggers and portability of DBMS, is to use some auditing tool that:
Given an audit plan: generate the triggers for the appropriate DBMS
Pablo Javier
Another small issue is with triggers doing any DML. nHibernate uses the affected row count to determine success on a lot of its operations. If you are doing any inserts/updates/etc. inside your triggers, then you'll need to turn NOCOUNT on inside the triggers to prevent those false rowcounts from bubbling up.
Not that this, in any way, prevents you from making triggers work, but I've spent enough time refactoring away from this problem I thought it was worth mentioning. Interceptors, or EventListeners, are a simple, portable way to satisfy auditing requirements.
Plus, no more icky T-SQL code...
Triggers aren't easily testable and they're actually quite difficult to write properly. And if your audit data is to be consumed by business users, it's often difficult to translate from database row-level operations back into the domain model.
I'm of the opinion that a database is really just the persistence area of an application. Of a single application. In other words, I don't think that other systems should use my database directly and so I think auditing to be done outside of the database (i.e. not with triggers).
Is it good practice to delegate data validation entirely to the database engine constraints?
Validating data from the application doesn't prevent invalid insertion from another software (possibly written in another language by another team). Using database constraints you reduce the points where you need to worry about invalid input data.
If you put validation both in database and application, maintenance becomes boring, because you have to update code for who knows how many applications, increasing the probability of human errors.
I just don't see this being done very much, looking at code from free software projects.
Validate at input time. Validate again before you put it in the database. And have database constraints to prevent bad input. And you can bet in spite of all that, bad data will still get into your database, so validate it again when you use it.
It seems like every day some web app gets hacked because they did all their validation in the form or worse, using Javascript, and people found a way to bypass it. You've got to guard against that.
Paranoid? Me? No, just experienced.
It's best to, where possible, have your validation rules specified in your database and use or write a framework that makes those rules bubble up into your front end. ASP.NET Dynamic Data helps with this and there are some commercial libraries out there that make it even easier.
This can be done both for simple input validation (like numbers or dates) and related data like that constrained by foreign keys.
In summary, the idea is to define the rules in one place (the database most the time) and have code in other layers that will enforce those rules.
The disadvantage to leaving the logic to the database is then you increase the load on that particular server. Web and application servers are comparatively easy to scale outward, but a database requires special techniques. As a general rule, it's a good idea to put as much of the computational logic into the application layer and keep the interaction with the database as simple as possible.
With that said, it is possible that your application may not need to worry about such heavy scalability issues. If you are certain that database server load will not be a problem for the foreseeable future, then go ahead and put the constraints on the database. You are quite correct that this improves the organization and simplicity of your system as a whole by keeping validation logic in a central location.
There are other concerns than just SQL injection with input. You should take the most defensive stance possible whenever accepting user input. For example, a user might be able to enter a link to an image into a textbox, which is actually a PHP script that runs something nasty.
If you design your application well, you should not have to laboriously check all input. For example, you could use a Forms API which takes care of most of the work for you, and a database layer which does much the same.
This is a good resource for basic checking of vulnerabilities:
http://ha.ckers.org/xss.html
It's far too late by the time the data gets to your database to provide meaningful validation for your users and applications. You don't want your database doing all the validation since that'll slow things down pretty good, and the database doesn't express the logic as clearly. Similarly, as you grow you'll be writing more application level transactions to complement your database transactions.
I would say it's potentially a bad practice, depending on what happens when the query fails. For example, if your database could throw an error that was intelligently handled by an application, then you might be ok.
On the other hand, if you don't put any validation in your app, you might not have any bad data, but you may have users thinking they entered stuff that doesn't get saved.
Implement as much data validation as you can at the database end without compromising other goals. For example, if speed is an issue, you may want to consider not using foreign keys, etc. Furthermore, some data validation can only be performed on the application side, e.g., ensuring that email addresses have valid domains.
Another disadvantage to doing data validation from the database is that often you dont validate the same way in every case. In fact, it often depends on application logic (user roles), and sometimes you might want to bypass validation altogether (cron jobs and maintenance scripts).
I've found that doing validation in the application, rather than in the database, works well. Of course then, all the interaction needs to go through your application. If you have other applications that work with your data, your application will need to support some sort of API (hopefully REST).
I don't think there is one right answer, it depends on your use.
If you are going to have a very heavily used system, with the potential that the database performance might become a bottleneck, then you might want to move the responsibility for validation to the front-end where it is easier to scale with multiple servers.
If you have multiple applications interacting with the database, then you might not want to replicate and maintain the validation rules across multiple applications, so then the database might be the better place.
You might want a slicker input screen that doesn't just hit the user with validation warnings when they try to save a record, maybe you want to validate a field after data has been entered and it losses focus; or even as the user types, changing the font colour as validation fails/passes.
Also related to constraints, is warnings of suspect data. In my application I have hard-constraints in the database (e.g. someone can't start a job before their date of birth), but then in the front-end have warnings for data that is possibly correct, but suspect (e.g. an eight year-old starting a job).