Web services and database concurrency - database

I'm building a .NET client application (C#, WinForms) that uses a web service for interaction with the database. The client will be run from remote locations using a WAN or VPN, hence the idea of using a web service rather than direct database access.
The issue I'm grappling with right now is how to handle database concurrency. That is, if two people from different locations update the same data, how do I deal with it? I'm considering using timestamps on each database record and having that as part of the update where clauses, but that means that the timestamps have to move back and forth through the web service interface, which seems kind of ugly.
What is the best way to approach this?

I don't think you want your web service to talk directly to the database. You probably want your service to interact with some type of business components who in turn interact with a data access layer. Any concurrency exceptions can be passed from the DAL up to the business layer where they can be handled so that the web service never has to see the timestamps.
But if you are passing something like a data table up to the client and you want to avoid timestamps, you can do concurrency checking by comparing field by field. The Table Adapter wizards generate this type of concurrency checking by default if you ask for optimistic concurrency checking.

If your collisions occur infrequently enough that they can be resolved manually, a simple solution is to add an update trigger that copies a row's pre-update values to an audit table. This way the most recent write is the "winner", but no data is ever lost to an overwrite, and an administrator can restore an earlier row state or even combine them.
This technique has its downsides, and is not a very good solution where frequent overwrites are common.

Also, this is slightly off-topic, but using web services isn't necessarily the way to go just because the clients will be remoting into the network. ASP.NET web services are XML-based and very verbose. If your client application can count on being always connected, you'd be better off not using web services.

Related

Data Migration from Legacy Data Structure to New Data Structure

Ok So here is the problem we are facing.
Currently:
We have a ton of Legacy Applications that have direct database access
The data structure in the database is not normalized
The current process / structure is used by almost all applications
What we are trying to implement:
Move all functionality to a RESTful service so no application has direct database access
Implement a normalized data structure
The problem we are having is how to implement this migration not only with the Applications but with the Database as well.
Our current solution is to:
Identify all the CRUD functionality and implement this in the new Web Service
Create the new Applications to replace the Legacy Apps
Point the New Applications to the new Web Service ( Still Pointing to the Old Data Structure )
Migrate the data in the databases to the new Structure
Point the New Applications to the new Web Service ( Point to new Data Structure )
But as we are discussing this process we are looking at having to rewrite the New Web Service twice. Once for the Old Data Structure and Once for the New Data Structure, As currently we could not represent the old Data Structure to fit the new Data Structure for the new Web Service.
I wanted to know if anyone has faced any challenges like this and how did you overcome these types of issues/implementation and such.
EDIT: More explanation of synchronization using bi-directional triggers; updates for syntax, language and clarity.
Preamble
I have faced similar problems on a data model upgrade on a large web application I worked on for 7 years, so I feel your pain. From this experience, I would propose the something a bit different - but hopefully one that will be a lot easier to implement. But first, an observation:
Value to the organisation is the data - data will long outlive all your current applications. The business will constantly invent new ways of getting value out of the data it has captured which will engender new reports, applications and ways of doing business.
So getting the new data structure right should be your most important goal. Don't trade getting the structure right against against other short term development goals, especially:
Operational goals such as rolling out a new service
Report performance (use materialized views, triggers or batch jobs instead)
This structure will change over time so your architecture must allow for frequent additions and infrequent normalizations to it. This means that your data structure and any shared APIs to it (including RESTful services) must be properly versioned.
Why RESTful web services?
You mention that your will "Move all functionality to a RESTful service so no application has direct database access". I need to ask a very important question with respect to the legacy apps: Why is this important and what value has it brought?
I ask because:
You lose ACID transactions (each call is a single transaction unless you implement some horrifically complicated WS-* standards)
Performance degrades: Direct database connections will be faster (no web server work and translations to do) and have less latency (typically 1ms rather than 50-100ms) which will visibly reduce responsiveness in applications written for direct DB connections
The database structure is not abstracted from the RESTful service, because you acknowledge that with the database normalization you have to rewrite the web services and rewrite the applications calling them.
And the other cross-cutting concerns are unchanged:
Manageability: Direct database connections can be monitored and managed with many generic tools here
Security: direct connections are more secure than web services that your developers will write,
Authorization: The database permission model is very advanced and as fine-grained as you could want
Scaleability: The web service is a (only?) direct-connected database application and so scales only as much as the database
You can migrate the database and keep the legacy applications running by maintaining a legacy RESTful API. But what if we can keep the legacy apps without introducing a 'legacy' RESTful service.
Database versioning
Presumably the majority of the 'legacy' applications use SQL to directly access data tables; you may have a number of database views as well.
One approach to the data migration is that the new database (with the new normalized structure in a new schema) presents the old structure as views to the legacy applications, typically from a different schema.
This is actually quite easy to implement, but solves only reporting and read-only functionality. What about legacy application DML? DML can be solved using
Updatable views for simple transformations
Introducing stored procedures where updatable views not possible (eg "CALL insert_emp(?, ?, ?)" rather than "INSERT INTO EMP (col1, col2, col3) VALUES (?, ? ?)".
Have a 'legacy' table that synchronizes with the new database with triggers and DB links.
Having a legacy-format table with bi-directional synchronization to the new format table(s) using triggers is a brute-force solution and relatively ugly.
You end up with identical data in two different schemas (or databases) and the possibility of data going out-of-sync if the synchronization code has bugs - and then you have the classic issues of the "two master" problem. As such, treat this as a last resort, for example when:
The fundamental structure has changed (for example the changing the cardinality of a relation), or
The translation to the legacy format is a complex function (eg if the legacy column is the square of the new-format column value and is set to "4", an updatable view cannot determine if the correct value is +2 or -2).
When such changes are required in your data, there will be some significant change in code and logic somewhere. You could implement in a compatibility layer (advantage: no change to legacy code) or change the legacy app (advantage: data layer is clean). This is a technical decision by the engineering team.
Creating a compatibility database of the legacy structure using the approaches outlined above minimize changes to legacy applications (in some cases, the legacy application continues without any code change at all). This greatly reduces development and testing costs (for which there is no net functional gain to the business), and greatly reduces rollout risk.
It also allows you to concentrate on the real value to the organisation:
The new database structure
New RESTful web services
New applications (potentially build using the RESTful web services)
Positive aspect of web services
Please don't read the above as a diatribe against web services, especially RESTful web services. When used for the right reason, such as for enabling web applications or integration between disparate systems, this is a good architectural solution. However, it might not be the best solution for managing your legacy apps during the data migration.
What it seems like you ought to do is define a new data model ("normalized") and build a mapping from the normalized model back to the legacy model. Then you can replace legacy direct calls with calls on the normalized one at your leisure. This breaks no code.
In parallel, you need to define what amounts to a (cerntralized) legacy db api, and map it to to your normalized model. Now, at your leisure, replace the original legacy db calls with calls on the legacy db API. This breaks no code.
Once the original calls are completely replaced, you can switch the data model over to the real normalized one. This should break no code, since everything is now going against the legacy db API or the normalized db API.
Finally, you can replace the legacy db API calls and related code, with revised code that uses the normalized data API. This requires careful recoding.
To speed all this up, you want an automated code transformation tool to implement the code replacements.
This document seems to have a good overview: http://se-pubs.dbs.uni-leipzig.de/files/Cleve2006CotransformationsinDatabaseApplicationsEvolution.pdf
Firstly, this seems like a very messy situation, and I don't think there's a "clean" solution. I've been through similar situations a couple of times - they weren't much fun.
Firstly, the effort of changing your client apps is going to be significant - if the underlying domain changes (by introducing the concept of an address that is separate from a person, for instance), the client apps also change - it's not just a change in the way you access the data. The best way to avoid this pain is to write your API layer to reflect the business domain model of the future, and glue your old database schema into that; if there are new concepts you cannot reflect using the old data (e.g. "get /app/addresses/addressID"), throw a NotImplemented error. Where you can reflect the new model with the old data, wire it together as best you can, and then re-factor under the covers.
Secondly, that means you need to build versioning into your API as a first-class concern - so you can tell clients that in version 1, features x, y and z throw "NotImplemented" exceptions. Each version should be backwards compatible, but add new features. That way, you can refactor features in version 1 as long as you don't break the service, and implement feature x in version 1.1, feature y in version 1.2 etc. Ideally, have a roadmap for your versions, and notify the client app owners if you're going to stop supporting a version, or release a breaking change.
Thirdly, a set of automated integration tests for your API is the best investment you can make - they confirm that you've not broken features as you refactor.
Hope this is of some use - I don't think there's a single, straightforward answer to your question.

Is WCF recommended to use with WPF and MVVM to retrieve data from SQL Server?

I am building a desktop application that will be used on local network, with SQL Server as database.
This application would have around 50 users top at the same time. In what particular scenario would I need to use WCF service? Is it recommended to create a WCF service on the server computer where database would reside, so we connect to this server through WCF service, instead of connecting to the database directly? What is the recommended way to connect to SQL Server data and why?
Edit: Let me explain in more detail. I have used WCF Ria services before, so I know how they work. Lets assume that WCF services works in same way. The question was directed toward why would we use WCF instead of directly connecting to database? I didnt want to specify my current application requirement, since I would get a specific answer for specific requirement. My goal was to understand in general why and when would you yse one instead of another. And I have received satisfying answers so far.
It appears to me that general consensus is to use WCF only if there would be a demand of another type of application, which would use web access to get data from service. Also, if I understood correctly, from security point of view, there is no difference between the two.
There would be a statistical app in the future that uses web to provide read-only statistics to user, and naturally some service will be required for this task (application has no specific client in mind, it will be offered to lots of clients). Since I need some demo application to be done very rapidly for particular clients, then I am thinking to neglect the service part, and make a proper layering (WPF->VM->Model->EF, so later I would just insert service between the model and EF. I guess it should not take too much time to make WPF app running with inserted layer. I am also postponing the service because of next reason: since HTML5 is (going to be) main technology for web, and there is a possibility that SL will be abandoned as technology (which I have been using), the logical decision would be to choose HTML5 over SL. But since I am totally unfamiliar with HTML5 and its requirements, I am not sure if WCF service is the best choice for it, and this is also one of the reasons to postpone the decision of choosing the service type (along with requirement to make the desktop demo app as fast as possible).
I think a better way to consider the question is whether you should abstract your database and data access layer from the application using a service interface. You could use WCF and SOAP or you could use a REST based HTTP service, the choice of technology is secondary to whether the current or future requirements of your application indicate that an additional layer of abstraction is warrented.
Reasons you might consider using a service interface instead of directly connecting to the SQL database include but are not limited to:
Ease of supporting multiple operating systems/client UIs
Ability to evolve the data/service interface separately from your database schema
Isolate application from changes to database schema or location (you don't have to redeploy change to application, only change internals of the services it is calling)
If data could be used by other systems, you have a standard means of allowing these systems to interface with the data your application is managing
Reduced SQL database connection security concerns (only service identity connects to database, allowing you to use a variety of authentication/authorization strategies on the client side)
The trade off you are looking at is the time/cost/complexity of implementing a service interface versus the flexibility and mantainability benefits you will gain. You should evaluate the needs of your application and your customer before you make a decision on whether to connect directly to your data store using ADO.NET or use a service layer.
You should take a look at the Microsoft Service Layer Guidelines as they cover a lot of the considerations to take into account.
Unless you need to create a reusable service, I can't think of a reason to add a WCF layer, unless you are just looking for a reason to do it. I think you can just go with some sort of ORM like EF or nHibernate and be happy.
The main reason for WCF is security. If the client connects directly to the DB then the client must be given rights on tables. The client can hack into the connection and use TSQL directly. You must expose port 1433 to the network in a single tier application. With WCF there is not direct access from the client to SQL. It is not just more secure in general but you can have more granular security. .NET service code can enforce row level security. A table only has column level security. If this is business on a private network and you don't expect anyone would try and hack into your db then client connecting directly to the SQL server is easier to build. With server side service the other factor is a change to server side code is one spot so you don't have to update 50 devices.

Web application vs. web services vs. classic application

Please I need help.
I have project in which I need application which communicates with local DB server and simultaneously with central remote DB server to complete some task(read stock quotas from local server create order and then write order to central orders DB,...).
So, I don`t know which architecture and technology do this.
Web application, .NET WinForms client applications on each computer, or web services based central application with client applications?
What are general differences between this approaches?
Thanks
If you don't want to expose your database directly to the clients, I'd recommend having a web service layer in between. Depending on the sensitivity of your data and the security level of your network, I'd recommend either a web service approach (where you can manage the encryption of data yourself, and without need for expensive ssl certificates) or a web interface (which might be easier to construct, but with limitations in security).
I agree with Tomas that a web service layer might be good. However, when it comes to choosing between webforms or winforms I don't think your question includes enough information to make the choice.
I'd say that if you want a powerful and feature rich user interface and want to make development easy, Winforms is probably the way to go. But if you need it to be usuable from a varied array of clients and want easier maintenance and deployment, a web app might be best.
First, focus on the exact relationship between these databases. What does "local" mean. Right there on the user's desktop? Shared between all the users in their office? Presumably the local quotes (you do mean stock quotes and not quotas?) could potentiually be a little out of date relative to the central order server's view of the world. Does that matter? I place an order for 100 X at price 78.34, real price may be different. What is the intended behaviour.
My guess is that there is at least some business logic and so we need to decide where that runs. One (thick client) approach is to put that logic on the desktop, the desktop app then might write directly to the central DB. I don't tend to do this for several reasons:
Every client desktop gets a database connection. Scaling is not good, eventually the database gets unhappy when the number of users gets very large.
If we need a slightly different app, perhaps exposed to a different set of users via the Web or whatever, we end up reproducing that business logic.
An alternative approach (thin or browser based) keeps the UI on the desktop, but puts the logic on the server. The client can then invoke some kind of service. Now there's lots of possible ways of doing that, a simple Web Service or Rest Service will do the job. I hope it's clear that this service-based appraoch addressed my two points above.
By symmetry I would treat the local databases in the same way, wrap them in services. However it's possible that some more complex relationship between the databases exists and in which case you might need the local service layer to interact with the central service layer.
I'm touting the general pronciple of Do Not Repeat Yourself, implement each piece of business logic once.

Using a web service to secure a database

There are some rumors floating around that the team at my company will soon be using web services for all future application development. The architecture is supposed to be something like this:
Application --> Web Service --> Database
The stated reasoning behind it is security. This sounds like a huge waste of time for little if any benefit. My question is, in what ways does a web service make your data more secure than a database? I would think that if an attacker wanted to get all your data and had already gotten onto the app server, it would be fairly trivial to figure out how the application is getting it's data.
Please keep in mind that these web services would be purely for data, and would have little if any business/validation logic, and would also be outside the application developers control (at least that's the way it's worked with all previous applications that have used web services).
If it's true that there will be no business logic or validation on the web services, then there is only a limited security benefit to adding the additional layer of abstraction. I say limited because the interface between your application and the database is still more limited than if they were directly talking to each other.
If you add validation and business logic to the equation, there is a significant security benefit, as anyone who has access to the application account can only do the database what the application is able to do. Additionally, this is a better design because it reduces coupling between your application and implementation details of how the data is stored in the database. If you wanted to change the database schema, you only need to update the web services, and not entire applications.
One important thing about Web Services is interoperability so that different applications from different platforms later can utilize the services and data. Your company will benefit a lot by doing so. And you are right about the security, it is definitely one of the good reasons to use web service rather than expose a public endpoint of the database, it is dangerous!
Web Services enable the accessibility of your data, For example, your data can be accessed within browser by javascript. There is no way to access the database on the server directly within Javascript.
All in all, go for it, that is the right approach.
the security argument is questionable; authenticating to a web service is no different than authenticating to the database
there are legitimate reasons for moving db operations to web services and SOA in general, but security isn't one of them
If you use a webservice hopefully you will also be using some kind of queue when sending the data to the database. If you are using a webservice and queue combo then the security come into place with less chance of lost data. If you do not have a webservice and queue combo if you send data to the database and it never gets there you have no were for it to go it just disappears.
You are correct though if someone wants to break into your system a webservice isnt going to help if anything it might make it worse if you make the webservice public and they find the name of your webservice because then they can just query your DB using the webservice and any security features on your servers will just think it is you applications getting the information.

How do you keep two related, but separate, systems in sync with each other?

My current development project has two aspects to it. First, there is a public website where external users can submit and update information for various purposes. This information is then saved to a local SQL Server at the colo facility.
The second aspect is an internal application which employees use to manage those same records (conceptually) and provide status updates, approvals, etc. This application is hosted within the corporate firewall with its own local SQL Server database.
The two networks are connected by a hardware VPN solution, which is decent, but obviously not the speediest thing in the world.
The two databases are similar, and share many of the same tables, but they are not 100% the same. Many of the tables on both sides are very specific to either the internal or external application.
So the question is: when a user updates their information or submits a record on the public website, how do you transfer that data to the internal application's database so it can be managed by the internal staff? And vice versa... how do you push updates made by the staff back out to the website?
It is worth mentioning that the more "real time" these updates occur, the better. Not that it has to be instant, just reasonably quick.
So far, I have thought about using the following types of approaches:
Bi-directional replication
Web service interfaces on both sides with code to sync the changes as they are made (in real time).
Web service interfaces on both sides with code to asynchronously sync the changes (using a queueing mechanism).
Any advice? Has anyone run into this problem before? Did you come up with a solution that worked well for you?
This is a pretty common integration scenario, I believe. Personally, I think an asynchronous messaging solution using a queue is ideal.
You should be able to achieve near real time synchronization without the overhead or complexity of something like replication.
Synchronous web services are not ideal because your code will have to be very sophisticated to handle failure scenarios. What happens when one system is restarted while the other continues to publish changes? Does the sending system get timeouts? What does it do with those? Unless you are prepared to lose data, you'll want some sort of transactional queue (like MSMQ) to receive the change notices and take care of making sure they get to the other system. If either system is down, the changes (passed as messages) will just accumulate and as soon as a connection can be established the re-starting server will process all the queued messages and catch up, making system integrity much, much easier to achieve.
There are some open source tools that can really make this easy for you if you are using .NET (especially if you want to use MSMQ).
nServiceBus by Udi Dahan
Mass Transit by Dru Sellers and Chris Patterson
There are commercial products also, and if you are considering a commercial option see here for a list of of options on .NET. Of course, WCF can do async messaging using MSMQ bindings, but a tool like nServiceBus or MassTransit will give you a very simple Send/Receive or Pub/Sub API that will make your requirement a very straightforward job.
If you're using Java, there are any number of open source service bus implementations that will make this kind of bi-directional, asynchronous messaging a snap, like Mule or maybe just ActiveMQ.
You may also want to consider reading Udi Dahan's blog, listening to some of his podcasts. Here are some more good resources to get you started.
I'm mid-way through a similar project except I have multiple sites that need to keep in sync over slow connections (dial-up in some cases).
Firstly you need to track changes, if you can use SQL 2008 (even the Express version is enough if the 2Gb limit isn't a problem) this will ease the pain greatly, just turn on Change Tracking on the database and each table. We're using SQL Server 2008 at the head office with the extended schema and SQL Express 2008 at each site with a sub-set of data and limited schema.
Secondly you need to track your changes, Sync Services does the trick nicely and supports using a WCF gateway into the main database. In this example you will need to use the Sync using SQL Express Client sample as a starting point, note that it's based on SQL 2005 so you'll need to update it to take advantage of the Change Tracking features in 2008. By default the Sync Services uses SQL CE on the clients, which I'm sure isn't enough in your case. You'll need a service that runs on your Web Server that periodically (could be as often as every 10 seconds if you want) runs the Synchronize() method. This will tell your main database about changes made locally and then ask the server for all changes made there. You can set up the get and apply SQL code to call stored procedures and you can add event handlers to handle conflicts (e.g. Client Update vs Server Update) and resolve them accordingly at each end.
We have a shop as a client, with three stores connected to the same VPN
Two of the shops have a computer running as a "server" for that shop and the the third one has the "master database"
To synchronize all to the master we don't have the best solution, but it works: there is a dedicated PC running an application that checks the timestamp of every record in every table of the two stores and if it is different that the last time you synchronize, it copies the results
Note that this works both ways. I.e. if you update a product in the master database, this change will propagate to the other two shops. If you have a new order in one of the shops, it will be transmitted to the "master".
With some optimizations you can have all the shops synchronize in around 20minutes
Recently I have had a lot of success with SQL Server Service Broker which offers reliable, persisted asynchronous messaging out of the box with very little implementation pain.
It is quick to set up and as you learn more you can use some of the more advanced features.
Unknown to most, it is also part of the desktop editions so it can be used as a workstation messaging system
If you have existing T-SQL skills they can be leveraged as all the code to read and write messages is done in SQL
It is blindingly fast
It is a vastly under-hyped part of SQL Server and well worth a look.
I'd say just have a job that copies the data in the pub database input table into a private database pending table. Then once you update the data on the private side have it replicated to the public side. If you don't have any of the replicated data on the public side updated it should be a fairly easy transactional replication solution.

Resources