Azure Mobile Service concurrency handling in SQL? - sql-server

I am implementing an Azure Mobile Service and I have a set of objects that can be accessed by multiple users, potentially at the same time. My problem is that I can't really find a lot of information on how to handle the potential concurrency issues that might result from this.
I know "Azure Tables" implements an E-Tag check. Is there an out-of-the-box solution for Azure Mobile Services for this one with SQL ? If not, what is the approach I should be going for here.
Should I just implement the E-Tag check by hand? I could include a GUID of the object that is generated every time the object is saved and checked when saving. This should be relatively safe way to do it?

For conflicts between multiple clients, you would need to add a detection/resolution mechanism. You could use a "timestamp" (typically a sequentially incremented version number) in your table schema and check it in your update script. You could fail an update if the timestamp used by a client for reading is older than the current timestamp.
If you want to use ETags via http headers, you could use Custom APIs. We are looking into enabling CRUD scripts to also set headers but that is not available today. Separately, we are looking into the offline scenarios as well.
(Prog Manager, Windows Azure Mobile Services)

Related

Azure Mobile Service and Concurrency on database actions?

I recently read into Azure Tables and I the system has an implemented E-tag check for checking concurrent actions. I assume that for Azure Mobile Service, each of the insert and update methods etc. are atomic, however, I have been hard pressed to find any real information on concurrent data access. Should I want such to be implemented, is it up to me to implement it or does Azure Mobile Service implement some kind of concurrency handling system.
A basic use case I am looking into is the most basic
User 1 gets object A
User 2 gets object A
User 2 saves object A
User 1 saves object A -> This should result in a fault
Is it up to me to implement this? And how should I go about it? My first instinct would be to manually add an E-tag field for the object that is checked by a server-side script. Is there a better approach?
My best guess is that because WAMS is using SQL tables that it uses Optimistic Locking. So, I think the E-Tag is the way to go.
The following articles should shed some light on SQL for Azure:
Windows Azure Storage and Concurrent Access
Best Practices for the Design of Large-Scale Services on Windows Azure Cloud Services
How to get most out of Windows Azure Tables

Listening for List updates using the SharePoint Client Object Model

I am looking for a decently efficient way to listen for List changes on a SharePoint site using only the Client Object Model. I understand how backwards this idea is, but I am trying to keep from having to push any libraries to the SharePoint servers on install. Everything is supposed to be drop and go on a local machine.
I've thought about a class that just loops a timer and keeps querying the ClientContext from the last date of successful query on, but that seems horribly inefficient.
I know this is a client object model, but is there any way to get notifications from the server on changes from the client only?
I am afraid that this is not possible by using the client object model. If you need to poll too often that the user experience suffers from the slow performance too much, you would need to catch the list changes on the server side. deploy a solution with a feature registering an SPItemEventReceiver to your list.
I understand your reluctance to push server-side code to the SP farm; without it, you can save discussions and explanations to the customer's administrators. However, some tasks are more efficient or even feasible only when run on the server. You can consider Sandbox Solutions for such functionality. They are deployed to SP not by the farm administrator but to a site collection by an site collection administrator by a friendly web UI. This needs less privileges, more relaxed company policies to comply with, and can be better accepted by your customers. You can develop, test and even use your solution in your site collection only without affecting the entire farm. Microsoft recommends even farm-wide solutions to be designed with as much as possible functionality in sandboxed solutions, putting only the necessary minimum to a farm solution.
If deploying the entire application as sandbox solution would not be possible, you could combine a sandboxed solution gathering the changes with an external web site requesting the gathered data from the site collection, or in you case with a client-only application as you are speaking about. (Sandboxed solutions have one big limitation: You cannot make a web request from within the site collection outside; you can only access the site collection from outside.)
--- Ferda

Is there a way to prevent users from doing bulk entries in a Postgresql Database

I have 4 new data entry users who are using a particular GUI to create/update/delete entries in our main database. The "GUI" client allows them to see database records on a map and make modifications there, which is fine and preferred way of doing it.
But lately lot of guys have been accessing local database directly using PGAdmin and running bulk queries (i.e. update, insert, delete,etc) which introduces lot of problems like people updating lot of records without knowing or making mistakes while setting values. It also effects our logging procedures as we are calculating averages and time stamps for reporting purposes which are quite crucial to us.
So is there a way to prevent users from using PGAdmin (please remember lot of these guys are working from home and we do not have access to their machines) and running SQL queries directly in the database.
We still have to give them access to certain tables and allow them to execute sql as long as it's coming through a certain client but deny access to same user when he/she tries to execute a query directly in the db.
The only sane way to control access to your database is converting your db access methods to 3-tier structure. You should build a middleware (maybe some rest API or something alike) and use this API from your app. Database should be hidden behind this middleware, so no direct access is possible. From DB point of view, there are no ways to tell if one database connection is from your app, or from some other tool (pgadmin, simple psql or some custom build client). Your database should be accessible only from trusted hosts and clients should not have access to those hosts.
This is only possible if you use a trick (which might get exploited, too, but maybe your users are not smart enought).
In your client app set some harmless parameter like geqo_pool_size=1001 (if it is 1000 normally).
Now write a trigger that checks if this parameter is set and outputs "No access through PGAdmin" if this parameter is not set like from your app (and the username is not your admin username).
Alternatives: Create a temporary table and check for its existance.
I believe you should block direct access to the database, and set an application to which your clients (humans and software ones) will be able to connect.
Let this application filter and pass only allowed commands.
A great care should be taken in the filtering - I would carefully think whether raw SQL would be allowed at all. Personally, I would design some simplified API, which would make me sure that a hypothetical client-attacker (In God we trust, all others we monitor) would not find a way to sneak with some dangerous modification.
I suppose that from security standpoint your current approach is very unsafe.
You should study advanced pg_hba.conf settings.
this file is the key point for use authorization. Basic settings imply only simple authentification methods like passwords and lists of IP, but you can have some more advanced solution.
GSSAPI
kerberos
SSPI
Radius server
any pam method
So your official client can use a more advanced method, like somthing with a third tier API, some really complex authentification mechanism. Then without using the application it will at least becomes difficult to redo these tasks. If the kerberos key is encrypted in your client, for example.
What you want to do is to REVOKE your users write access, then create a new role with write access, then as this role you CREATE FUNCTION defined as SECURITY DEFINER, which updates the table in a way you allow with integrity checks, then GRANT EXECUTE access to this function for your users.
There is an answer on this topic on ServerFault which references the following blog entry with detailed description.
I believe that using middleware as other answers suggest is an unnecessary overkill in your situation. The above solution does not require for the users to change the way they access the database, just restricts their right to modify the data only through the predefined server side methods.

Setting up a Reporting Server to liberate resource from a webserver

Yay, first post on SO! (Good work Jeff et al.)
We're trying to solve a bottleneck in one of our web-applications that was introduced when we started allowing users to generate reports on-demand.
Our infrastructure is as follows:
1 server acting as a Webserver/DBServer (ColdFusion 7 and MSSQL 2005)
It's serving a web-application for our backend users and a frontend website. The reports are generated by the users from the backend so there's a level of security where the users have to log in (web based).
During peak hours when reports are generated it brings the web-application and frontend website to unacceptable speed due to SQL Server using resources for the huge queries and afterward ColdFusion generating multi page PDFs.
We're not exactly sure what the best practice would be to remove some load, but restricting access to the reports isn't an option at the moment.
We've considered denormalizing data to other tables to simplify the most common queries, but that seems like it would just push the issue further.
So, we're thinking of getting a second server and use it as a "report server" with a replicated copy of our DB on which the queries would be ran. This would fix one issue, but the second remains: generating PDFs is resource intensive.
We would like to offload that task to the reporting server as well, but being in a secured web-application we can't just fire HTTP GET to create PDFs with the user logged in the web-application from server 1 and displaying it in the web-application but generating/fetching it on server 2 without validating the user's credential...
Anyone have experience with this? Thanks in advance Stack Overflow!!
"We would like to offload that task to the reporting server as well, but being in a secured web-application we can't just fire HTTP GET to create PDFs with the user logged in the web-application from server 1 and displaying it in the web-application but generating/fetching it on server 2 without validating the user's credential..."
why can't you? you're using the world's easiest language for writing webservices. here are my suggestions.
first, move the database to it's own server thus having cf and sql server on separate servers. the first reason to do this is performance. as already mentioned, having both cf and sql on the same server isn't an ideal setup. the second reason is for security. if someone is able to hack your webserver, well there right there to get your data. you should have a firewall in place between your cf and sql server to give you more security. last reason is for scalability. if you ever need to throw more resources or cluster your database, it's easier when it's on it's own server.
now for the webservices. what you can do is install cf on another server and writing webservices to handle the generation of reports. just lock down the new cf server to accept only ssl connections and pass the login credentials of the users to the webservice. inside your webservice, authenticate the user before invoking the methods to generate the report.
now for the pdfs themselves. one of the methods i've done in the pass is generating a hash based on some parameters passed (user credentials and the generated sql to run the query) and then once the pdf is generated, you assign the hash to the name of the pdf and save it on disk. now you have a simple caching system where you can look to see if the pdf already exists and if so, return it, otherwise generate it and cache it.
in closing, your problem is not something that most haven't seen before. you just need to do a little work and your application will magnitudes faster.
The most basic best practice is to not have the web server and db server on the same hardware. I'd start with that.
You have to separate the perception between generating the PDF and doing the calculations. Both are separate steps.
What you can do is
1) Create a report calculated table that will run daily and populated it with all the calculated values for all your reports.
2) When someone requests a PDF report, have the report do a simple select query of the pre-calculated values. It will be much less db effort than calculating on the fly. You can use coldfusion to generate the PDF if it's using the fancy pdf settings. Otherwise you may be able to get away with using the raw PDF format (it's similar to html markup) in text form, or use another library (cfx_pdf, a suitable java library, etc) to generate them.
If the users don't need to download and only need to view/print the report, could you get away with flash paper?
An alternative is also to build a report queue. Whether you put it on the second server or not, what CF could do if you can get away with it, you could put report requests into a queue, and email them to the users as they get processed.
You can then control the queue through a scheduled process to run as regularly as you like and do only create a few reports at a time. I'm not sure if it's a suitable approach for your situation.
As mentioned above, doing a stored procedure may also help, and make sure you have your indexes set correctly in MySQL. I once had a 3 minute query that I brought down to 15 seconds because I forgot to declare additional indexes in each table that were being heavily used.
Let us know how it goes!
In addition to advice to separate web & db servers, I'd tried to:
a) move queries into stored procedures, if you're not using them yet;
b) generate reports by scheduler and keep them cached in special tables in ready-to-use state, so customers only select them with few fast queries -- this should also decrease report building time for customers.
Hope this helps.

Web services and database concurrency

I'm building a .NET client application (C#, WinForms) that uses a web service for interaction with the database. The client will be run from remote locations using a WAN or VPN, hence the idea of using a web service rather than direct database access.
The issue I'm grappling with right now is how to handle database concurrency. That is, if two people from different locations update the same data, how do I deal with it? I'm considering using timestamps on each database record and having that as part of the update where clauses, but that means that the timestamps have to move back and forth through the web service interface, which seems kind of ugly.
What is the best way to approach this?
I don't think you want your web service to talk directly to the database. You probably want your service to interact with some type of business components who in turn interact with a data access layer. Any concurrency exceptions can be passed from the DAL up to the business layer where they can be handled so that the web service never has to see the timestamps.
But if you are passing something like a data table up to the client and you want to avoid timestamps, you can do concurrency checking by comparing field by field. The Table Adapter wizards generate this type of concurrency checking by default if you ask for optimistic concurrency checking.
If your collisions occur infrequently enough that they can be resolved manually, a simple solution is to add an update trigger that copies a row's pre-update values to an audit table. This way the most recent write is the "winner", but no data is ever lost to an overwrite, and an administrator can restore an earlier row state or even combine them.
This technique has its downsides, and is not a very good solution where frequent overwrites are common.
Also, this is slightly off-topic, but using web services isn't necessarily the way to go just because the clients will be remoting into the network. ASP.NET web services are XML-based and very verbose. If your client application can count on being always connected, you'd be better off not using web services.

Resources