I'm in charge of fixing part of an application that syncs data entities from a DB2 database running on an iSeries to a SQL Server database on Azure. The application seems to run this synchronization just fine when the iSeries and the SQL Server host are on the same local network. However, add in the change in latency when the database host is in Azure and this process slows to an unacceptable level.
I'm looking for ideas on how to proceed. Some additional information:
Some of our clients might have millions of records in some tables.
An iSeries platform is not virtualizable/containerizable as far as I know.
Internet connection speed will vary wildly. Highly dependent on the client.
The current solution uses an ODBC driver to get data from the iSeries. I've started looking at taking a different approach as far as retrieving the data including creating API's (doesn't seem to be the best idea for transferring lots of data) or using Azure Data Services, but, those are really pieces to a bigger puzzle.
Edit: I'm trying to determine what the bottleneck is and how to fix it. I forgot to add that the IT manager (we only have one person in IT... we are a small company) has setup a test environment using a local iSeries and an instance of our software in Azure. He tried analyzing the network traffic using Wireshark and told me we aren't using tons of bandwidth, but, there seems to be alot of TCP "chatter". His conclusion was we need to consider another way to get the data. I'm not convinced and that is part of the reason I'm here asking this question.
More details:
Our application uses data from an ERP system that runs on the iSeries.
Part of our application is the sync process in question.
The process has two components: a full data sync and a delta/changes data sync.
When these processes are run when both the iSeries and our application are on site, the performance isn't great, but, it is acceptable.
The sync process is written in C# and connects to DB2 using an ODBC driver.
We do some data manipulation and verification as part of the sync process. i.e., making sure parent records exist, foreign keys are appropriate, etc.
I've been using Access to rapid-prototype a DB. Now I'd like to do a small group online test so I split the DB and placed the back-end on Azure SQL Server, then re-linked. It's incredibly slow and I've been researching solutions for days without positive results. My local environment is Win10, Office2016 64bit and internet connection is fast and stable.
I have tried different ODBC drivers, including the SQL Native Client v11.
I've disabled auto-tuning level on the NIC.
I've recreated all queries from access on the server.
I've made sure that Tracing in ODBC is off.
But I enabled tracing temporarily to see what was happening. If I opened the front-end, logged in (Small user table), and did something on the first form (Add 1 record with 3 sub-records...and really...nothing fancy or heavy at all and this only takes 1 minute) then closed the DB, I see that the Tracing log file is 1.5MB.
So I created an empty Access file and an ODBC link to only the User table (12 columns, 20 records), and then monitored the tracing log file again. Opening access doesn't add anything to the log file, but opening this one, linked table made the log file grow to 255kb. Opening this table in access took 5 seconds.
Access sent about 800 requests to the server for opening just this one small table. If I paste all the User table data into a text file, it's only 2kb. SO why is it so slow?
Any ideas on this, and specifically other suggestions to get this working faster?
Kind regards,
Well, the reason why using Azure is slower than running Access connected to a local instance of SQL server is because, well slow is slow!
I mean, if you going to travel 30 miles, you have a choice to walk, or to take a car.
So here is the question you need to know:
Why is walking slower than driving a car?
Answer: Because you are travelling at a slower speed!
So why is using Azure slower the using an instance of SQL server running on your local computer or local network?
Answer:
Because the connection speed to Azure is about 100 times slower!
The idea here that you not going to take into account the DIFFERENCE in connection speed is the issue here. It is a disservice to the reading public that may conclude that such a setup (Access front end on a pc to Azure instance of SQL server) is not a viable setup.
So the first issue here is to make a note of your connection speed to the back end database.
A typical office local area network has a speed of 100mbits, or today most are 1gig – even the el-cheapo routers you purchase at Best Buy are now rated at 1gig (1000 mbits).
However, your typical high speed internet is rated at about 5, or 10 mbits. So that is 100 times slower. (Actually 1000/5 = 200 times slower!!!).
That means if something NOW takes 3 seconds on your office network with Access and SQL server? Well then a WAN (over the internet), then you need to multiple the time by the change in your connection speed (this is so simple – yet it seems to escape all!). So, if you lucky, you might have a 5 mbits speed rating for your internet. That means you go
1000 / 5 = 200
You now take the 200 and multiple the existing delay you have of say 3 seconds and you get 600 seconds (that is 10 minutes if you are wondering!). So you going from 3 seconds to 10 minutes!
This kind of comparison in speed would be like walking into a sports shop to purchase a rubber boat to cross the Atlantic. So not taking into account the change in internet speed and wondering why things are slow is the issue here.
You can most certainly use Access to Azure, but you have to realize two simple concepts.
a connection and test with a connection that is 50-200 times slower than your LAN is a test that going to run 50 to 200 times slower! The failure to mention and take into considering the MASSIVE DIFFERENCE in your speed connection of your LAN compared to a WAN is the simple issue here.
opening a form bound to a large table of data is going to case performance issues.
I was sitting at the bus stop talking to a 90 year old granny lady. I asked her the following:
Have you ever used an instant teller?
She said, why yes, I use them all the time.
I then asked here don’t you think it would be bad to have the teller machine download all the peoples accounts while you wait and THEN ask you for your account number?
The old lady stated, of course, that would be silly. I type in my account pin and the machine ONLY downloads my account information – this is practical and obvious.
In other words that old lady realised that downloading a bunch of data BEFORE you the user even types in or does anything is a waste of bandwidth.
So you never want to launch a form bound to a table and THEN ask the user what record to work on. Why have Access download large numbers of records into a form and THEN ask the user or allow the user to navigate to the required record?
Even when using Google, it does not download the whole internet into your web browser page and you then go ctrl+f to search the contents of that web page.
The same concepts should be applied to Access applications. A design that asks for what to work on and then launches a form bound to a table with a "where" clause will thus fix this issue.
So if you have a form (and even a sub form) that displays a customer invoice, you would FIRST ASK FOR the invoice number, and then simply launch that form using a where clause that restricts the form to the ONE invoice!
Keep in mind that you can STILL use that invoice form bound to a table of 1 million rows and ONLY THE ONE record will be pulled down the network connection *if one used the where clause.
So a typical internet connection has adequate speed to run a browser, and also has MORE than adequate bandwidth speed to pull down a few records. Access often gets a bad rap for poor performance, but that is ONLY DUE to Access developers IGNORING the obvious advice that downloading tons of things that you don’t yet need into a form will run slow.
So web based applications, or even desktop applications written in vb.net perform well with SQL Azure running in the cloud over that MUCH slower internet connection because those applications don’t launch forms bound to large datasets WITHOUT FIRST simply allowing the user to request what they need to see and view.
As for Access and using SharePoint? That setup can be VERY fast, and in fact MUCH faster than SQL Azure, MySQL or any traditional database system because when you using SharePoint tables and Access, then Access automatic syncs a copy of the data local. This setup means your application will continue to run WITHOUT ANY internet connection. The instant the connection is restored, then the data sync can resume.
This means that if you have a table with 15,000 rows and run a report on that data the report can run and launch in an instant with SharePoint back end since a local copy of the data exists in the front end at ALL TIMES! So this setup is VERY well suited an off line mode or in cases that you have a poor and slow internet connections since you as noted always have local copy of the data – only when a record is changed does a sync occur, and that sync can occur independent of Access. So you change one record – and it starts syncing with SharePoint.
However for larger data sets that have to be updated, then SQL server is far better since you can execute a sql update on 10,000 rows and ZERO network traffic and transfer of data need occur to update those 10,000 rows when using SQL server (a pass through query) and when using SharePoint, the 10,000 rows WILL transfer over the network since the local copy requires the rows to be updated. So that massive advantage of using SharePoint for the database backend does not exist for applications that have to update lots of rows or do lots of row update types of data processing.
So the key concepts and take away here:
The high speed internet connection you have is often 10-200 times slower than your typical cheap office (local) network. So that means a 2 second operation will now take 10-200 times longer.
The Access application needs to be optimized to avoid things like loading too many records into a form. So building search forms etc. That FIRST ASK the user what they need to work on is a basic and simple requirement for all good developers and that includes Access developers.
Access and SharePoint can be the BEST option, and such a setup allows the application to run EVEN WHEN there is no internet connection at all. If table sizes are below say 10,000 rows, then this setup can often be ideal. However for applications that have to update lots of rows and for data processing heavy applications this setup is poor since updates to any rows will case data syncing to occur over the network. This setup is also the cheapest, since a single office 365 account with SharePoint support for Access can be had for $6 per month, and that $6 account allows up to 500 free users and those 500 users can even use their Gmail or non-Microsoft account for this setup. And such access applications that do fit within the bounds of SharePoint tables tend to need far less changes and optimizing then using SQL server over the internet.
With SQL server, use of views, pass-tough query and in some cases writing store procedures allows updates and code to run WITHOUT using ANY bandwidth. So one can send a single update query to the server that updates 10,000 rows of data – the only network cost will be the “tiny” amount of bandwidth to send that sql statement.
So while bound forms can be used with SQL Azure running in the cloud, one needs to build software like those do for the web, or vb.net in which they FIRST ask the user what account or customer to work on and THEN launch the UI to display that given data.
So in access, you build a search form say like this:
So at the end of the day, it is important to ignore posts here that suggests Access to SQL in the cloud is not viable. Access with proper designs will work rather well over typical internet connections to SQL server running on Azure.
In fact I seen people use Access to SQL over a 56k modem!
One has to adopt sensible designs in which the data pulled for a given task is restricted – this is a hall mark of all developers – the only issue is Access does NOT enforce this approach while most other developer tools don’t let you hang yourself with things like bound forms to large tables! It not that Access is slow, but Access is slow when you make poor design decisions.
Access to SharePoint can be a real winner – especially for poor bandwidth, spotty bandwidth and even when the connection is lost, the application will continue to run and run faster than 99% of the cases if you were running the same application with a SQL back end. There is a BIG caveat here since only certain types of applications will work well with SharePoint tables. For me to explain the why, how, and when such applications are better is beyond a simple post here, but one simply needs to be aware that SharePoint can be incredible solution, but not for all applications and SQL server can and will be better choice. This SharePoint “better” choice can only be determined on a case by case evaluation of the given type of application in question.
The problem is simply that Azure SQL Database is not very fast running with small DTUs (Database Transaction Units) compared to, say, an in-house instance of SQL Server hosted on even a moderate modern server.
I've checked it out too, and it requires extremely careful design of queries and filtering - far from what you normally can get away with - to obtain acceptable overall speed. On the other hand, this is a giving experience that will bring focus to potential bottlenecks you otherwise wouldn't encounter before it might be too late.
OK, so after almost a week of trying to get this to work (Access front-end to SQL Server back-end on Azure), I've come to the conclusion that it's not a viable solution.
I've tried SQL Server, and setup a Sharepoint 2016 server on Azure, which also failed.
What has worked is using a product from Bullzipp called MS Access to MySQL to convert the access tables, then adding a MySQL DB on the server and importing the file generated by Bullzip. The only thing to note here is that Bullzip doesn't like the newer access formats (it wants an MDB file) so go to Access, create a new, empty file, but make sure you set its file type to MDB. then import your tables across and run Bullzip.
It's now working a hell of a lot faster than the SQL Server, but I am getting some write conflicts in Access, so I just need to go through the code and do whatever I need to so I can avoid those messages.
Using Access as a front to Azure SQL tables is the worst solution. But, sometimes you have to do it. I have a client who is adamant that she wants to keep her Access database. When she hired her very first employee, it became clear she needed to SQL tables behind the screens.
This was a bit of a nightmare. However, after redesigning some terrible table structures, creating views and many procs, I've been able to do it. I use local tables in some cases, and refill by pulling from a stored proc and inserting into the local table. I use linked tables for basic data edits, and do explicit save records almost constantly.
I also have a first-load module that opens all forms, goes to the last record, back to the first record, and then hides the form until needed. The load limps along for about 3
My only remaining issue is now that Azure will close connections after idle time of (I think) 30 or more minutes -- or maybe it's when the laptop sleeps? That kills the app and it has to be closed and re-opened.
I have a Microsoft Access .accdb database on a company server. If someone opens the database over the network, and runs a query, where does the query run? Does it:
run on the server (as it should, and as I thought it did), and only the results are passed over to the client through the slow network connection
or run on the client, which means the full 1.5 GB database is loaded over the network to the client's machine, where the query runs, and produces the result
If it is the latter (which would be truly horrible and baffling), is there a way around this? The weak link is always the network, can I have queries run at the server somehow?
(Reason for asking is the database is unbelievably slow when used over network.)
The query is processed on the client, but that does not mean that the entire 1.5 GB database needs to be pulled over the network before a particular query can be processed. Even a given table will not necessarily be retrieved in its entirety if the query can use indexes to determine the relevant rows in that table.
For more information, see the answers to the related questions:
ODBC access over network to *.mdb
C# program querying an Access database in a network folder takes longer than querying a local copy
It is the latter, the 1.5 GB database is loaded over the network
The "server" in your case is a server only in the sense that it serves the file, it is not a database engine.
You're in a bad spot:
The good thing about access is that it's easy to create forms and reports and things by people who are not developers. The bad is everything else about it. Particularly 2 things:
People wind up using it for small projects that grow and grow and grow, and wind up in your shoes.
It sucks for multiple users, and it really sucks over a network when it gets big
I always convert them to a web-based app with SQL server or something, but I'm a developer. That costs money to do, but that's what happens when you use a tool that does not scale.
I've to create a Access 2003 database and share it among 100 users, users won't be doing any modifications, only viewing several reports that are generated daily (and once) using a scheduled task on the host machine.
Would a simultaneous 100 users break the performances down in that context?
What would you advise me regarding this workflow?
Exclude:
Using a database server (sqlserver,...etc) is out of topic
I've already thought about outputting the reports into static html, but now I want to first evaluate the sharing of the whole database (because filtering capability might be needed)
I'd like to avoid replication
You have used the word "host". Remember, Access is not a true client-server engine: it merely provides access to the data; consumers pull the data down to their local machines, where their local Access runtime or local Access development version executes the query against the downloaded data. Entire "freight trains" of data can come down across the wire to the desktop.
Some years ago we had a large database that the customer wanted in Access (eventually moved it to Oracle). Some queries would eat up 90%-100% of available LAN bandwidth for 15-30 seconds, during which time other write operations to completely different databases on the LAN would time-out, and data corruption would result.
So the main concern of your scenario would be the effects of possibly severe degradation on other applications. It will depend on the size of your database and the nature of your queries behind the reports.
I'd recommend "canning" the reports if you can, so that each running of a report does not invoke the query that instantiates the data behind it.
EDIT: An alternative, if one is necessary, would be to have a web server running on the same machine as the Access "host" executing the queries, and serving the end-result reports out to the consumers' browsers as HTML. This would reduce bandwidth consumption. The LAN becomes "the cloud".
If you give each user their own copy of the front end and linked to the data source then you might get away with 100 users if the network is up to scratch. I have about 100 users mostly read only on an access DB but they are not all using it at the same time
You can automate the front end installation using the excellent autoFE updater
www.autofeupdater.com/
We have an application which needs to interact with 3 different databases
(SQL Server) to fetch the user details and display them on a web page. Is it a good option to use a linked server or should we copy the user details (via some daily job) to the application database?
Using a linked server will give you a round trip delay every time you query the data. If you only query the data once per day or per session this might be acceptable. If however you are issuing many queries to these servers you may find that the performance is so poor that your application is unusable.
You could use SQL replication to push (or pull) the data from each of the servers into a local copy on the application server. This will provide you with much better query performance (no round trip delay) while also ensuring that you have the latest data. There are lots of options with SQL replication you should be able to find something that suits your needs.
For more information on SQL Replication see http://technet.microsoft.com/en-us/library/ms151198.aspx
A linked server is only going to allow your databases to talk to each other. If the application is interacting with three discrete databases, then you simply need discrete connections. I would not recommend heavily using the linked servers unless you are moving a lot of data (since picking it up into the application and putting it into another database may take even longer).