how can I simulate network latency on my developer machine? - sql-server

I am upsizing an MS Access 2003 app to a SQL Server backend. On my dev machine, SQL Server is local, so the performance is quite good. I want to test the performance with a remote SQL Server so I can account for the effects of network latency when I am redesigning the app. I am expecting that some of the queries that seem fast now will run quite slowly once deployed to production.
How can I slow down (or simulate the speed of a remote) SQL Server without using a virtual machine, or relocating SQL to another computer? Is there some kind of proxy or Windows utility that would do this for me?

I have not used it myself, but here's another SO question:
Network tools that simulate slow network connection
In one of the comments SQL Server has been mentioned explicitly.

You may be operating under a misconception. MS-Access supports so-called "heterogeneous joins" (i.e. tables from a variety of back-ends may be included in the same query, e.g. combining data from Oracle and SQLServer and Access and an Excel spreadsheet). To support this feature, Access applies the WHERE clause filter at the client except in situations where there's a "pass-through" query against an intelligent back-end. In SQL Server, the filtering occurs in the engine running on the server, so SQL Server typically sends much smaller datasets to the client.
The answer to your question also depends on what you mean by "remote". If you pit Access and SQL Server against each other on the same network, SQL Server running on the server will consume only a small fraction of the bandwidth that Access does, if the Access MDB file resides on a file server. (Of course if the MDB resides on the local PC, no network bandwidth is consumed.) If you're comparing Access on a LAN versus SQL Server over broadband via the cloud, then you're comparing a nominal 100 mbit/sec pipe against DSL or cable bandwidth, i.e. against perhaps 20 mbit/sec nominal for high-speed cable, a fifth of the bandwidth at best, probably much less.
So you have to be more specific about what you're trying to compare.
Are you comparing Access clients on the local PC consuming an Access MDB residing on the file server against some other kind of client consuming data from a SQL Server residing on another server on the same network? Are you going to continue to use Access as the client? Will your queries be pass-through?

There is a software application for Windows that does that (simulates a low bandwidth, latency and losses if necessary). It not free though. The trial version has a 30-sec emulation limit. Here is the home page of that product: http://softperfect.com/products/connectionemulator/

#RedFilter: You should indicate which version of Access you are using. This document from 2006 shows that the story of what Access brings down to the client across the wire is more complicated than whether the query contains "Access-specific keywords".
http://msdn.microsoft.com/en-us/library/bb188204(SQL.90).aspx
But Access may be getting more and more sophisticated about using server resources with each newer version.
I'll stand by my simple advice: if you want to minimize bandwidth consumption, while still using Access as the GUI, pass-through queries do best, because then it is you, not Access, who will control the amount of data that comes down the wire.
I still think your initial question/approach is misguided: if your Access MDB file was located on the LAN in the first place (was it?) you don't need to simulate the effects of network latency. You need to sniff the SQL statements Access generates, rather than introducing some arbitrary and constant "network latency" factor. To compare an Access GUI using an MDB located on a LAN server against an upsized Access GUI going against a SQL Server back-end, you need to assess what data Access brings down across the wire to the client from the back-end server. Even "upsized" Access can be a hog at the bandwidth trough unless you use pass-through queries. But a properly written client for a SQL-Server back-end will always be far more parsimonious with network bandwidth than Access going against an MDB located on a LAN server, ceteris paribus.

Related

Microsoft Access database - queries run on server or client?

I have a Microsoft Access .accdb database on a company server. If someone opens the database over the network, and runs a query, where does the query run? Does it:
run on the server (as it should, and as I thought it did), and only the results are passed over to the client through the slow network connection
or run on the client, which means the full 1.5 GB database is loaded over the network to the client's machine, where the query runs, and produces the result
If it is the latter (which would be truly horrible and baffling), is there a way around this? The weak link is always the network, can I have queries run at the server somehow?
(Reason for asking is the database is unbelievably slow when used over network.)
The query is processed on the client, but that does not mean that the entire 1.5 GB database needs to be pulled over the network before a particular query can be processed. Even a given table will not necessarily be retrieved in its entirety if the query can use indexes to determine the relevant rows in that table.
For more information, see the answers to the related questions:
ODBC access over network to *.mdb
C# program querying an Access database in a network folder takes longer than querying a local copy
It is the latter, the 1.5 GB database is loaded over the network
The "server" in your case is a server only in the sense that it serves the file, it is not a database engine.
You're in a bad spot:
The good thing about access is that it's easy to create forms and reports and things by people who are not developers. The bad is everything else about it. Particularly 2 things:
People wind up using it for small projects that grow and grow and grow, and wind up in your shoes.
It sucks for multiple users, and it really sucks over a network when it gets big
I always convert them to a web-based app with SQL server or something, but I'm a developer. That costs money to do, but that's what happens when you use a tool that does not scale.

SQL Server Table > MS Access Local Copy?

I'm looking for a little advice.
I have some SQL Server tables I need to move to local Access databases for some local production tasks - once per "job" setup, w/400 jobs this qtr, across a dozen users...
A little background:
I am currently using a DSN-less approach to avoid distribution issues
I can create temporary LINKS to the remote tables and run "make table" queries to populate the local tables, then drop the remote tables. Works as expected.
Performance here in US is decent - 10-15 seconds for ~40K records. Our India teams are seeing >5-10 minutes for the same datasets. Their internet connection is decent, not great and a variable I cannot control.
I am wondering if MS Access is adding some overhead here than can be avoided by a more direct approach: i.e., letting the server do all/most of the heavy lifting vs Access?
I've tinkered with various combinations, with no clear improvement or success:
Parameterized stored procedures from Access
SQL Passthru queries from Access
ADO vs DAO
Any suggestions, or an overall approach to suggest? How about moving data as XML?
Note: I have Access 7, 10, 13 users.
Thanks!
It's not entirely clear but if the MSAccess database performing the dump is local and the SQL Server database is remote, across the internet, you are bound to bump into the physical limitations of the connection.
ODBC drivers are not meant to be used for data access beyond a LAN, there is too much latency.
When Access queries data, is doesn't open a stream, it fetches blocks of it, wait for the data wot be downloaded, then request another batch. This is OK on a LAN but quickly degrades over long distances, especially when you consider that communication between the US and India has probably around 200ms latency and you can't do much about it as it adds up very quickly if the communication protocol is chatty, all this on top of the connection's bandwidth that is very likely way below what you would get on a LAN.
The better solution would be to perform the dump locally and then transmit the resulting Access file after it has been compacted and maybe zipped (using 7z for instance for better compression). This would most likely result in very small files that would be easy to move around in a few seconds.
The process could easily be automated. The easiest is maybe to automatically perform this dump every day and making it available on an FTP server or an internal website ready for download.
You can also make it available on demand, maybe trough an app running on a server and made available through RemoteApp using RDP services on a Windows 2008 server or simply though a website, or a shell.
You could also have a simple windows service on your SQL Server that listens to requests for a remote client installed on the local machines everywhere, that would process the dump and sent it to the client which would then unpack it and replace the previously downloaded database.
Plenty of solutions for this, even though they would probably require some amount of work to automate reliably.
One final note: if you automate the data dump from SQL Server to Access, avoid using Access in an automated way. It's hard to debug and quite easy to break. Use an export tool instead that doesn't rely on having Access installed.
Renaud and all, thanks for taking time to provide your responses. As you note, performance across the internet is the bottleneck. The fetching of blocks (vs a continguous DL) of data is exactly what I was hoping to avoid via an alternate approach.
Or workflow is evolving to better leverage both sides of the clock where User1 in US completes their day's efforts in the local DB and then sends JUST their updates back to the server (based on timestamps). User2 in India, also has a local copy of the same DB, grabs just the updated records off the server at the start of his day. So, pretty efficient for day-to-day stuff.
The primary issue is the initial DL of the local DB tables from the server (huge multi-year DB) for the current "job" - should happen just once at the start of the effort (~1 wk long process) This is the piece that takes 5-10 minutes for India to accomplish.
We currently do move the DB back and forth via FTP - DAILY. It is used as a SINGLE shared DB and is a bit LARGE due to temp tables. I was hoping my new timestamped-based push-pull of just the changes daily would have been an overall plus. Seems to be, but the initial DL hurdle remains.

What mechanism to be used for asynchronous communication between two SQL Servers in the case?

We use a central SQL Server (2008 Standard edition) and several smaller, dedicated SQL Servers (Express editions). We need to implement some mechanism for transferring data asynchronously* from the dedicated decentralized SQL Server (bigger volume, see below) and back from the central SQL Server (few records, basically some notifications for the machines and possibly some optimization hints).
The dedicated SQL Servers are physically located near technology machines, and they are collecting say datetime, temperature rows in regular intervals (think about few seconds interval). There are about 500 records for one job, but the next job follows immediately (the machine does not know it is a new job--being quite stupid in the sense -- and simply collects the temperatures on and on).
The technology machines must be able to work without the central SQL Server, and the central SQL Server must work also when the machine is not accessible (i.e. its dedicated SQL engine cannot be reached, switched off with the machine). In other words, the solution need not to be super fast, but must be robust in the sense that no collected data is lost.
The basic idea is to move the collected data from the dedicated SQL Server (preprocessed to the normalized format with ID of the machine) to the well known table on the central SQL Server. Only the newer data should be sent to minimize the amount of the data. That transfer should be started by the dedicated SQL Server in regular intervals (say one hour) if the connection is OK. If the connection is not OK, the data will be sent after next hour, etc.
Another well known table on the central SQL Server will be used to send notifications for the dedicated SQL Server engines. This way the dedicated engine can be told (for example) what data was already processed/archived on the central SQL Server (i.e. the hint for what records may already be deleted from the local database on the dedicated machine), or whatever information that is hinted from the central (just hints or other not the real-time requirements). The hints will be collected by the dedicated SQL Server (i.e. also the machine responsibility). In other words, the central SQL Server only processes the well known, local tables. It does not try to connect the dedicated SQL Server machines.
The solution should use only the standard mechanisms -- SQL commands (via stored procedures), no external software. What kind of solution should I focus on?
Thanks,
Petr
[Edited later] The SQL servers are at the same Local Area Network.
If you are willing to make a mental switch and stop thinking in terms of tables and rows and instead think in terms of data and messages then Service Broker can do handle all the communication, delivery and message processing. Instead of locally (on the Express machines) doing INSERT INTO LocalTable(datetime, temperature) VALUES (...) you think in terms of:
BEGIN CONVERSATION WITH CentralServer ...;
SEND ON conversation MESSAGE TYPE [Measurement] (<datetime...><temperature ...>)
See Using Service Broker instead of Replication or High Volume Contiguous Real Time ETL
Sounds like a job for merge replication.

Why is it not advisable to have the database and web server on the same machine?

Listening to Scott Hanselman's interview with the Stack Overflow team (part 1 and 2), he was adamant that the SQL server and application server should be on separate machines. Is this just to make sure that if one server is compromised, both systems aren't accessible? Do the security concerns outweigh the complexity of two servers (extra cost, dedicated network connection between the two, more maintenance, etc.), especially for a small application, where neither piece is using too much CPU or memory? Even with two servers, with one server compromised, an attacker could still do serious damage, either by deleting the database, or messing with the application code.
Why would this be such a big deal if performance isn't an issue?
Security. Your web server lives in a DMZ, accessible to the public internet and taking untrusted input from anonymous users. If your web server gets compromised, and you've followed least privilege rules in connecting to your DB, the maximum exposure is what your app can do through the database API. If you have a business tier in between, you have one more step between your attacker and your data. If, on the other hand, your database is on the same server, the attacker now has root access to your data and server.
Scalability. Keeping your web server stateless allows you to scale your web servers horizontally pretty much effortlessly. It is very difficult to horizontally scale a database server.
Performance. 2 boxes = 2 times the CPU, 2 times the RAM, and 2 times the spindles for disk access.
All that being said, I can certainly see reasonable cases that none of those points really matter.
It doesn't really matter (you can quite happily run your site with web/database on the same machine), it's just the easiest step in scaling..
It's exactly what StackOverflow did - starting with single machine running IIS/SQL Server, then when it started getting heavily loaded, a second server was bought and the SQL server was moved onto that.
If performance is not an issue, do not waste money buying/maintaining two servers.
On the other hand, referring to a different blogging Scott (Watermasyck, of Telligent) - they found that most users could speed up the websites (using Telligent's Community Server), by putting the database on the same machine as the web site. However, in their customer's case, usually the db & web server are the only applications on that machine, and the website isn't straining the machine that much. Then, the efficiency of not having to send data across the network more that made up for the increased strain.
Tom is correct on this. Some other reasons are that it isn't cost effective and that there are additional security risks.
Webservers have different hardware requirements than database servers. Database servers fare better with a lot of memory and a really fast disk array while web servers only require enough memory to cache files and frequent DB requests (depending on your setup). Regarding cost effectiveness, the two servers won't necessarily be less expensive, however performance/cost ratio should be higher since you don't have to different applications competing for resources. For this reason, you're probably going to have to spend a lot more for one server which caters to both and offers equivalent performance to 2 specialized ones.
The security concern is that if the single machine is compromised, both webserver and database are vulnerable. With two servers, you have some breathing room as the 2nd server will still be secure (for a while at least).
Also, there are some scalability benefits since you may only have to maintain a few database servers that are used by a bunch of different web applications. This way you have less work to do applying upgrades or patches and doing performance tuning. I believe that there are server management tools for making these tasks easier though (in the single machine case).
I would think the big factor would be performance. Both the web server/app code and SQL Server would cache commonly requested data in memory and you're killing your cache performance by running them in the same memory space.
Security is a major concern. Ideally your database server should be sitting behind a firewall with only the ports required to perform data access opened. Your web application should be connecting to the database server with a SQL account that has just enough rights for the application to function and no more. For example you should remove rights that permit dropping of objects and most certainly you shouldn't be connecting using accounts such as 'sa'.
In the event that you lose the web server to a hijack (i.e. a full blown privilege escalation to administrator rights), the worst case scenario is that your application's database may be compromised but not the whole database server (as would be the case if the database server and web server were the same machine). If you've encrypted your database connection strings and the hacker isn't savvy enough to decrypt them then all you've lost is the web server.
One factor that hasn't been mentioned yet is load balancing. If you start off thinking of the web server and the database as separate machines, you optimize for fewer network round trips and also it gets easier to add a second web server or a second database engine as needs increase.
I agree with Daniel Earwicker - the security question is pretty much flawed.
If you have a single box setup with a webserver and only the database for that webserver on it, if that webserver is compromised you lose both the webserver and only the database for that specific application.
This is exactly the same as what happens if you lose the webserver on a 2-server setup. You lose the web server, and just the database for that specific application.
The argument that 'the rest of the DB server's integrity is maintained' where you have a 2-server setup is irrelevant, because in the first scenario, every other database server relating to every other application (if there are any) remain unaffected as well - being, as they are, hosted elsewhere.
Similarly, to the question posed by Kev 'what about all the other databases residing on the DB server? All you've lost is one database.'
if you were hosting an application and database on one server, you would only host databases on that server which related to that application. Therefore, you would not lose any additional databases in a single server setup when compared to a multiple server setup.
By contrast, in a 2 server setup, where the attacker had access to the Web Server, and by proxy, limited rights (in the best case scenario) to the database server, they could put the databases of every other application at risk by carrying out slow, memory intensive queries or maximising the available storage space on the database server. By separating the applications out into their own concerns, very much like virtualisation, you also isolate them for security purposes in a positive way.
I can speak from first hand experience that it is often a good idea to place the web server and database on different machines. If you have an application that is resource intensive, it can easily cause the CPU cycles on the machine to peak, essentially bringing the machine to a halt. However, if your application has limited use of the database, it would probably be no big deal to have them share a server.
Wow, No one brings up the fact that if you actually buy SQL server at 5k bucks, you might want to use it for more than your web application. If your using express, maybe you don't care. I see SQL servers run Databases for 20 to 30 applicaitions, so putting it on the webserver would not be smart.
Secondly, depends on whom the server is for. I do work for financial companies and the govt. So we use a crazy pain in the arse approach of using only sprocs and limiting ports from webserver to SQL. So if the web app gets hacked. The only thing the hacker can do is call sprocs as the user account on the webserver is locked down to only see/call sprocs on the DB. So now the hacker has to figure out how to get into the DB. If its on the web server well its kind of easy to get to.
It depends on the application and the purpose. When high availability and performance is not critical, it's not bad to not to separate the DB and web server. Especially considering the performance gains - if the appliation makes a large amount of database queries, a considerable amount of network load can be removed by keeping it all on the same system, keeping the response times low.
I listened to that podcast, and it was amusing, but the security argument made no sense to me. If you've compromised server A, and that server can access data on server B, then you instantly have access to the data on server B.
I think its because the two machines usually would need to be optimized in different ways. Other than that I have no idea, we run all our applications with the server-database on the same machine - granted we're not public facing - but we've had no problems.
I can't imagine that too many people care about one machine being compromised over both since the web application will usually have nearly unrestricted access to at the very least the data if not the schema inside the database.
Interested in what others might say.
Database licences are not cheep and are often charged per CPU, therefore by separating out your web-servers you can reduce the cost of your database licences.
E.g if you have 1 server doing both web and database that contains 8 CPUs you will have to pay for an 8 cpu licence. However if you have two servers each with 4 CPUs and runs the database on one server you will only have to pay for a 4 cpu licences
An additional concern is that databases like to take up all the available memory and hold it in reserve for when it wants to use it. You can force it to limit the memory but this can considerably slow data access.
Something not mentioned here, and the reason I am facing, is 0 downtime deployments. Currently I have DB/webserver on same machine and that makes updates a pain. If you they are on a seprate machine, you can perform A/B releases.
I.e.:
The DNS currently points to WebServerA
Apply sofware updates to WebServerB
Change DNS to point to WebServerB
Work on WebServerA at leisure for the next round of updates.
This works before the state is stored in the DB, on a separate server.
Arguing that there is a real performance gain to be had by running a database server on a web server is a flawed argument.
Since Database servers take query strings and return result sets, the data actually flowing from data server to web server is relatively small, but the horsepower required to process the query and generate the result set is relatively large. Optimizing performance around the data transfer time therefore is optimizing around the wrong thing.
Regarding security, there are advantages to having the data server on a different box than the web server. Having such a setup is not the be all and end all of security, but it is a step in the right direction.
Regarding scalability, it is easy and relatively cheap to add web servers and put them into cluster to handle increased traffic. It is not so easy and cheap to add data servers and cluster them. Also, web servers and data servers have different hardware needs, so multiple boxes help out with scalability.
If you are starting small and have only one box, then a good way would go would be to use virtual machines. Running the web server and data server in different VMs on one host gives you all the gains of separate boxes at the cost of one large box price.
Operating system is another consideration. While your database may require larger memory spaces and therefore UNIX, your web server - or more specifically your app server since you mention only two tiers - may be a .Net-based, and therefore require Windows.
Ok! Here is the thing, it is more Secure to have your DB Server installed on another Machine and your Application on the Web Server. You then connect your application to the DB with a Web Link. Thanks it.

Running SQL Server on the Web Server

Is it good, bad, or indifferent to run SQL Server on your webserver?
I'm using Server 2008 and SQL Server 2005, but I don't think that matters to this question.
For small sites, it doesn't make a bit of a difference.
As the load grows, though, this scales really badly, and quicker than you think:
Database servers are built on the premise they "own" the server. They trade memory for speed and they easily use all available RAM for internal caching.
Once resources start to be scarce, profiling becomes very difficult -- it is clear that IIS and SQL are both suffering, less clear where the bottleneck is. IIS needs CPU, SQL Server needs RAM or CPU etc etc
No matter how many layers you put in your code, it all runs on the same CPU, therefore a single layered application will run better in this context -- less overhead -- but it will not scale.
Security is really bad, usually you isolate SQL behind a firewall!
If you can afford it, it's probably better to shell out a few bucks and get a second server, maybe using PostgreSQL. One IIS server and one PostgreSQL cost about as much as on IIS + SQL Server because of licensing costs...
Larger shops would probably not consider this a best practice... However, if you aren't dealing with hundreds of requests per second, you're fine putting them both on one box.
In fact, for small apps, you will see better performance on the back-end because data does not have to go across the wire. It's all about scale.
Keep in mind that database servers eat memory. Here's one important lesson from the school of hard knocks: if you decide to run SQL Server 2005 on the same machine as your web server (and that is the setup you mentioned in your question), make sure you go into Sql Server Management Studio and do this:
Right click on the server instance and click properties
Select 'memory' from the list on the left
Change 'maximum server memory' to something your server can sustain.
If you don't do that, SQL Server will eventually take up all of your server's RAM and hang onto it indefinitely. This will cause your server to more or less sputter and die. If you are not aware of this, it can be very frustrating to troubleshoot.
I've done this quite a few times. It's not something you would do if you had the infrastructure of a large corporation and it does not scale, but it's fine for a lot of things.
It really comes down to how much work your webserver and your sql server are doing.
Without more information I doubt you are going to get any helpful answers.
If your web server is publicly accessible, this is a VERY bad idea from a security perspective.
Although it makes a lot of things more difficult from a routing, firewall, ports, authentication, etc. perspective, separation is good. When you have your database server running on the web server, if your web server is compromised, then your sql server is, too.
When you have them on separate boxes, you've raised the bar a little.
There's still a lot more work to be done to secure your web server AND your database server, but why make it easier than it needs to be?
I'd say it was best to run them on the same server until it becomes a problem. That way you'll save yourself some money and time upfront. Once the site becomes a success and requires a some architectural changes it should have already paid for itself.
Remember to back up :)
It will depend on the expected load of the server. For small sites, it is no problem at all (if correctly configured). For large sites, you might want to consider distributing the load over different servers: web server, file server, database server, etc.
I've seen this issue over and over again. The right answer is to put SQL Server on one machine and IIS (web server) on the other. Your money will go into the SQL Server machine because the right drive system and RAM must be purchased to support a efficient server but the web server can be a much scaled down & less expensive machine with just a mirrored drive set.

Resources