Approach - Optimizing LINQ for performance - sql-server

My question is in several parts and could go long, so sorry about that.
I am using LINQ to pass on data across to Silverlight client.
Part 1: Connection pooling & optimization:
I want to know how can I ensure that LINQ establishes only one connection to database and also follow connection pooling. In the past, I used to put the connection pooling property in the SQL Connection string itself. So, if I put this setting file, where the connection string is set when I add tables in DBML file, will LINQ respect that
LINQ (DBML) forms core of my data access layer. I write classes that has methods which are basically LINQ queries.
I use the DataContext object in "using" block in my code where I handle data related part.
Is this the reason why LINQ might be using multiple connection? If I have have DataContext as a class level variable, will this ensure only one connection?
Part 2: Optimization
In not so distant past, ADO.Net days, I used to write stored procedures, execute it via a DataReader and then loop through the DataReader, fill as many objects of my Model class and pass on that collection for binding to DataGrid.
In LINQ days, I do more or less same as far creating a collection of objects is concerned. But I execute direct LINQ statements itself.
I can guess that SQL Stored procedures will give me faster performance, but if I execute Stored procedures using LINQ will it be just as fast as the older days and is it a right approach?

Ohoh.... so many bad habits.
Part 2: Optimization In not so distant
past, ADO.Net days, I used to write
stored procedures, execute it via a
DataReader and then loop through the
DataReader, fill as many objects of my
Model class and pass on that
collection for binding to DataGrid. In
LINQ days, I do more or less same as
far creating a collection of objects
is concerned. But I execute direct
LINQ statements itself. I can guess
that SQL Stored procedures will give
me faster performance, but
Basically, in the past you had some minor dlusions. Stored procedures have NO (!) performance gain for more than 10 years now. It is clearly spelled out in the documentation. SQL execution is as fast as for non stored procedures, query plans are cached and reused for both.
SP's are only good (as in: saving time) if they avoid round trips (i.e. sending multiple batches of requests from the client oto the server). And then the savings are not due to the fact that they are storped procedures, but due to the fact that round trips cost time.
Sadly, many programmers still have delusions because they get them from other people who got them.... 10 years ago when stored procedures had an intrinsic advantages. Now that this is SQL Server 6.5 time - since 7.0 this is history.
http://weblogs.asp.net/fbouma/archive/2003/11/18/38178.aspx
Basically you treated lost development time and lots of useless code against.... mostlikely no measurable advantage.
Part 1: Connection pooling &
optimization: I want to know how can I
ensure that LINQ establishes only one
connection to database and also follow
connection pooling.
You dont. Basically dont try to be smarter than is good for you. What is wrong with multiple connections IF THEY MAKE SENSE?
For pooling, yu should (sql server) not to have to put anything in the connection string normally. And yes, LINQ will not magically bypass the connection pool defined in the connection string. See, LINQ does not EVER talk to the database - it uses ADO.NET for this, and ADO.NET has not magically changed behavior just because some higher level ORM is using it instead of you. The connection string has pooling entries, ADO.NET still sees them and follows them.
Now, having only one database connection from a server pool is one thing: STUPID. It limits ou to one transaction at a time and totally destroys performance once the load gets higher (i.e. multiple requests need to be handled at the same time).
I use the DataContext object in
"using" block in my code where I
handle data related part. Is this the
reason why LINQ might be using
multiple connection? If I have have
DataContext as a class level variable,
will this ensure only one connection?
Ah - depends. It may make sense, it may not. You know, before thinking you have a problem, especially given the hugh amount of information you give here (i.e. none), do the ONLY sensible thing: grab a profiler and MEASURE whether you have one. Openng / closing a connection 100 times or 1000 times most likely will not even show up on the profiler.No problem = no reason to fix something.
That said, I dont like a per method connection opening - it normally shows a bad class design. Connections should be reused within a unit of work.

Related

How to compare Accesss SQL result to SQL Server results efficiently?

I am trying to convert my Access query to SQL views. I can see how many records are returned in both cases easily. But it is getting difficult to make sure the record matches as I have to check it manually. Also I don't check every single record so there might be some conversion error that I did not notice.
Is there a way to check at least some segment of the result automatically?
In a way, you are asking the wrong question.
When you migrate the data to sql server, then at that point you assumed all data made it to sql server. I suppose you could do a one-time check of the record counts.
At that point, you then link the tables (that likely pointed to an access back end) to now pointing to sql server.
At this point, all of the existing queries in Access should and will work fine. ZERO need exists to test and check this after a migration.
For existing forms that are bound to a table (a linked table), then you DO NOT NEED to create views on sql server and this is simply not done nor required.
The ONLY time you will convert an access query to a view is if the access query runs slow.
For a simple query say on a table of 1 million invoices, you can continue to use the Access query (that is running against the linked tables). If you do a query such as
Select * from tblInvoices where invoiceNum = 12345
Then Access will ONLY pull the one record down the network pipe. So ZERO need and performance gains will occur by converting this query to a view (do not do this!!! – you are simply wasting valuable time and money).
However, for a complex query with multiple joins, and multiple tables (which are NOT in general used for a form), then you do want to convert such quires to a view, and then given the view the SAME name as what the access (client side) query had.
So you not “out of the blue” going to wind up with a large number of views on sql server after a migration.
You migrate the data, link the tables. At that point 99% of your application will work just fine as before. After I migration you DO NOT NEED TO create any views.
Creating views is a “after the fact” and occurs AFTER you have the application running. So views are NOT created until such time you have the application up and running. (And deployed).
To get your access application to work with sql server you DO NOT create ANY views. You do not even create ONE view.
You migrate the data, link the tables, and now your application should work.
You ONLY start creating views for access queries that run slow.
So for example, you might have a report that now runs slow. The reason is that while access does a good job for most existing quires (that you don’t have to change), for some, you find that the access client does a “less than ideal” job of running that query.
So you flip that query in access into sql view mode, cut + paste that sql into the sql manager, (creating a view). You then modify the syntax of the query. For some sql, it might be simply less effort to bring up the query in the access designer, and then bring up the sql designer in sql server, and simply re-create the sql from scratch in place of the cut + paste idea. (Both approaches work fine – it depends on how “crazy” the sql is you are working with as to which approach you find works better (try both – see what suits you best). Some like to cut+psate the sql, and others like to simply look at the graphical designer in both cases (on two monitors) and work that way.
At this point, after you created the sql query (the view), you simply run that view, and look at the record count. You then run the query that you have on the access client, and if it returns the same number of records, then in 999.9% of cases, you can be rather confident that the sql query (the view) produces the same results as the client query. You now delete (or re-name) the access query, link to the view (giving it the same name as the access query, and you are now done.
The 2, or 20 places that used that access client query will now be using the view. You should not have to worry or modify one thing in Access after having done this.
Now that the view is working, you then re-deploy your access front end to all workstations (I assume you have some means to update the access application part – just like you did even when not using sql server).
At this point, users are all happy and working.
If in the next day or so, you now find another report that runs slow (most will not), then that access query becomes a possible choice by you to convert to a view.
So out an application of say 200 quires in access, you will in general likely only have to convert 2-10 of them to a views. The VAST majority of the access saved queries DO NOT AND SHOULD NOT be converted to views. It is a waste of time to convert an access query that runs perfectly well and performs well to a view.
So your question suggests that you have to convert a “lot” of Access quires to views. This is NEVER the case nor is it a requirement.
So the amount of times that you need to convert an access query to a view is going to be quite rare. Most forms are based on a table. Or now a linked table to sql server. In these cases you do NOT want to convert to a view, and you don’t need to.
So the answer here is simply run the access query, and then run the view. They are both working on the same data (you using linked tables and the access queries will HIT those linked tables and work just fine). If the access client using the linked tables to sql server returns the same number of records as your view on sql server, then as noted, you are 999.8% done, and you are now safe to re-name the access query, and link to the view with the same name.
However, the number of times you do this after a migration is VERY low and VERY rare.
You don’t convert any existing access saved query to a view unless running that query in access is taking excessive time.
As noted, for “not too complex” sql quires (ones that don’t involve joins between too many tables), you will not see nor find a benefit by converting that access query to a view. You KEEP and use the existing Access quires as they are.
You only want to spend time on the “few” queries that run slow, the rest do NOT need to be modified.
For an existing form that is bound to a table (or now a linked table), such forms are NOT to be changed to views – they are editing tables directly, (or linked tables), and REALLY REALLY need to remain as such. Attempting to covert such forms using a linked table to a view is HUGE mess, and something that will not help performance, but worse only introduce potential 100’s of issues and bugs into your existing application, and you doing something that is NOT required.
Because you will create so VERY FEW views, then a automated approach is not required. You simply make the one query, run it, and then run the existing access query (client side). If they both return the same number of reocrds, then it is a cold chance in hell day that you need further testing. You going to be creating one view at a time, and creating these views over a "longer" period of time. You NEVER migrate access and then migrate a large number of queries to sql server. That is NOT how a migration works.
You migrate, link the tables. At that point your application should work fine. You get it working (and not yet have created any views). You get the application deployed. ONLY after you have everything working just fine do you THEN start to introduce views into the mix and picture. As noted, because you only ever going to work with introducing ONE view into the application, and this will occur "over time" while users are using the existing and working application, then little need for some automated approach and test is required. As I stated, with 200+ access client queries, you should not need to convert more then about 5, maybe 10 to views - the rest are to remain as they are now and untouched and unmodified by you.

Effect of stored procedures on network traffic in Access/SQL setup

I am currently administering/developing an Access 2010 frontend/SQL backend database. We are trying to improve frontend performance, and one solution that has been suggested is pushing a lot of the VBA that is running the front end down into stored procedures on the server. I'm fairly proficient in VBA, but very new to SQL and network architecture. Everything I've turned up on google has been information about splitting the database, which is already done, rather than information about network loads resulting from running stored procedures vs running VBA.
What is the difference in network traffic between the current setup and pushing this action down to a stored procedure?
As a specific example, if I'm populating a form in the current setup, there are a few queries run to provide data to different elements on the form. With the current architecture, does Access retrieve the queried tables from the backend, query them client-side and then populate the data? How would that be different in terms of network traffic from, say, executing a SP when the form loads, and only transferring the data necessary for displaying the form?
The end goal is to reduce the chattiness between Access and SQL, and I'm mostly trying to figure out exactly what is happening where.
As a general rule, if you launch a form open with a where clause to restrict the form to one record, then using a bound form, or adopting a stored procedure will NOT result in any difference or reduction in network traffic.
Any local access query based on a table simply will request the one record. There is no “local” concept of processing in this regards EVEN with a linked table. Note the word “table” or singular here.
Access does not and will not pull down a whole table unless you have such forms and quires without any “where” clause to restrict the data pulled.
In other words if you have a poorly designed form, dump and change that design to something in which you now ONLY pull down the one record, then of course the setup will result in reduced network traffic.
However the above reduction is NOT DUE to adopting the stored procedure but ONLY that of adopting a design in which you restrict the records requested into the form.
So doing something poorly and then improving that process is NOT a justification to adopt stored procedures.
Thus in the case of pulling records into a form the using a stored procedure will NOT improve performance. Worse is binding a form to a stored procedure results in a form that is READY ONLY anyway!
So stored procedures don’t necessary increase performance or reduce network traffic when talking about loading a record into a form in terms of response time or performance.
If you have to do large amounts of recordset processing then of course adopting a stored procedure can save network performance. So in place of some VBA code to process 100,000 payroll reocrds, then yes moving such code server side will help. However processing a 100,000 payroll records is NOT common task and is NOT a user interface issue in most cases anyway. In other words, you don’t have a slow loading form or slow response time to load such forms. In other words, such types of processing are NOT done interactive by users waiting for a form to load.
SQL server is indeed a high performance system, and also a system that can scale to many users.
If you write your application in c++, or VB or in your case with ms-access, in GENERAL the performance of all of these tools will BE THE SAME.
In other words...sql server is rather nice, and is a standard system used in the IT industry.
However, sql server will NOT solve your performance issues without efforts on your part. And, it turns out that MOST of those same efforts also make your non sql server Access applications run better.
In fact, we see many posts that mention moving the back end data
to sql server actually slowed things down. (and in fact on a single machine, Access JET (now called ACE) is actually FASTER THEN SQL server (so when single user on same machine – Access is faster than SQL server on the same machine in most cases).
A few things:
Having a table with 75k records is quite small. Let’s assume you have 12 users. With a just a 100% file base system (jet), and no sql server, then the performance of that system should really have screamed.
I have some applications out there with 50, or 60 HIGHLY related tables. With 5 to 10 users on a network, response time is instant. I don't think any form load takes more than one second. Many of those 60+ tables are highly relational and in the 50 to 75k records range.
So, with my 5 users I see no reason why I can’t scale to 15 users with such small tables in the 75,000 record range. And this is without SQL server.
If the application did not perform with such small tables of only 75k records then upsizing to sql server will do absolute nothing to fix performance issues. In fact, in the sql server newsgroups you see weekly posts by people who find that upgrading to sql actually slowed things down.
I even seem some very cool numbers showing that some queries where actually MORE EFFICIENT in terms of network use by JET then sql server.
My point here is that technology will NOT solve performance problems. However, good designs that make careful use of limited bandwidth resources is the key here. So, if the application was not written with good performance in mind then you kind are stuck with a poor design!
I mean, when using a JET file share, you grab a invoice from the 75k record table only the one record is transferred down the network with a file share (and, sql server will also only transfer one record). So, at this point, you
really will NOT notice any performance difference by upgrading to SQL Server. There is no magic here. And adopting a SQL stored procedure will be even a GREATER waste of time!
And adopting a stored procedure in place of above will NOT gain you performance either!
Sql server is a robust and more scalable product then is JET. And, security, backup and host of other reasons make sql server a good choice. However, sql server will NOT solve a performance problem with dealing with such small tables as 75k records
Of course, when efforts are made to utilize sql server, then significant advances in performance can be realized.
I will give a few tips...these apply when using ms-access as a file share (without a server), or even odbc to sql server:
** Ask the user what they need before you load a form!
The above is so simple, but so often I see the above concept ignored. For example, when you walk up to an instant teller machine, does it download every account number and THEN ASK YOU what you want to do?
In access, it is downright silly to open up form attached to a table WITHOUT FIRST asking the user what they want! So, if it is a customer invoice, get the invoice number, and then load up the form with the ONE record. How can one record be slow? When done editing the record and the form is closed, and you are back to the prompt ready to do battle with the next customer.
You can read up on how this "flow" of a good user interface works here (and this applies to both JET, and sql server applications):
http://www.kallal.ca/Search/index.html
My only point here is restrict the form to only the ONE record the user needs. You don't need nor gain by using a stored procedure to accomplish this task. I am always dismayed how often a developer builds a nice form, attaches it to a large table, and then opens it and the throws this form attached to some huge table and then tells the users to go have at this and have fun. Don't we have any kind of concern for those poor users? Often, the user will not even know how to search for something!
So prompt, and asking the user also makes a HUGE leap forward in usability. And, the big bonus is reduced network traffic too! Gosh better and faster, and less network traffic! What more do we want!
** USE CAUTION with quires that require more than one linked table
JET has a real difficult time joining odbc tables together. Often the Access data engine (jet/Ace) does a good job, but often such joins are slow. However most forms for editing data are NOT based on a multi-table query. (so again, a stored procedure will not speed up form load for editing of data).
The simple solution for such multiple joins (for both forms and reports) is build the sql server side as a view, and then link to that view.
This view approach is MUCH less work then a stored procedure and results in the joins occurring server side. And results view are updatable as opposed to READ ONLY when you adopt stored procedures. And performance of such views will again equal that of stored procedure in THIS context.
So once gain, adopting stored procedures DOES NOT help and is more expensive from a developer cost then simply using a view. Really this just amounts to people suggesting that you rack up bills and use developer time to create something that yields nothing over that of a view except more billable hours.
I don't think it needs pointing out that if the query in question already runs well, then the above can be ignored, but just keep in mind that local queries with more than one table based on links to sql server can often run slow. So, just be aware of the above.
This view trick also applies well to combo boxes.
So one can continue to use bound forms to a linked table but one simply needs to restrict the form to the ONE RECORD you need.
You can safely open up to a single invoice form etc. but simply ENSURE you open such forms (openForm) by restricting records via the "where" clause. No view, or stored procedure is required here.
Bound forms are way less work then un-bound forms and performance is generally just as good anyway when done right.
Avoid large loading of combo boxes. A combo box is good for about 100 entries. After that you are torturing the user (what they got to look through 100s of entries). So, keep things like combo boxes down to a min size. This is both faster and MORE importantly it is kinder to your users.
After all, at the end of the day what we really want is to treat users well. It seems that treating the users well, and reducing the bandwidth (amount of data) goes hand in hand.
So, better applications treat the users well and run faster! (this is good news!)
So, #1 tip is to reduce the data that you transfer into a form.
Using stored procedures is not required in the vast majority of cases and will not reduce bandwidth requirements anymore then adopting where clauses and views.

Database locked (or slow) with Linq To Entities and Stored Procedures

I'm using Linq To Entities (L2E) ONLY to map all my Stored Procedures in my database for easy translation into objects. My data is not really sensitive (so I'm considering Isolation level "READ UNCOMMITED" everywhere). I have several tables with millions of rows. I have the website and a bunch of scripts utilizing the same datamodel created using entity framework. I have indexed all tables (max 3 for each table) so that every filter I use is directly catched by an index. My scripts mainly consists of
1) Get your from DB (~5 seconds)
2) Making API (1-3 seconds)
3) Adding result in database
I have READ_COMMITTED_SNAPSHOT and ALLOW_SNAPSHOT_ISOLATION set to ON.
Using this strategy most my queries is fixed and very fast (usually). I still have some queries used by my scripts that could run up to 20 second but the are not called that often.
The problem is that suddenly the whole database gets slow and all my queries are returned slowly (can be over 10 seconds). Using Sql Profiler I have tried to find the issue.
As mentioned Im considering NOLOCKS using "READ UNCOMMITED"... Right now I'm going through each possible database call and adding indexes and/or caching tables to make the call faster.
I have also considered removing L2E and accesing the "old way" to be sure thats not the issue. Is My data context looking my tables? it sure looks that way. I have experimenting haveing the context living over the API call to minimize created context but right now I create a new context for each call since I thought it was looking the database.
The problem is that I cannot control, that every single call is made fast, for all eternity otherwise the whole system gets slowed down.
When I restart sql server and rebuild indexes its really fast for a short period of time before everything gets slow again. Any pointers would be appreciated.
Have you considered checking if there are any major waits on the server.
Review the following page on sys.dm_os_wait_stats (Transact-SQL) and see if you can get any insight into why the server is slow....

How much performance do I lose by increasing the number of trips to SQL Server?

I have a web application where the web server and SQL Server 2008 database sit on different boxes in the same server farm.
If I take a monolithic stored procedure and break it up into several smaller stored procs, thus making the client code responsible for calls to multiple stored procedures instead of just one, and I going to notice a significant performance hit in my web application?
Additional Background Info:
I have a stored procedure with several hundred lines of code containing decision logic, update statements, and finally a select statement that returns a set of data to the client.
I need to insert a piece of functionality into my client code (in this sense, the client code is the ASP web server that is calling the database server) that calls a component DLL. However, the stored procedure is updating a recordset and returning the udpated data in the same call, and my code ideally needs to be called after the decision logic and update statements are called, but before the data is returned to the client.
To get this functionality to work, I'm probably going to have to split the existing stored proc into at least two parts: one stored proc that updates the database and another that retrieves data from the database. I would then insert my new code between these stored proc calls.
When I look at this problem, I can't help but think that, from a code maintenance point of view, it would be much better to isolate all of my update and select statements into thin stored procs and leave the business logic to the client code. That way whenever I need to insert functionality or decision logic into my client code, all I need to do is change the client code instead of modifying a huge stored proc.
Although using thin stored procs might be better from a code maintenance point-of-view, how much performance pain will I experience by increasing the number of trips to the database? The net result to the data is the same, but I'm touching the database more frequently. How does this approach affect performance when the application is scaled up to handle demand?
I'm not one to place performance optimization above everything else, especially when it affects code maintenance, but I don't want to shoot myself in the foot and create headaches when the web application has to scale.
in general, as a rule of thumb you should make roundtrips to SQL server at a minimum.
the "hit" on the server is very expensive, it's actually more expensive to devide the same operation into 3 parts then doing 1 hit and everything else on the server.
regarding maintenance, you can call 1 stored proc from the client, having that proc call another 2 proc's.
I had an application with extreme search logic, thats what I did to implement it.
some benchmarking results...
I had a client a while back that had servers falling and crumbling down, when we checked for the problem it was many roundtrips to SQL server, when we minimized it, the servers got back to normal.
It will affect it. We use a Weblogic server where all the business logic is in the AppServer connected to a DB/2 database. We mostly use entity beans in our project and for most business service calls make several trips to the DB with no visible side effects. (We do tune some queries to be multi-table when needed).
It really depends on your app. You are going to need to benchmark.
A well setup SQL Server on good hardware can process many thousands of transactions per second.
In fact breaking up a large stored procedure can be beneficial, because you can only have one cached query plan per batch. Breaking into several batches means they will each get their own query plan.
You should definitely err on the side of code-maintenance, but benchmark to be sure.
Given that the query plan landscape will chnage, you should alos be prepared to update your indexes, perhaps creating different covering indexes.
In essence, this question is closely related to tight vs. loose coupling.
At the outset: You could always take the monolithic stored procedure and break it up into several smaller stored procs, that are all called by one stored procedure, thus making the client code only responsible for calling one stored procedure.
Unless the client will do something (change the data or provide status to user) I would probably not recommend moving multiple calls to the client, since you would be more tightly coupling the client to the order of operations for the stored procedure without a significant performance increase.
Either way, I would benchmark it and adjust from there.

Many connections vs. big data queries

Hello I am creating a windows application that will be installed in 10 computers that will access the same database thru Entity Framework.
I was wondering what's better:
Spread the queries into packets (i.e. load contact then attach the included navigation properties - [DataContext.Contacts.Include("Phone"]).
Load everything in one query rather then splitting it out in individual queries.
You name it.
BTW I have a query that its trace string produced over 500 lines of sql, im doubting, maybe i should waive user-exprience for performance since performance is also a part of u.e.
You could put your SQL in stored procedures and write your Entity Framework logic to use the procedures instead of generating the SQL and sending it over the wire.
As with everything database related, it depends. Things like the connection type (LAN vs WAN), how you handle caching, database load level, type of database load (writes vs reads) etc, can all make a difference.
But in general, whenever you can reduce the number of round trips to the database that's a good thing. And remember: you can have more than one result set after executing a single SqlCommand.
Load everything in one query rather
then splitting it out in individual
queries.
This will normally be superior. You're usually better off writing chunkier queries than chatty ones. Fewer calls have less overhead - you need to obtain fewer connections, deal with less latency, etc..
Does the database server have to support other applications? For most business software applications, SQL server won't even break a sweat servicing ten clients - particularly performing basic entity lookups. It won't even really know you're there unless it's installed on a 486SX.

Resources