I have a very simple UPDATE query in Access which should update the underlying SQL Server. For whatever reason, Access does not pass the query to server but handles it by itself, i.e., it does an update query for each row. Since the table is huge, this runs forever.
This is the query as generated with the editor.
UPDATE dbo_myTable
SET dbo_myTable.myColumn = 'A'
WHERE dbo_myTable.myOtherColumn = 123;
If I run the same query as pure SQL it takes only seconds - as expected.
UPDATE dbo.myTable
SET dbo.myTable.myColumn = 'A'
WHERE dbo.myTable.myOtherColumn = 123;
The problem is not the 'A' value. If I change it from 'A' to Null the problem remains.
Background:
My actual update query is more complicated and involves joins and multiple conditions. While debugging the speed issues I could break it down to the above simple query which is already slow.
I used the SQL Server Profiler to confirm my guess that access does a query for each row instead of passing the whole query to the SQL Server.
Related:
I had a similar question a while ago: Force MS Access to send full query to SQL server . While the problem is the same - Access not passing the whole query - the solution has to be different because here there are really no special commands whatsoever here.
The syntax for update queries in Access is significantly different from that of SQL server, especially regarding joins. They can't be handed off to SQL server.
One of the main differences is that in Access, an update query write locks all included tables by default and can write to all of them, while in SQL server, you have a separate FROM section, and the query only write locks and writes to a single table.
Instead, use a passthrough query to execute an update query on SQL server if performance is an issue.
Perhaps you can fool Access to call a bulk update:
Sql = "UPDATE dbo.myTable SET dbo.myTable.myColumn = 'A' WHERE dbo.myTable.myOtherColumn = 123;"
CurrentDb.Execute Sql, dbQSPTBulk
The above dbqQSPTBulk is supposed to be used with dbQSQLPassthough
but you don’t have to.
If you use above, then only one update command is sent.
Credit: Albert Kallal
Related
I'm issuing a fairly simple update of a single varchar column against a remote linked server - like this:
UPDATE Hydrogen.CRM.dbo.Customers
SET EyeColor = 'Blue'
WHERE CustomerID = 619
And that works fine when is written as an ad-hoc query:
Parameterized queries bad
When we do what we're supposed to do, and have our SqlCommand issue it as a parameterized query, the SQL ends up being: (not strictly true, but close enough)
EXEC sp_executesql N'UPDATE [Hydrogen].[CRM].[dbo].[Customers]
SET [EyeColor] = #P1
WHERE [CustomerID] = #P5',
N'#P1 varchar(4),#P5 bigint',
'Blue',619
This parameterized form of the query ends up performing a remote scan against the linked server:
It creates a cursor on the linked server, and takes about 35 seconds to pull back 1.2M rows to the local server through a series of hundreds of sp_cursorfetch - each pulling down a few thousand rows.
Why, in the world, would the local SQL Server optimizer ever decide to pull back all 1.2M rows to the local server in order to update anything? And even if it was going to decide to pull back rows to the local server, why in the world would it do it using a cursor?
It only fails on varchar columns. If I try updating an INT column, it works fine. But this column is varchar - and it fails.
I tried other parametrizing the column as nvarchar, and it's still bad.
Every answer I've seen actually are questions:
"is the collation the same?"
"What if you change the column type?"
"Have you tried OPENQUERY?"
"Does the login have sysadmin role on the linked server?"
I already have my workaround: parameterized queries bad - use ad-hoc queries.
I was hoping for an explanation of the thing that makes no sense. And hopefully if we have an explanation we can fix it - rather than workaround it.
Of course I can't reproduce it anywhere except the customer's live environment. So it is going to require knowledge of SQL Server to come up with an explanation of what's happening.
Bonus Reading
Stackoverflow: Remote Query is slow when using variables vs literal
Stackoverflow: Slow query when connecting to linked server
https://dba.stackexchange.com/q/36893/2758
Stackoverflow: Parameter in linked-server query is converted from varchar to nvarchar, causing index scan and bad performance
Performance Issues when Updating Data with a SQL Server Linked Server
Update statements causing lots of calls to sp_cursorfetch?
Remote Scan on Linked Server - Fast SELECT/Slow UPDATE
I've been looking at logging procedure executions on our reporting database to a table, and aim to come up with a generic snippet of code that can be copied into any proc we want to log.
The above lead me to play around with ##ProcID. The Microsoft documentation explains that this will provide the object ID of the proc, udf, or trigger within which it is contained. That makes sense, but I'm also seeing it return a value when run directly from a new query window. I've not been able to relate this integer to an object id in the database - I have no idea what this Id represents. I'm sysadmin on the server I'm trying this on so there shouldn't be any permission restrictions.
I haven't managed to find anything online about this - the only search result which looked relevant is on a login restricted SAP support forum.
Use Master
select ##procid -- returns an integer
select object_name(##procid) -- NULL
select * from sys.objects where object_id = ##ProcId -- 0 rows
While this isn't documented, the value corresponds to the objectid attribute of the cached query plan, as returned by sys.dm_exec_plan_attributes. The meaning of that is documented: "For plans of type "Adhoc" or "Prepared", it is an internal hash of the batch text."
To confirm, the following query returns the text of the query itself (and thus serves as a form of quine for SQL Server, albeit one that cheats as it inspects runtime values):
SELECT t.[text]
FROM sys.dm_exec_cached_plans p
CROSS APPLY sys.dm_exec_sql_text(p.plan_handle) t
CROSS APPLY sys.dm_exec_plan_attributes(p.plan_handle) a
WHERE a.attribute = 'objectid' AND a.value = ##PROCID
It depends what tool you are using to submit the command. Many tools will create a temporary stored procedure containing your commands (using ODBC prepared statement for example), and then run that procedure.
Speculating, it may be that the tool is detecting that the statement is unchanged and therefore re-using the previous prepared statement. In this case SQL server would not be involved, it would be the client library.
Alternatively, it may be that the server is detecting the sql is unchanged, and the preserved procid is a consequence of query-plan caching system. (SQL server attempts to detect repeated ad-hoc statements and optimise by re-using the plans for them.)
Either way you should consider this a curiosity, not something you should rely on for correct operation of your system as it may well change with updates to SQL Server or your client library.
About 5 times a year one of our most critical tables has a specific column where all the values are replaced with NULL. We have run log explorers against this and we cannot see any login/hostname populated with the update, we can just see that the records were changed. We have searched all of our sprocs, functions, etc. for any update statement that touches this table on all databases on our server. The table does have a foreign key constraint on this column. It is an integer value that is established during an update, but the update is identity key specific. There is also an index on this field. Any suggestions on what could be causing this outside of a t-sql update statement?
I would start by denying any client side dynamic SQL if at all possible. It is much easier to audit stored procedures to make sure they execute the correct sql including a proper where clause. Unless your sql server is terribly broken, they only way data is updated is because of the sql you are running against it.
All stored procs, scripts, etc. should be audited before being allowed to run.
If you don't have the mojo to enforce no dynamic client sql, add application logging that captures each client sql before it is executed. Personally, I would have the logging routine throw an exception (after logging it) when a where clause is missing, but at a minimum, you should be able to figure out where data gets blown out next time by reviewing the log. Make sure your log captures enough information that you can trace it back to the exact source. Assign a unique "name" to each possible dynamic sql statement executed, e.g., each assign a 3 char code to each program, and then number each possible call 1..nn in your program so you can tell which call blew up your data at "abc123" as well as the exact sql that was defective.
ADDED COMMENT
Thought of this later. You might be able to add / modify the update trigger on the sql table to look at the number of rows update prevent the update if the number of rows exceeds a threshhold that makes sense for your. So, did a little searching and found someone wrote an article on this already as in this snippet
CREATE TRIGGER [Purchasing].[uPreventWholeUpdate]
ON [Purchasing].[VendorContact]
FOR UPDATE AS
BEGIN
DECLARE #Count int
SET #Count = ##ROWCOUNT;
IF #Count >= (SELECT SUM(row_count)
FROM sys.dm_db_partition_stats
WHERE OBJECT_ID = OBJECT_ID('Purchasing.VendorContact' )
AND index_id = 1)
BEGIN
RAISERROR('Cannot update all rows',16,1)
ROLLBACK TRANSACTION
RETURN;
END
END
Though this is not really the right fix, if you log this appropriately, I bet you can figure out what tried to screw up your data and fix it.
Best of luck
Transaction log explorer should be able to see who executed command, when, and how specifically command looks like.
Which log explorer do you use? If you are using ApexSQL Log you need to enable connection monitor feature in order to capture additional login details.
This might be like using a sledgehammer to drive in a thumb tack, but have you considered using SQL Server Auditing (provided you are using SQL Server Enterprise 2008 or greater)?
I have the following query that runs in 16ms - 30ms.
<cfquery name="local.test1" datasource="imagecdn">
SELECT hash FROM jobs WHERE hash in(
'EBDA95630915EB80709C69089315399B',
'3617B8E6CF0C62ECBD3C48DDF8585466',
'D519A38F09FDA868A2FEF1C55C9FEE76',
'135F94C3774F7719CFF8FF3A275D2D05',
'D58FAE69C559273D8427673A08193789',
'2BD7276F209768F2FCA6635659D7922A',
'B1E3CFBFCCFF6F5B48A849A050E6D424',
'2288F5B8A797F5302E8CA24323617236',
'8951883E36B5D38A4643DFAA0396BF13',
'839210BD564E30BE1355D1A6D4EF7081',
'ED4A2CB0C28B608C29576819CF7BE19B',
'CB26925A4874945B810707D5FF0B91F2',
'33B2FC229F0CC797A02AD163CDBA0875',
'624986E7547DBAC0F47B3005CFDE0A16',
'6F692C289BD805CEE41EF59F83F16F4D',
'8551F0033C617BD9EADAAD6CEC4B3E9E',
'94C3C0A74C2DE085FF9F1BBF928821A4',
'28DC1A9D2A69C2EDF5E6C0E6368A0B3C'
)
</cfquery>
If I execute the same query but use cfqueryparam it runs in 500ms - 2000ms.
<cfset local.hashes = "[list of the same ids as above]">
<cfquery name="local.test2" datasource="imagecdn">
SELECT hash FROM jobs WHERE hash in(
<cfqueryparam cfsqltype="cf_sql_varchar" value="#local.hashes#" list="yes">
)
</cfquery>
The table has roughly 60,000 rows. The "hash" column is varchar(50) and has a unique non-clustered index, but is not the primary key. DB server is MSSQL 2008. The web server is running the latest version of CF9.
Any idea why the cfqueryparam causes the performance to bomb out? It behaves this way every single time, no matter how many times I refresh the page. If I pair the list down to only 2 or 3 hashes, it still performs poorly at like 150-200ms. When I eliminate the cfqueryparam the performance is as expected. In this situation there is the possibility for SQL injection and thus using cfqueryparam would certainly be preferable, but it shouldn't take 100ms to find 2 records from an indexed column.
Edits:
We are using hashes generated by hash() not UUIDS or GUIDS. The hash is generated by a hash(SerializeJSON({ struct })) which contains the plan for a set of operations to execute on an image. The purpose for this is that it allows us to know before insert and before query the exact unique id for that structure. These hashes act as an "index" of what structures have already been stored in the DB. In addition with hashes the same structure will hash to the same result, which is not true for UUIDS and GUIDS.
The query is being executed on 5 different CF9 servers and all of them exhibit the same behavior. To me this rules out the idea that CF9 is caching something. All servers are connecting to the exact same DB so if caching was occurring it would have to be the DB level.
Your issue may be related to VARCHAR vs NVARCHAR. These 2 links may help
Querying MS SQL Server G/UUIDs from ColdFusion and
nvarchar vs. varchar in SQL Server, BEWARE
What might be happening is there is a setting in ColdFusion administrator if cfqueryparam sends varchars as unicode or not. If that setting does not match the column setting (in your case, if that setting is enabled) then MS SQL will not use that index.
As Mark points out it is is probably got a bad execution plan in the cache. One of the advantages of cfqueryparam is that when you pass in different values it can reuse the cached plan it has for that statement. This is why when you try it with a smaller list you see no improvement. When you do not use cfqueryparam SQL Server has to work out the Execution Plan each time. This normally a bad thing unless it has a sub optimal plan in the cache. Try clearing the cache as explained here http://www.devx.com/tips/Tip/14401 this hopefully will mean that the next time you run your statement with cfqueryparam in it'll cache the better plan.
Make sense?
I don't think cfqueryparam causing issue. As you have mention big hike in execution it may be index not going to use for your query when trying with cfqueryparam. I have created same scenario on my development computer but I got same execution time with and without cfqueryparam. There may be some overhead using list as in first query you are passing it directly as test and in second coldfusion need to create from query parameter from provided list but again this should not that much. I will suggest to start "SQL Server Profiler" and monitor query executed on server, this will give you better who costing another 500 ms.
am used lot of time , i know the diff between sql query and sp ,
SQL query will be compiled everytime it is executed.
Stored procedures are compiled only once when they are
executed for the first time.
This is general database question
But one big doubt is ,
For example ,
one dynamic work , that is i pass the userid to SP and sp will return the username,password,full details,
So for this scenario the query should execute once again know, so what is the necessary of SP instead of SQL QUERY ,
Please clear this doubt ,
Hi thanks for all your updates,
but i dont want the advantage, comparison ,
just say ,
How sp executing , while we go with dynamic works,
For example ,
if pass userid 10 then sp also read records 10 ,
if i pass 14 then, SP again look the 14 records , see this same work NORMAL SQL QUERY
doing , but on that time execute and fetching ,so why should i go for sp ,
Regards
Stored procedures, like the name says, are stored on the database server. They are transmitted to the server and compiled when you create them, and executed when you call them.
Simple SQL queries, on the other hand, are transmitted to the server and compiled each time you use them.
So transmitting of a huge query (instead of a simple "execute procedure" command) and compiling create an overhead which can be avoided by the use of a stored procedure.
MySQL, like other RDBMS, has a query cache. But this avoid only compiling, and only if the query is exactly the same than a previously executed query, which means the cache is not used if you execute 2 times the same query, with different values in a where clause, for example.
I see no reason for a stored procedure simply to query for all user details.
Stored procedures are functional code that you execute on the database server. I can think of three reasons why you'd use them:
To create an interface for users that hides the schema details from clients.
Performance. Extensive calculations on a large data set might be done more efficiently on the database server
Sometimes it can be difficult (or impossible, depending on your skill) to express what you think you need in a declarative, set-based language like SQL. That's when some people throw up their hands and write stored procs.
Only 1. would be justifiable from your question. I would recommend sticking with SQL.
UPDATE: The new information you provided still does not justify stored procedures in my opinion. A query that returns 14 records is routine.