Query when value trails with .b decreases speed by 30x-100x - sql-server

Consider table testTable, a table with six fields: one of them a UNIQUEIDENTIFIER, one a TIMESTAMP and four of them VARCHARs. Field Filename is VARCHAR.
This first query takes 1 minutes 38 seconds
Select top 1 * from testTable WHERE Filename = 'any.string.1512.b'
Either of these queries takes 1-3 seconds
Select top 1 * from testTable WHERE Filename = 'any.string.1512'
Select top 1 * from testTable WHERE Filename like 'cusip.realloc.1412.b%'
I have looked at the execution plan for all three and the only difference is that the last query (the LIKE statement) used 46% index seek\54% Key Lookup vs a 50\50 index\key lookup for the first two. As far as I can tell as soon as I no longer use the .b part of this search criterium, the queries go back to normal speed.
FileName has been indexed; table has been removed and recreated just in case. We have added indexes, removed indexes, check table, checked database, restart services, restart server, recreated the table. This field used to be VARCHAR(MAX) and I changed it to VARCHAR(100) to index it, but the problem was occurring before making this change.
Something else that I believe may be happening is that there might be something wrong with the end of the table. It will never complete a full:
Select * from testTable
I hoped it was a corrupted table but that wasn't the case. However when we attempt to generate a script in SSMS it fails to generate (no error given). I was able to recreate it by using another SQL client and generating the structure from SSMS and data copy from the other SQL client.
We are pretty stumped.

Related

How does SQL Server handle failed query to linked server?

I have a stored procedure that relies on a query to a linked server.
This stored procedure is roughly structured as follows:
-- Create local table var to stop query from needing round trips to linked server
DECLARE #duplicates TABLE (eid NVARCHAR(6))
INSERT INTO #duplicates(eid)
SELECT eid FROM [linked_server].[linked_database].[dbo].[linked_table]
WHERE es = 'String'
-- Update on my server using data from linked server
UPDATE [my_server].[my_database].[dbo].[my_table]
-- Many things, including
[status] = CASE
WHEN
eid IN (
SELECT eid FROM #duplicates
)
THEN 'String'
ELSE es
END
FROM [my_server].[another_database].[dbo].[view]
-- This view obscures sensitive information and shows only the data that I have permission to see
-- Many other things
The query itself is much more complex, but the key idea is building this temporary table from a linked server (because it takes the query 5 minutes to run if I don't, versus 3 seconds if I do).
I've recently had an issue where I ended up with updates to my table that failed to get checked against the linked server for duplicate information.
The logical chain of events is this:
Get all of the data from the original view
The original view contains maybe 3000 records, of which maybe 30 are
duplicates of the entity in question, but with 1 field having a
different value.
I then have to grab data from a different server to know which of
the duplicates is the correct one.
When the stored procedure runs, it updates each record.
ERROR STEP - when the stored procedure hits a duplicate record, it
updates my_table again - so es gets changed multiple times in a row.
The temp table was added after the fact when we realized incorrect es values were being introduced to my_table.
'my_database` does not contain the data needed to determine which is the correct tuple, hence the requirement for the linked server.
As far as I can tell, we had a temporary network interruption or a connection timeout that stopped my_server from getting the response back from linked_server, and it just passed an empty table to the rest of the procedure.
So, my question is - how can I guard against this happening?
I can't just check if the table is empty, because it could legitimately be empty. I need to definitively know if that initial SELECT from linked_server failed, if it timed out, or if it intentionally returned nothing.
without knowing the definition of the table you're querying you could get into an issue where your data is to long and you get a truncation error on your table.
Better make sure and substring it...
DECLARE #duplicates TABLE (eid NVARCHAR(6))
INSERT INTO #duplicates(eid)
SELECT SUBSTRING(eid,1,6) FROM [linked_server].[linked_database].[dbo].[linked_table]
WHERE es = 'String'
-- Update on my server using data from linked server
UPDATE [my_server].[my_database].[dbo].[my_table]
-- Many things, including
[status] = CASE
WHEN
eid IN (
SELECT eid FROM #duplicates
)
THEN 'String'
ELSE es
END
FROM [my_server].[another_database].[dbo].[view]
I had a similar problem where I needed to move data between servers, could not use a network connection so I ended up doing BCP out and BCP in. This is fast, clean and takes away the complexity of user authentication, drivers, trust domains. also it's repeatable and can be used for incremental loading.

SQL Server System.OutOfMemoryException

I am generating XML from 60 tables, and storing this xml in a table.
Table Name : Final_XML_Table
PK FK XML_Content (type xml)
1 1 "XML that I am generating from 60 tables"
When I am running below query , it gives memory exception :
Select * from Final_XML_Table
Things I have tried :
1. Results to text : I am getting only few lines from XML as text in output window
2. Results to file : I am getting only few lines from XML in file.
Please suggest, and also if there is any change , will I have to do this on server's SQL server as well while deployment.
I have also set XML_Data to unlimited :
This is not an answer, but to much for a comment...
The fact, that you are able to store the XML, shows clearly, that the XML is not to big for the database.
The fact that you get an out-of-memory exception with Select * from Final_XML_Table shows clearly, that SSMS has a problem on reading/displaying your XML.
You might try to check the length like here:
DECLARE #tbl TABLE (x XML);
INSERT INTO #tbl VALUES('<root><test>blah</test><test /><test2><x/></test2></root>');
SELECT * FROM #tbl; --This does not work for you
SELECT DATALENGTH(x) FROM #tbl; --This returns just "82" in this case
Might be, that due to a logical error in your XML's creation (a wrong join?) the XML contains multiple/repeated elements. You might try a query like this to get a count of nodes in order to check if this number is realistic:
SELECT x.value('count(//*)','int') FROM #tbl
For the exampe above this returns "5"
You might do the same with your original XML.
With a query like the following you can retrieve all node names of the first level, the second level and so on. You can check if this looks okay:
SELECT firstLevel.value('local-name(.)','varchar(max)') AS l1_node
,SecondLevel.value('local-name(.)','varchar(max)') AS l2_node
--add more
FROM #tbl
OUTER APPLY x.nodes('/*') AS A(firstLevel)
OUTER APPLY A.firstLevel.nodes('*') AS B(SecondLevel)
--add more
And - of course - you might open the ResourceMonitor to look at the actual usage of memory...
Come back with more details...
That error isn't a SQL Server error, it's from SSMS. It means that SSMS has run out of memory.
SSMS is only a 32bit application, so can only address 2GB of RAM. If it tries to address more than that, the error will occur. if you've had SSMS open and returned some very large datasets, that RAM is going to get used up.
In all honesty, if you're running a query like SELECT * FROM Final_XML_Table then I would hazard a guess that the dataset is huge. Add a WHERE clause, or don't return the dataset on screen. if you really need to view the data (all of it), export it to something else. But I very much doubt you need to look at every row, if you're returning around 2GB of data.

Insert from select or update from select with commit every 1M records

I've already seen a dozen such questions but most of them get answers that doesn't apply to my case.
First off - the database is am trying to get the data from has a very slow network and is connected to using VPN.
I am accessing it through a database link.
I have full write/read access on my schema tables but I don't have DBA rights so I can't create dumps and I don't have grants for creation new tables etc.
I've been trying to get the database locally and all is well except for one table.
It has 6.5 million records and 16 columns.
There was no problem getting 14 of them but the remaining two are Clobs with huge XML in them.
The data transfer is so slow it is painful.
I tried
insert based on select
insert all 14 then update the other 2
create table as
insert based on select conditional so I get only so many records and manually commit
The issue is mainly that the connection is lost before the transaction finishes (or power loss or VPN drops or random error etc) and all the GBs that have been downloaded are discarded.
As I said I tried putting conditionals so I get a few records but even this is a bit random and requires focus from me.
Something like :
Insert into TableA
Select * from TableA#DB_RemoteDB1
WHERE CREATION_DATE BETWEEN to_date('01-Jan-2016') AND to_date('31-DEC-2016')
Sometimes it works sometimes it doesn't. Just after a few GBs Toad is stuck running but when I look at its throughput it is 0KB/s or a few Bytes/s.
What I am looking for is a loop or a cursor that can be used to get maybe 100000 or a 1000000 at a time - commit it then go for the rest until it is done.
This is a one time operation that I am doing as we need the data locally for testing - so I don't care if it is inefficient as long as the data is brought in in chunks and a commit saves me from retrieving it again.
I can count already about 15GBs of failed downloads I've done over the last 3 days and my local table still has 0 records as all my attempts have failed.
Server: Oracle 11g
Local: Oracle 11g
Attempted Clients: Toad/Sql Dev/dbForge Studio
Thanks.
You could do something like:
begin
loop
insert into tablea
select * from tablea#DB_RemoteDB1 a_remote
where not exists (select null from tablea where id = a_remote.id)
and rownum <= 100000; -- or whatever number makes sense for you
exit when sql%rowcount = 0;
commit;
end loop;
end;
/
This assumes that there is a primary/unique key you can use to check if a row int he remote table already exists in the local one - in this example I've used a vague ID column, but replace that with your actual key column(s).
For each iteration of the loop it will identify rows in the remote table which do not exist in the local table - which may be slow, but you've said performance isn't a priority here - and then, via rownum, limit the number of rows being inserted to a manageable subset.
The loop then terminates when no rows are inserted, which means there are no rows left in the remote table that don't exist locally.
This should be restartable, due to the commit and where not exists check. This isn't usually a good approach - as it kind of breaks normal transaction handling - but as a one off and with your network issues/constraints it may be necessary.
Toad is right, using bulk collect would be (probably significantly) faster in general as the query isn't repeated each time around the loop:
declare
cursor l_cur is
select * from tablea#dblink3 a_remote
where not exists (select null from tablea where id = a_remote.id);
type t_tab is table of l_cur%rowtype;
l_tab t_tab;
begin
open l_cur;
loop
fetch l_cur bulk collect into l_tab limit 100000;
forall i in 1..l_tab.count
insert into tablea values l_tab(i);
commit;
exit when l_cur%notfound;
end loop;
close l_cur;
end;
/
This time you would change the limit 100000 to whatever number you think sensible. There is a trade-off here though, as the PL/SQL table will consume memory, so you may need to experiment a bit to pick that value - you could get errors or affect other users if it's too high. Lower is less of a problem here, except the bulk inserts become slightly less efficient.
But because you have a CLOB column (holding your XML) this won't work for you, as #BobC pointed out; the insert ... select is supported over a DB link, but the collection version will get an error from the fetch:
ORA-22992: cannot use LOB locators selected from remote tables
ORA-06512: at line 10
22992. 00000 - "cannot use LOB locators selected from remote tables"
*Cause: A remote LOB column cannot be referenced.
*Action: Remove references to LOBs in remote tables.

error when insert into linked server

I want to insert some data on the local server into a remote server, and used the following sql:
select * into linkservername.mydbname.dbo.test from localdbname.dbo.test
But it throws the following error
The object name 'linkservername.mydbname.dbo.test' contains more than the maximum number of prefixes. The maximum is 2.
How can I do that?
I don't think the new table created with the INTO clause supports 4 part names.
You would need to create the table first, then use INSERT..SELECT to populate it.
(See note in Arguments section on MSDN: reference)
The SELECT...INTO [new_table_name] statement supports a maximum of 2 prefixes: [database].[schema].[table]
NOTE: it is more performant to pull the data across the link using SELECT INTO vs. pushing it across using INSERT INTO:
SELECT INTO is minimally logged.
SELECT INTO does not implicitly start a distributed transaction, typically.
I say typically, in point #2, because in most scenarios a distributed transaction is not created implicitly when using SELECT INTO. If a profiler trace tells you SQL Server is still implicitly creating a distributed transaction, you can SELECT INTO a temp table first, to prevent the implicit distributed transaction, then move the data into your target table from the temp table.
Push vs. Pull Example
In this example we are copying data from [server_a] to [server_b] across a link. This example assumes query execution is possible from both servers:
Push
Instead of connecting to [server_a] and pushing the data to [server_b]:
INSERT INTO [server_b].[database].[schema].[table]
SELECT * FROM [database].[schema].[table]
Pull
Connect to [server_b] and pull the data from [server_a]:
SELECT * INTO [database].[schema].[table]
FROM [server_a].[database].[schema].[table]
I've been struggling with this for the last hour.
I now realise that using the syntax
SELECT orderid, orderdate, empid, custid
INTO [linkedserver].[database].[dbo].[table]
FROM Sales.Orders;
does not work with linked servers. You have to go onto your linked server and manually create the table first, then use the following syntax:
INSERT INTO [linkedserver].[database].[dbo].[table]
SELECT orderid, orderdate, empid, custid
FROM Sales.Orders
WHERE shipcountry = 'UK';
I've experienced the same issue and I've performed the following workaround:
If you are able to log on to remote server where you want to insert data with MSSQL or sqlcmd and rebuild your query vice-versa:
so from:
SELECT * INTO linkservername.mydbname.dbo.test
FROM localdbname.dbo.test
to the following:
SELECT * INTO localdbname.dbo.test
FROM linkservername.mydbname.dbo.test
In my situation it works well.
#2Toad: For sure INSERT INTO is better / more efficient. However for small queries and quick operation SELECT * INTO is more flexible because it creates the table on-the-fly and insert your data immediately, whereas INSERT INTO requires creating a table (auto-ident options and so on) before you carry out your insert operation.
I may be late to the party, but this was the first post I saw when I searched for the 4 part table name insert issue to a linked server. After reading this and a few more posts, I was able to accomplish this by using EXEC with the "AT" argument (for SQL2008+) so that the query is run from the linked server. For example, I had to insert 4M records to a pseudo-temp table on another server, and doing an INSERT-SELECT FROM statement took 10+ minutes. But changing it to the following SELECT-INTO statement, which allows the 4 part table name in the FROM clause, does it in mere seconds (less than 10 seconds in my case).
EXEC ('USE MyDatabase;
BEGIN TRY DROP TABLE TempID3 END TRY BEGIN CATCH END CATCH;
SELECT Field1, Field2, Field3
INTO TempID3
FROM SourceServer.SourceDatabase.dbo.SourceTable;') AT [DestinationServer]
GO
The query is run on DestinationServer, changes to right database, ensures the table does not already exist, and selects from the SourceServer. Minimally logged, and no fuss. This information may already out there somewhere, but I hope it helps anyone searching for similar issues.

SQL Server 2000: search through out database

Some how some records in my table are getting updated with value of xyz in a certain column. Out of hundred of stored procedures, functions, triggers, how can I determine which code is doing this action. Is there a way to search through the database some how through each and every script of the code?
Please help.
One approach is to check syscomments
Contains entries for each view, rule,
default, trigger, CHECK constraint,
DEFAULT constraint, and stored
procedure within the database. The
text column contains the original SQL
definition statements..
e.g. select text from syscomments
If you are having trouble finding that literal string, the values could be coming from a table, or they could be being concatenated within a routine.
Try this
Select text from syscomments
where CharIndex('x', text) > 0
and CharIndex('y', text) > 0
and CharIndex('z', text) > 0
That might help you either find the right routine, or further indicate that the values are coming from a table.
This is going to be nearly impossible to do in SQL Server 2000 because the update might very well be from a variable that has that value or a join to another table that has that value and not hard-coded into the stored proc, trigger etc. The update could also be coming from a DTS package, a job, a piece of dynamic code run by the app or even from query analyzer, so the code itself may not be recorded inthe datbase anywhere.
Perhaps a better approach might be to create an audit table for the table in question and have it record the user and the code from the spid that generated the change as well as the old and new values. You'll have to wait until it happens again, but then you would know exactly what changed the value and what value to put it back to if need be.
Alternatively you could run profiler on the system until it happens but profiler tends to hurt performance and is not usually a good idea to run on a production system. If it is happening very often, it might be an acceptable alternative.
Here's a hint as to how you might get some of the info you want for the eventual trigger code you write:
create table #temp (eventtype nvarchar (1000), parameters int, eventinfo nvarchar (4000), myspid int)
declare #myspid int
select #myspid =##spid
insert #temp (eventtype,parameters, eventinfo)
exec ('dbcc inputbuffer (##spid)')
update #temp
set myspid = #myspid
select hostname, program_name, eventinfo
from #temp t
join sysprocesses s on t.myspid = s.spid
WHERE spid = #myspid
You might use sql-profiler to trac the update of a given table / column.

Resources