SQL Server import wizard and temp tables - sql-server

I have a VERY long query, so to make it readable I divided it in two.
Select field_1, field_2
from table
into #temp;
and
Select field_1, field_2, field_1b
from #temp TMP
Inner Join table_2 ON TMP.field_2b = field_2
That works fine in SQL Server Management Studio.
Now I need to make a job that loads this data to another database. I have some import-export wizards projects that work without problem.
In this particular case I can't make the wizard work, it throws an error on #temp.
I tried
set fmtonly off
but I get timeout (the timeout value is set to 0)
Source is SQL Server 2014 (v12)
Destination is SQL Server 2016 (v13)
Any Idea of how can I make this work, my last resource is to make one query out of two, but like to try to maintain some order and readability if possible.
Thank you!

If you want to split your query into two purely for readability purposes, then there are alternative ways of formatting your query, such as a CTE:
WITH t1 AS (
SELECT field_1, field_2
FROM table
)
SELECT t1.field_1, t1.field_2, table_2.field_1b
FROM t1
INNER JOIN table_2 ON t1.field_2 = table_2.field_2b;
While I can't begin to speculate on performance (because I know nothing of your actual query or your underlying schema), this will probably improve performance as well because it removes the overhead of populating a temp table. In general, you shouldn't be compromising performance just to make your query 'readable'.

First and foremost, please post the error message. That is most often the most enlightening point of a question, and it can enable others to help you along.
Since you're using SSIS, please copy/paste what gets printed in the "output" window? That's where your error messages would go.
A few points
fmtonly has a rather different purpose, so if the difference between fmtonly on/off would be whether or not you get headers or headers and data. Please see the link below with the documentation for fmtonly:
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-fmtonly-transact-sql?view=sql-server-2017
Have you tried alternatives to the temp table solution? That seems to be the most likely culprit. Here are some alternative ideas:
using a staging table (permanent table used for ETL); you would truncate it, populate it using the two queries mentioned in your question, pass it along to the destination server and voila!
Using a CTE. This can avoid the temp table issues, though they aren't the easiest to read.
Not breaking up the query for readability. This may not be fun, though it will eliminate the trouble.
I have other points which may help you along, though, without the error message, this is enough to give you a head start.

This is hard to identify problem when you are not posting how you loads the data to another database
But I suggest you use semi-permanent Temp table.
So you create real table at the start of job and drop the said table when the job is done. Remember you can have multiple steps on the job, it's doesn't have to be in the same query.

Related

Getting data of temp table while debugging

While debugging I am unable to watch temp table's value in sql server 2012.I am getting all of my variables value and even can print that but struggling with the temp tables .Is there any way to watch temp table's value?.
SQL Server provides the concept of temporary table which helps the developer in a great way. These tables can be created at runtime and can do the all kinds of operations that one normal table can do. But, based on the table types, the scope is limited. These tables are created inside tempdb database.
While debugging, you can pause the SP at some point, write the select statement in your SP before the DROP table statement, the # table is available for querying.
select * from #temp
I placed this code inside my stored procedure and I am able to see the temp table contents inside the "Locals" window.
INSERT INTO #temptable (columns) SELECT columns FROM sometable; -- populate your temp table
-- for debugging, comment in production
DECLARE #temptable XML = (SELECT * FROM #temptable FOR XML AUTO); -- now view #temptable in Locals window
This works on older SQL Server 2008 but newer versions would probably support a friendlier FOR JSON object. Credit: https://stackoverflow.com/a/6748570/1129926
I know this is old, I've been trying to make this work also where I can view temp table data as I debug my stored procedure. So far nothing works.
I've seen many links to methods on how to the do this, but ultimately they don't work the way a developer would want them to work. For example: suppose one has several processes in the Stored Procedure that updates and modifies data in the same temp table, there is no way to see update on the fly for each process in the SP.
This is a VERY common request, yet no one seems to have a solution other than don't use Stored Procedures for complex processing due how difficult they are to debug. If you're a .NET Core/EF 6 developer and have the correct PK,FK set for the database, one shouldn't really need to use Stored Procedures at all as it can all be handled by EF6 and debug code to view data results in your entities/models directly (usually in web API using models/entities).
Trying to retrieve the data from the tempdb is not possible even with the same connection (as has been suggested).
What is sometimes used is:
PRINT '#temptablename'
SELECT * FROM #temptablename
Dotted thruout the code, you can add a debug flag to the SP and selectively debug the output. NOT ideal at all, but works for many situations.
But this MUST already be in the Stored Procedure before execution (not during). And you must remember to remove the code prior to deployment to a production environment.
I'm surprised in 2022, we still have no solution to this other than don't use complex stored procedures or use .NET Core/EF 6 ... which in my humble opinion is the best approach for 2022 since SSMS and other tools like dbForge and RedGate can't accomplish this either.

inner join Vs scalar Function

Which of the following query is better... This is just an example, there are numerous situations, where I want the user name to be displayed instead of UserID
Select EmailDate, B.EmployeeName as [UserName], EmailSubject
from Trn_Misc_Email as A
inner join
Mst_Users as B on A.CreatedUserID = B.EmployeeLoginName
or
Select EmailDate, GetUserName(CreatedUserID) as [UserName], EmailSubject
from Trn_Misc_Email
If there is no performance benefit in using the First, I would prefer using the second... I would be having around 2000 records in User Table and 100k records in email table...
Thanks
A good question and great to be thinking about SQL performance, etc.
From a pure SQL point of view the first is better. In the first statement it is able to do everything in a single batch command with a join. In the second, for each row in trn_misc_email it is having to run a separate BATCH select to get the user name. This could cause a performance issue now, or in the future
It is also eaiser to read for anyone else coming onto the project as they can see what is happening. If you had the second one, you've then got to go and look in the function (I'm guessing that's what it is) to find out what that is doing.
So in reality two reasons to use the first reason.
The inline SQL JOIN will usually be better than the scalar UDF as it can be optimised better.
When testing it though be sure to use SQL Profiler to view the cost of both versions. SET STATISTICS IO ON doesn't report the cost for scalar UDFs in its figures which would make the scalar UDF version appear better than it actually is.
Scalar UDFs are very slow, but the inline ones are much faster, typically as fast as joins and subqueries
BTW, you query with function calls is equivalent to an outer join, not to an inner one.
To help you more, just a tip, in SQL server using the Managment studio you can evaluate the performance by Display Estimated execution plan. It shown how the indexs and join works and you can select the best way to use it.
Also you can use the DTA (Database Engine Tuning Advisor) for more info and optimization.

CTE vs multiple inserts

As we all know, CTEs were introduced in SQL Server 2005 and the crowd went wild.
I have a case where I'm inserting a whole lot of static data into a table. What I want to know is which of the following is faster and what other factors I should be aware of.
INSERT INTO MyTable (MyField) VALUES ('Hello')
INSERT INTO MyTable (MyField) VALUES ('World')
or
WITH MyCTE(Field1) AS (SELECT 'Hello' UNION SELECT 'World')
INSERT INTO MyTable (MyField) SELECT Field1 FROM MyCTE
I have an uncomfortable feeling the answer will be dependent on things like what triggers exist on MyTable...
(Also, I know and don't care that BULK INSERTing a CSV and any number of other methods are objectively faster and better ways of inserting static data. I specifically want to know the concerns I should be aware of with a CTE vs multiple inserts.)
Not sure which version of SQL Server you're using (2005 or 2008) - but no matter which version you use, I don't see any big benefit in using a CTE in this case for multiple inserts, quite honestly. CTE's are indeed great for a great many situation - but this is not one of them.
So basically, I would suggest you just use several INSERT statements.
In SQL Server 2008, you could simplify those by just specifying multiple values tuples:
INSERT INTO MyTable (MyField)
VALUES ('Hello'), ('World'), ('and outer space')
As always, your table structure, presence (or absence) of indices and triggers does indeed have a significant impact on your INSERT speed. If you need to load a lot of data, it's sometimes easier to turn these constraints and triggers OFF for the duration of the INSERT and then back on - but again: there's really no way to give you a clear indication whether that's the case in your specific situation or not - just too many variables we don't know about that play a siginificant role. Measure it, compare it, make a decision for yourself!

SQL Server query taking up 100% CPU and runs for hours

I have a query that has been running every day for a little over 2 years now and has typically taken less than 30 seconds to complete. All of a sudden, yesterday, the query started taking 3+ hours to complete and was using 100% CPU the entire time.
The SQL is:
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER JOIN vwSomeData alpha ON beta.id = alpha.id
alpha.id is a BIGINT type and beta.id is an INT type. dbo.fnGetStuff() is a simple SELECT statement with 2 INNER JOINs on tables in the same DB, using a WHERE id = #id. The function returns approximately 11000 results.
The view vwSomeData is a simple SELECT statement with two INNER JOINs that returns about 590000 results.
Both the view and the function will complete in less than 10 seconds when executed by themselves. Selecting the results of the function into a temporary table first and then joining on that makes the query finish in < 10 seconds.
How do I troubleshoot what's going on? I don't see any locks in the activity manager.
Look at the query plan. My guess is that there is a table scan or more in the execution plan. This will cause huge amounts of I/O for the few record you get in the result.
You could use the SQL Server Profiler tool to monitor what queries are running on SQL Server. It doesn't show the locks, but it can for instance also give you hints on how to improve your query by suggesting indexes.
If you've got a reasonably recent version of SQL Server Management Studio, it has a Database Tuning Adviser as well, under Tools. It takes a trace from the Profiler and makes some, sometimes highly useful, suggestions. Makes sure there's not too many queries - it takes a long time to build advice.
I'm not an expert on it, but have had some luck with it in the past.
Do you need to use a function? Can you re-write the entire thing into a stored procedure in which you pass in the #ID as a parameter.
Even if your table has indexes because you pass the #ID as a variable to the WHERE clause potentially greatly increasing the amount of time for the query to run.
The reason the indexes may not be used is because the Query Analyzer does not know the value of the variables when it selects an access method to perform the query. Because this is a batch file, only one pass is made of the Transact-SQL code, preventing the Query Optimizer from knowing what it needs to know in order to select an access method that uses the indexes.
You might want to consider an INDEX query hint if you cannot re-write the SQL.
it might also be possible, since this just started happening, that the INDEXes have become fragmented and might need to be rebuilt.
I've had similar problems with joining functions that return large datasets. I had to do what you've already suggested. Put the results in a temp table and join on that.
Look at the estimated plan, this will probably shed some light. Typically when query cost gets orders of magnitude more expensive it is because a loop or merge join is being used where a hash join is more appropriate. If you see a loop or merge join in the estimated plan, look at the number of rows it expects to process - is it far smaller than the number of rows you know will actually be in play? You can also specify a hint to use a hash join and see if it performs much better. If so, try updating statistics and see if it goes back to a hash join without a hint.
SELECT
#id,
alpha.A, alpha.B, alpha.C,
beta.X, beta.Y, beta.Z,
alpha.P, alpha.Q
FROM
[DifferentDatabase].dbo.fnGetStuff(#id) beta
INNER HASH JOIN vwSomeData alpha ON beta.id = alpha.id
-- having no idea what type of schema is in place and just trying to throw out ideas:
Like others have said... use Profiler and find the source of pain... but I'm thinking it is the function on the other database. Since that function might be a source of pain, have you thought about a little denormalization or anything on [DifferentDatabase]. I think you'll find a bit more scalability in joining to a more flattened table with indexes than a costly function.
Run this command:
SET SHOWPLAN_ALL ON
Then run your query. It will display the execution plan, look for a "SCAN" on an index or a table. That is most likely what is happening to your query now. If that is the case, try to figure out why it is not using indexes now (refresh statistics, etc)

Query hangs with INNER JOIN on datetime field

We've got a weird problem with joining tables from SQL Server 2005 and MS Access 2003.
There's a big table on the server and a rather small table locally in Access. The tables are joined via 3 fields, one of them a datetime field (containing a day; idea is to fetch additional data (daily) from the big server table to add data to the local table).
Up until the weekend this ran fine every day. Since yesterday we experienced strange non-time-outs in Access with this query. Non-time-out means that the query runs forever with rather high network transfer, but no timeout occurs. Access doesn't even show the progress bar. Server trace tells us that the same query is exectuted over and over on the SQL server without error but without result either. We've narrowed it down to the problem seemingly being accessing server table with a big table and either JOIN or WHERE containing a date, but we're not really able to narrow it down. We rebuilt indices already and are currently restoring backup data, but maybe someone here has any pointers of things we could try.
Thanks, Mike.
If you join a local table in Access to a linked table in SQL Server, and the query isn't really trivial according to specific limitations of joins to linked data, it's very likely that Access will pull the whole table from SQL Server and perform the join locally against the entire set. It's a known problem.
This doesn't directly address the question you ask, but how far are you from having all the data in one place (SQL Server)? IMHO you can expect the same type of performance problems to haunt you as long as you have some data in each system.
If it were all in SQL Server a pass-through query would optimize and use available indexes, etc.
Thanks for your quick answer!
The actual query is really huge; you won't be happy with it :)
However, we've narrowed it down to a simple:
SELECT * FROM server_table INNER JOIN access_table ON server_table.date = local_table.date;
If the server_table is a big table (hard to say, we've got 1.5 million rows in it; test tables with 10 rows or so have worked) and the local_table is a table with a single cell containing a date. This runs forever. It's not only slow, It just does nothing besides - it seems - causing network traffic and no time out (this is what I find so strange; normally you get a timeout, but this just keeps on running).
We've just found KB article 828169; seems to be our problem, we'll look into that. Thanks for your help!
Use the DATEDIFF function to compare the two dates as follows:
' DATEDIFF returns 0 if dates are identical based on datepart parameter, in this case d
WHERE DATEDIFF(d,Column,OtherColumn) = 0
DATEDIFF is optimized for use with dates. Comparing the result of the CONVERT function on both sides of the equal (=) sign might result in a table scan if either of the dates is NULL.
Hope this helps,
Bill
Try another syntax ? Something like:
SELECT * FROM BigServerTable b WHERE b.DateFld in (SELECT DISTINCT s.DateFld FROM SmallLocalTable s)
The strange thing in your problem description is "Up until the weekend this ran fine every day".
That would mean the problem is really somewhere else.
Did you try creating a new blank Access db and importing everything from the old one ?
Or just refreshing all your links ?
Please post the query that is doing this, just because you have indexes doesn't mean that they will be used. If your WHERE or JOIN clause is not sargable then the index will not be used
take this for example
WHERE CONVERT(varchar(49),Column,113) = CONVERT(varchar(49),OtherColumn,113)
that will not use an index
or this
WHERE YEAR(Column) = 2008
Functions on the left side of the operator (meaning on the column itself) will make the optimizer do an index scan instead of a seek because it doesn't know the outcome of that function
We rebuilt indices already and are currently restoring backup data, but maybe someone here has any pointers of things we could try.
Access can kill many good things....have you looked into blocking at all
run
exec sp_who2
look at the BlkBy column and see who is blocking what
Just an idea, but in SQL Server you can attach your Access database and use the table there. You could then create a view on the server to do the join all in SQL Server. The solution proposed in the Knowledge Base article seems problematic to me, as it's a kludge (if LIKE works, then = ought to, also).
If my suggestion works, I'd say that it's a more robust solution in terms of maintainability.

Resources