Excel - SQL Query - ## Temp Table - sql-server

I am trying to create a global temp table using the results from one query, which can then be selected as a table and manipulated further several times without having to reprocess the data over and over.
This works perfectly in SQL management studio, but when I try to add the table through an Excel query, the table can be referenced at that time, but it is not created in Temporary Tables in the tempdb database.
I have broken it down into a simple example.
If I run this in SQL management studio, the result of 1 is returned as expected, and the table ##testtable1 is created in Temporary Tables
set nocount on;
select 1 as 'Val1', 2 as 'Val2' into ##testtable1
select Val1 from ##testtable1
I can then run another select on this table, even in a different session, as you'd expect. E.g.
Select Val2 from ##testtable1
If I don't drop ##testtable1, running the below in a query in Excel returns the result of 2 as you'd expect.
Select Val2 from ##testtable1
However, if I run the same Select... into ##testtable1 query directly in Excel, that correctly returns the result of 1, but the temptable is not created.
If I then try to run
Select Val2 from ##testtable1
As a separate query, it errors saying "Invalid object name '##testtable1'
The table is not listed within Temporary Tables in SQL management studio.
It is as if it is performing a drop on the table after the query has finished executing, even though I am not calling a drop.
How can I resolve this?

Read up on global temp tables(GTT). They persist as long as there is a session referencing it. In SSMS, if you close the session that created the GTT prior to using it in another session, the GTT would be discarded. This is what is happening in Excel. Excel creates a connection, executes and disconnects. Since there are no sessions using the GTT when Excel disconnects, the GTT is discarded.
I would highly recommend you create a normal table rather than use a GTT. Because of their temporary nature and dependence on an active session, you may get inconsistent results when using a GTT. If you create a normal table instead, you can be certain it will still exist when you try to use it later.
The code to create/clean the table is pretty simple.
IF OBJECT_ID('db.schema.tablename') IS NOT NULL
TRUNCATE TABLE [tablename]
ELSE
CREATE [tablename]...
GO
You can change the truncate to a delete to clean up a specific set of data and place it at the start of each one of your queries.

is it possible you could use a view? assuming that you are connecting to 5 DBs on the same server can you union the data together in a view:
CREATE VIEW [dbo].[testView]
AS
SELECT *
FROM database1.dbo.myTable
UNION
SELECT *
FROM database2.dbo.myTable
Then in excel:
Data> New Query > From Database > FromSQL Server Database
enter DB server
Select the view from the appropriate DB - done :)
OR call the view however you are doing it (e.g. vba etc.)
equally you could use a stored procedure and call that from VBA .. basically anything that moves more of the complexity to the server side to make your life easier :D

You can absolutely do this. Notice how I'm building a temp table from SQL called 'TmpSql' ...this could be any query you want. Then I set it to recordset 1. Then I create another recordset 2, that goes and gets the temp table data.
Imagine if you were looping on the first cn.Execute where TmpSql is changing.. This allows you to build a Temporary table coming from many sources or changing variables. This is a powerful solution.
cn.open "Provider= ..."
sql = "Select t.* Into #TTable From (" & TmpSql & ") t "
Set rs1 = cn.Execute(sql)
GetTmp = "Select * From #TTable"
rs2.Open GetTmp, cn, adOpenDynamic, adLockBatchOptimistic
If Not rs2.EOF Then Call Sheets("Data").Range("A2").CopyFromRecordset(rs2)
rs2.Close
rs1.Close
cn.Close

Related

How to do an inner join rather than for each loop in SSIS?

On the ETL server I have a DW user table.
On the prod OLTP server I have the sales database. I want to pull the sales only for users that are present in the user table on the ETL server.
Presently I am using an execute SQL task to fetch the DW users into a SSIS System.Object variable. Then using a for each loop to loop through each item (userid) in this variable and via a data flow task fetch the OLTP sales table for each user and dump it into the DW staging table. The for each is taking long time to run.
I want to be able to do an inner join so that the response is quicker, but I cant do this since they are on separate servers. Neither can I use a global temp table to make the inner join, for the same reason.
I tried to collect the DW users into a comma separated string variable and then using it (via string_split) to query into OLTP, but this is also taking more time at the pre-execute phase (not sure why exactly) even for small number of users.
I also am aware of lookup transform but that too will result in all oltp rows to be brought into the dw etl server to test the lookup condition.
Is there any alternate approach to be able to do an inner join by taking the list of users into the source?
Note: I do not have write permissions on the OLTP db.
Based on the comments, I think we can use a temporary table to solve this.
Can you help me understand this restriction? "Neither can I use a global temp table to make the inner join, for the same reason."
The restriction is since oltp server and dw server are separate so can't have global temp table common to both servers. Hope makes sense.
The general pattern we're going to do is
Execute SQL Task to create a temporary table on the OLTP server
A Data Flow task to populate the new temporary table. Source = DW. Destination = OLTP. Ensure Delay Validation = True
Modify existing Data Flow. Modify source to be a query that uses the temporary table i.e. SELECT S.* FROM oltp.sales AS S WHERE EXISTS (SELECT * FROM #SalesPerson AS SP WHERE SP.UserId = S.UserId); Ensure Delay Validation = True
A long form answer on using temporary tables (global to set the metadata, regular thereafter)
I don't use temp table in SSIS
Temporary tables, live in tempdb. Your OLTP and DW connection managers likely do not point to tempdb. To be able to reference a temporary table, local or global, in SSIS you need to either define an additional connection manager for the same server that points explicitly at tempdb so you can use the drop down in the source/destination components (technically accurate but dumb). Or, you use an SSIS Variable to hold the name of the table and use the ~From Variable~ named option in source/destination component (best option, maximum flexibility).
Soup to nuts example
I will use WideWorldImporters as my OLTP system and WideWorldImportersDW as my DW system.
One-time task
Open SQL Server Management Studio, SSMS, and connect to your OLTP system. Define a global temporary table with a unique name and the expected structure. Leave your connection open so the table structure remains intact during initial development.
I used the following statement.
DROP TABLE IF EXISTS #SO_70530036;
CREATE TABLE #SO_70530036(EmployeeId int NOT NULL);
Keep track of your query because we'll use it later on but as I advocate in my SSIS answers, perform the smallest task, test that it works and then go on to the next. It's the only way to debug.
Connection Managers
Define two OLE DB Connection Managers. WWI_DW uses points to the named instance DEV2019UTF8 and WWI_OLTP points to DEV2019EXPRESS. Right click on WWI_OLTP and select Properties. Find the property RetainSameConnection and flip that from the default of False to True. This ensures the same connection is used throughout the package. As temporary tables go out of scope when the connection goes away, closing and reopening a connection in a package will result in a fatal error.
These two databases on different instances so we can't cheat and directly comingle data.
Variables
Define 4 variables in SSIS, all of type String.
TempTableName - I used a value of ##SO_70530036 but use whatever value you specified in the One-time task section.
QuerySourceEmployees - This will be the query you run to generate the candidate set of data to go into the temporary table. I used SELECT TOP (3) E.[WWI Employee ID] AS EmployeeId FROM Dimension.Employee AS E WHERE E.[Is SalesPerson] = CAST(1 AS bit);
QueryDefineTables - Remember the drop/create statements from the on-time task? We're going to use the essence of them but use the expression builder to let us dynamically swap the table name. I clicked the ellipses, ..., on the Expression section and used the following "DROP TABLE IF EXISTS " + #[User::TempTableName] + "; CREATE TABLE " + #[User::TempTableName] + "( EmployeeId int NOT NULL);" You should be able to copy the Value from the row and paste it into SSMS to confirm it works.
QuerySales - This is the actual query you're going to use to pull your filtered set of sales data. Again, we'll use the Expression to allow us to dynamically reference the temporary table name. The prettified version of the expression would look something like
"SELECT
SI.InvoiceID
, SI.SalespersonPersonID
, SO.OrderID
, SOL.StockItemID
, SOL.Quantity
, SOL.OrderLineID
FROM
Sales.Invoices AS SI
INNER JOIN
Sales.Orders AS SO
ON SO.OrderID = SI.OrderID
INNER JOIN
Sales.OrderLines AS SOL
ON SO.OrderID = SOL.OrderID
WHERE
EXISTS (SELECT * FROM " + #[User::TempTableName] + " AS TT WHERE TT.EmployeeID = SI.SalespersonPersonID);"
Again, you should be able to pull the Value from the three queries and run them independently and verify they work.
Execute SQL Task
Add an Execute SQL task to the Control Flow. I named mine SQL Create temporary table My Connection Manager is WWI_OLTP and I changed the SQLSourceType to Variable and the SourceVariable is User::QueryDefineTables
Every time your package runs, the first thing it will do is establish create the temporary table. Which is good because SSIS is a metadata driven ETL engine and the next two steps would fail if the table didn't exist.
Data Flow Task - Prime the pump
This data flow is where we'll transfer DW data back to the OLTP system so can filter in the source system.
Drag a Data Flow Task onto the Control Flow. I named mine DFT Load Temp and before you click into it, right click on the Task and find the DelayValidation property and change this from the default of False to True. Normally, a package validates all metadata before actual execution begins as the idea is you want to know everything is good before any data starts moving. Since we're using temporary tables, we need to tell the execution engine "trust us, it'll be ready"
Double click inside the Data Flow Task.
Add an OLE DB Source. I named mine OLESRC SourceEmployees I use the connection manager WWI_DW. My data access mode changes to SQL command from variable and then I select my variable User::QuerySourceEmployees
Add an OLE DB Destination. I named mine OLEDST TempTableName and double clicked to configure it. The Connection Manager is WWI_OLTP and again, since the table lives in tempdb, we can't select it from the drop down. Change the Data access mode to Table name or view name variable - fast load and then select your variable name User::TempTableName. Click the Mapping tab and ensure source columns map to destination columns.
Data Flow Task - Transfer data
Finally, we will pull our source data, nicely filtered against the data from our target system.
Add an OLE DB Source. I named it OLESRC QuerySales. The Connection Manager is WWI_OLTP. Data access mode again changes to SQL command from variable and the variable name is User::QuerySales
From here, do whatever else you need to do to make the magic happen.
Instead of having 270k rows with an unfiltered query
I have 67k as there are only 3 employees in the temporary table.
Reference package
But wait, there's more!
Close out visual studio, open it back up and try to touch something in the data flows. Suddenly, there are red Xs everywhere! Any time you close a data flow component, it fires a revalidate metadata operation and guess what, it can't do that as the connection to the temporary table is gone.
The package will run fine, it will not throw VS_NEEDSNEWMETADATA but editing/maintenance becomes a pain.
If you switched from global temporary table to local, switch the table name variable's value back to a global and then run the define statement in SSMS. Once that's done, then you can continue editing the package.
I assure you, the local temporary table does work once you have the metadata set and you use queries via variables for source/destination.
No need for the global temporary table hack, or the SET FMTONLY OFF hack (which no longer works).
Just specify the result set metadata in the SQL query with WITH RESULT SETS. eg
EXEC ('
create table #t
(
ID INT,
Name VARCHAR(150),
Number VARCHAR(15)
)
insert into #t (Id, Name, Number)
select object_id, name, 12
from sys.objects
select * from #t
')
WITH RESULT SETS
(
(
ID INT,
Name VARCHAR(150),
Number VARCHAR(15)
)
)
If you need to parameterize the query, there's a bit of a catch because there are some limitations in how SSIS discovers parameters. SSIS runs sp_describe_undeclared_parameters, which doesn't really work with batches that call sp_executesql, because sp_executesql has a very unique way it handles parameters, one which you couldn't replicate with a user stored procedure.
So to parameterize the query you'll either need to pass the parameter values into the query using the "query from variable" and SSIS expressions, or push all this TSQL into a stored procedure.

How to speed up tables transfer between Access and SQL Server using VBA?

I am trying to move tables from access to SQL Server programmatically.
I have some limitation in the system permissions, ie: I cannot use OPENDATASOURCE or OPENROWSET.
What I want to achieve is to transfer some table from Access to SQL Server and then work on that tables through vba (excel)/python and T-SQL.
The problem is in the timing that it is required to move the tables.
My current process is:
I work with vba macros, importing data from excel and making same transformation in access, to then import into the SQL Server
destroy the table in the server: "DROP TABLE"
re-importing the table with DoCmd.TransferDatabase
What I have notice is that the operation seems to be done based on a batch of rows and not directly. It is taking 1 minutes and half each 1000 rows. The same operation on Access it would have taken few seconds.
I understood that it is a specific way of SQL Server to use import by batches of 10 rows, probably to have more access on data: Micorsoft details
But in the above process I just want a copy the table from access to the SQL as fast as possible as then I would avoid cross platform links and I will perform operation only on the SQL Server.
Which would be the faster way to achieve this goal?
Why are functions like OPENDATASOURCE or OPENROWSET are blocked? Do you work in a bank?
I can't say for sure which solution is the absoute fastest, but you may want to consider exporting all Access tables as separate CSV files (or Excel files), and then run a small script to load each of those files into SQL Server.
Here is some VBA code that saves separate tables as separate files.
Dim obj As AccessObject, dbs As Object
Set dbs = Application.CurrentData
For Each obj In dbs.AllTables
If Left(obj.Name, 4) <> "MSys" Then
DoCmd.TransferText acExportDelim, , obj.Name, obj.Name & ".csv", True
DoCmd.TransferSpreadsheet acExport, acSpreadsheetTypeExcel9, obj.Name, obj.Name & ".xls", True
End If
Next obj
Now, you can very easily, and very quickly, load CSV files into SQL Server using Bulk Insert.
Create TestTable
USE TestData
GO
CREATE TABLE CSVTest
(ID INT,
FirstName VARCHAR(40),
LastName VARCHAR(40),
BirthDate SMALLDATETIME)
GO
BULK
INSERT CSVTest
FROM 'c:\csvtest.txt'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
--Check the content of the table.
SELECT *
FROM CSVTest
GO
--Drop the table to clean up database.
DROP TABLE CSVTest
GO
https://blog.sqlauthority.com/2008/02/06/sql-server-import-csv-file-into-sql-server-using-bulk-insert-load-comma-delimited-file-into-sql-server/
Also, you may want to consider one of these options.
https://www.online-tech-tips.com/ms-office-tips/ms-access-to-sql-database/
https://support.office.com/en-us/article/move-access-data-to-a-sql-server-database-by-using-the-upsizing-wizard-5d74c0df-c8cd-4867-8d07-e6e759d72924

Verifying Syntax of a Massive SQL Update Command

I'm new to SQL Server and am doing some cleanup of our transaction database. However, to accomplish the last step, I need to update a column in one table of one database with the value from another column in another table from another database.
I found a SQL update code snippet and re-wrote it for our own needs but would love someone to give it a once over before I hit the execute button since the update will literally affect hundreds of thousands of entries.
So here are the two databases:
Database 1: Movement
Table 1: ItemMovement
Column 1: LongDescription (datatype: text / up to 40 char)
Database 2: Item
Table 2: ItemRecord
Column 2: Description (datatype: text / up to 20 char)
Goal: set Column1 from db1 to the value of Colum2 from db2.
Here is the code snippet:
update table1
set table1.longdescription = table2.description
from movement..itemmovement as table1
inner join item..itemrecord as table2 on table1.itemcode = table2.itemcode
where table1.longdescription <> table2.description
I added the last "where" line to prevent SQL from updating the column where it already matches the source table.
This should execute faster and just update the columns that have garbage. But as it stands, does this look like it will run? And lastly, is it a straightforward process, using SQL Server 2005 Express to just backup the entire Movement db before I execute? And if it messes up, just restore it?
Alternatively, is it even necessary to re-cast the tables as table1 and table 2? Is it valid to execute a SQL query like this:
update movement..itemmovement
set itemmovement.longdescription = itemrecord.description
from movement..itemmovement
inner join item..itemrecord on itemmovement.itemcode = itemrecord.itemcode
where itemmovement.longdescription <> itemrecord.description
Many thanks in advance!
You don't necessarily need to alias your tables but I recommend you do for faster typing and reduce the chances of making a typo.
update m
set m.longdescription = i.description
from movement..itemmovement as m
inner join item..itemrecord as i on m.itemcode = i.itemcode
where m.longdescription <> i.description
In the above query I have shortened the alias using m for itemmovement and i for itemrecord.
When a large number of records are to be updated and there's question whether it would succeed or not, always make a copy in a test database (residing on a test server) and try it out over there. In this case, one of the safest bet would be to create a new field first and call it longdescription_text. You can make it with SQL Server Management Studio Express (SSMS) or using the command below:
use movement;
alter table itemmovement add column longdescription_test varchar(100);
The syntax here says alter table itemmovement and add a new column called longdescription_test with datatype of varchar(100). If you create a new column using SSMS, in the background, SSMS will run the same alter table statement to create a new column.
You can then execute
update m
set m.longdescription_test = i.description
from movement..itemmovement as m
inner join item..itemrecord as i on m.itemcode = i.itemcode
where m.longdescription <> i.description
Check data in longdescription_test randomly. You can actually do a spot check faster by running:
select * from movement..itemmovement
where longdescription <> longdescription_test
and longdescription_test is not null
If information in longdescription_test looks good, you can change your update statement to set m.longdescription = i.description and run the query again.
It is easier to just create a copy of your itemmovement table before you do the update. To make a copy, you can just do:
use movement;
select * into itemmovement_backup from itemmovement;
If update does not succeed as desired, you can truncate itemmovement and copy data back from itemmovement_backup.
Zedfoxus provided a GREAT explanation on this and I appreciate it. It is excellent reference for next time around. After reading over some syntax examples, I was confident enough in being able to run the second SQL update query that I have in my OP. Luckily, the data here is not necessarily "live" so at low risk to damage anything, even during operating hours. Given the nature of the data, the updated executed perfectly, updating all 345,000 entries!

Connection scoped temp tables across stored procedures

I'm working on a data virtualization solution. The user is able to write his own SQL queries as filters for a query i make. I would like not having to run this filter query every time i select something from the database(It will likely be a complex series of joins).
My idea was to use a # temp table at script level and keep the connection alive. This #temp table would then be selected from but updated only when the user changes the filter. The idea being i can actually use it from stored procedures and the table is scoped to that connection.
I got the idea from someone who suggested to use dynamic sql and ## global temp tables named with the connection process ID so to make each connection have a unique global temp table. This was to overcome sharing temp tables across stored procedures. But it seems a bit clumsy.
I did a quick test with the below code and seemed to work fine
-- Run script at connection open from some app
SELECT * INTO #test
FROM dataTable
-- Now we can use stored procedures with #test table
EXECUTE selectFromTempTable
EXECUTE updateTempTable #sqlFilterString
EXECUTE selectFromTempTable
Only real problem i can see is the connection have to be kept alive for the duration which can be a few hours maybe. A single user can have multiple connections running at the same time. The number of users on a single database server would be like max 20.
If its a huge issue i could make it so the application can close and open them as needed so each user only have 1 connection open at a time. And maybe even then close it if not in use, and reopen when needed again with the delay of having to wait for the query to run.
Would this be bad practice? or kill any performance benefit from not running the filter query? This is on SQL Server 2008 and up.
I think I would create a permanent table, using the spid (process ID) as a key value. Each connection has its own process ID, so anyone can use it to identify their entries in the table:
create table filter(
spid int,
filternum int,
filterstring varchar(255),
<other cols> );
create unique index filterindx on filter(spid, filternum);
Then when a user creates filter entries:
delete from filter where spid = ##spid
insert into filter(spid, filternum, filterstring) select ##spid, 1, 'some sql thing'
insert into filter(spid, filternum, filterstring) select ##spid, 2, 'some other sql thing'
Then you can access each user's filter values by selecting where spid = ##spid etc

How to merge table from access to SQL Express?

I have one table named "Staff" in access and also have this table(same name) in SQL 2008.
Both table have thousands of records. I want to merge records from the access table to sql table without affecting the existing records in sql. Normally, I just export using OCBC driver and that works fine if that table doesn't exist in sql server. Please advise. Thanks.
A simple append query from the local access table to the linked sql server table should work just fine in this case.
So, just drop in the first (from) table into the query builder. Then change the query type to append, and you are prompted for the append table name.
From that point on, just drop in the columns you want (do not drop in the PK column, as they need not be used nor transferred in this case).
You can also type in the sql directly in the query builder. Either way, you will wind up with something like:
INSERT INTO dbo_custsql
( ADMINID, Amount, Notes, Status )
SELECT ADMINID, Amount, Notes, Status
FROM custsql1;
This may help: http://www.red-gate.com/products/sql-development/sql-compare/
Or you could write a simple program to read from each data set and do the comparison, adding, updating, and deleting, etc.

Resources