I'm in process of migrating data from DB2 to SQL Server using linked server and open query, like below:
--SET STATISTICS IO on
-- Number of records are: 18176484
select * INTO [DBName].[DBO].Table1
FROM OPENQUERY(DB2,
'Select * From OPERATIONS.Table1')
This query is taking 9 hrs and 17mins (number of record 18176484) to be inserted.
Is there any other way to insert records more quickly? Can I use "OpenRowSet" function to do the bulk insert? OR an SSIS package will increase the performance and will take less time? Please help
You probably want to export the data to a csv file such as this answer on StackOverflow:
EXPORT TO result.csv OF DEL MODIFIED BY NOCHARDEL SELECT col1, col2, coln FROM testtable;
(Exporting result of select statement to CSV format in DB2)
Once its a CSV file you can import it into SQL Server using either BCP or SSIS both of which are extremely fast especially if you use file lock on the target table.
Related
I have created a package is SSIS. It's working fine for first time insertion. When I am running the package through SQL Server agent jobs, I am getting duplicates inserted when the scheduled job is inserting data.
I don't have any idea about how to stop inserting multiple duplicate records.
I am expecting to remove duplicates insertion while running deployed package through SQL Server Jobs
There are 2 approaches to do that:
(1) using SQL Command
This option can be used if source and destination are on the same server
Since you are using ADO.NET source you can change the Data Access mode to SQL Command and select only data that not exists in the destination:
SELECT *
FROM SourceTable
WHERE NOT EXISTS(
SELECT 1
FROM DestinationTable
WHERE SourceTable.ID = DestinationColumn.ID)
(2) using Lookup Transformation
You can use a Lookup transformation to get the non-matching rows between Source and destination and ignore duplicates:
UNDERSTAND SSIS LOOKUP TRANSFORMATION WITH AN EXAMPLE STEP BY STEP
SSIS - only insert rows that do not exists
SSIS import data or insert data if no match
Implementing Lookup Logic in SQL Server Integration Services
In order to remove duplicates use SQL Task with the following query (assuming that you are not extracting million of rows and you want to remove duplicates on the extracted data, not destination) :
with cte as (
select field1,field2, row_number() over(partition by allfieldsfromPK order by allfieldsfromPK) as rownum)
delete from cte where rownum > 1
Then use a Data Flow Task and insert clean data into destination table.
In case you just want to not insert duplicates , a very good option is to use MERGE statement, a more performant alternative.
I am trying to move tables from access to SQL Server programmatically.
I have some limitation in the system permissions, ie: I cannot use OPENDATASOURCE or OPENROWSET.
What I want to achieve is to transfer some table from Access to SQL Server and then work on that tables through vba (excel)/python and T-SQL.
The problem is in the timing that it is required to move the tables.
My current process is:
I work with vba macros, importing data from excel and making same transformation in access, to then import into the SQL Server
destroy the table in the server: "DROP TABLE"
re-importing the table with DoCmd.TransferDatabase
What I have notice is that the operation seems to be done based on a batch of rows and not directly. It is taking 1 minutes and half each 1000 rows. The same operation on Access it would have taken few seconds.
I understood that it is a specific way of SQL Server to use import by batches of 10 rows, probably to have more access on data: Micorsoft details
But in the above process I just want a copy the table from access to the SQL as fast as possible as then I would avoid cross platform links and I will perform operation only on the SQL Server.
Which would be the faster way to achieve this goal?
Why are functions like OPENDATASOURCE or OPENROWSET are blocked? Do you work in a bank?
I can't say for sure which solution is the absoute fastest, but you may want to consider exporting all Access tables as separate CSV files (or Excel files), and then run a small script to load each of those files into SQL Server.
Here is some VBA code that saves separate tables as separate files.
Dim obj As AccessObject, dbs As Object
Set dbs = Application.CurrentData
For Each obj In dbs.AllTables
If Left(obj.Name, 4) <> "MSys" Then
DoCmd.TransferText acExportDelim, , obj.Name, obj.Name & ".csv", True
DoCmd.TransferSpreadsheet acExport, acSpreadsheetTypeExcel9, obj.Name, obj.Name & ".xls", True
End If
Next obj
Now, you can very easily, and very quickly, load CSV files into SQL Server using Bulk Insert.
Create TestTable
USE TestData
GO
CREATE TABLE CSVTest
(ID INT,
FirstName VARCHAR(40),
LastName VARCHAR(40),
BirthDate SMALLDATETIME)
GO
BULK
INSERT CSVTest
FROM 'c:\csvtest.txt'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
--Check the content of the table.
SELECT *
FROM CSVTest
GO
--Drop the table to clean up database.
DROP TABLE CSVTest
GO
https://blog.sqlauthority.com/2008/02/06/sql-server-import-csv-file-into-sql-server-using-bulk-insert-load-comma-delimited-file-into-sql-server/
Also, you may want to consider one of these options.
https://www.online-tech-tips.com/ms-office-tips/ms-access-to-sql-database/
https://support.office.com/en-us/article/move-access-data-to-a-sql-server-database-by-using-the-upsizing-wizard-5d74c0df-c8cd-4867-8d07-e6e759d72924
I have one table in SQL server and 5 tables in Teradata.I want to join those 5 table in teradata with sql server table and store result in Teradata table.
I have sql server name but i dont know how to simultaneously run a query both on sql server and teradata.
i want to do this:
sql server table query
Select distinct store
from store_Desc
teradata tables:
select cmp_id,state,sde
from xyz
where store in (
select distinct store
from sql server table)
You can create a table (or a volatile table if you do not have write privileges) to do this. Export result from SQL Server as text or into the language of your choice.
CREATE VOLATILE TABLE store_table (
column_1 datatype_1,
column_2 datatype_2,
...
column_n datatype_n);
You may need to add ON COMMIT PRESERVE ROWS before the ; to the above depending on your transaction settings.
From a language you can loop the below or do an execute many.
INSERT INTO store_table VALUES(value_1, value_2, ..., value_n);
Or you can use the import from text using Teradata SQL Assistant by going to File and selecting Import. Then execute the below and navigate to your file.
INSERT INTO store_table VALUES(?, ?, ..., n);
Once you have inserted your data you can query it by simply referencing the table name.
SELECT cmp_id,state,sde
FROM xyz
WHERE store IN(
SELECT store
FROM store_table)
The DISTINCT function is most easily done on export from SQL Server to minimize the rows you need to upload.
EDIT:
If you are doing this many times you can do this with a script, here is a very simple example in Python:
import pyodbc
con_ss = pyodbc.connect('sql_server_odbc_connection_string...')
crs_ss = con_ss.cursor()
con_td = pyodbc.connect('teradata_odbc_connection_string...')
crs_td = con_td.cursor()
# pull data for sql server
data_ss = crs_ss.execute('''
SELECT distinct store AS store
from store_Desc
''').fetchall()
# create table in teradata
crs_td.execute('''
CREATE VOLATILE TABLE store_table (
store DEC(4, 0)
) PRIMARY INDEX (store)
ON COMMIT PRESERVE ROWS;''')
con_td.commit()
# insert values; you can also use an execute many, but this is easier to read...
for row in data_ss:
crs_td.execute('''INSERT INTO store_table VALUES(?)''', row)
con_td.commit()
# get final data
data_td = crs_td.execute('''SELECT cmp_id,state,sde
FROM xyz
WHERE store IN(
SELECT store
FROM store_table);''').fetchall()
# from here write to file or whatever you would like.
Is fetching data from the Sql Server through ODBC an option?
The best option may be to use Teradata Parallel Transporter (TPT) to fetch data from SQL Server using its ODBC operator (as a producer) combined with Load or Update operator as the consumer to insert it into an intermediate table on Teradata. You must then perform rest of the operations on Teradata. For the rest of the operations, you can use BTEQ/SQLA to store the results in the final Teradata table. You can also put the same SQL in TPT's DDL operator instead of BTEQ/SQLA and get it done in a single job script.
To allow use of tables residing on separate DB environments (in your case SQL-Server and Teradata) in a single select statement, Teradata has recently released Teradata Query Grid. But I'm not sure about exact level of support for SQL-Server and it will involve licensing hassle and quite a learning curve to do this simple job.
I have done several SSIS packages over the past few months to move data from a legacy database to a SQL Server database. It normally takes 10-20 minutes to process around 5 millions of records depending on the transformation.
The issue I am experiencing with one of my package is a very poor performance because one of the columns in my destination is of the SQL Server XML data type.
Data comes in like this: 5
A script creates a Unicode string like this: <XmlData><Value>5</Value></XmlData>
Destination is simply a column with XML data type
This is really slow. Any advice?
I did a SQL Trace and notice that in behind the scene SSIS is executing on each row a convert before the insert:
declare #p as xml
set #p=convert(xml,N'<XmlData><Value>5</Value></XmlData>')
Try using a temporary table to store the resulting 5 million records without the XML transformation and then use SQL Server itself to move them from tempDB to the final destination:
INSERT INTO final_destination (...)
SELECT cast(N'<XmlData><Value>5</Value></XmlData>' AS XML) AS batch_converted_xml, col1, col2, colX
FROM #tempTable
If 5.000.000 turns to be too much data for a single batch, you can do it in smaller batches (100k lines should work like a charm).
The record captured by the profiler looks like an OleDB transformation with one command per line.
How can I import CSV file data into SQL Server 2000 table? I need to insert data from CSV file to table twice a day. Table has more then 20 fields but I only need to insert value into 6 fields.
i face same problem before i can suggest start reading here. The author covers:"This is very common request recently – How to import CSV file into SQL Server? How to load CSV file into SQL Server Database Table? How to load comma delimited file into SQL Server? Let us see the solution in quick steps."
I need to insert data from CSV file to table twice a day.
Use DTS to perform the import, then schedule it.
For SQL 2000, I would use DTS. You can then shedule this as a job when your happy with it.
Below is a good Microsoft link explaining how to use it.
Data Transformation Services (DTS)
You describe two distinct problems:
the CSV import, and
the extraction of data into only those 6 fields.
So break your solution down into two steps:
import the CSV into a raw staging table, and
then insert into your six 'live' fields from that staging table.
There is a function for the first part, called BULK INSERT, the syntax looks like this:
BULK INSERT target_staging_table_in_database
FROM 'C:\Path_to\CSV_file.csv'
WITH
(
DATAFILETYPE = 'CHAR'
,FIRSTROW = 2
,FIELDTERMINATOR = ','
,ROWTERMINATOR = '\n'
);
Adjust to taste, and consult the docs for more options. You might also want to TRUNCATE or DELETE FROM your staging table before doing the bulk insert so you don't have any old data in there.
Once you get the information into the database, doing an UPDATE or INSERT into those six fields should be straightforward.
You can make of use SQL Server Integration services(SSIS). It's jusy one time task to create the Package. Next time onwards just run that package.
You can also try Bulk Insert as daniel explained.
You can also try Import export wizard in SQL Server 2000.