Importing a txt file into SQL Server with a where clause - sql-server

I have a .txt file which is 6.00 GB. It is a tab-delimited file so when I try to load it into SQL Server, the column delimiter is tab.
I need to load that .txt file into the database, but I don't need all the rows from the 6.00 Gb file. I need to be able to use a condition like
select *
into <my table>
where column5 in ('ab, 'cd')
but this is a text file and am not able to load it into db with that condition.
Can anyone help me with this?

Have you tried with BULK INSERT command? Take a look at this solution:
--Create temporary table
CREATE TABLE #BulkTemporary
(
Id int,
Value varchar(10)
)
--BULK INSERT has no WHERE clause
BULK INSERT #BulkTemporary FROM 'D:\Temp\File.txt'
WITH (FIELDTERMINATOR = '\t', ROWTERMINATOR = '\n')
--Filter results
SELECT * INTO MyTable FROM #BulkTemporary WHERE Value IN ('Row2', 'Row3')
--Drop temporary table
DROP TABLE #BulkTemporary
Hope this helps.

Just do a Bulk Insert into a staging table and form there move the data you actually want into a production table. The Where Clause is for doing something based on a specific condition inside SQL Server, not for loading data into SQL Server.

Related

Bulk Insert with dynamic mapping in SQL Server

Currently, I'm using Bulk Insert statement to read the CSV files and add all rows to the SQL table.
BULK INSERT tablename
FROM 'D:\Import Files\file.csv'
WITH (
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR='0x0a');
But now I have a dynamic mapping for each file and stored in a table (File's Field Name = Database Field Name).
Mapping Table:
FileId FileFieldName DBFieldName
1 Order-Id orderid
1 Order-Date orderdate
2 Id orderid
2 Orderedon orderdate
I want to map the file field name with database fields and import the rows to the SQL table.
How to achieve dynamic mapping with Bulk Insert statement in SQL Server?
Write your dynamic map from the table to an XML file, then use this syntax for BULK INSERT:
BULK INSERT tablename
FROM 'D:\Import Files\file.csv'
WITH (FORMATFILE = 'D:\BCP\myFirstImport.xml');

Validating Records before imported into Database in SSIS [duplicate]

I have a .txt file which is 6.00 GB. It is a tab-delimited file so when I try to load it into SQL Server, the column delimiter is tab.
I need to load that .txt file into the database, but I don't need all the rows from the 6.00 Gb file. I need to be able to use a condition like
select *
into <my table>
where column5 in ('ab, 'cd')
but this is a text file and am not able to load it into db with that condition.
Can anyone help me with this?
Have you tried with BULK INSERT command? Take a look at this solution:
--Create temporary table
CREATE TABLE #BulkTemporary
(
Id int,
Value varchar(10)
)
--BULK INSERT has no WHERE clause
BULK INSERT #BulkTemporary FROM 'D:\Temp\File.txt'
WITH (FIELDTERMINATOR = '\t', ROWTERMINATOR = '\n')
--Filter results
SELECT * INTO MyTable FROM #BulkTemporary WHERE Value IN ('Row2', 'Row3')
--Drop temporary table
DROP TABLE #BulkTemporary
Hope this helps.
Just do a Bulk Insert into a staging table and form there move the data you actually want into a production table. The Where Clause is for doing something based on a specific condition inside SQL Server, not for loading data into SQL Server.

Using CSV to update records in pre-exisiting SQL Server table

My SQL Server 2012 table is CUST_TABLE. It already has a lot of customer records (more than 10,000).
I have a CSV with the first column customer number which is my primary key. The second column has email addresses. The first row of this CSV contains columns heading custnum and email. The CSV has 125 data rows. I am using SSMS and want to update just 125 customer records and change their email.
The only solution I found was to have update statements to change the data. Is there any other easier way to do this? Like using the import data function by right-clicking on the database and then hovering over tasks. Thank you.
Read the csv into a temp table, then update your table using the temp table.
For example:
USE yourdb;
GO
IF OBJECT_ID('tempdb.dbo.#tmp', 'U') IS NOT NULL
DROP TABLE #tmp;
GO
CREATE TABLE #tmp (
t_cust_nr NVARCHAR(MAX),
t_email NVARCHAR(MAX)
)
SET NOCOUNT ON;
-- Read the csv, skip the first row
BULK INSERT #tmp
FROM 'C:\path\to\your.csv'
WITH (FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR ='\n');
-- Trim whitespace
UPDATE #tmp
SET t_cust_nr = LTRIM(RTRIM(t_cust_nr)),
t_email = LTRIM(RTRIM(t_email));
-- Add your update statement here...
-- You also might have to cast the t_cust_nr to a diff. data type if needed.
SET NOCOUNT OFF;
DROP TABLE #tmp;

BULK INSERT not inserting properly from CSV

I am trying to use BULK INSERT to add rows to an existing table from a .csv file. For now I have a small file for testing purposes with the following formatting:
UserID,Username,Firstname,Middlename,Lastname,City,Email,JobTitle,Company,Manager,StartDate,EndDate
273,abc,dd,dd,dd,dd,dd,dd,dd,dd,dd,dd
274,dfg,dd,dd,dd,dd,dd,dd,dd,dd,dd,dd
275,hij,dd,dd,dd,dd,dd,dd,dd,dd,dd,dd
And this is what my query currently looks like:
BULK INSERT DB_NAME.dbo.Users
FROM 'C:\data.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
When I execute this query it returns 1 row affected. I checked the entry in the table and noticed that the data in the file is inserted in the table as a single row.
What could be causing this? What I am trying to accomplish is insert those rows in individual rows in the table. See (long)image below
The first column is actually an IDENTITY column so in the file I just specified a integer even though it will be overwritten by the auto generated ID as I am not sure how to tell the query to start inserting from the second field yet.
There are more columns created in the actual table than specified in the file as not everything needs to be filled. Could that be causing it?
The problem is that you are loading data into the first column. To skip a column create a view over your table with just the columns you want to load and BULK INSERT into view. See below example (from MSDN: https://msdn.microsoft.com/en-us/library/ms179250.aspx):
CREATE VIEW v_myTestSkipCol AS
SELECT Col1,Col3
FROM myTestSkipCol;
GO
USE AdventureWorks2012;
GO
BULK INSERT v_myTestSkipCol
FROM 'C:\myTestSkipCol2.dat'
WITH (FORMATFILE='C:\myTestSkipCol2.xml');
GO
What I would recommend you do instead is to create a staging table which matches the file exactly. Load data into that and then use INSERT statement to copy it into your permanent table. This approach is much more robust and flexible. For example, after loading the staging table you can perform some data validation or cleanup before loading the permanent table.

update a table in SSIS periodically

How to do regular updates to a database table in SSIS. The table has foreign key constraints.
I have a package running every week, and I have to update the data in the table from a flat file. Most of the contents are the same with update values and other new rows.
UPDATE : My data file contains updated contents ( some rows missing, some rows added, some modified ). The data file does not have the Primary keys ( I create the primary keys when I first bulk insert the data from the data file ), on subsequent SSIS package runs, I need to update the table with new data file contents.
e.g.
table
---------------------------------------------
1 Mango $0.99
2 Apple $0.59
3 Orange $0.33
data file
---------------------------------------------
Mango 0.79
Kiwi 0.45
Banana 0.54
How would I update the table with data from the file. The table has foreign key constraints with other tables.
another approach, to load massive group data instead of dealing row by row:
On database
create an staging table (e.g. StagingTable [name], [price])
Create a procedure (you may need to change the objects names, and add
transaction control and error handling etc just a draft):
create procedure spLoadData
as
begin
update DestinationTable
set DestinationTable.Price = StagingTable.Price
from DestinationTable
join StagingTable
on DestinationTable.Name = StagingTable.Name
insert into DestinationTable
(Name, Price)
select Name, Price
from StagingTable
where not exists (select 1
from DestinationTable
where DestinationTable.name = StagingTable.Name)
end
On SSIS
Execute SQL Task with (truncate [staging_table_name])
Data Flow task transferring from your Flat File to the Staging Table
Execute SQL Task calling the procedure you created (spLoadData).
Following are the few thoughts/steps:
Create a Flat File Connection manger.
Take Data flow task.
Create Flat File Source with connection manager just created.
Take lookup transformation(s) as many as you need to get FK values based on your source file values.
Take a lookup transformation after all above lookups, to get all values from Destination table.
Keep Conditional split and compare source values and destination values.
If all columns matched then UPDATE, else INSERT.
Map above conditional split results accordingly to OLEDB Destnation/OLEDB Command.
Give a try and let me know the results/comments.

Resources