SQL insert: ignore bad records

SQL insert: ignore bad records - database

How can I configure my insert statement in SQL Server as shown below so that the good rows are created in the table instead of none of the rows being created at all?
CREATE TABLE sql_server_test_a
(
ID NVARCHAR(4000),
FIRST_NAME NVARCHAR(200),
LAST_NAME NVARCHAR(200),
Member_ID INT
);
INSERT INTO sql_server_test_a (ID, FIRST_NAME, LAST_NAME, Member_ID)
VALUES ('1', 'Paris', 'Hilton', twelve),
('2', 'Nicky', 'Hilton', 24);

Related

SQL Server : alternatives to INTERSECT

I am writing an attempt at implementing a name search functionality via a SQL Server stored procedure.
Three tables are involved with the following definitions:
Employee table (other columns removed for brevity)
CREATE TABLE Payroll.Employee
(
EmployeeID INT NOT NULL IDENTITY(1,1),
EmployeeName NVARCHAR(50) NOT NULL ,
CONSTRAINT PK_Employee PRIMARY KEY CLUSTERED (EmployeeID),
);
Names table (single names stored with a a unique KeyNameID)
CREATE TABLE Payroll.KeyName
(
KeyNameID INT NOT NULL IDENTITY(1,1),
KeyName NVARCHAR(20) NOT NULL ,
RwVersion ROWVERSION NOT NULL
CONSTRAINT PK_KeyName PRIMARY KEY CLUSTERED (KeyNameID),
CONSTRAINT UC_KeyName UNIQUE (KeyName)
);
EmployeeName (employee's name stored using the id for each single name)
CREATE TABLE Payroll.EmployeeName
(
EmployeeNameID INT NOT NULL IDENTITY(1,1),
KeyNameID INT NOT NULL,
EmployeeID INT NOT NULL,
CONSTRAINT PK_EmployeeName PRIMARY KEY CLUSTERED (EmployeeNameID),
CONSTRAINT UC_EmployeeName UNIQUE (KeyNameID,EmployeeID)
);
The Employee table has the following rows:
INSERT INTO Employee (EmployeeID, EmployeeName) VALUES (1, 'ayub kassim');
INSERT INTO Employee (EmployeeID, EmployeeName) VALUES (2, 'eric yuda');
INSERT INTO Employee (EmployeeID, EmployeeName) VALUES (3, 'james kassim');
Each of the above names are split and stored in the KeyName table as follows:
INSERT INTO KeyName (KeyNameID, KeyName) VALUES (1, 'ayub');
INSERT INTO KeyName (KeyNameID, KeyName) VALUES (2, 'eric');
INSERT INTO KeyName (KeyNameID, KeyName) VALUES (3, 'james');
INSERT INTO KeyName (KeyNameID, KeyName) VALUES (4, 'kassim');
INSERT INTO KeyName (KeyNameID, KeyName) VALUES (5, 'yuda');
The KeyNameIDs are then used to identify each employee's single name as follows (two records per employee):
INSERT INTO EmployeeName (EmployeeNameID, KeyNameID, EmployeeID) VALUES (1, 1, 1);
INSERT INTO EmployeeName (EmployeeNameID, KeyNameID, EmployeeID) VALUES (2, 4, 1);
INSERT INTO EmployeeName (EmployeeNameID, KeyNameID, EmployeeID) VALUES (3, 2, 2);
INSERT INTO EmployeeName (EmployeeNameID, KeyNameID, EmployeeID) VALUES (4, 5, 2);
INSERT INTO EmployeeName (EmployeeNameID, KeyNameID, EmployeeID) VALUES (3, 3, 3);
INSERT INTO EmployeeName (EmployeeNameID, KeyNameID, EmployeeID) VALUES (3, 4, 3);
My search code is as follows:
CREATE PROCEDURE dbo.uspEmployeeSearch
#SearchString1 NVARCHAR(20),
#SearchString2 NVARCHAR(20)
AS
DECLARE #StringLength INT = LEN(#SearchString2)
SELECT #SearchString1 = RTRIM(#SearchString1) + '%'
SELECT #SearchString2 = RTRIM(#SearchString2) + '%'
SET NOCOUNT ON
IF #StringLength = 0
SELECT
EmployeeName,
taxpin
FROM
Employee
JOIN
EmployeeName ON Employee.EmployeeId = EmployeeName.EmployeeId
JOIN
KeyName ON KeyName.KeyNameId = EmployeeName.KeyNameId
WHERE
KeyName.KeyName LIKE #SearchString1;
ELSE
SELECT
EmployeeName,
taxpin
FROM
Employee
JOIN
EmployeeName ON Employee.EmployeeId = EmployeeName.EmployeeId
JOIN
KeyName ON KeyName.KeyNameId = EmployeeName.KeyNameId
WHERE
KeyName.KeyName LIKE #SearchString1
INTERSECT
SELECT
EmployeeName,
taxpin
FROM
Employee
JOIN
EmployeeName ON Employee.EmployeeId = EmployeeName.EmployeeId
JOIN
KeyName ON KeyName.KeyNameId = EmployeeName.KeyNameId
WHERE
KeyName.KeyName LIKE #SearchString2
The procedure expects two parameters. Where only one name is being searched, the second parameter would be a zero-length string.
So far it works with around 31% of the cost on SORT. The nested loops use 'index seek' which is good for me.
In short, if I am searching for 'ayub kassim' I only want employee records with the full name 'ayub kassim'. But if I search for 'kassim' I want all employee records with 'kassim' in the name.
My question is: Is it possible to implement this stored procedure using JOINS only, without the proprietary INTERSECT clause?
For the record I do not want to use LIKE because I need a very fast procedure and my employee table could run into hundreds of thousands of records.
Thanks in advance for your help.

Stored procedure INSERT INTO with different column count

I have a stored procedure and I am trying to insert some data into a temporary table. However, the stored procedure only contains 3 of these columns (col1 through col3), and I am trying to update one and have the last one auto-incremented.
DECLARE #customCol VARCHAR(12)
CREATE TABLE #table
(
col1 VARCHAR(50),
col2 INT,
col3 INT,
customCol VARCHAR(12),
rowNumber INT PRIMARY KEY IDENTITY
)
INSERT INTO #table (col1, col2, col3, customCol, rowNumber)
EXEC sp #var1, #var2
UPDATE #table
SET customCol = #customCol
WHERE rowNumber = (SELECT COUNT(*) FROM #table)
My issue is that whenever I try this, I get the error shown below
Column name or number of supplied values does not match table definition.
I understand that this is because the stored procedure only contains 3 columns and is missing 2 other values, any tips on how I can adjust my query to fix this problem?

Just specify three columns in the insert statement and use IDENTITY(1,1) instead of IDENTITY
--Create tables
CREATE TABLE #table (
col1 VARCHAR(50),
col2 INT,
col3 INT,
customCol VARCHAR(12),
rowNumber INT PRIMARY KEY IDENTITY(1,1)
)
--Insert into first temp table
INSERT INTO #table (col1, col2, col3) EXEC Sp #var1,#var2
UPDATE #table SET customCol = #customCol WHERE rowNumber = (SELECT MAX(rowNumber) FROM #table2)

Output inserted row from CTE

I have a CTE and I need to populate that CTE with the row that has been inserted.
I tried using temp table.
I am not sure how to create temp table within CTE and fill CTE.
This is what I have tried:
WITH RESULT AS
(
DECLARE #INSERTOUTPUT1 TABLE
(
BOOKID INT,
BOOKTITLE NVARCHAR(50),
MODIFIEDDATE DATETIME
);
-- INSERT NEW ROW INTO BOOKS TABLE
INSERT INTO BOOKS
OUTPUT INSERTED.* INTO #INSERTOUTPUT1
VALUES(101, 'ONE HUNDRED YEARS OF SOLITUDE', GETDATE());
SELECT * FROM #INSERTOUTPUT1
)
SELECT * FROM RESULT
Below is the schema for the table:
DROP TABLE dbo.Books;
CREATE TABLE dbo.Books
(
BookID int NOT NULL PRIMARY KEY,
BookTitle nvarchar(50) NOT NULL,
ModifiedDate datetime NOT NULL
);

you can't have the declare statement inside the CTE. It should be separate statement. Not sure what you wanted the CTE there for ? but there isn't a need for CTE
DECLARE #INSERTOUTPUT1 TABLE
(
BOOKID INT,
BOOKTITLE NVARCHAR(50),
MODIFIEDDATE DATETIME
);
INSERT INTO Books
OUTPUT INSERTED.* INTO #INSERTOUTPUT1
VALUES(101, 'ONE HUNDRED YEARS OF SOLITUDE', GETDATE());
select *
from #INSERTOUTPUT1

TRIGGER AFTER INSERT SELECT MIN(COUNT) insert ID

I'm trying to create a trigger after an insert on the eventss table. The trigger should select the Bcoordinator_ID from the bookingCoordinator table where they have the minimum number of occurrences in the eventss table.
Here's my table data followed by the trigger. It doesn't like the minCount in the values, I think it's looking for and int.
DROP TABLE eventsBooking
CREATE TABLE eventsBooking
(
EBK INT NOT NULL IDENTITY(100, 1),
booking_ID AS 'EBK'+CAST( ebk as varchar(10)) PERSISTED PRIMARY KEY,
bookingDate DATE,
Bcoordinator_ID VARCHAR (20),
eventss_ID VARCHAR (20) NOT NULL
)
INSERT INTO eventsBooking
VALUES ('2015-01-07 11:23:00', NULL, 'EVT100');
Eventss table:
EVT INT NOT NULL IDENTITY(100, 1),
eventss_ID AS 'EVT' + CAST(evt as varchar(10)) PERSISTED PRIMARY KEY,
eventsName varchar(50),
noOfStages SMALLINT,
noOfRounds SMALLINT,
eventsDate DATE,
entryFee DECIMAL (7,2),
venue_ID VARCHAR (20) NOT NULL,
judges_ID VARCHAR (20)
INSERT INTO eventss
VALUES ('Swimming Gala 2015', '3', '7', '2015-01-07 09:00:00', '35.00', 'VEN101', 'JUD100');
CREATE TABLE bookingCoordinator
(
BCO INT NOT NULL IDENTITY(100, 1),
Bcoordinator_ID AS 'BCO'+CAST( bco as varchar(10)) PERSISTED PRIMARY KEY,
forename varchar(20) NOT NULL,
familyName varchar(50)
)
INSERT INTO bookingCoordinator VALUES ('Steve', 'Wills');
Trigger:
CREATE TRIGGER TRGinsertJudge
ON [dbo].[eventss]
AFTER INSERT
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO dbo.eventsBooking (Bcoordinator_ID, bookingDate, Eventss_ID)
VALUES(minCount, getdate(), 100)
SELECT MIN(COUNT(Bcoordinator_ID)) AS minCount
FROM eventsBooking
END

You can't do an aggregation of an aggregation i.e. MIN(COUNT(1))
If you just want the Bcoordinatior_ID with the least counts in eventsBooking, do this
select top 1 bcoordinator_id
from eventsBooking
group by bcoordinator_id
order by count(1) asc
And you don't use VALUES() in an INSERT INTO ... SELECT statement
Also, in your current code, since eventsBooking.bcoordinator_id is always null, you need to join to the actual table of bookingCoordinators to return booking coordinators without any events booked.
So your complete trigger statement should be
INSERT INTO dbo.eventsBooking (Bcoordinator_ID, bookingDate, Eventss_ID)
select
top 1
bookingcoordinator.bcoordinator_id, getdate(), 100
from bookingCoordinator left join eventsBooking
on bookingCoordinator.Bcoordinator_ID = eventsBooking.Bcoordinator_ID
group by bookingcoordinator.bcoordinator_id
order by count(1) asc

TSQL - Bringing Data Together from Different Sources ...refactoring PK and FKs

I have various offices and one central head office. Each office has its own SQL Server 2008 instance so each office has its own data set with its own set of IDs.
Each office has already imported data into the head office and stored the data on a set of STAGING_Tables that look like this.
DECLARE #STAGING_COUNTRY TABLE
(
Original_CountryID INT NOT NULL,
OfficeID VARCHAR(10) NOT NULL,
Data VARCHAR(200) NOT NULL
);
DECLARE #STAGING_CITY TABLE
(
Original_CityID INT NOT NULL,
Original_CountryID_FK INT NOT NULL,
OfficeID VARCHAR(10) NOT NULL,
OtherData VARCHAR(100) NOT NULL
);
STAGING_COUNTRY has the original ID of each row (which off course will be duplicated since each office will have ID=1 for the 1st row on their Country table) and also has a unique OfficeID value that together with the Original_CountryID ..makes a unique value.
STAGING_CITY has also the original ID of each row, the unique OfficeID value that represent each office and in this case a FK to CountryID, (but of course at this point we have a reference to the Original_CountryID ..that in conjunction with the office ID could be identified).
Let's add some dummy rows:
/* ADD DUMMY VALUES TO STAGING_COUNTRY */
INSERT INTO #STAGING_COUNTRY
(Original_CountryID, OfficeID, Data) VALUES (1, 'Office1', 'USA')
INSERT INTO #STAGING_COUNTRY (Original_CountryID, OfficeID, Data)
VALUES (2, 'Office1', 'Canada')
INSERT INTO #STAGING_COUNTRY (Original_CountryID, OfficeID, Data)
VALUES (3, 'Office1', 'Japan')
INSERT INTO #STAGING_COUNTRY (Original_CountryID, OfficeID, Data)
VALUES (1, 'Office2', 'USA')
INSERT INTO #STAGING_COUNTRY (Original_CountryID, OfficeID, Data)
VALUES (1, 'Office2', 'Italy')
INSERT INTO #STAGING_COUNTRY (Original_CountryID, OfficeID, Data)
VALUES (3, 'Office2', 'Canada')
INSERT INTO #STAGING_COUNTRY (Original_CountryID, OfficeID, Data)
VALUES (3, 'Office3', 'Canada')
INSERT INTO #STAGING_COUNTRY (Original_CountryID, OfficeID, Data)
VALUES (2, 'Office3', 'France')
INSERT INTO #STAGING_COUNTRY (Original_CountryID, OfficeID, Data)
VALUES (3, 'Office3', 'USA')
/* ADD DUMMY VALUES TO STAGING_CITY */
INSERT INTO #STAGING_CITY (Original_CityID, Original_CountryID_FK, OfficeID, OtherData) VALUES
(1, 1, 'Office1', 'New York')
INSERT INTO #STAGING_CITY (Original_CityID, Original_CountryID_FK,
OfficeID, OtherData) VALUES (2, 1, 'Office1', 'Vancouver')
INSERT INTO #STAGING_CITY (Original_CityID, Original_CountryID_FK,
OfficeID, OtherData) VALUES (3, 1, 'Office1', 'Tokia')
INSERT INTO #STAGING_CITY (Original_CityID, Original_CountryID_FK,
OfficeID, OtherData) VALUES (1, 2, 'Office2', 'New York')
INSERT INTO #STAGING_CITY (Original_CityID, Original_CountryID_FK,
OfficeID, OtherData) VALUES (2, 2, 'Office2', 'Rome')
INSERT INTO #STAGING_CITY (Original_CityID, Original_CountryID_FK,
OfficeID, OtherData) VALUES (3, 2, 'Office2', 'Vancouver')
INSERT INTO #STAGING_CITY (Original_CityID, Original_CountryID_FK,
OfficeID, OtherData) VALUES (1, 3, 'Office3', 'Vancouver')
INSERT INTO #STAGING_CITY (Original_CityID, Original_CountryID_FK,
OfficeID, OtherData) VALUES (2, 3, 'Office3', 'Paris')
INSERT INTO #STAGING_CITY (Original_CityID, Original_CountryID_FK,
OfficeID, OtherData) VALUES (3, 3, 'Office3', 'New York')
The central head office wants to run reports from a central dtabase that pretty much contains copy all the data from all offices but in order to make this reporting DB optimized, we need to reshuffle a bit the STAGING_Tables ...and reorganize the data in FINAL_Tables that look like this:
DECLARE #FINAL_COUNTRY TABLE
(
CountryID INT IDENTITY PRIMARY KEY,
Original_CountryID INT NOT NULL,
OfficeID VARCHAR(10) NOT NULL,
Data VARCHAR(200) NOT NULL
);
DECLARE #FINAL_CITY TABLE
(
CityID INT IDENTITY PRIMARY KEY,
Original_CityID INT NOT NULL,
CountryID_FK INT NOT NULL,
OfficeID VARCHAR(10) NOT NULL,
OtherData VARCHAR(100) NOT NULL
);
PROBLEM:
The FINAL_COUNTRY and FINAL_CITY tables should be as optimized as possible for reporting purposes. These reports will be written in T-SQL stored procedures.
QUESTION:
What is the best way to reorganize the FINAL_Tables so that each record has a TRUE PK identifier (like in the original Office_Tables) and each FK is updated to point to the right newly created PK ...at the server level?
NOTE:
Please note that both staging & final tables are inside the same DB, on the server.
Also we still need to keep the OriginalIDs on the FINAL_Tables for other purposes.
GOALS:
The main goal here is to reorganize into a set of tables that can be easily indexed for performance purposes.
Please ask more info if needed.
Many many thanks in advanced...

This is probably just a partial answer. You may want to consider putting a generic IDENTITY id on each of your staging tables. Something like:
DECLARE #STAGING_COUNTRY TABLE
(
Stage_Country_id INT IDENTITY(1,1) NOT NULL,
Original_CountryID INT NOT NULL,
OfficeID VARCHAR(10) NOT NULL,
Data VARCHAR(200) NOT NULL
);
DECLARE #STAGING_CITY TABLE
(
Stage_City_id INT IDENTITY(1,1) NOT NULL,
Original_CityID INT NOT NULL,
Original_CountryID_FK INT NOT NULL,
OfficeID VARCHAR(10) NOT NULL,
OtherData VARCHAR(100) NOT NULL
);
Your final tables should not have the original_ids as you should only have 1 record per city / country in them.
Then I think you'd need some sort of cross reference tables to bridge your final tables to your stage tables. That would look like this:
DECLARE #COUNTRY_xref TABLE
(
country_xref_id INT IDENTITY(1,1) not null,
CountryID INT not null,
Stage_Country_id INT
);
DECLARE #CITY_xref TABLE
(
city_xref_id INT IDENTITY(1,1) not null,
CityID INT not null,
Stage_City_id INT not null
);
Are you also asking what the loading / conversion process would look like or was this more about the schema?
your final tables would probably look like this:
DECLARE #FINAL_COUNTRY TABLE
(
CountryID INT IDENTITY PRIMARY KEY,
Data VARCHAR(200) NOT NULL
);
DECLARE #FINAL_CITY TABLE
(
CityID INT IDENTITY PRIMARY KEY,
CountryID_FK INT NOT NULL,
OtherData VARCHAR(100) NOT NULL
);

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL insert: ignore bad records - database

Related

SQL Server : alternatives to INTERSECT

Stored procedure INSERT INTO with different column count

Output inserted row from CTE

TRIGGER AFTER INSERT SELECT MIN(COUNT) insert ID

TSQL - Bringing Data Together from Different Sources ...refactoring PK and FKs

Categories

Resources