T-SQL trouble with conditional select into - sql-server

I'm running into some problems with my attempts to create a specific table. I have read some similar answers on this site already, but I'm not sure I'm understanding their answers and/or how it applies to my issue.
I have an existing table which contains ingredients for a recipe (table definition below):
Create Table Recipes.dbo.IngredientsByRecipeID
(
RecipeID int,
IngredientName varchar(max),
UnitTypeOfIngredient varchar(max),
QuantityOfIngredient decimal(6,3)
);
I'm trying to create and display the contents of a table which contains all ingredient names used in all recipes, in alphabetical order, and with all duplicate entries removed.
I've tried a few different variations on how to implement it, but I'm not sure now if I'm running into problems with my implementation or problems with how SQL Server works (I'm new to T-SQL and its ways).
My method for sorting the ingredients alphabetically into a new, duplicates-free table is as follows:
/*these three lines represent code which I've been trying to use to control the problems I keep getting with tables existing or not existing, I don't know if they are really helping or not. The first two only drop and create the table, the second is one attempt at a working solution for my needs*/
--drop table AlphabeticallySortedIngredientsList;
--create table AlphabeticallySortedIngredientsList (IngredientName varchar(max));
--select IngredientName into AlphabeticallySortedIngredientsList from IngredientsByRecipeID where ((select count(IngredientName) from AlphabeticallySortedIngredientsList) = 0) order by IngredientName asc;
--here is my latest attempt at getting a working solution for my needs
select IngredientName
into AlphabeticallySortedIngredientsList
from IngredientsByRecipeID
where not exists (select *
from AlphabeticallySortedIngredientsList
where IngredientName = AlphabeticallySortedIngredientsList.IngredientName)
order by IngredientName asc;
With each implementation I initially had some kind of syntax errors, but after (apparently) cleaning up those errors, I'm always left with one last error and while the error changes based on input, the same object is causing the errors. If I leave the 'drop table' line from above uncommented then this is the error I get:
Msg 208, Level 16, State 1, Line 5
Invalid object name 'AlphabeticallySortedIngredientsList'.
If I also uncomment the 'create table AlphabeticallySortedIngredientsList' line, then I get this error:
Msg 2714, Level 16, State 6, Line 9
There is already an object named 'AlphabeticallySortedIngredientsList' in the database.
And if I drop the table and then comment both lines out, I get the same error as above:
Msg 2714, Level 16, State 6, Line 9
There is already an object named 'AlphabeticallySortedIngredientsList' in the database.
I'm perplexed as it seems that my 'select IngredientName into AlphabeticallySortedIngredientList' line cannot make up its mind as to whether or not it is creating the table. If I drop the table before my solution runs, then it says it cannot be found. If I create the table before my solution runs, then it says the table already exists.
I'm guessing the problem is that the 'select' part of the statement is creating AlphabeticallySortedIngredientsList (given it doesn't exist) but then the conditional tries to look for a value in AlphabeticallySortedIngredientsList, which does not exist yet as the operation to create it hasn't occurred yet. Am I on the right track with this assessment, and if so, how can I fix the problem?
I need to preserve the original contents of IngredientsByRecipeID, for the record. AlphabeticallySortedIngredientList only serves the purpose of presenting a clean and orderly list of all ingredients from IngredientsByRecipeID.
I should also mention that if I remove the conditional (as in, I comment out the 'where' clause), then the select into works without any errors. This is with both drop table and create table commented out, and with no table AlphabeticallySortedIngredientList already existing in the database.
Maybe it would be better to just add them all to the table without deleting duplicates, and then go through and delete duplicates? My problem might be with trying to do it all in one step.

The SELECT/INTO syntax creates a new table. You cannot use this and reference the table you are creating in the WHERE clause. Want you want to do is use DISTINCT on the SELECT clause to remove duplicates before you create the table. If you need to use an existing table then you want to use an INSERT statement where the values to insert come from a SELECT clause. Note, if you need to keep the table order by name, it should have the name as the primary key. Since you want unique entries on this, you probably want this anyway.
SELECT DISTINCT IngredientName
INTO AlphabeticallySortedIngredientsList
FROM IngredientsByRecipeID
ADD CONSTRAINT PK_AlphabeticallySortedIngredientsList PRIMARY KEY CLUSTERED (IncredientName)
Generally, I would not suggest keeping this table as you can always get the data from the existing table. Keeping the data in separate tables will force you to periodically update one or the other to keep them in sync. If you do opt to keep the names in a separate table, then you should probably add an Id column, then update the other (original) table with the Id of the name and remove the name column from it. Add a foreign key constraint on the name Id column so that you are required to add the name there and link it to the recipe.

Create table Ingredients
(ingredient varChar(50) primary key not null)
Insert Ingredients(Ingredient)
Select distinct IngredientName
From IngredientsByRecipeID

Yes you are on the right track. The SELECT ... INTO statement will create the table based on the SELECT expression. Trying to access it prematurely in the WHERE NOT EXISTS clause is causing the conflict.
Trying using a SELECT DISTINCT or GROUP BY clause to remove duplicates without relying on WHERE NOT EXISTS.

Related

Duplicate Error when inserting unique non duplicate values T-SQL

Here is my other mistake I can't fight:
I am trying to add (INSERT INTO sql statement) a unique records from the WHID field
(ClientEpisode table) into WHID field (EHREpisode table):
INSERT INTO
[WH].[Fact].EHREpisode ([WHID])
SELECT
[HP].[bhcms5].ClientEpisode.WHID
FROM
[HP].[bhcms5].ClientEpisode
Both of the WHID fields are unique (non duplicated) in both tables, and between each others, but I keep having an error:
Plz, see the error message, in bigger letters:
"Cannot insert duplicate key row in object 'Fact.EHREpisode' with unique index 'IX_EHREpisode'. The duplicate key value is (NULL, NULL>).
The statement has been terminated."
Below are my tables structures:
EHREpisode:
ClientEpisode:
The error message seems to be saying something about NULL values being the culprit (your screen capture is hard to read). So, you may try excluding NULL values from being inserted:
INSERT INTO [WH].[Fact].EHREpisode ([WHID])
SELECT [HP].[bhcms5].ClientEpisode.WHID
FROM [HP].[bhcms5].ClientEpisode
WHERE [HP].[bhcms5].ClientEpisode.WHID IS NOT NULL;
As an alternative, consider using an index on WHID which ignores NULL values:
CREATE UNIQUE NONCLUSTERED INDEX idx_col1
ON [HP].[bhcms5].ClientEpisode (WHID)
WHERE WHID IS NOT NULL;
The error is very clear: There is a unique index called IX_EHREpisode with such columns that your query tries to insert the value (null,null) two times.
You have provided us a set of SSMS GUI pictures that are not really relevant to this. Instead, the way to use the GUI would be:
Go to the "Object explorer" at the left of SSMS
Find the table "EHREpisode" you are interested in
Open "Indexes". You should find IX_EHREpisode here. Open it.
Inside you will see the columns included in this index. There will be a couple of these that would both take NULL value if your query executed
Thus, you will either have to modify the index, or re-think your query.

Insert records from temp table where record is not already preset fails

I am trying to insert from temporary table into regular one but since there is data in temp table sharing the same values for a primary key of the table I am inserting to, it fails with primary key constraint being violated. That is expected so I am working around it by inserting only the rows that have the primary key not already present in table I am inserting to.
I tried both EXISTS and NOT IN approach, I checked examples showcasing both, confirmed both works in SQL server 2014 in general, yet I am still getting the following error:
Violation of PRIMARY KEY constraint 'PK_dbo.InsuranceObjects'. Cannot
insert duplicate key in object 'dbo.InsuranceObjects'. The duplicate
key value is (3835fd7c-53b7-4127-b013-59323ea35375).
Here is the SQL in NOT IN variance I tried:
print 'insert into InsuranceObjects'
INSERT INTO $(destinDB).InsuranceObjects
(
Id, Value, DefInsuranceObjectId
)
SELECT Id, InsuranceObjectsValue, DefInsuranceObjectId
FROM #vehicle v
WHERE v.Id NOT IN (SELECT Id FROM $(destinDB).InsuranceObjects) -- prevent error when running scrypt multiple times over
GO
If not apparent:
Id is the primary key in question.
$(destinDB) is command line variable. Different from TSQL variable.
It allows me to define the target database and instance in convenient
script based level or even multiple scripts based level. Its used in
multiple variations throughout the code and has so far performed
perfectly. The only downside is you have to run in CMD mode.
when creating all temp tables USE $(some database) is also used so
it's not an issue
I must be missing something completely obvious but it's driving me nuts that such a simple query fails. What is worse, when I try running select without insert part, it returns ALL the records from temp table despite me having confirmed there are duplicates that should fail the NOT IN part in where clause.
I suspect the issue is that you have duplicate ID values in your temp table. Please check the values there as it would cause the issue you are seeing.

SQL server trigger question

I am by no means a sql programmer and I am trying to accomplish something that I am pretty sure has been done a million times before.
I am trying to auto generate a customer number in sql every time a new customer is inserted, but the trigger (or sp?) will only work if at least the first name, last name and another value called case number is entered. If any of these fields are missing, the system generates an error. If the criteria is met, the system generates and assigns a unique id to that customer that begins with letters GL- and then uses 5 digit number so a customer John Doe would be GL-00001 and Jane Doe would be GL-00002.
I am sorry if I am asking too much but I am basically a select insert update guy and nothing more so thanks in advance for any help.
If I were in this situation, I would:
--Alter the table(s) so that first name, last name and case number are required (NOT NULL) columns. Handle your checks for required fields on the application side before submitting the record to the database.
--If it doesn't already exist, add an identity column to the customer table.
--Add a persisted computed column to the customer table that will format the identity column into the desired GL-00000 format.
/* Demo computed column for customer number */
create table #test (
id int identity,
customer_number as 'GL-' + left('00000', 5-len(cast(id as varchar(5)))) + cast(id as varchar(5)) persisted,
name char(20)
)
insert into #test (name) values ('Joe')
insert into #test (name) values ('BobbyS')
select * from #test
drop table #test
This should satisfy your requirements without the need to introduce the overhead of a trigger.
So what do you want to do? generate a customer number even when these fields arn't populated?
Have you looked at the SQL for the trigger? You can do this in SSMS (SQL Server Managment Studio) by going to the table in question in the Object Explorer, expanding the table and then expanding triggers.
If you open up the trigger you'll see what it does to generate the customer number. If you are unsure on how this code works, then post the code for the trigger up.
If you are making changes to an existing system i'd advise you to find out any implications that changing the way data is inputted works.
For example, others parts of the application may depend on all of the initial values being populated, so after changing the trigger to allow incomplete data to be added, you may inturn break something else.
You have probably a unique constraint and/or NOT NULL constraints set on the table.
Remove/Disable these (for example with the SQL-Server Management Console in Design Mode) and then try again to insert the data. Keep in mind, that you will probably not be able to enable the constraints after your insert, since you are violating conditions after the insert. Only disable or reomve the constraints, if you are absolutely sure that they are unecessary.
Here's example syntax (you need to know the constraint names):
--disable
ALTER TABLE customer NOCHECK CONSTRAINT your_constraint_name
--enable
ALTER TABLE customer CHECK CONSTRAINT your_constraint_name
Caution: If I were you, I'd rather try to insert dummy values for the not null columns like this:
insert into customers select afield , 1 as dummyvalue, 2 as dummyvalue from your datasource
A very easy way to do this would be to create a table of this sort of structure:
CustomerID of type in that is a primary key and set it as identity
CustomerIDPrfix of type varchar(3) which stores GL- as a default value.
Then add your other fields and set them to NOT NULL.
If that way is not acceptable and you do need to write a trigger check out these two articles:
http://msdn.microsoft.com/en-us/library/aa258254(SQL.80).aspx
http://www.kodyaz.com/articles/sql-trigger-example-in-sql-server-2008.aspx
Basiclly it is all about getting the logic right to check if the fields are blank. Experiment with a test database on your local machine. This will help you get it right.

In Oracle, is it possible to "insert" a column into a table?

When adding a column to an existing table, Oracle always puts the column at the end of the table. Is it possible to tell Oracle where it should appear in the table? If so, how?
The location of the column in the table should be unimportant (unless there are "page sizes" to consider, or whatever Oracle uses to actually store the data). What is more important to the consumer is how the results are called, i.e. the Select statement.
rename YOUR_ORIGINAL_TABLE as YOUR_NEW_TABLE;
create table YOUR_ORIGINAL_TABLE nologging /* or unrecoverable */
as
select Column1, Column2, NEW_COLUMN, Column3
from YOUR_NEW_TABLE;
Drop table YOUR_NEW_TABLE;
Select * From YOUR_ORIGINAL_TABLE; <<<<< now you will see the new column in the middle of the table.
But why would you want to do it? It's seems illogical. You should never assume column ordering and just use named column list if column order is important.
Why does the order of the columns matter? You can always alter it in your select statement?
There's an advantage to adding new columns at the end of the table. If there's code that naively does a "SELECT *" and then parses the fields in order, you won't be breaking old code by adding new columns at the end. If you add new columns in the middle of the table, then old code may be broken.
At one job, I had a DBA who was super-anal about "Never do 'SELECT *'". He insisted that you always write out the specific fields.
What I normally do is:
Rename the old table.
Create the new table with columns in the right order.
Create the constraints for that new table.
Populate with data:Insert into new_table select * from renamed table.
I don't think that this can be done without saving the data to a temporary table, dropping the table, and recreating it. On the other hand, it really shouldn't matter where the column is. As long as you specify the columns you are retrieving in your select statement, you can order them however you want.
Bear in mind that, under the tables, all the data in the table records are glued together. Adding a column to the end of a table [if it is nullable or (in later versions) not null with a default] just means a change to the table's metadata.
Adding a column in the middle would require re-writing every record in that table to add the appropriate value (or markers) for that column. In some cases, that might mean the records take up more room on the blocks and some records need to be migrated.
In short, it's a VAST amount of IO effort for a table of any real size.
You can always create a view over the table that has the columns in the preferred order and use that view in a DML statement just as you would the table
I don't believe so - SQL Server doesn't allow these either. The method I always have to use is:
Create new table that looks right (including additional column
Begin transaction
select all data from old table into new one
Drop old table
Rename new table
Commit transaction.
Not exactly pretty, but gets the job done.
No, its not possible via an "ALTER TABLE" statement. However, you could create a new table with the same definition as your current one, albeit with a different name, with the columns in the correct order in the way you want them. Copy the data into the new table. Drop the old table. Rename the new table to match the old table name.
Tom Kyte has an article on this on AskTom
link text
Apparently there's a trick involving marking the "after" columns INVISIBLE; when restored, they end up at the back.
CREATE TABLE yourtable (one NUMBER(5, 0), two NUMBER(5, 0), three NUMBER(5, 0), four NUMBER(5, 0))
ALTER TABLE yourtable ADD twopointfive NUMBER(5, 0);
ALTER TABLE yourtable MODIFY (three INVISIBLE, four INVISIBLE);
ALTER TABLE yourtable MODIFY (three VISIBLE, four VISIBLE);
https://oracle-base.com/articles/12c/invisible-columns-12cr1#invisible-columns-and-column-ordering
1) Ok so you can't do it directly. We don't need post after post saying the same thing, do we?
2) Ok so the order of columns in a table doesn't technically matter. But that's not the point, the original question simply asked if you could or couldn't be done. Don't presume that you know everybody else's requirements. Maybe they have a table with 100 columns that is currently being queried using "SELECT * ..." inside some monstrously hacked together query that they would just prefer not to try to untangle, let alone replace "*" with 100 column names. Or maybe they are just anal about the order of things and like to have related fields next to each other when browsing schema with, say SQL Developer. Maybe they are dealing with non-technical staff that won't know to look at the end of a list of 100 columns when, logically, it should be somewhere near the beginning.
Nothing is more irritating than asking an honest question and getting an answer that says: "you shouldn't be doing that". It's MY job, not YOURS! Please don't tell me how to do my job. Just help if you can. Thanks!
Ok... sorry for the rant. Now...at www.orafaq.com it suggests this workaround.
First suppose you have already run:
CREATE TABLE tab1 ( col1 NUMBER );
Now say you want to add a column named "col2", but you want them ordered "col2", "col1" when doing a "SELECT * FROM tbl1;"
The suggestion is to run:
ALTER TABLE tab1 ADD (col2 DATE);
RENAME tab1 TO tab1_old;
CREATE TABLE tab1 AS SELECT 0 AS col1, col1 AS col2 FROM tab1_old;
I found this to be incredibly misleading. First of all, you're filling "col1" with zero's so, if you had any data, then you are losing it by doing this. Secondly, it's actually renaming "col1" to "col2" and fails to mention this. So, here's my example, hopefully it's a little clearer:
Suppose you have a table that was created with the following statement:
CREATE TABLE users (first_name varchar(25), last_name varchar(25));
Now say you want to insert middle_name in between first_name and last_name. Here's one way:
ALTER TABLE users ADD middle_name varchar(25);
RENAME users TO users_tmp;
CREATE TABLE users AS SELECT first_name, middle_name, last_name FROM users_tmp;
/* and for good measure... */
DROP TABLE testusers_tmp;
Note that middle_name will default to NULL (implied by the ALTER TABLE statement). You can alternatively set a different default value in the CREATE TABLE statement like so:
CREATE TABLE users AS SELECT first_name, 'some default value' AS middle_name, last_name FROM users_tmp;
This trick could come in handy if you're adding a date field with a default of sysdate, but you want all of the existing records to have some other (e.g. earlier) date value.

Detailed error message for violation of Primary Key constraint in sql2008?

I'm inserting a large amount of rows into an empty table with a primary key constraint on one column.
If there is a duplicate key error, is there any way to find out the value of the key (or row) that caused the error?
Validating the data prior to the insert is sadly not something I can do right now.
Using SQL 2008.
Thanks!
Doing the count(*) / group by thing is something I'm trying to avoid, this is an insert of hundreds of millions of rows from hundreds of different DB's (some of which are on remote servers)...I don't have the time or space to do the insert twice.
The data is supposed to be unique from the providers, but unfortunately their validation doesn't seem to work correctly 100% of the time and I'm trying to at least see where it's failing so I can help them troubleshoot.
Thank you!
There's not a way of doing it that won't slow your process down, but here's one way that will make it easier. You can add an instead-of trigger on that table for inserts and updates. The trigger will check each record before inserting it and make sure it won't cause a primary key violation. You can even create a second table to catch violations, and have a different primary key (like an identity field) on that one, and the trigger will insert the rows into your error-catching table.
Here's an example of how the trigger can work:
CREATE TRIGGER mytrigger ON sometable
INSTEAD OF INSERT
AS BEGIN
INSERT INTO sometable SELECT * FROM inserted WHERE ISNUMERIC(somefield) = 1 FROM inserted;
INSERT INTO sometableRejects SELECT * FROM inserted WHERE ISNUMERIC(somefield) = 0 FROM inserted;
END
In that example, I'm checking a field to make sure it's numeric before I insert the data into the table. You'll need to modify that code to check for primary key violations instead - for example, you might join the INSERTED table to your own existing table and only insert rows where you don't find a match.
The solution would depend on how often this happens. If it's <10% of the time then I would do the following:
Insert the data
If error then do Bravax's revised solution (remove constraint, insert, find dup, report and kill dup, enable constraint).
This means it's only costing you on the few times an error occurs.
If this is happening more often then I'd look at sending the boys over to see the providers :-)
Revised:
Since you don't want to insert twice, could you:
Drop the primary key constraint.
Insert all data into the table
Find any duplicates, and remove them
Then re-add the primary key constraint
Previous reply:
Insert the data into a duplicate of the table without the primary key constraint.
Then run a query on it to determine rows which have duplicate values for the rpimary key column.
select count(*), <Primary Key>
from table
group by <Primary Key>
having count(*) > 1
Use SSIS to import the data and have it check for this as part of the data flow. That is the best way to handle. SSIS can send the bad records to a table (that you can later send to the vendor to help them clean up their act) and process the good ones.
I can't believe that SSIS does not easily address this "reality", because, let's face it, oftentimes you need and want to be able to:
See if a record exists with a certain unique or primary key
If it does not, insert it
If it does, either ignore it or update it.
I don't understand how they would let a product out the door without this capability built-in in an easy-to-use manner. Like, say, set an attribute of a component to automatically check this.

Resources