I have a requirement where data from 150 tables with different columns should be copied to another table which has all these columns. I need a script which will do this activity automatically instead of manually inserting one by one.
Any suggestions?
You can get the column names from either sys.columns or information_schema.columns along with the datatype, then it's just a simple matter of de-duping the columns (based on name) and sorting out any conflicts with differing datatypes to create your destination table.
once you have that, you can create and execute all your insert statements.
Good luck.
Related
I have a TABLE A with 3000 records with 25 columns. I want to have a history table called Table A history holding all the changes updates and deletes for me to look up any day. I usually use cursors. Now thought using triggers which I was not asked to. Do you have any other suggestions? Many thanks!
If your using tsql /SQL server and you can't use triggers, which is the only sure way to get every change, maybe use a stored procedure that is scheduled in job to run every x amount of time, the stored procedure using a MERGE statement with the two tables to get new records or changes. I would not suggest this if you need every single change without question.
CREATE TABLE dbo.TableA (id INT, Column1 nvarchar(30))
CREATE TABLE dbo.TableA_History (id INT, Column1 nvarchar(30), TimeStamp DateTime)
(this code isn't production, just the general idea)
Put the following code inside a stored procedure and use a Sql Server Job with a schedule on it.
MERGE INTO dbo.TableA_History
USING dbo.TableA
ON TableA_History.id = TableA.id AND TableA_History.Column1 = TableA.Column1
WHEN NOT MATCHED BY TARGET THEN
INSERT (id,Column1,TimeStamp) VALUES (TableA.id,TableA.Column1,GETDATE())
So basically if the record either doesn't exist or doesn't match meaning a column changed, insert the record into the history table.
It is possible to create history without triggers in some case, even if you are not using SQL Server 2016 and system-versioned table are not available.
In some cases, when you can identify for sure which routines are modifying your table, you can create history using OUTPUT INTO clause.
For example,
INSERT INTO [dbo].[MainTable]
OUTPUT inserted.[]
,...
,'I'
,GETUTCDATE()
,#CurrentUserID
INTO [dbo].[HistoryTable]
SELECT *
FROM ... ;
In routines, when you are using MERGE I like that we can use $action:
Is available only for the MERGE statement. Specifies a column of type
nvarchar(10) in the OUTPUT clause in a MERGE statement that returns
one of three values for each row: 'INSERT', 'UPDATE', or 'DELETE',
according to the action that was performed on that row.
It's very handy that we can add the user which is modifying the table. Using triggers you need to use session context or session variable to pass the user. In versioning table you need to add additional column to the main table in order to log the user as it only logs the current table columns (at least for now).
So, basically it depends on your data and application. If you have many sources of CRUD over the table, the trigger is the most secure way. If your table is very big and heavily used, using MERGE is not good as it my cause blocking and harm performance.
In our databases we are using all of the methods depending on the situation:
triggers for legacy
system-versioning for new development
direct OUTPUT in the history, when sure that data is modified only by given set of routines
I have DB tables where there are no identity column. We have client data fetched from DB2 to SQL Server and unfortunately DB2 design doesn't have identity columns.
Now we have some data inserted, updated and deleted from source (DB2/SQL Server) and these data I want to load to destination (SQL Server) using some incremental load concept.
I tried SSIS lookups in Dataflow task however it's taking huge time to simply insert one new record. Please note that, in "lookup transformation editor" I'm mapping all "available input columns" to available "available lookup columns " as there is no identity column. I think, this is why it's taking time. I have few tables having around 20 million records.
Is there any faster method /ways available to do this, specially when table does not have identity column? Is except or SQL merge will help?
I'm open to have any other approach other than SSIS.
Look up is SSIS takes some time, so you can use ESQL Task and call the merge procedures.
I think what you can do is use merge procedures there you can create a column in your source table and update the records in the column like
merge desination
using
{
source columns from source s}
join desination d
on s.primarykey=d.primary key
when matched then
s.updatedrecord=1
when not matched then
insert into desination columns.
from the above the query you new records will be inserted and the updated records with the help of updatedrecord column you can update or insert them in your destination table successfully.
you can go to the following link for merge procedures.
https://www.sqlservercentral.com/Forums/Topic1042053-392-1.aspx
https://msdn.microsoft.com/en-us/library/bb510625.aspx
If your source is a SQL query from DB2 for instance, try adding a new column to this. It will be a checksum value over the columns you select "expect to change or want to monitor changes over".
SELECT
BINARY_CHECKSUM(
Column1
,Column2
,Column3)AS ChecksumValue
,Column1
,Column2
,Column3
FROM #TEMP
You would have to add this to your existing table in SQL as well to be able to start comparing.
If you have this, then you can do the lookup on the checksum value rater than on the columns. Since number lookups are a lot quicker than varchar comparisons over multiple columns. I am guessing since there is no key, you would then have to split the data between checksum matches (which should be no change existing records) and non matches. The non matches Could be new rows or just updates. But your set should be smaller to work with.
Good luck. HTH
Background: I am trying to have an after update trigger which stores the changed values dynamically into another table. Since this trigger should be generic and easy to transfer to other tables and won't cause problems, if I add additional columns (If my whole code should be required to solve this, I'll update the question)
While trying to do this, I encounter following issue: I want to store the inserted table into an temporary table, which I do in this way:
SELECT *
INTO #tempINSERTED
FROM INSERTED
But the original table contains both: ntext and timestamp columns which aren't allowed in temporary tables.
Another approach I tried, was looping through the system table INFORMATION_SCHEMA.COLUMNS and build a SQL statement as a string excluding non-copyable columns, but this way I cannot access the inserted table. - I already figured I cannot access inserted if I use sp_executesql.
So my question: is there a way to access the inserted table and exclude non-copyable columns as ntext, text, image ?
Thanks in advance
You want the triggers to run fast. So the better approach would be to generate the create trigger code rather than looping through the fields in the trigger itself. Then if the table schema changes you will need to regenerate the trigger.
For your #TEMPINSERTED table you can use nvarchar(max) in place of ntext,
varchar(max) for text and varbinary(max) in place of image. You can also use and binary(8) or bigint in place of timestamp.
I would suggest using a table variable instead of an #temptable. I.e.:
declare #tempTable table (
fieldname int, -- and so on
)
I have situation where I need to change the order of the columns/adding new columns for existing Table in SQL Server 2008.
Existing column
MemberName
MemberAddress
Member_ID(pk)
and I want this order
Member_ID(pk)
MemberName
MemberAddress
I got the answer for the same ,
Go on SQL Server → Tools → Options → Designers → Table and Database Designers and unselect Prevent saving changes that require table re-creation
2- Open table design view and that scroll your column up and down and save your changes.
It is not possible with ALTER statement. If you wish to have the columns in a specific order, you will have to create a newtable, use INSERT INTO newtable (col-x,col-a,col-b) SELECT col-x,col-a,col-b FROM oldtable to transfer the data from the oldtable to the newtable, delete the oldtable and rename the newtable to the oldtable name.
This is not necessarily recommended because it does not matter which order the columns are in the database table. When you use a SELECT statement, you can name the columns and have them returned to you in the order that you desire.
If your table doesn't have any records you can just drop then create your table.
If it has records you can do it using your SQL Server Management Studio.
Just click your table > right click > click Design then you can now arrange the order of the columns by dragging the fields on the order that you want then click save.
Best Regards
I tried this and dont see any way of doing it.
here is my approach for it.
Right click on table and Script table for Create and have this on
one of the SQL Query window,
EXEC sp_rename 'Employee', 'Employee1' -- Original table name is Employee
Execute the Employee create script, make sure you arrange the columns in the way you need.
INSERT INTO TABLE2 SELECT * FROM TABLE1.
-- Insert into Employee select Name, Company from Employee1
DROP table Employee1.
Relying on column order is generally a bad idea in SQL. SQL is based on Relational theory where order is never guaranteed - by design. You should treat all your columns and rows as having no order and then change your queries to provide the correct results:
For Columns:
Try not to use SELECT *, but instead specify the order of columns in the select list as in: SELECT Member_ID, MemberName, MemberAddress from TableName. This will guarantee order and will ease maintenance if columns get added.
For Rows:
Row order in your result set is only guaranteed if you specify the ORDER BY clause.
If no ORDER BY clause is specified the result set may differ as the Query Plan might differ or the database pages might have changed.
Hope this helps...
This can be an issue when using Source Control and automated deployments to a shared development environment. Where I work we have a very large sample DB on our development tier to work with (a subset of our production data).
Recently I did some work to remove one column from a table and then add some extra ones on the end. I then had to undo my column removal so I re-added it on the end which means the table and all references are correct in the environment but the Source Control automated deployment will no longer work because it complains about the table definition changing.
The real problem here is that the table + indexes are ~120GB and the environment only has ~60GB free so I'll need to either:
a) Rename the existing columns which are in the wrong order, add new columns in the right order, update the data then drop the old columns
OR
b) Rename the table, create a new table with the correct order, insert to the new table from the old and delete from the old as I go along
The SSMS/TFS Schema compare option of using a temp table won't work because there isn't enough room on disc to do it.
I'm not trying to say this is the best way to go about things or that column order really matters, just that I have a scenario where it is an issue and I'm sharing the options I've thought of to fix the issue
SQL query to change the id column into first:
ALTER TABLE `student` CHANGE `id` `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT FIRST;
or by using:
ALTER TABLE `student` CHANGE `id` `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT AFTER 'column_name'
What is the difference between using
SELECT ... INTO MyTable FROM...
and
INSERT INTO MyTable (...)
SELECT ... FROM ....
?
From BOL [ INSERT, SELECT...INTO ], I know that using SELECT...INTO will create the insertion table on the default file group if it doesn't already exist, and that the logging for this statement depends on the recovery model of the database.
Which statement is preferable?
Are there other performance implications?
What is a good use case for SELECT...INTO over INSERT INTO ...?
Edit: I already stated that I know that that SELECT INTO... creates a table where it doesn't exist. What I want to know is that SQL includes this statement for a reason, what is it? Is it doing something different behind the scenes for inserting rows, or is it just syntactic sugar on top of a CREATE TABLE and INSERT INTO.
They do different things. Use INSERT when the table exists. Use SELECT INTO when it does not.
Yes. INSERT with no table hints is normally logged. SELECT INTO is minimally logged assuming proper trace flags are set.
In my experience SELECT INTO is most commonly used with intermediate data sets, like #temp tables, or to copy out an entire table like for a backup. INSERT INTO is used when you insert into an existing table with a known structure.
EDIT
To address your edit, they do different things. If you are making a table and want to define the structure use CREATE TABLE and INSERT. Example of an issue that can be created: You have a small table with a varchar field. The largest string in your table now is 12 bytes. Your real data set will need up to 200 bytes. If you do SELECT INTO from your small table to make a new one, the later INSERT will fail with a truncation error because your fields are too small.
Which statement is preferable? Depends on what you are doing.
Are there other performance implications? If the table is a permanent table, you can create indexes at the time of table creation which has implications for performance both negatively and positiviely. Select into does not recreate indexes that exist on current tables and thus subsequent use of the table may be slower than it needs to be.
What is a good use case for SELECT...INTO over INSERT INTO ...? Select into is used if you may not know the table structure in advance. It is faster to write than create table and an insert statement, so it is used to speed up develoment at times. It is often faster to use when you are creating a quick temp table to test things or a backup table of a specific query (maybe records you are going to delete). It should be rare to see it used in production code that will run multiple times (except for temp tables) because it will fail if the table was already in existence.
It is sometimes used inappropriately by people who don't know what they are doing. And they can cause havoc in the db as a result. I strongly feel it is inappropriate to use SELECT INTO for anything other than a throwaway table (a temporary backup, a temp table that will go away at the end of the stored proc ,etc.). Permanent tables need real thought as to their design and SELECT INTO makes it easy to avoid thinking about anything even as basic as what columns and what datatypes.
In general, I prefer the use of the create table and insert statement - you have more controls and it is better for repeatable processes. Further, if the table is a permanent table, it should be created from a separate create table script (one that is in source control) as creating permanent objects should not, in general, in code are inserts/deletes/updates or selects from a table. Object changes should be handled separately from data changes because objects have implications beyond the needs of a specific insert/update/select/delete. You need to consider the best data types, think about FK constraints, PKs and other constraints, consider auditing requirements, think about indexing, etc.
Each statement has a distinct use case. They are not interchangeable.
SELECT...INTO MyTable... creates a new MyTable where one did not exist before.
INSERT INTO MyTable...SELECT... is used when MyTable already exists.
The primary difference is that SELECT INTO MyTable will create a new table called MyTable with the results, while INSERT INTO requires that MyTable already exists.
You would use SELECT INTO only in the case where the table didn't exist and you wanted to create it based on the results of your query. As such, these two statements really are not comparable. They do very different things.
In general, SELECT INTO is used more often for one off tasks, while INSERT INTO is used regularly to add rows to tables.
EDIT:
While you can use CREATE TABLE and INSERT INTO to accomplish what SELECT INTO does, with SELECT INTO you do not have to know the table definition beforehand. SELECT INTO is probably included in SQL because it makes tasks like ad hoc reporting or copying tables much easier.
Actually SELECT ... INTO not only creates the table but will fail if it already exists, so basically the only time you would use it is when the table you are inserting to does not exists.
In regards to your EDIT:
I personally mainly use SELECT ... INTO when I am creating a temp table. That to me is the main use. However I also use it when creating new tables with many columns with similar structures to other tables and then edit it in order to save time.
I only want to cover second point of the question that is related to performance, because no body else has covered this. Select Into is a lot more faster than insert into, when it comes to tables with large datasets. I prefer select into when I have to read a very large table. insert into for a table with 10 million rows may take hours while select into will do this in minutes, and as for as losing indexes on new table is concerned you can recreate the indexes by query and can still save a lot more time when compared to insert into.
SELECT INTO is typically used to generate temp tables or to copy another table (data and/or structure).
In day to day code you use INSERT because your tables should already exist to be read, UPDATEd, DELETEd, JOINed etc. Note: the INTO keyword is optional with INSERT
That is, applications won't normally create and drop tables as part of normal operations unless it is a temporary table for some scope limited and specific usage.
A table created by SELECT INTO will have no keys or indexes or constraints unlike a real, persisted, already existing table
The 2 aren't directly comparable because they have almost no overlap in usage
Select into creates new table for you at the time and then insert records in it from the source table. The newly created table has the same structure as of the source table.If you try to use select into for a existing table it will produce a error, because it will try to create new table with the same name.
Insert into requires the table to be exist in your database before you insert rows in it.
The simple difference between select Into and Insert Into is:
--> Select Into don't need existing table. If you want to copy table A data, you just type Select * INTO [tablename] from A. Here, tablename can be existing table or new table will be created which has same structure like table A.
--> Insert Into do need existing table.INSERT INTO [tablename] SELECT * FROM A;.
Here tablename is an existing table.
Select Into is usually more popular to copy data especially backup data.
You can use as per your requirement, it is totally developer choice which should be used in his scenario.
Performance wise Insert INTO is fast.
References :
https://www.w3schools.com/sql/sql_insert_into_select.asp
https://www.w3schools.com/sql/sql_select_into.asp
The other answers are all great/correct (the main difference is whether the DestTable exists already (INSERT), or doesn't exist yet (SELECT ... INTO))
You may prefer to use INSERT (instead of SELECT ... INTO), if you want to be able to COUNT(*) the rows that have been inserted so far.
Using SELECT COUNT(*) ... WITH NOLOCK is a simple/crude technique that may help you check the "progress" of the INSERT; helpful if it's a long-running insert, as seen in this answer).
[If you use...]
INSERT DestTable SELECT ... FROM SrcTable
...then your SELECT COUNT(*) from DestTable WITH (NOLOCK) query would work.
Select into for large datasets may be good only for a single user using one single connection to the database doing a bulk operation task. I do not recommend to use
SELECT * INTO table
as this creates one big transaction and creates schema lock to create the object, preventing other users to create object or access system objects until the SELECT INTO operation completes.
As proof of concept open 2 sessions, in first session try to use
select into temp table from a huge table
and in the second section try to
create a temp table
and check the locks, blocking and the duration of second session to create a temp table object. My recommendation it is always a good practice to create and Insert statement and if needed for minimal logging use trace flag 610.