Data movement utility in db2

Data movement utility in db2 - database

Why the not logged initially option during the data movement is faster than using the LOAD utility in DB2?
Not logged initially Method:
db2 alter table tablename activate not logged initially
db2 insert into table tablename select * from tbname
Load Utility:
db2 declare source cursor for select * from tablename
db2 load from source of cursor insert into tablename nonrecoverable

Based on your database size and performance question I am going to assume that you are using the Database Partitioning Feature (DPF) with DB2.
When you perform INSERT INTO ... SELECT, this occurs in parallel on all database partitions – each partition is working independently. With logging turned off, this will be quite fast (albeit dangerous – if there is a problem, the not-logged-initially table will have to be dropped and recreated).
When you use LOAD FROM CURSOR, all of the database partitions execute the SELECT statement and return the rows to the coordinator partition, which then feeds them into the LOAD utility. The LOAD utility then performs hash partitioning to send the data back to all of the database partitions again. As you can imagine, with a large volume of data, shipping all of this data back and forth can be quite inefficient.

Related

How to loop on multiple databases and tables

I have a SQL Server instance which contains multiple databases. I have 2 tables which exist in all databases on the server:
Refresh Log
Detailed refresh log.
I want to union all the tables across all databases on the server so the final result will be 2 tables which are the union refresh log and detailed refresh log.
I need help to write the function which runs across all databases.

I'm also a little uncertain as to what you're hoping for, for example if you need the resulting output to be in two permanent tables or if you just need the result when queried. Of course once you build your SELECT you can return it to the caller or put it into a table, so I'll leave that up to you.
If your databases are unchanging, then of course you can just write your query and maybe put it into a VIEW for convenience:
SELECT columns from database1.dbo.RefreshLog
UNION ALL
SELECT columns from database2.dbo.RefreshLog
...
and so on
But if you're saying that your databases are themselves dynamic, in other words that databases may be created or dropped over the lifetime of your project, then you could consider using the "undocumented" procedure sp_msforeachdb to build up a list of databases, and then use THAT list to build your UNION query. Here's a quick script that captures the names of all databases that include a specific table ("Products" in the example):
IF object_id('tempdb..#DatabaseNames') IS NOT NULL
DROP TABLE #DatabaseNames
CREATE TABLE #DatabaseNames (DatabaseName SYSNAME)
execute sp_msforeachdb #command1=
N'IF EXISTS(SELECT * FROM [?].sys.tables WHERE Name = ''Products'')
INSERT #DatabaseNames VALUES(N''Database [?]'')'
SELECT * FROM #DatabaseNames

Where did my new table go

I have a database called mbt. I wanted to write some data from temporary table to real table.
--I used this query.
SELECT * INTO new_table FROM #tmp
when i runned the query it returned normal message.
15813 row(s) affected
After that i checked my tables in mbt database, but i couldn't see 'new_table'
how could such a thing be, where the table might have gone.
I may have forgotten to use 'use MBT' statment at the beginning of the query. Does it make problem
I'm using ms sql server 2014(SP2)(KB3171021)-12.0.5000.0(X64)
ANSWER
It gone to Master DB
select 'master' as DatabaseName,
T.name collate database_default as TableName
from master.sys.tables as T

It Will create a new table on your database. but you did not use so it will store in master database on your server.

Run the query below to find databases which have the object new_table:
sp_MSForEachDB 'Use [?] IF EXISTS (SELECT 1 FROM sys.objects WHERE name= ''new_table'')
SELECT DB_NAME()'

I had the same problem. What i did is, I rewrite the statement of use Database and then refresh the database browser after that i got Result. You can try it. may be it will help you.

Always use command "USE db_name" to make sure that you are querying right database.
Below command will show all databases available on the server.
SHOW DATABASES;
If you are using GUI tool to connect DB server, there is a possibility that at the time of connection you got connected to different DB. If you executed the query to create table and inserted record. These records are inserted in new table in different DB than mbt.

Neater way of dynamically selecting a database other than sp_executesql

I am looking to set up a high availability architecture whereby two mirror databases exist (DB1 & DB2) that serve another database with views (DBV) on it. DB1 has the overnight ETL on it, whilst DBV looks at DB2 until the etl is complete on DB1, at which point its views switch to the underlying tables on DB1. Once the ETL is complete on DB1, DB2 is restored with DB1 data before the next day's ETL. The next day, DB1 and DB2 switch roles.
I am looking for a neater/more secure way of switching between the two views than running sp_executesql to run a dynamically built string. I will be looking to also do this on stored procedures from a staging database which need to have their scripts dynamically altered to use the correct database to run the ETL on. Essentially, I am looking to pass the USE statement dynamically and then execute the rest of the script outside of any dynamic statement.
I want to avoid sp_executesql for support reasons for other developers and also to get around any possible extensive concatenation of strings if the stored procedure/view gets particularly lengthy.
Any ideas / different approaches to high availability in this context would be welcome.

One option might be to create a copy of each view in DBV for both target databases - i.e.
some_schema.DB1_myview
some_schema.DB2_myview
and then use a synonym to expose the views under their final names.
CREATE SYNONYM some_schema.myview ON some_schema.DB1_myview
Your switch process would then need only to drop and recreate the synonyms, rather than the views themselves. This would still need to be done with a dynamic SQL statement, but the complexity would be much lower.
A downside would be that there would be a risk of the definitions of the underlying views getting out of sync.
Edit
At the cost of more risk of getting out of sync, it would be possible to avoid dynamic SQL altogether by creating (for instance) a pair of stored procedures each of which generated the synonyms for one database or the other. Your switch code would then only need to work out which procedure to call.

Have you considered renaming the databases as you switch things around? I.e. the following prints 1 followed by 2, nothing in DBV had to be modified:
create database DB1
go
use DB1
go
create table T (ID int not null);
insert into T(ID) values (1);
go
create database DB2
go
use DB2
go
create table T (ID int not null);
insert into T(ID) values (2);
go
create database DBV
go
use DBV
go
create view V
as
select ID
from DB1..T
go
select * from V
go
alter database DB1 modify name = DBt
go
alter database DB2 modify name = DB1
go
alter database DBt modify name = DB2
go
select * from V
Obviously better names than 1 and 2 may be used. This way, DB1 is always the one used for live and DB2 is used for any staging work.

Reserving clean block of identity values in T-SQL for data migration

We're currently working on the following process whose goal is to move data between 2 sets of database servers while maintaining FK's and handling the fact that the destination tables already have rows with overlapping identity column values:
Extract a set of rows from a "root" table and all of its children tables' FK associated data n-levels deep along with related rows that may reside in other databases on the same instance from the source database server.
Place that extracted data set into a set of staging tables on the destination database server.
Rekey the data in the staging tables by reserving block of identities for the destination tables and update all related child staging tables (each of these staging tables will have the same schema as the source/destination table with the addition of a "lNewIdentityID" column).
Insert the data with its new identity into the destination tables in correct order (option SET IDENTITY_INSERT 'desttable' ON will be used obviously).
I'm struggling with the block reservation portion of this process (#3). Our system is pretty much a 24 hour system except for a short weekly maintenance window. Management needs this process to NOT have to wait each week for the maintenance window to migrate data between servers. That being said, I may have 100 insert transactions competing with our migration process while it is on #3. Below is my wag at an attempt to reserve the block of identities, but I'm worried that between "SET #newIdent..." and "DBCC CHECKIDENT..." that an insert transaction will complete and the migration process won't have a "clean" block of identities in a known range that it can use to rekey the staging data.
I essentially need to lock the table, get the current identity, increase the identity, and then unlock the table. I don't know how to do that in T-SQL and am looking for ideas. Thank you.
IF EXISTS (SELECT TOP 1 1 FROM sys.procedures WHERE [name]='DataMigration_ReserveBlock')
DROP PROC DataMigration_ReserveBlock
GO
CREATE PROC DataMigration_ReserveBlock (
#tableName varchar(100),
#blockSize int
)
AS
BEGIN
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
DECLARE #newIdent bigint;
SET #newIdent = #blockSize + IDENT_CURRENT(#tableName);
DBCC CHECKIDENT (#tableName, RESEED, #newIdent);
SELECT #newIdent AS NewIdentity;
END
GO
DataMigration_ReserveBlock 'tblAddress', 1234

You could wrap it in a transaction
BEGIN TRANSACTION
...
COMMIT
It should be fast enough to not cause problems with your other insert processes. Though it would be a good idea to include try / catch logic so you could rollback if problems do occur.

Normal table or global temp table?

Me and another developer are discussing which type of table would be more appropriate for our task. It's basically going to be a cache that we're going to truncate at the end of the day. Personally, I don't see any reason to use anything other than a normal table for this, but he wants to use a global temp table.
Are there any advantages to one or the other?

Use a normal table in tempdb if this is just transient data that you can afford to lose on service restart or a user database if the data is not that transient.
tempdb is slightly more efficient in terms of logging requirements.
Global temp tables get dropped once all referencing connections are the connection that created the table is closed.
Edit: Following #cyberkiwi's edit. BOL does definitely explicitly say
Global temporary tables are visible to
any user and any connection after they
are created, and are deleted when all
users that are referencing the table
disconnect from the instance of SQL
Server.
In my test I wasn't able to get this behaviour though either.
Connection 1
CREATE TABLE ##T (i int)
INSERT INTO ##T values (1)
SET CONTEXT_INFO 0x01
Connection 2
INSERT INTO ##T VALUES(4)
WAITFOR DELAY '00:01'
INSERT INTO ##T VALUES(5)
Connection 3
SELECT OBJECT_ID('tempdb..##T')
declare #killspid varchar(10) = (select 'kill ' + cast(spid as varchar(5)) from sysprocesses where context_info=0x01)
exec (#killspid)
SELECT OBJECT_ID('tempdb..##T') /*NULL - But 2 is still
running let alone disconnected!*/

Global temp table
-ve: As soon as the connection that created the table goes out of scope, it takes
the table with it. This is damaging if you use connection pooling which can swap connections constantly and possibly reset it
-ve: You need to keep checking to see if the table already exists (after restart) and create it if not
+ve: Simple logging in tempdb reduces I/O and CPU activity
Normal table
+ve: Normal logging keeps your cache with your main db. If your "cache" is maintained but is still mission critical, this keeps it consistent together with the db
-ve: follow from above More logging
+ve: The table is always around, and for all connections
If the cache is a something like a quick lookup summary for business/critical data, even if it is reset/truncated at the end of the day, I would prefer to keep it a normal table in the db proper.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight