I am testing some backup utility working with Oracle DB 19c where Im trying to generate some archive logs so that it is able to be deleted after a certain size.
Is there some way to quickly generate redo logs via artificially creating load so that they are pushed to archivelogs?
One way to generate redo would be the following
Be sure to have the tablespace with logging mode or the table created with the logging option. If the tablespace was created without specifying any logging mode, the default mode is logging. Same applied for the table.
To avoid issues with storage, just truncate the table in each main loop.
Be sure that you have enough space for at least one loop iteration, thereby you won't get any error.
In the example below, we put the tablespace in force logging mode , although it is not necessary. Then I create a test table with just three fields, but you can use as many as you want, just remember you need storage for at least one iteration.
I use dbms_random to generate random string values.
Example
alter tablespace users force logging; -- if the tablespace has nologging
create table x ( c1 number, c2 varchar2(50), c3 varchar2(50) ) logging tablespace users ; -- table in logging mode
declare
num_loops pls_integer := 10; -- use as many iterations as you want.
begin
for r in 1..num_loops
loop
for h in 1 .. 100000 -- I just define 100k for main loop to avoid undo issues
loop
insert into x values ( h , dbms_random.string('X',50), dbms_random.string('X',50) ) ;
end loop;
commit;
execute immediate ' truncate table x reuse storage ' ;
end loop;
end;
/
Related
I've already seen a dozen such questions but most of them get answers that doesn't apply to my case.
First off - the database is am trying to get the data from has a very slow network and is connected to using VPN.
I am accessing it through a database link.
I have full write/read access on my schema tables but I don't have DBA rights so I can't create dumps and I don't have grants for creation new tables etc.
I've been trying to get the database locally and all is well except for one table.
It has 6.5 million records and 16 columns.
There was no problem getting 14 of them but the remaining two are Clobs with huge XML in them.
The data transfer is so slow it is painful.
I tried
insert based on select
insert all 14 then update the other 2
create table as
insert based on select conditional so I get only so many records and manually commit
The issue is mainly that the connection is lost before the transaction finishes (or power loss or VPN drops or random error etc) and all the GBs that have been downloaded are discarded.
As I said I tried putting conditionals so I get a few records but even this is a bit random and requires focus from me.
Something like :
Insert into TableA
Select * from TableA#DB_RemoteDB1
WHERE CREATION_DATE BETWEEN to_date('01-Jan-2016') AND to_date('31-DEC-2016')
Sometimes it works sometimes it doesn't. Just after a few GBs Toad is stuck running but when I look at its throughput it is 0KB/s or a few Bytes/s.
What I am looking for is a loop or a cursor that can be used to get maybe 100000 or a 1000000 at a time - commit it then go for the rest until it is done.
This is a one time operation that I am doing as we need the data locally for testing - so I don't care if it is inefficient as long as the data is brought in in chunks and a commit saves me from retrieving it again.
I can count already about 15GBs of failed downloads I've done over the last 3 days and my local table still has 0 records as all my attempts have failed.
Server: Oracle 11g
Local: Oracle 11g
Attempted Clients: Toad/Sql Dev/dbForge Studio
Thanks.
You could do something like:
begin
loop
insert into tablea
select * from tablea#DB_RemoteDB1 a_remote
where not exists (select null from tablea where id = a_remote.id)
and rownum <= 100000; -- or whatever number makes sense for you
exit when sql%rowcount = 0;
commit;
end loop;
end;
/
This assumes that there is a primary/unique key you can use to check if a row int he remote table already exists in the local one - in this example I've used a vague ID column, but replace that with your actual key column(s).
For each iteration of the loop it will identify rows in the remote table which do not exist in the local table - which may be slow, but you've said performance isn't a priority here - and then, via rownum, limit the number of rows being inserted to a manageable subset.
The loop then terminates when no rows are inserted, which means there are no rows left in the remote table that don't exist locally.
This should be restartable, due to the commit and where not exists check. This isn't usually a good approach - as it kind of breaks normal transaction handling - but as a one off and with your network issues/constraints it may be necessary.
Toad is right, using bulk collect would be (probably significantly) faster in general as the query isn't repeated each time around the loop:
declare
cursor l_cur is
select * from tablea#dblink3 a_remote
where not exists (select null from tablea where id = a_remote.id);
type t_tab is table of l_cur%rowtype;
l_tab t_tab;
begin
open l_cur;
loop
fetch l_cur bulk collect into l_tab limit 100000;
forall i in 1..l_tab.count
insert into tablea values l_tab(i);
commit;
exit when l_cur%notfound;
end loop;
close l_cur;
end;
/
This time you would change the limit 100000 to whatever number you think sensible. There is a trade-off here though, as the PL/SQL table will consume memory, so you may need to experiment a bit to pick that value - you could get errors or affect other users if it's too high. Lower is less of a problem here, except the bulk inserts become slightly less efficient.
But because you have a CLOB column (holding your XML) this won't work for you, as #BobC pointed out; the insert ... select is supported over a DB link, but the collection version will get an error from the fetch:
ORA-22992: cannot use LOB locators selected from remote tables
ORA-06512: at line 10
22992. 00000 - "cannot use LOB locators selected from remote tables"
*Cause: A remote LOB column cannot be referenced.
*Action: Remove references to LOBs in remote tables.
It's been several years since I've worked with SQL and C# .NET so be gentle.
I'm jumping in to assist on a project that a coworker has been building. Something though seems quite out of whack.
I'm trying to provide straight reports on a particular Table in the database. It has 9 columns and approximately 1.6M rows last time I checked. This is big, but it's hardly large enough to create problems. However, when I run a simple Query using MS SQL Server Management Studio, it takes 11 seconds.
SELECT *
FROM [4.0Analytics].[dbo].[Measurement]
where VIN = 'JTHBJ46G482271076';
I tried creating an index for VIN but it times out.
"An exception occurred while executing a Transact-SQL statement or batch."
"Could not allocate space for object 'X' in dabase 'Your Datase' because the 'PRIMARY' filegroup is full"
It seems however that it should be taking a lot less time in the first place even non-indexed so I'd like to find out what might be wrong there and then move onto the index time-out next. Unless 11 seconds is normal for a simple query when non-indexed?
As David Gugg has mentioned you do not have enough space left in your database.
Check if you have enough space left on the disk where your Primary File is located. If you have enough space on the disk use the following command and then try to create the index
USE [master]
GO
ALTER DATABASE [4.0Analytics]
MODIFY FILE ( NAME = N'Primary_File_Name'
, MAXSIZE = UNLIMITED
, FILEGROWTH = 10%
)
GO
-- This will allow your database to grow automatically if it runs out of space
-- provided you have space left on the disk
-- Now try to create the Index and it should let you create it.
SELECT * is taking too long. Well no wonder how many indexes you put on a table if you are doing a SELECT * it will always result in a Clustered Index Scan if you have primary key defined on the table otherwise a table scan.
Try `Select <Column Names>` --<-- Only the columns you actually need
I would not recommend to SET the datafile Autogruth to percentage [%], it is better
(best practice ) to set it to growth by MB, for example:
USE [master]
GO
ALTER DATABASE [YourDataBaseName] MODIFY
FILE ( NAME = N'YouDataBaseFileName',
FILEGROWTH = 10240KB ,
MAXSIZE = UNLIMITED)
GO
The Error you have got during the index creation were, because that the index didn't have the ability to extend.(because the parameter MAXSIZE is set to LIMIT value).
to check it you can do by :
a. Object Explorer >>> Databases >>> Right click on the requested Database >>> GO to TAB "File".
b.T-SQL :
select
FILE_NAME(e.file_id ) as [FileName],
e.growth,
e.max_size,
e.is_percent_growth
f rom sys.master_files e
where OBJECT_NAME(e.database_id) = 'YourDatabaseName'
GO
I have 200K+ rows data in xls and as per requirement i need to update database tables (2 tables) using xls data.
I know the process to copy data from xls to SQL server table however i am struggling with approach to update database tables.
I could not think of any other approach than writing a cursor and i dont want to go with cursor approach as updating
200k+ data using cursor may eat up transaction log and will take lot of time to finish the update.
Can someone help me with what else could be done to accomplish this.
Use the following techniques.
1 - Import the data into a staging table. Use the import / export tool is one way to do the task The target table should be in a throw away or staging database.
http://technet.microsoft.com/en-us/library/ms141209.aspx
2 - Make sure that the data types between the EXCEL data and TABLE data are the same.
3 - Make sure the existing target [TRG_TBL] TABLE has a primary key. Make sure the EXCEL data loaded into a [SRC_TBL] table has the same key. You can add a non-clustered index to speed up the JOIN in the UPDATE statement.
4 - Add a [FLAG] column as INT NULL to the [TRG_TABLE] with an ALTER TABLE command.
5 - Make sure a full backup is done before and after the large UPDATE. You can also use a DATABASE SNAPSHOT. The key point is to have a roll back plan in place if needed.
-- Select correct db
USE [TRG_DB]
GO
-- Set to simple mode
ALTER DATABASE [TRG_DB] SET RECOVERY SIMPLE;
GO
-- Update in batches
DECLARE #VAR_ROWS INT = 1;
WHILE (#VAR_ROWS > 0)
BEGIN
-- Update fields and flag on join
UPDATE TOP (10000) T
SET
T.FLD1 = S.FLD1,
-- ... Etc
T.FLAG = 1
FROM [TRG_TABLE] T JOIN [SRC_TABLE] S ON T.ID = S.ID
WHERE T.[FLAG] IS NULL
-- How many rows updated
SET #VAR_ROWS = ##ROWCOUNT;
-- WAL -> flush log entries to data file
CHECKPOINT;
END
-- Set to full mode
ALTER DATABASE [MATH] SET RECOVERY FULL;
GO
In summary, I gave you all the tools to do the job. Just modify them for your particular occurrence.
PS: Here is working code from my blog on large deletes. Same logic applies.
http://craftydba.com/?p=3079
PPS: I did not check the sample code for syntax. That is left up for you.
I am using DB2 9.7 FP5 for LUW. I have a table with 2.5 million rows and I want to delete about 1 million rows and this delete operation is distributed across table. I am deleting data with 5 delete statements.
delete from tablename where tableky between range1 and range2
delete from tablename where tableky between range3 and range4
delete from tablename where tableky between range5 and range5
delete from tablename where tableky between range7 and range8
delete from tablename where tableky between range9 and range10
While doing this, first 3 deletes works properly but the 4th fails and DB2 hangs, doing nothing. Below is the process I followed, please help me on this:
1. Set following profile registry parameters: DB2_SKIPINSERTED,DB2_USE_ALTERNATE_PAGE_CLEANING,DB2_EVALUNCOMMITTED,DB2_SKIPDELETED,DB2_PARALLEL_IO
2.Alter bufferpools for automatic storage.
3. Turn off logging for tables (alter table tabname activate not logged initially) and delete records
4. Execute the script with +c to make sure logging is off
What are the best practices to delete such large amount of data? Why its failing when it is deleting data from same table and of same nature?
This is allways tricky task. The size of transaction (e.g. for safe rollback) is limited by the size of transaction log. The transaction log is filled not only by yours sql commands but also by the commands of other users using db in the same moment.
I would suggest using one of/or combination of following methods
1. Commits
Do commmits often - in your case I would put one commit after each delete command
2. Increase the size of transaction log
As I recall default db2 transaction log is not very big. The size of transaction log should be calculated/tuned for each db individually. Reference here and with more details here
3. Stored procedure
Write and call stored procedure which does deletes in blocks, e.g.:
-- USAGE - create: db2 -td# -vf del_blocks.sql
-- USAGE - call: db2 "call DEL_BLOCKS(4, ?)"
drop PROCEDURE DEL_BLOCKS#
CREATE PROCEDURE DEL_BLOCKS(IN PK_FROM INTEGER, IN PK_TO INTEGER)
LANGUAGE SQL
BEGIN
declare v_CNT_BLOCK bigint;
set v_CNT_BLOCK = 0;
FOR r_cur as c_cur cursor with hold for
select tableky from tablename
where tableky between pk_from and pk_to
for read only
DO
delete from tablename where tableky=r_cur.tableky;
set v_CNT_BLOCK=v_CNT_BLOCK+1;
if v_CNT_BLOCK >= 5000 then
set v_CNT_BLOCK = 0;
commit;
end if;
END FOR;
commit;
END#
4. Export + import with replace option
In some cases when I needed to purge very big tables or leave just small amount of records (and had no FK constraints), then I used export + import(replace). The replace import option is very destructive - it purges the whole table before import of new records starts (reference of db2 import command), so be sure what you're doing and make backup before. For such sensitive operations I create 3 scripts and run each separately: backup, export, import. Here is the script for export:
echo '===================== export started ';
values current time;
export to tablename.del of del
select * from tablename where (tableky between 1 and 1000
or tableky between 2000 and 3000
or tableky between 5000 and 7000
) ;
echo '===================== export finished ';
values current time;
Here is the import script:
echo '===================== import started ';
values current time;
import from tablename.del of del allow write access commitcount 2000
-- !!!! this is IMPORTANT and VERY VERY destructive option
replace
into tablename ;
echo '===================== import finished ';
5. Truncate command
Db2 in version 9.7 introduced TRUNCATE statement which:
deletes all of the rows from a table.
Basically:
TRUNCATE TABLE <tablename> IMMEDIATE
I had no experience with TRUNCATE in db2 but in some other engines, the command is very fast and does not use transaction log (at least not in usual manner). Please check all details here or in official documentation. As solution 4, this method too is very destructive - it purges the whole table so be very careful before issuing the command. Ensure previous state with table/db backup doing first.
Note about when to do this
When there are no other users on db, or ensure this by locking the table.
Note about rollback
In transaction db (like db2) rollback can restore db state to the state when transaction started. In methods 1,3 and 4 this can't be achieved, so if you need feature "restoring to the original state", the only option which ensures this is the method nr. 2 - increase transaction log.
delete from ordpos where orderid in ((select orderid from ordpos where orderid not in (select id from ordhdr) fetch first 40000 rows only));
Hoping this will resolve your query :)
It's unlikely that DB2 is "hanging" – more likely it's in the process of doing a Rollback after the DELETE operation filled the transaction log.
Make sure that you are committing after each individual DELETE statement. If you are executing the script using the +c option for the DB2 CLP, then make sure you include an explicit COMMIT statement between each DELETE.
Best practice to delete the data which has millions of rows is to use commit in between the deletes. In your case you can use commit after every delete statement.
What commit does is it will clear the transction logs and make space available for other delte operations to perform.
Alternatively instad of 5 delete statements use loop and pass the delete statement to delete, After one iteration of the loop execute one commit then database will never hang and simultaneously your data will get deleted.
use some thing like this.
while(count<no of records)
delete from (select * from table fetch fist 50000 records only)
commit;
count= total records- no of records.
If SELECT WHERE FETCH FIRST 10 ROWS ONLY can pull-in a few chunk of records,in chunks of 10 for example, then you can feed this as input into another script that will then delete these records. Rinse and repeat...
For the benefit of everyone, here is the link to my developerWorks article on the same problem. I tried different things and the one I shared on this article worked perfectly for me.
I have an application that uses incident numbers (amongst other types of numbers). These numbers are stored in a table called "Number_Setup", which contains the current value of the counter.
When the app generates a new incident, it number_setup table and gets the required number counter row (counters can be reset daily, weekly, etc and are stored as int's). It then incremenets the counter and updates the row with the new value.
The application is multiuser (approximately 100 users at any one time, as well as sql jobs that run and grab 100's of incident records and request incident numbers for each). The incident table has some duplicate incident numbers where they should not be duplicate.
A stored proc is used to retrieve the next counter.
SELECT #Counter = counter, #ShareId=share_id, #Id=id
FROM Number_Setup
WHERE LinkTo_ID=#LinkToId
AND Counter_Type='I'
IF isnull(#ShareId,0) > 0
BEGIN
-- use parent counter
SELECT #Counter = counter, #ID=id
FROM Number_Setup
WHERE Id=#ShareID
END
SELECT #NewCounter = #Counter + 1
UPDATE Number_Setup SET Counter = #NewCounter
WHERE id=#Id
I've now surrounded that block with a transaction, but I'm not entirely sure it' will 100% fix the problem, as I think there's still shared locks, so the counter can be read anyway.
Perhaps I can check that the counter hasn't been updated, in the update statement
UPDATE Number_Setup SET Counter = #NewCounter
WHERE Counter = #Counter
IF ##ERROR = 0 AND ##ROWCOUNT > 0
COMMIT TRANSACTION
ELSE
ROLLBACK TRANSACTION
I'm sure this is a common problem with invoice numbers in financial apps etc.
I cannot put the logic in code either and use locking at that level.
I've also locked at HOLDLOCK but I'm not sure of it's application. Should it be put on the two SELECT statements?
How can I ensure no duplicates are created?
The trick is to do the counter update and read in a single atomic operation:
UPDATE Number_Setup SET Counter = Counter+1
OUTPUT INSERTED.Counter
WHERE id=#Id;
This though does not assign the new counter to #NewCounter, but instead returns it as a result set to the client. If you have to assign it, use an intermediate table variable to output the new counter INTO:
declare #NewCounter int;
declare #tabCounter table (NewCounter int);
UPDATE Number_Setup SET Counter = Counter+1
OUTPUT INSERTED.Counter INTO #tabCounter (NewCounter)
WHERE id=#Id
SELECT #NewCounter = NewCounter FROM #tabCounter;
This solves the problem of making the Counter increment atomic. You still have other race conditions in your procedure because the LinkTo_Id and share_id can still be updated after the first select so you can increment the counter of the wrong link-to item, but that cannot be solved just from this code sample as it depends also on the code that actualy updates the shared_id and/or LinkTo_Id.
BTW you should get into the habbit of name your fields with consistent case. If they are named consistently then you must use the exact match case in T-SQL code. Your scripts run fine now just because you have a case insensitive collation server, if you deploy on a case sensitive collation server and your scripts don't match the exact case of the field/tables names errors will follow galore.
have you tried using GUIDs instead of autoincrements as your unique identifier?
If you have the ablity to modify your job that gets mutiple records, I would change the thinking so that your counter is an identity column. Then when you get the next record you can just do an insert and get the ##identity of the table. That would ensure that you get the biggest number. You would also have to do a dbccReseed to reset the counter instead of just updating the table when you want to reset the identity. The only issue is that you'd have to do 100 or so inserts as part of your sql job to get a group of identities. That may be too much overhead but using an identity column is a guarenteed way to get unique numbers.
I might be missing something, but it seems like you are trying to reinvent technology that has already been solved by most databases.
instead of reading and updating from the 'Counter' column in the Number_Setup table, why don't you just use an autoincrementing primary key for your counter? You'll never have a duplicate value for a primary key.