Fully delete data from Clickhouse DB to save disk space - database

In order to free up disk space, I've gotten rid of a number of old tables in a clickhouse database via
DROP TABLE mydb.mytable
However, disk usage did not change at all. In particular I expected /var/lib/clickhouse/data/store to shrink.
What am I missing here? Is there a "postgresql.vacuum"-equivalent in clickhouse I should be doing?

Atomic databases keep dropped tables for 8 minutes.
You can use DROP TABLE mydb.mytable no delay
https://kb.altinity.com/engines/altinity-kb-atomic-database-engine

Related

PostgreSQL failed imports still claim hard disk space? Need to clear cache?

I have a PostgreSQL (10.0 on OS X) database with a single table for the moment. I have noticed something weird when I'm importing a csv file in that table.
When the import fails for various reasons (e.g. one extra row in the csv file or too many characters in a column for a given row), no rows are being added to the table but PostgreSQL still claims that space on my hard disk.
Now, I have a very big csv to import and it failed several time because the csv was not compliant to begin with - so I had tons of import fails that I fixed and tried to import again. What I've realized now is that my computer storage has been reduced by 30-50 GB or so because of that and my database is still empty.
Is that normal?
I suspect this is somewhere in my database cache. Is there a way for me to clear that cache or do I have to fully reinstall my database?
Thanks!
Inserting rows into the database will increase the table size.
Even if the COPY statement fails, the rows that have been inserted so far remain in the table, but they are dead rows since the transaction that inserted them failed.
In PostgreSQL, the SQL statement VACUUM will free that space. That typically does not shrink the table, but it makes the space available for future inserts.
Normally, this is done automatically in the background by the autovacuum daemon.
There are several possibilities:
You disabled autovacuum.
Autovacuum is not fast enough cleaning up the table, so the next load cannot reuse the space yet.
What can you do:
Run VACUUM (VERBOSE) on the table to remove the dead rows manually.
If you want to reduce the table size, run VACUUM (FULL) on the table. That will lock the table for the duration of the operation.

Tablespace is not freed after dropping tables (Oracle 11g)

I have a Oracle 11g database with block size = 8192. So, if I'm correct maximum datafile size will be 32GB.
I have a huge table containing around 10 million records. Data in this table will be purged often. For purging we chose CTAS as a better option as we are going to delete greater portion of the data.
As we'll be dropping the old table after CTAS, the old tables are not releasing the space for new tables. I understand that a tablespace has AUTOEXTEND option but no AUTOSHRINK. But the space occupied by old tables should be available for new tables, which is not happening in this case.
I'm getting an Exception saying
ORA-01652: unable to extend temp segment by 8192 in tablespace
FYI the only operation happening all the time is CTAS + Dropping the old table. Nothing else. First time this is working fine, but when the same operation is done the second time, exception arises.
I tried adding an additional datafile to the tablespace, but after few more purge operations on the table, this is also getting full to 32GB and the issue continues.

DB2 - Reclaiming disk space used by dropped tables

I have an application that logs to a DB2 database. Each log is stored in a daily table, meaning that I have several tables, one per each day.
Since the application is running for quite some time, I dropped some of the older daily tables, but the disk space was not reclaimed.
I understand this is normal in DB2, so I goggled and found out that the following command can be used to reclaim space:
db2 alter tablespace <table space> reduce max
Since the tablespace that store the daily log tables is called USERSPACE1, I executed the following command successfully:
db2 alter tablespace userspace1 reduce max
Unfortunately the disk space used by DB2 instance is still the same...
I've read somewhere that the REORG command can be executed, but what I've seen it is used to reorganize tables. Since I dropped the tables, how can I use REORG?
Is there any other way to do this?
Thanks
Reduce the size of a tablespace is very complex. The extents (set of contiguous pages; unit of tablespace allocation) of the tables are not distributed sequentially for a same table. When you reorg a table, the rows will be organized in pages, and the new pages will be written normally at the end of the tablespace. Sometimes, the high watermark will be increased, and your tablespace will be bigger.
You need to reorg all tables from a tablespace in order to "defrag" all tables. Then, you have to perform a new reorg in order to use the previous space, because it should be an empty space in the tablespace.
However, there are many criteria that impacts the organization of the tables in a tablespace: New extents are created (new rows, rows overflow due to updates); compression could be activated after reorg.
What you can do is to assign few or just one table per tablespace; however, you will waste a lot of space (overhead, empty pages, etc.)
The command that you are using is an automatic way to do that, but it does not always work as desired: http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0055392.html
If you want to see the distribution of the tables in your tablespace, you can use db2dart. Then, you can have an idea of which table to reorg (move).
Sorry guys,
The command that I mentioned on the original post works after all, but the space was retrieved very slowly.
Thanks for the help

Dropping an unindexed table with over 1.7 Billion rows on live database (SQL Admin Nightmare)

A recent employee of our company had a stored procedure that has gone haywire, and caused mass inserts into a debug table of his. The table is unindexed, is now at close to 1.7 billion rows, and is taking up so much space that the backup no longer fits on the backup drive (Backups now reach close to 250GB).
I haven't really seen anything like this, so I'm seeking advice from the MSSQL Gurus out here.
I know I could nibble away at the table, but being unindexed, the DELETE FROM [TABLE] WHERE ID IN (SELECT TOP 10000 [ID] FROM [TABLE]) nearly locks up the server searching for them.
I also don't want my log file to get massive, it's currently sitting at 480GB on a 1TB drive. If I delete this table, will I be able to shrink it back down? (My recovery mode is simple)
We could index the id field on the table, though we only have around 9 hours downtime a day, and during business hours we can't be locking up the database.
Just looking for advice here, and a point in the right direction.
Thanks.
You may want to consider TRUNCATE
MSDN reference: http://technet.microsoft.com/en-us/library/aa260621(v=sql.80).aspx
Removes all rows from a table without logging the individual row deletes.
Syntax:
TRUNCATE TABLE [YOUR_TABLE]
As #Rahul suggests in the comments, you could also use DROP TABLE [YOUR_TABLE] if you no longer plan to use the table in question. The TRUNCATE option would simply empty the table but leave it in place if you wanted to continue to use it.
With regards to the space issue, both of these operations will be comparatively quick and the space will be reclaimed, but it won't happen instantly. When using TRUNCATE, the data still has to be deleted, but SQL Server will simply deallocate the data pages used by the table and use a background process to actually perform the clean up afterwards.
This post should provide some useful information.
One suggestion would be ... take the back up of only that 1.7 billion rows table (probably in a tape drive/somewhere with good enough space) and then drop the table saying drop table table_name.
That way, if at all that debug table data is needed in future; you have a copy and can restore from backup.
I would remove the logging for this table and launch a delete stored procedure that would commit every 1000 rows.

TSQL updating large table with other from TEMPDB causes enormous grow

I have a custom import tool which bulk-insert the data in temp (421776 rows). After that the tool inserts unknown rows into the target table and updates existing rows based on a hash key(combination of 2 columns). The target db has nearly the same row count. The update query looks something like this (about 20 less update columns)
update targetTable set
theDB.dbo.targetTable.code=temp.code,
theDB.dbo.targetTable.name=temp.name,
from [tempDB].[dbo].[targettable] as temp
where theDB.dbo.targetTable.hash=temp.hash COLLATE SQL_Latin1_General_CP1_CI_AS
I know the nvarchar compare with a collate is a bit bad but not easy to avoid. Still the hash column has it's own unique index. Also locally it works well but on this server of mine the temp DB keeps growing to 21 gig. Reindexing and shrinking won't work at all.
Just a side note for others who face tempdb problems. A good read is http://bradmcgehee.com/wp-content/uploads/presentations/Optimizing_tempdb_Performance_chicago.pdf
It looks like you're using tempdb explicitly with data you've put there. Is there are a reason to use tempdb as if it was your own database?
The reason tempdb is growing is because you're explicitly putting data there. 420k rows doesn't sound heavy, but it's best to keep it within your own user db.
Suggest changing your business logic to move away from [tempDB].[dbo].[targettable] to something on your own user database.
You can temporarily change the transaction logging from Full or Bulk logged down to simple. That will keep everything from getting logged for a rollback.
Is this a cartesian product when there's no explicit join?

Resources