Best way to get the table space occupied by snowflake table

Best way to get the table space occupied by snowflake table - snowflake-cloud-data-platform

My current query to get the table space occupied by the tables in SAMPLE_DB is as below:
use role accountadmin;
use schema snowflake.INFORMATION_SCHEMA;
SELECT
table_name,
sum(active_bytes)
FROM "INFORMATION_SCHEMA".table_storage_metrics
where TABLE_CATALOG in ('SAMPLE_DB') group by table_name;
Question is: do I also need to find out the sum of TIME_TRAVEL_BYTES FAILSAFE_BYTES to get the total space for each table in SAMPLE_DB ?

Yes - plus you will need to include RETAINED_FOR_CLONE_BYTES if that us relevant

Related

Snowflake show tables not accessed in last 20 days

There is a situation where I need to clean up my database in snowflake.
we have around 40 database and each database has more than 100 table. Some are getting loaded everyday and some are not, but used everyday.
However, There has been lots of table added for testing and other purpose (by lots of developer and user).
Now we are working on cleaning up un-used table.
We have query_history table which gives us the information of query run in past, however it has field such as database, warehouse, User etc. but not table.
I was wondering is there is any way we can write a query which give us table name not used (DDL and DML b0th) in last 10 days.

select obj.value:objectName::string objName
, max(query_start_time) as QUERY_DATE_TIME
from snowflake.account_usage.access_history
, table(flatten(direct_objects_accessed)) obj
group by 1
order by QUERY_DATE_TIME desc;

The information schema has a tables view and in that you have a last altered column, will that work with you? It will not give you the last accessed table but will give the last altered table. Other than this, there are no easy way to get this information from snowflake at this time. I also needed this feature, I think we should request for this feature.
select table_schema,
table_name,
last_altered
from information_schema.tables
where table_type = 'BASE TABLE'
and last_altered < dateadd( 'DAY', -10, current_timestamp() )
order by table_schema,
table_name;

Is there a way to figure out the tables a snowflake query is accessing?

I looked up the SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY table but the information it provides is the database name. Is there a way in snowflake to get the table names a query is accessing ? I am not looking for a solution which involves parsing the query string, as that is really complicated.

To see what tables an historical query accessed, you can go to the History tab, click on the query ID for the query, and then click on the profile. For queries that you are about to run, you can see what table(s) it will access by typing "explain" before the query. That will produce a metadata result set with a list of tables the query will read from in addition to other information.
Edit: If the explain produces a very long result set and you want to filter it down to just the affected tables, you can do something like this:
-- Generate the explain metadata reult set
explain select * from MY_VIEW;
-- Filter to just affected tables
select distinct "objects" as TABLE_NAME
from table(result_scan(last_query_id()))
where "operation" ilike '%table%' and "objects" is not null;

Does deleting the rows from table also release space of large objects in CLOB or BLOB columns?

Size of large objects is not shown with table size where queried through dba_segments where table_name=abc so does deleting the rows from table also release space of large objects in CLOB or BLOB columns in those tables?

If you delete the rows having large LOB objects then oracle does not free the space to the tablespace automatically.
You will need to claim the space(size) which will be released to your tablespace using following query:
ALTER TABLE <YourTable> MODIFY LOB (<LobColumn>) (SHRINK SPACE);
You can calculate the total size of the LOB segment using the following query:
SELECT OWNER,SEGMENT_NAME,ROUND(SUM(BYTES)/1024/1024) "LOB size (mb)"
FROM DBA_SEGMENTS
WHERE SEGMENT_NAME IN
(
SELECT SEGMENT_NAME
FROM DBA_LOBS WHERE TABLE_NAME = <YOURTABLE>
AND OWNER = <YOUROWNER>
)
GROUP BY OWNER,SEGMENT_NAME;
You can calculate the size before and after deletion of record and you will find no difference, but once you execute the aforementioned ALTER TABLE command (after deleting the rows), you will see difference in size using the above query.
Cheers!!

If the table is not big, you have sufficient space in the tablespace and you do not have permission of ALTER TABLE, then I would suggest the approach below.
After deletion of records
CREATE TABLE NEW_LOB_TABLE AS SELECT * FROM ORIGINAL_LOB_TABLE
DROP ORIGINAL_LOB_TABLE
RENAME NEW TABLE TO ORIGINAL TABLE

Check this answer, my db was 40 DB it got reduced to 10 db after using 'Vaccum' command in SQLManager
https://stackoverflow.com/a/62749800/4571399

Using count function or create specific column for counting in sql

I working on groups project. I have those tables :
I can get the number of members for each group by using count function :
SELECT COUNT(1) AS Counts FROM [Groups].[GroupMembers]
WHERE GroupId=Id;
Or I can add another column to Groups table for counting and every time new member join to the group, this field will increase by one. Does it better to use count function or add another column for counting ? in other words, what are the advantages and disadvantages of each method ?

Creating a column to store the count's is not recommend at all.
When you want the count of each group you can use a simple Select query to show the count of each group.
SELECT G.groupid,
Count(userid)
FROM groups G
LEFT OUTER JOIN groupmembers GM
ON G.groupid = GM.groupid
GROUP BY G.groupid
In case you want to add a new column then you will require a Trigger on GroupMembers table to update the count column in Groups table when a new user is added to any group in GroupMembers table

It depends on your table engine. If your table engine is MyISAM it would be much faster because it would simply read number of rows in the table from stored value, however Innodb engines will need to do a full table scan.
It is not recommended to store a count inside of the table itself, so if this is something you're worried about, use the MyISAM engine if possible.
Storing a value in the table would needlessly require an extra UPDATE query on each new/lost membership.

Stored procedure?

I write a stored procedure for most viewed photos in my procedure, this is my procedure can u check this please is ok or is there any improvement required?
create procedure sp_photos_selectmostviewedphotos
as
select * from photos order by views desc
is it enough or required any modification

First just specify the columns you really need -> replace the star in your query.
Then create an index over the views column (SortOrder DESC).
The rest should be OK :)

+1 to Greco, just to add:
I'd imagine you won't actually use ALL the records (the name indicates "most viewed photos"), so I'd stick in a TOP clause and only return however many records you actually need.
e.g.
SELECT TOP 10 Column1, Column2
FROM Photos
ORDER BY Views DESC

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Best way to get the table space occupied by snowflake table - snowflake-cloud-data-platform

Yes - plus you will need to include RETAINED_FOR_CLONE_BYTES if that us relevant

Related

Snowflake show tables not accessed in last 20 days

Is there a way to figure out the tables a snowflake query is accessing?

Does deleting the rows from table also release space of large objects in CLOB or BLOB columns?

Using count function or create specific column for counting in sql

Stored procedure?

Categories

Resources