Clickhouse - Does 10 billion data need separate databases and tables

Clickhouse - Does 10 billion data need separate databases and tables - database

I have a demand for 10 billion detailed data query. 10 billion data is stored in the Clickhouse. Do I need to separate databases and tables?
The table engine I use is ReplicatedReplacingMergeTree
The query statement is as follows：
select * from table_a where name = 'akkj';
select * from table_b where name = 'bttc';

No, you can store any number of rows in a single table.
My production:
SELECT count()
FROM fact_event_shard
┌───────count()─┐
│ 1415809324034 │
└───────────────┘
select * from table_a where name = 'akkj';
select * from table_b where name = 'bttc';
Very bad idea. Clickhouse does not like such queries. You should use Cassandra, not Clichkouse.

Related

Is there a way to identify if a column is not been used in a SQL Server Database?

I have a big database and a lot tables and I would like to identify what columns are not been called by any store procedure or any query, or not in use.

I'm not sure if this is what you're looking for, but in your DB under views there's a folder for System Views - three of which are the following. Looking in all_objects look for the table name, then use the object_id of that table to select from the other queries. There may be other meta-data here that is appropriate for your need.
SELECT *
FROM sys.all_objects
SELECT *
FROM sys.all_columns
WHERE object_id = 981578535
SELECT *
FROM sys.all_views
WHERE object_id = 981578535

After an investigation I have found a new feature of SQL Server called Query Store where you can find in disk the executions with other information. If you have a SQL Server 2016 instance you can find it in the properties of the data base. There you can change the amount of days to capture and then you can in theory try to find the columns in question. The idea of this technology is to give you more options for performance tuning.
You can find the information with this query :
SELECT TOP 10 qt.query_sql_text, q.query_id,
qt.query_text_id, p.plan_id, rs.last_execution_time
FROM sys.query_store_query_text AS qt
JOIN sys.query_store_query AS q
ON qt.query_text_id = q.query_text_id
JOIN sys.query_store_plan AS p
ON q.query_id = p.query_id
JOIN sys.query_store_runtime_stats AS rs
ON p.plan_id = rs.plan_id
where qt.query_sql_text LIKE '%ColumnToFind%'
ORDER BY rs.last_execution_time;
Credits: Query Store

Database Index when SQL statement includes "IN" clause

I have SQL statement which takes really a lot of time to execute and I really had to improve it somehow.
select * from table where ID=1 and GROUP in
(select group from groupteam where
department= 'marketing' )
My question is if I should create index on columns ID and GROUP would it help?
Or if not should I create index on second table on column DEPARTMENT?
Or I should create two indexes for both tables?
First table has 249003.
Second table has in total 900 rows while query in that table returns only 2 rows.
That is why I am surprised that response is so slow.
Thank you

You can also use EXISTS, depending on your database like so:
select * from table t
where id = 1
and exists (
select 1 from groupteam
where department = 'marketing'
and group = t.group
)
Create a composite index on individual indexes on groupteam's department and group
Create a composite index or individual indexes on table's id and group
Do an explain/analyze depending on your database to review how indexes are being used by your database engine.

Try a join instead:
select * from table t
JOIN groupteam gt
ON d.group = gt.group
where ID=1 AND gt.department= 'marketing'
Index on table group and id column and table groupteam group column would help too.

Perform Query and count rows on multiple identical table

I have multiple tables created for each date to store some information for each date.
For example History3108,History0109..etc All of these tables share same schema. Some time i need to query multiple tables and get the rows and count of records. What is the faster way of doing this in oracle and SQL Server?
Currently i am doing like this...
When i need count of multiple tables: Select count(*) for each table and add
When i need records of multiple tables: select * from table1, select * from table2 (Basically select * for each table.)
Would this give better performance if we include all of the queries in one transaction?

With UNION you can get records from multiple tables that shares the same datatype group and column names. For example, if you want to see all records from multiple tables:
(select * from history3108)
union all
(select * from history0109)
union all
(select * from history0209)
/* [...] and so on */
and if you want to count all records from these tables:
select count(*) from (
(select * from history3108)
union all
(select * from history0109)
union all
(select * from history0209)
/* [...] and so on */
);
Oracle Docs - The UNION [ALL], INTERSECT, MINUS Operators

SQL Query question

No particular DBMS in mind, how would I do the following:
# There are many tables per one restaurant, many napkins per one table
# Pseudo SQL
SELECT RESTAURANT WHERE ID = X;
SELECT ALL TABLES WHERE RESTAURANT_ID = RESTAURANT.ID;
SELECT ALL NAPKINS WHERE TABLE_ID = TABLE.ID;
But, all in one query? I've used a JOIN to get all the tables in the same query as restaurant, but is it possible to get all napkins for each table as well, in the same query?

select * -- replace * with the columns you need...
from restaurant as r
inner join tables as t on t.restaurant_id = r.id
inner join napkins as n on n.table_id = t.id
where r.id = [restaurant id]

You would definitely end up in repeating Tables and restaurant information on the rows, like:
Restaurant1 Table1 Napkin1
Restaurant1 Table1 Napkin2
Restaurant1 Table1 Napkin3
Restaurant1 Table2 Napkin4
Restaurant2 Table1 Napkin5

It seems you want to return three separate results, not a single result with repeat values for RESTAURANT_N or TABLE_N.
In SQL, this is done with stored procedures which can return multiple result sets. The syntax for stored procedures varies among database products, therefore you should ask the question for specific products. In the stored procedure, there will be three select statements for the RESTAURANTS, TABLES and NAPKINS. The results of the three statements are returned in a bundle to the application, which can then use the results.

sql server 2005 - select records from tbl A contained WITHIN a text field of tbl B

I'm trying to work out a SQL Select in MS SQL 2005, to do the following:
TABLE_A contains a list of keywords... asparagus, beetroot, beans, egg plant etc (x200).
TABLE_B contains a record with some long free text (approx 4000 chars)...
I know what record within TABLE_B I am selecting (byID).
However I need to get a shortlist of records from TABLE_A that are contained WITHIN the text of the record in TABLE_B.
I'm wondering if SQLs CONTAINS function is uselful... but maybe not.
This needs to be a super quick query.
Cheers

It will never be super quick because of the LIKE and wildcard at each end. You can not index it and there are no whizzy tricks. However, because you have already filtered TableB then it should be acceptable. If you had a million rows in tableB, you could go for coffee while it ran
SELECT
A.KeyWordColumn
FROM
TableA A
JOIN
TableB B ON B.BigTextColumn LIKE '%' + A.KeyWordColumn+ '%'
WHERE
B.ByID = #ID --or constant etc
CONTAINS can be used if you have full text indexing: but not for a normal SQL query

I would try this
select keyword from table_a, table_b
where table_b.text like '%' + keyword + '%'
and table_b.Id = '111'

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Clickhouse - Does 10 billion data need separate databases and tables - database

Related

Is there a way to identify if a column is not been used in a SQL Server Database?

Database Index when SQL statement includes "IN" clause

Perform Query and count rows on multiple identical table

SQL Query question

sql server 2005 - select records from tbl A contained WITHIN a text field of tbl B

Categories

Resources