TDengine data loss issue - tdengine

The issue is as follows, there are two clients
client 1:
taos> create table stb(ts timestamp, c1 int, c2 float) tags(t1 int);
Query OK, 0 of 0 row(s) in database (0.002777s)
taos> insert into t1 using stb tags(1) values(now, 1, 1.2);
Query OK, 1 of 1 row(s) in database (0.011575s)
taos> select * from stb;
ts | c1 | c2 | t1 |
=============================================================================
2021-09-29 23:07:03.665 | 1 | 1.20000 | 1 |
Query OK, 1 row(s) in set (0.002033s)
client 2:
taos> select * from stb;
ts | c1 | c2 | t1 |
=============================================================================
2021-09-29 23:07:03.665 | 1 | 1.20000 | 1 |
Query OK, 1 row(s) in set (0.002254s)
Everything looks good for now, but after following steps
client 1:
taos> alter table stb drop column c2;
Query OK, 0 of 0 row(s) in database (0.004846s)
taos> alter table stb add column c2 double;
Query OK, 0 of 0 row(s) in database (0.003483s)
client 2:
taos> insert into t1 values(now, 2, 2.2);
Query OK, 1 of 1 row(s) in database (0.001591s)
Now execute below query in both clients:
client 1:
taos> select * from t1;
ts | c1 | c2 |
====================================================================
2021-09-29 23:07:03.665 | 1 | NULL |
2021-09-29 23:09:37.872 | 2 | NULL |
Query OK, 2 row(s) in set (0.002334s)
client 2:
taos> select * from t1;
ts | c1 | c2 |
====================================================================
2021-09-29 23:07:03.665 | 1 | 1.200000000 |
2021-09-29 23:09:37.872 | 2 | 2.200000000 |
Query OK, 2 row(s) in set (0.002121s)
Why the same query in different clients has different results? Is there a data loss issue in this scenario?

Quick answer for this question is yes since there is a known TDengine design issue in TDengine 2.x. So it is not recommended for user to delete a float column and add back a double column with the same name.

Related

anybody knows how to show a complete line in taos shell in TDengine?

the query results showed in taos shell are truncated, anybody knows how to disable this truncate? I have tried to use \G, but the output format is different.
taos> select * from tb1;
ts | f1 | f2 |
=========================================================================
2022-03-31 08:50:44.398 | 1 | Hash Join (cost=230.47..71... |
Query OK, 1 row(s) in set (0.001446s)
taos> select * from tb1\G;
*************************** 1.row ***************************
ts: 2022-03-31 08:50:44.398
f1: 1
f2: Hash Join (cost=230.47..713.98 rows=101 width=488) (actual time=0.711..7.427 rows=100 loops=1)
Query OK, 1 row(s) in set (0.001340s)
you can use following sql
taos> set max_binary_display_width 100;
taos> show stables;
name | created_time | columns | tags | tables |
==================================================================================================================================================================
meters1 | 2022-05-26 18:53:58.611 | 121 | 2 | 10000 |
Query OK, 1 row(s) in set (0.001390s)
It seems \G is OK and there's no difference in your output.
You can also reference to this FAQ.

how to update specified field to null with update option equals 2 in TDengine database?

by using update = 2 option when creating database, we can update specified field, how should I update this field to null?
taos> create database db update 2; [5/2934]
Query OK, 0 of 0 row(s) in database (0.008446s)
taos> use db;
Database changed.
taos> create table tb(ts timestamp, c1 int, c2 nchar(20));
Query OK, 0 of 0 row(s) in database (0.024248s)
taos> insert into tb values(now, 1, "beijing");
Query OK, 1 of 1 row(s) in database (0.008139s)
taos> select * from tb;
ts | c1 | c2 |
=========================================================================
2022-02-07 14:54:54.189 | 1 | beijing |
Query OK, 1 row(s) in set (0.001694s)
taos> insert into tb values("2022-02-07 14:54:54.189", 2, NULL);
Query OK, 1 of 1 row(s) in database (0.000608s)
taos> select * from tb;
ts | c1 | c2 |
=========================================================================
2022-02-07 14:54:54.189 | 2 | beijing |
Query OK, 1 row(s) in set (0.005644s)
Because update = 2 cannot achieve what you want, you can use update = 1

union all in TDengine

Recently I tried union all in TDengine and found there was a issue:
taos> select count(*) as count, loc from st where ts between 1600000000000 and 1600000000010 group by loc;
count | loc | loc |
==========================================================================================
10 | nchar0 | nchar0 |
10 | nchar1 | nchar1 |
10 | nchar2 | nchar2 |
10 | nchar3 | nchar3 |
10 | nchar4 | nchar4 |
10 | nchar5 | nchar5 |
Query OK, 6 row(s) in set (0.003831s)
taos> select count(*) as count, loc from st where ts between 1600000000020 and 1600000000030 group by loc;
Query OK, 0 row(s) in set (0.002620s)
taos> select count(*) as count, loc from st where ts between 1600000000000 and 1600000000010 group by loc
-> union all
-> select count(*) as count, loc from st where ts between 1600000000020 and 1600000000030 group by loc;
count | loc | loc |
==========================================================================================
10 | nchar0 | nchar0 |
10 | nchar1 | nchar1 |
10 | nchar2 | nchar2 |
10 | nchar3 | nchar3 |
10 | nchar4 | nchar4 |
10 | nchar5 | nchar5 |
Query OK, 6 row(s) in set (0.004686s)
taos> select count(*) as count, loc from st where ts between 1600000000020 and 1600000000030 group by loc
-> union all
-> select count(*) as count, loc from st where ts between 1600000000000 and 1600000000010 group by loc;
count | loc | loc |
==========================================================================================
Query OK, 0 row(s) in set (0.004371s)
From above queries, why the result for query 3 and query 4 is different? it makes me confused.
Because TDengine's SQL is a SQL-like language, I guess its implementation of SQL is basically the same as SQL. In mySQL, the implementation principle of UNIONALL is that the result table is generated based on the previous table structure instead of the following table. For example, the structure of Table 1 is:
CREATE TABLE `table_1` (
`col1` int(255) DEFAULT NULL,
`col2` int(255) DEFAULT NULL,
`col3` int(255) DEFAULT NULL,
`col4` int(255) DEFAULT NULL
) ENGINE = InnoDB CHARSET = latin1;
The structure of Table 2 is:
CREATE TABLE `table_1` (
`col4` int(255) DEFAULT NULL,
`col3` int(255) DEFAULT NULL,
`col2` int(255) DEFAULT NULL,
`col1` int(255) DEFAULT NULL
) ENGINE = InnoDB CHARSET = latin1;
The results of Table 1 Unionall and Table 2 have the same structure as Table 1, and the results of Table 2 Unionall have the same structure as Table 2.
So for this problem, in the second case, the previous table is empty and TDengine may not recognize its table structure, so the result table structure is also empty, resulting in inconsistent results.

Create SQL Server Select/Delete Query from value in other table

I have a master table named Master_Table and the columns and values in the master table are below:
| ID | Database | Schema | Table_name | Common_col | Value_ID |
+-------+------------+--------+-------------+------------+----------+
| 1 | Database_1 | Test1 | Test_Table1 | Test_ID | 1 |
| 2 | Database_2 | Test2 | Test_Table2 | Test_ID | 1 |
| 3 | Database_3 | Test3 | Test_Table3 | Test_ID2 | 2 |
I have another Value_Table which consist of values that need to be deleted.
| Value_ID | Common_col | Value |
+----------+------------+--------+
| 1 | Test_ID | 110 |
| 1 | Test_ID | 111 |
| 1 | Test_ID | 115 |
| 2 | Test_ID2 | 999 |
I need to build a query to create a SQL query to delete the value from the table provided in Master_Table whose database and schema information is provided in the same row. The column that I need to refer to delete the record is given in Common_col column of master table and the value I need to select is in Value column of Value_Table.
The result of my query should create a query as given below :
DELETE FROM Database_1.Test1.Test_Table1 WHERE Test_ID=110;
or
DELETE FROM Database_1.Test1.Test_Table1 WHERE Test_ID in (110,111,115);
These query should be inside a loop so that I can delete all the row from all the database and tables provided in master table.
Queries don't really create queries.
One way to do what you're saying, which could be useful if this is a one time thing or very occasional thing, is to use SSMS to generate query statements, then copy them to the clipboard, paste them into the window, and execute there.
SELECT 'DELETE FROM Database_1.Test1.Test_Table1 WHERE '
+ common_col
+ ' = '
+ convert(VARCHAR(10),value)
This probably isn't what you want; it sounds more like you want to automate cleanup or something.
You can turn this into one big query if you don't mind repeating yourself a little:
DELETE T1
FROM Database_1.Test1.Test_Table1 T1
INNER JOIN Database_1.Test1.ValueTable VT ON
(VT.common_col = 'Test_ID' and T1.Test_ID=VT.Value) OR
(VT.common_col = 'Test_ID2' and T1.Test_ID2=VT.Value)
You can also use dynamic SQL combined with the first part ... but I hate dynamic SQL so I'm not going to put it in my answer.

Why am I getting an index scan for a covered query using aggregate function?

I have a query:
select min(timestamp) from table
This table has 60+million rows, and daily I delete a few off the end. To determine whether or not there is any data old enough do delete I run the query above. There is an index on timestamp ascending, containing only one column, and the query plan in oracle causes this to be a full index scan. Should this not be the definition of a seek?
edit including plan:
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 2 | INDEX FULL SCAN (MIN/MAX)| NEVENTS_I2 | 1 | 8 | 4 (100)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
| 0 | SELECT STATEMENT | | 1 | 8 | 4 (0)| 00:00:01 |
Can you post the actual query plan? Are you sure that it is not doing a min/max index full scan? As you can see in this example, we're getting the MIN value from a 100,000 row table using a min/max index full scan with only a handful of consistent gets.
SQL> create table foo (
2 col1 date not null
3 );
Table created.
SQL> insert into foo
2 select sysdate + level
3 from dual
4 connect by level <= 100000;
100000 rows created.
SQL> create index idx_foo_col1
2 on foo( col1 );
Index created.
SQL> analyze table foo compute statistics for all indexed columns;
Table analyzed.
SQL> set autotrace on;
<<Note that I ran this statement once just to get the delayed block cleanout to
happen so that the consistent gets number wouldn't be skewed. You could run a
different query as well>>
1* select min(col1) from foo
SQL> /
MIN(COL1)
---------
02-FEB-11
Execution Plan
----------------------------------------------------------
Plan hash value: 817909383
--------------------------------------------------------------------------------
-----------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
--------------------------------------------------------------------------------
-----------
| 0 | SELECT STATEMENT | | 1 | 7 | 2 (0)|
00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 7 | |
|
| 2 | INDEX FULL SCAN (MIN/MAX)| IDX_FOO_COL1 | 1 | 7 | 2 (0)|
00:00:01 |
--------------------------------------------------------------------------------
-----------
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
532 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
At first I thought that the index would only be used if the column is declared NOT NULL. I tested with the following setup:
SQL> CREATE TABLE my_table (ts TIMESTAMP);
Table created
SQL> INSERT INTO my_table
2 SELECT systimestamp + ROWNUM * INTERVAL '1' SECOND
3 FROM dual CONNECT BY LEVEL <= 100000;
100000 rows inserted
SQL> CREATE INDEX ix ON my_table(ts);
Index created
SQL> EXPLAIN PLAN FOR SELECT MIN(ts) FROM my_table;
Explained
SQL> SELECT * FROM TABLE(dbms_xplan.display);
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 69 (2)| 00:00:0
| 1 | SORT AGGREGATE | | 1 | 13 | |
| 2 | INDEX FULL SCAN (MIN/MAX)| IX | 90958 | 1154K| |
--------------------------------------------------------------------------------
Here we notice that the index is used, but all rows from the index are read. If we specify that the column is not null we get a much better plan:
SQL> ALTER TABLE my_table MODIFY ts NOT NULL;
Table altered
SQL> EXPLAIN PLAN FOR SELECT MIN(ts) FROM my_table;
Explained
SQL> SELECT * FROM TABLE(dbms_xplan.display);
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:0
| 1 | SORT AGGREGATE | | 1 | 13 | |
| 2 | INDEX FULL SCAN (MIN/MAX)| IX | 90958 | 1154K| 2 (0)| 00:00:0
--------------------------------------------------------------------------------
In fact this is the same plan that is also used if we add a WHERE clause (Oracle will read a single row from the index):
SQL> EXPLAIN PLAN FOR SELECT MIN(ts) FROM my_table WHERE ts IS NOT NULL;
Explained
SQL> SELECT * FROM TABLE(dbms_xplan.display);
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:
| 1 | SORT AGGREGATE | | 1 | 13 | |
| 2 | FIRST ROW | | 90958 | 1154K| 2 (0)| 00:00:
| 3 | INDEX FULL SCAN (MIN/MAX)| IX | 90958 | 1154K| 2 (0)| 00:00:
--------------------------------------------------------------------------------
This last plan shows (line 2) that Oracle is indeed performing a "seek".
Just wanted to hone in on the fact that an "INDEX FULL SCAN (MIN/MAX)" is simply not the same as an "INDEX FULL SCAN". An INDEX FULL SCAN really does scan the entire index (possibly with filtering). However an INDEX FULL SCAN (MIN/MAX) or INDEX RANGE SCAN (MIN/MAX) only gets the smallest or largest leaf block (from the range), but can only be employed as long as the column is NOT NULL (which is a bit silly, and really a bug, since a NULL value is by definition neither the smallest nor largest value). The (MIN/MAX) optimization is an implicit FIRST_ROWS action, and doesn't need the "WHERE ... IS NOT NULL" query condition to perform the optimization. Interestingly the MIN/MAX optimization is normally not considered by the CBO for function-based indexes, that's another little bug.

Resources