TDengine SQL issue - tdengine

I recently found the select syntax in TDengine SQL is slightly different from mysql due to stable. But I cannot give a reasonable explanation. Let me show the emample:
Here is the stable structure:
describe mystable;
Field | Type | Length | Note |
=================================================================================
ts | TIMESTAMP | 8 | |
value | DOUBLE | 8 | |
tag1 | NCHAR | 16 | TAG |
tag2 | NCHAR | 43 | TAG |
tag3 | NCHAR | 29 | TAG |
tag4 | NCHAR | 10 | TAG |
tag5 | NCHAR | 2 | TAG |
When I execute:
select count(*) from mystable
count(*) |
========================
270419 |
And count on a colume, the result should be the same:
select count(value) from mystable
count(value) |
========================
270419 |
However, When count on a tag, the results is different:
select count(tag1) from mystable
count(tag1) |
========================
13 |
So what does select(tag_name) actually means in TDengine sql?

After get the response of professional personnel of R&D. The select(tag_name) means number of tables belong to this stable with specified tag, we can print it out by using sql:
select tbname, tag1 from stable;

From my understanding, select count(*) or count(column) returns the number of records of each child table which belongs to the same super table. But tags are different, although tags can be used to filter records for each child table, but does not attach to each record, so select count(tag) returns number of distinct tags for stable, but not number of total records.

Related

Column Index not reflecting in Explain Plan for predicates with "IN" Statement

I have a table with column name IDENTIFIER and the table (TAB1) has an index for this column. whenever i try to query a single data using a simple where clause with single value, explain plan shows that it is utilizing an existing index on that particular column.
But whenever i have a list of values in another table, say a temporary table ( TEMP_IDENTIFIER ) with list of all identifiers that i want to query and when i frame a query on the same table with an IN clause , i could see that explain plan is not considering the index, instead it performs an full table scan on the table
Ideally i would want the second query to utilize the existing index as well
Please find the both the queries and explain plan as follows
Query 1
explain plan for
select * from schemaowner.TAB1
where IDENTIFIER = 'A';
Explain Plan
Plan hash value: 4172144893
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 51 | 12750 | 11 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TAB1 | 51 | 12750 | 11 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | COL_INDEX | 51 | | 4 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("IDENTIFIER"='A')
Query 2
explain plan for
select * from schemaowner.TAB1
where IDENTIFIER in (select IDENTIFIER from SCHEMAOWNER.temp_IDENTIFIER);
Explain Plan :
Plan hash value: 935676029
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3135K| 822M| | 74751 (1)| 00:14:58 |
|* 1 | HASH JOIN RIGHT SEMI| | 3135K| 822M| 2216K| 74751 (1)| 00:14:58 |
| 2 | TABLE ACCESS FULL | TEMP_IDENTIFIER | 61115 | 1492K| | 85 (2)| 00:00:02 |
| 3 | TABLE ACCESS FULL | TAB1 | 3745K| 893M| | 28028 (2)| 00:05:37 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("IDENTIFIER"="IDENTIFIER")
Note
-----
- dynamic sampling used for this statement (level=2)
Thats the beauty of the optimizer. It's figured out (or costed) that a SEMI join is the most efficient method :)

SQL Server TPH (Table Per Hierarchy) auto increment multiple columns base on type

We currently use TPT (Table Per Type) in Entity Framework, this is very slow as we have about 20 tables, when they are queried, Entity Framework creates some massive disguising SQL which is very slow.
Each table has an auto increment integer column, this allows each type to have a number that is incremented per type. This is what the clients wanted. Now that we are wanting to move to the more performant TPH, we need all these table columns moved to the one table.
How can we have the auto increment columns based on the type as in the results below?
e.g.
Current Job Task
| TaskId | TaskNumber |
-----------------------------
| 1234 | 1 |
| 2345 | 2 |
Current Work Task
| TaskId | TaskNumber |
-----------------------------
| 3244 | 1 |
| 3245 | 2 |
This is the TPH table structure we want, as you can see, we want the task number to increment based on the Type of task.
| TaskId | Type | JobTaskNumber | WorkTaskNumber |
---------------------------------------------------------------
| 1234 | Job | 1 | null |
| 2345 | Job | 2 | null |
| 3244 | Work | null | 1 |
| 3245 | Work | null | 2 |
I am wondering if we use a seeding table, but any solutions greatly appreciated
Many thanks
Andrew
OK so did what I thought would work.
Not a hugely nice approach as we need about 20 seed tables.Each table has just an identity id defined as a BIGINT in sql server
When we want to add and get a new incremented id we just call this using dapper to get the result.
INSERT INTO SeedMyTable DEFAULT VALUES; SELECT CAST(SCOPE_IDENTITY() AS BIGINT)

SQL Server - Is it possible to define a table column as a table?

I know that this is possible in Oracle and I wonder if SQL Server also supports it (searched for answer without success).
It would greatly simplify my life in the current project if I could define a column of a table to be a table itself, something like:
Table A:
Column_1 Column_2
+----------+----------------------------------------+
| 1 | Columns_2_1 Column_2_2 |
| | +-------------+------------------+ |
| | | 'A' | 12345 | |
| | +-------------+------------------+ |
| | | 'B' | 777777 | |
| | +-------------+------------------+ |
| | | 'C' | 888888 | |
| | +-------------+------------------+ |
+----------+----------------------------------------+
| 2 | Columns_2_1 Column_2_2 |
| | +-------------+------------------+ |
| | | 'X' | 555555 | |
| | +-------------+------------------+ |
| | | 'Y' | 666666 | |
| | +-------------+------------------+ |
| | | 'Z' | 000001 | |
| | +-------------+------------------+ |
+----------+----------------------------------------+
Thanks in advance.
There is one option where you can store data as XML
Declare #YourTable table (ID int,XMLData xml)
Insert Into #YourTable values
(1,'<root><ID>1</ID><Active>1</Active><First_Name>John</First_Name><Last_Name>Smith</Last_Name><EMail>john.smith#email.com</EMail></root>')
,(2,'<root><ID>2</ID><Active>0</Active><First_Name>Jane</First_Name><Last_Name>Doe</Last_Name><EMail>jane.doe#email.com</EMail></root>')
Select ID
,Last_Name = XMLData.value('(root/Last_Name)[1]' ,'nvarchar(50)')
,First_Name = XMLData.value('(root/First_Name)[1]' ,'nvarchar(50)')
From #YourTable
Returns
ID Last_Name First_Name
1 Smith John
2 Doe Jane
Actually, for a normalized database we do not require such functionality.
Because if we need to insert a table within a column than we can create a child table and reference it as a foreign key in the parent table.
In spite, if you still insist to such functionality than you can use SQL Server 2016 to support JSON data where you can store any associative list in JSON format.
Like:
DECLARE #json NVARCHAR(4000)
SET #json =
N'{
"info":{
"type":1,
"address":{
"town":"Bristol",
"county":"Avon",
"country":"England"
},
"tags":["Sport", "Water polo"]
},
"type":"Basic"
}'
SELECT
JSON_VALUE(#json, '$.type') as type,
JSON_VALUE(#json, '$.info.address.town') as town,
JSON_QUERY(#json, '$.info.tags') as tags
SELECT value
FROM OPENJSON(#json, '$.info.tags')
In older versions, this can be achieved through xml as shown in previous answer.
Your can also make use of "sql_variant" datatype to map your table.
Previously, I was also in search of such features as available in Oracle. But after reading various articles and blogs from experts, I was convinced, such features will make the things more complex beside helping.
Only storing the data in required format is not important, It is worthy when it is also efficiently available (readable).
Hope this will help you to take your decision.

How to use COUNT() to parse individual words in a nvarchar field?

So my query:
SELECT Tags, COUNT(Tags) AS Listings
FROM Job
WHERE datepart(year, dateposted)=2013
GROUP BY Tags
ORDER BY Listings DESC
Outputs:
+----------------------+----------+
| Tags | Listings |
+----------------------+----------+
| java c++ | 41 |
| software development | 41 |
| java c++ c# | 31 |
| | 25 |
| sysadmin | 25 |
| see jd | 24 |
| java c++ ood | 23 |
| java | 23 |
+----------------------+----------+
I want it to come out like so:
+----------------------+----------+
| Tags | Listings |
+----------------------+----------+
| java | 118|
| c++ | 95 |
| ood | 23 |
| see | 24 |
| jd | 24 |
| software development | 41 |
| sysadmin | 25 |
| c# | 31 |
+----------------------+----------+
How can I count each individual word in the field instead of the entire field? The tags column is nvarchar.
First, your table structure is awful. Storing data in a list like that is going to cause you headaches similar to what you are trying to do right now.
The problem with a split function is you have no idea what software development or other multi-word tags are - Is that one word or two?
I think the only way you will solve this is by creating a table with your tags or using a derived table similar to the following:
;with cte (tag) as
(
select 'java' union all
select 'c++' union all
select 'software development' union all
select 'sysadmin' union all
select 'ood' union all
select 'jd' union all
select 'see' union all
select 'c#'
)
select c.tag, count(j.tags) listings
from cte c
inner join job j
on j.tags like '%'+c.tag+'%'
group by c.tag
See SQL Fiddle with Demo. Using this you can get a result:
| TAG | LISTINGS |
| java | 9 |
| c++ | 10 |
| software development | 4 |
| sysadmin | 2 |
| ood | 6 |
| jd | 3 |
| see | 2 |
| c# | 1 |
The issue with the above as was pointed out in the comments is how to decide if you have a tag software and development, those will match with the above query.
The best solution that you would have to this problem would be to store the tags in a separate table similar to:
create table tags
(
tag_id int,
tag_name varchar(50)
);
Then you could use a JOIN table to connect your jobs to the tag:
create table tag_job
(
job_id int,
tag_id int
);
Once you have a set up similar to this then it becomes much easier to query your data:
select t.tag_name,
count(tj.tag_id) listings
from tags t
inner join tag_job tj
on t.tag_id = tj.tag_id
group by t.tag_name
See demo
You will probably need to split out the individual words.
Here's a good series on splitters in SQL Server:
SqlServerCentral.com
I don't see how you will be able to differentiate "software development" as a single tag though. If you have a list of acceptable tags elsewhere, you could probably use that perform a count.
If you have a list of Available Tags, here is one approach that doesn't require a split.
Sql Fiddle Example
There could be an issue with this approach if you have a tag that is contained in another. I.e. 'software' and 'software development'
This is how I solved my issue.
SELECT TOP 50 Tags.s Tag, COUNT(Tags.s) AS Listings
FROM Job
CROSS APPLY [dbo].[SplitString](Tags,' ') Tags
WHERE NOT Job.Tags IS NULL and datepart(year,job.datecreated) = 2013
GROUP BY Tags.s
ORDER BY Listings DESC

Rearranging and deduplicating SQL columns based on column data

Sorry I know that's a rubbish Title but I couldn't think of a more concise way of describing the issue.
I have a (MSSQL 2008) table that contains telephone numbers:
| CustomerID | Tel1 | Tel2 | Tel3 | Tel4 | Tel5 | Tel6 |
| Cust001 | 01222222 | 012333333 | 07111111 | 07222222 | 01222222 | NULL |
| Cust002 | 07444444 | 015333333 | 07555555 | 07555555 | NULL | NULL |
| Cust003 | 01333333 | 017777777 | 07888888 | 07011111 | 016666666 | 013333 |
I'd like to:
Remove any duplicate phone numbers
Rearrange the telephone numbers so that anything beginning with "07" is the first phone number. If there are multiple 07's, they should be in the first fields. The order of the numbers apart from that doesn't really matter.
So, for example, after processing, the table would look like:
| CustomerID | Tel1 | Tel2 | Tel3 | Tel4 | Tel5 | Tel6 |
| Cust001 | 07111111 | 07222222 | 01222222 | 012333333 | NULL | NULL |
| Cust002 | 07444444 | 07555555 | 015333333 | NULL | NULL | NULL |
| Cust003 | 07888888 | 07011111 | 016666666 | 013333 | 01333333 | 017777777 |
I'm struggling to figure out how to efficiently achieve my goal (there are 600,000+ records in the table). Can anyone help?
I've created a fiddle if it'll help anyone play around with the scenario.
You can break up the numbers into individual rows using UNPIVOT, then reorder them based on the occurence of the '07' prefix using ROW_NUMBER(), and finally recombine it using PIVOT to end up with the 6 Tel columns again.
select *
FROM
(
select CustomerID, Col, Tel
FROM
(
select *, Col='Tel' + RIGHT(
row_number() over (partition by CustomerID
order by case
when Tel like '07%' then 1
else 2
end),10)
from phonenumbers
UNPIVOT (Tel for Seq in (Tel1,Tel2,Tel3,Tel4,Tel5,Tel6)) seqs
) U
) P
PIVOT (MAX(TEL) for Col IN (Tel1,Tel2,Tel3,Tel4,Tel5,Tel6)) V;
SQL Fiddle
Perhaps using cursor to collect all customer id and sorting the fields...traditional sorting technique as we used to do in school c++ ..lolz...like to know if any other method possible.
If you dont get any then it is the last way . It will take a long time for sure to execute.

Resources