In these two query result of TDengine, why is the number after the decimal point different? - tdengine

As you see:
First SQL's result is 457445.014572144
second one is 457445.000000000
taos> select time, price, vol, pricevol100 as amount from sh600096 where time > '2022-12-15' limit 1;
time | price | vol | amount |
2022-12-15 09:30:24.000 | 23.95000 | 191 | 457445.014572144 |
Query OK, 10 row(s) in set (0.071000s)
taos> select 23.95191100 as amount;
amount |
457445.000000000 |
Query OK, 1 row(s) in set (0.008000s)
below is my create table statement :
CREATE STABLE st_transaction_data (time TIMESTAMP, price FLOAT, vol INT, buyorsell TINYINT) TAGS (market TINYINT)

Related

SQL Server find sum of values based on criteria within another table

I have a table consisting of ID, Year, Value
---------------------------------------
| ID | Year | Value |
---------------------------------------
| 1 | 2006 | 100 |
| 1 | 2007 | 200 |
| 1 | 2008 | 150 |
| 1 | 2009 | 250 |
| 2 | 2005 | 50 |
| 2 | 2006 | 75 |
| 2 | 2007 | 65 |
---------------------------------------
I then create a derived, aggregated table consisting of an ID, MinYear, and MaxYear
---------------------------------------
| ID | MinYear | MaxYear |
---------------------------------------
| 1 | 2006 | 2009 |
| 2 | 2005 | 2007 |
---------------------------------------
I then want to find the sum of Values between the MinYear and MaxYear foreach ID in the aggregated table, but I am having trouble determining a proper query.
The final table should look something like this
----------------------------------------------------
| ID | MinYear | MaxYear | SumVal |
----------------------------------------------------
| 1 | 2006 | 2009 | 700 |
| 2 | 2005 | 2007 | 190 |
----------------------------------------------------
Right now I can perform all the joins to create the second table. But then I use a fast forward cursor to iterate through each record of the second table with the code inside the for loop looking like the following
DECLARE #curMin int
DECLARE #curMax int
DECLARE #curID int
FETCH Next FROM fastCursor INTo #curISIN, #curMin , #curMax
WHILE ##FETCH_STATUS = 0
BEGIN
SELECT Sum(Value) FROM ValTable WHERE Year >= #curMin and Year <= #curMax and ID = #curID
Group By ID
FETCH Next FROM fastCursor INTo #curISIN, #curMin , #curMax
Having found the sum of values between specified years, I can connect it back to the second table and I wind up the desired result (the third table).
However, the second table in reality is roughly 4 million rows, so this iteration is extremely time consuming (~generating 300 results a minute) and presumably not the best solution.
My question is, is there a way to generate the third table's results without having to use a cursor/for loop?
During a group by the sum will only be for the ID in question -- since the min year and max year is for the ID itself then you don't need to double query. The query below should give you exactly what you need. If you have a different requirement let me know.
SELECT ID, MIN(YEAR) as MinYear, MAX(YEAR) as MaxYear, SUM(VALUE) as SUMVALUE
FROM tablenameyoudidnotsay
GROUP BY ID
You could use query as bellow
TableA is your first table, and TableB is the second one
SELECT *,
(select SUM(Value) FROM TableA where tablea.ID=TableB.ID AND tableA.Year BETWEEN
TableB.MinYear AND TableB.MaxYear) AS SumValue
from TableB
You can put your criteria into a join and obtain the result all as one set which should be faster:
SELECT b.Id, b.MinYear, b.MaxYear, sum(a.Value)
FROM Table2 b
JOIN Table1 a ON a.Id=b.Id AND b.MinYear <= a.Year AND b.MaxYear >= a.Year
GROUP BY b.Id, b.MinYear, b.MaxYear

Sql Server - display a second record below first one with other data

I have an sql table with the below data:
Id department Amount
1 Accounting 10000
2 Catering 5000
3 Cleaning 5000
I want to return the data as below:
Id department Amount
1 Accounting 10000
1 50%
2 Catering 5000
2 25%
3 Cleaning 5000
3 25%
This implies every records return a second record just below it and display the percentage of the total amount. I have tried to use a PIVOT table but still I cannot position
the second row just below the first related one.
Has anyone ever done something similar I need just some guidelines.
create table #T(Id int, Dept varchar(10),Amount int)
insert into #T
values(1,'Accounting',10000),(2,'Catering',5000),(3,'Cleaning',5000)
declare #Totll float = (Select sum(Amount) from #T)
Select *
from #T
union
select Id,Convert(varchar(50), (Amount/#Totll)*100)+'%',0
from #T
order by Id,Amount desc
Use a CTE to calculate the total of the amounts.
Then use UNION ALL for your table and the query which calculates the percentages:
with cte as (select sum(amount) sumamount from tablename)
select id, department, amount
from tablename
union all
select id, concat(100 * amount / (select sumamount from cte), '%'), null
from tablename
order by id, amount desc
See the demo.
Results:
> id | department | amount
> -: | :--------- | -----:
> 1 | Accounting | 10000
> 1 | 50% | null
> 2 | Catering | 5000
> 2 | 25% | null
> 3 | Cleaning | 5000
> 3 | 25% | null

Sum of DataLength values does not match table size

I am trying to find which user records in a table are taking up the most space. To this end, I am using the DATALENGTH function in SqlServer.
SELECT
UserName,
SUM(
ISNULL(DATALENGTH(columnA), 1) +
ISNULL(DATALENGTH(columnB), 1) +
....
ISNULL(DATALENGTH(columnZ), 1) +
)/1000000 AS SizeInMegaBytes
FROM MyTable
GROUP BY UserName
ORDER BY SizeInMegaBytes DESC
Results:
+----------+-----------------+
| UserName | SizeInMegaBytes |
+----------+-----------------+
| User1 | 1700 |
+----------+-----------------+
| User2 | 1504 |
+----------+-----------------+
| .... | .... |
+----------+-----------------+
| User75 | 20 |
+----------+-----------------+
Total Size = 16,523 MB
The only problem is that the results don't match up with the size of the table. I use the built-in stored procedure to get the size of the table
sp_spaceused [MyTable]
Results:
+---------+-------+-------------+-------------+------------+-------------+
| name | rows | reserved | data | index_size | unused |
+---------+-------+-------------+-------------+------------+-------------+
| MyTable | 61477 | 59425416 KB | 42482152 KB | 62584 KB | 16880680 KB |
+---------+-------+-------------+-------------+------------+-------------+
The stored procedure shows the total data size as 42 GB yet the query of all the columns shows 16 GB. What could be taking up the extra space if I have accounted for the size of all the columns?
EDIT - I don't think my issue is the same as the duplicate mentioned because here I am taking the SUM of all the grouped records while the previous question did not. There seems to be such a large disparity between the SUM of the DataLength function and the results of sp_spaceused (29 GB) I don't think it could be accounted for by indexes or header information alone.
First; your math is suspect; 1MB = (1KB * 1KB) = 1024B * 1024B
Second; there could be metadata associated with table and records. Inspecting the table definition can give insight here.

Iterate through an SQL Server table and insert rows

A table (Table1) has the data below:
+-----------+-----------+-----------+---------+
| AccountNo | OldBranch | NewBranch | Balance |
+-----------+-----------+-----------+---------+
| 785321 | 10 | 20 | -200 |
| 785322 | 10 | 20 | 300 |
+-----------+-----------+-----------+---------+
Using the logic :
if the Balance is negative (ie. <0) then NewBranch has to be debited (Dr) and Old Branch has to be credited (Cr);
if the Balance is positive (ie. >0) then OldBranch has to be debited (Dr) and New Branch has to be credited (Cr);
rows as below have to be inserted into another Table (Table2)
+------------+------+--------+--------+
| Account NO | DrCr | Branch | Amount |
+------------+------+--------+--------+
| 785321 | Dr | 20 | 200 |
| 785321 | Cr | 10 | 200 |
| 785322 | Cr | 20 | 300 |
| 785322 | Dr | 10 | 300 |
+------------+------+--------+--------+
What are the possible solutions using a Cursor and otherwise?
Thanks,
You did not provide much in the way of details but something like this should be pretty close.
update nb
set Balance = Balance - ABS(t1.Balance)
from NewBranch nb
join Table1 t1 on t1.AccountNo = nb.AccountNo
where nb.Balance < 0
update ob
set Balance = Balance - ABS(t1.Balance)
from OldBranch ob
join Table1 t1 on t1.AccountNo = ob.AccountNo
where ob.Balance > 0
You absolutely dont need a cursor, just a set of insert statements
INSERT INTO Table2 (AccountNo,DrCr,Branch,Amount)
SELECT AccountNo,'Dr',IIF(Balance<0,NewBranch,OldBranch),IIF(balance<0,-1*balance,balance) FROM Table1
UNION ALL
SELECT AccountNo,'Cr',IIF(Balance>0,NewBranch,OldBranch),IIF(balance<0,-1*balance,balance) FROM Table1
declare #t table (Accountno int,
OldBranch INT,
NewBranch int,
Balance int)
insert into #t (Accountno,
OldBranch,
NewBranch,
Balance)
values (785321,10,20,200),
(785322,10,20,300)
select Accountno,Y.CRDR,Y.Branch,Y.Amount from #t CROSS APPLY
(Select 'Dr' AS CRDR,OldBranch AS Branch,Balance As Amount
UNION ALL
Select 'Cr',NewBranch,Balance)y

SQL Server insertion performance

Let's suppose I have the following table with a clustered index on a column (say, a)
CREATE TABLE Tmp
(
a int,
constraint pk_a primary key clustered (a)
)
Then, let's assume that I have two sets of a very large number of rows to insert to the table.
1st set) values are sequentially increasing (i.e., {0,1,2,3,4,5,6,7,8,9,..., 999999997, 999999998, 99999999})
2nd set) values are sequentially decreasing (i.e., {99999999,999999998,999999997, ..., 3,2,1,0}
do you think there would be performance difference between inserting values in the first set and the second set? If so, why?
thanks
SQL Server will generally try and sort large inserts into clustered index order prior to insert anyway.
If the source for the insert is a table variable however then it will not take account of the cardinality unless the statement is recompiled after the table variable is populated. Without this it will assume the insert will only be one row.
The below script demonstrates three possible scenarios.
The insert source is already exactly in correct order.
The insert source is exactly in reversed order.
The insert source is exactly in reversed order but OPTION (RECOMPILE) is used so SQL Server compiles a plan suited for inserting 1,000,000 rows.
Execution Plans
The third one has a sort operator to get the inserted values into clustered index order first.
/*Create three separate identical tables*/
CREATE TABLE Tmp1(a int primary key clustered (a))
CREATE TABLE Tmp2(a int primary key clustered (a))
CREATE TABLE Tmp3(a int primary key clustered (a))
DBCC FREEPROCCACHE;
GO
DECLARE #Source TABLE (N INT PRIMARY KEY (N ASC))
INSERT INTO #Source
SELECT TOP (1000000) ROW_NUMBER() OVER (ORDER BY (SELECT 0))
FROM sys.all_columns c1, sys.all_columns c2, sys.all_columns c3
SET STATISTICS TIME ON;
PRINT 'Tmp1'
INSERT INTO Tmp1
SELECT TOP (1000000) N
FROM #Source
ORDER BY N
PRINT 'Tmp2'
INSERT INTO Tmp2
SELECT TOP (1000000) 1000000 - N
FROM #Source
ORDER BY N
PRINT 'Tmp3'
INSERT INTO Tmp3
SELECT 1000000 - N
FROM #Source
ORDER BY N
OPTION (RECOMPILE)
SET STATISTICS TIME OFF;
Verify Results and clean up
SELECT object_name(object_id) AS name,
page_count,
avg_fragmentation_in_percent,
fragment_count,
avg_fragment_size_in_pages
FROM
sys.dm_db_index_physical_stats(db_id(), object_id('Tmp1'), 1, NULL, 'DETAILED')
WHERE index_level = 0
UNION ALL
SELECT object_name(object_id) AS name,
page_count,
avg_fragmentation_in_percent,
fragment_count,
avg_fragment_size_in_pages
FROM
sys.dm_db_index_physical_stats(db_id(), object_id('Tmp2'), 1, NULL, 'DETAILED')
WHERE index_level = 0
UNION ALL
SELECT object_name(object_id) AS name,
page_count,
avg_fragmentation_in_percent,
fragment_count,
avg_fragment_size_in_pages
FROM
sys.dm_db_index_physical_stats(db_id(), object_id('Tmp3'), 1, NULL, 'DETAILED')
WHERE index_level = 0
DROP TABLE Tmp1, Tmp2, Tmp3
STATISTICS TIME ON results
+------+----------+--------------+
| | CPU Time | Elapsed Time |
+------+----------+--------------+
| Tmp1 | 6718 ms | 6775 ms |
| Tmp2 | 7469 ms | 7240 ms |
| Tmp3 | 7813 ms | 9318 ms |
+------+----------+--------------+
Fragmentation Results
+------+------------+------------------------------+----------------+----------------------------+
| name | page_count | avg_fragmentation_in_percent | fragment_count | avg_fragment_size_in_pages |
+------+------------+------------------------------+----------------+----------------------------+
| Tmp1 | 3345 | 0.448430493 | 17 | 196.7647059 |
| Tmp2 | 3345 | 99.97010463 | 3345 | 1 |
| Tmp3 | 3345 | 0.418535127 | 16 | 209.0625 |
+------+------------+------------------------------+----------------+----------------------------+
Conclusion
In this case all three of them ended up using exactly the same number of pages. However Tmp2 is 99.97% fragmented compared with only 0.4% for the other two. The insert to Tmp3 took the longest as this required an additional sort step first but this one time cost needs to be set against the benefit to future scans against the table of minimal fragmentation.
The reason why Tmp2 is so heavily fragmented can be seen from the below query
WITH T AS
(
SELECT TOP 3000 file_id, page_id, a
FROM Tmp2
CROSS APPLY sys.fn_PhysLocCracker(%%physloc%%)
ORDER BY a
)
SELECT file_id, page_id, MIN(a), MAX(a)
FROM T
group by file_id, page_id
ORDER BY MIN(a)
With zero logical fragmentation the page with the next highest key value would be the next highest page in the file but the pages are exactly in the opposite order of what they are supposed to be.
+---------+---------+--------+--------+
| file_id | page_id | Min(a) | Max(a) |
+---------+---------+--------+--------+
| 1 | 26827 | 0 | 143 |
| 1 | 26826 | 144 | 442 |
| 1 | 26825 | 443 | 741 |
| 1 | 26824 | 742 | 1040 |
| 1 | 26823 | 1041 | 1339 |
| 1 | 26822 | 1340 | 1638 |
| 1 | 26821 | 1639 | 1937 |
| 1 | 26820 | 1938 | 2236 |
| 1 | 26819 | 2237 | 2535 |
| 1 | 26818 | 2536 | 2834 |
| 1 | 26817 | 2835 | 2999 |
+---------+---------+--------+--------+
The rows arrived in descending order so for example values 2834 to 2536 were put into page 26818 then a new page was allocated for 2535 but this was page 26819 rather than page 26817.
One possible reason why the insert to Tmp2 took longer than Tmp1 is because as the rows are being inserted in exactly reverse order on the page every insert to Tmp2 means the slot array on the page needs to be rewritten with all previous entries moved up to make room for the new arrival.
To answer this question, you only need to look up what effect clustering has on data and the manner in which it is logically ordered. By clustering ascending, higher numbers get added on to the end of the table; inserts will be very fast. When inserting in reverse, it will be inserted in between two other records (read up on page splitting); this will result in slower inserts. This actually has other negative effects as well (read up on fill factor).
It has to do with allocating pages sequentially as is done for a clustered index. With the first they would naturally cluster together. But in the second, I think you would have to keep moving the page locations to have them sequentially ascending. However, I really only understand SQL server at a conceptual level, so you'd have to test.

Resources