I've written a stored procedure that is returning a 2 column temp table, one ID column that is not unique, but a value between 2 and 12 to group values on. The other column is that actual data value column. I want to break out this 1 column table into a table into basically 1 table of 11 columns, 1 column for each dataset.
i'd like to have this parsed out into columns by ID. An identity column is not necessary since they will be unique to their own column. Something like;
Data2 Data3 Data4
102692... 103516.... 108408....
104114... 103476.... 108890....
and so on. I have tried using a While Loop through the datasets, but it's mainly getting these contained in 1 insert that is troubling me. I can't figure out how to say
While recordCount > 0
Begin
Insert into #tempTable(value1ID2,value1ID3,Value1ID4)
End
and then loop through value2ID2, value2ID3 etc.
If this isn't attainable that's fine i'll have to figure out a workaround, but the main reason i'm trying to do this is for a Report Builder dataset for a line chart that will eventually share a date grouping.
Since you need to aggregate string values, then you will need to use either the max or min aggregate function. The problem with that is it will return a single row for each column. In order to rerun multiple rows, then you will need to use a windowing function like row_number() to generate a unique value for each id/string combination. This will allow you to return multiple rows for each id:
select Data2 = [2], Data3 = [3], Data4 = [4]
from
(
select id, stringvalue,
row_number() over(partition by id order by stringvalue) seq
from yourtable
) d
pivot
(
max(stringvalue)
for id in ([2], [3], [4])
) piv
Related
I need to calculate the difference of a column between two lines of a table. Is there any way I can do this directly in SQL? I'm using Microsoft SQL Server 2008.
I'm looking for something like this:
SELECT value - (previous.value) FROM table
Imagining that the "previous" variable reference the latest selected row. Of course with a select like that I will end up with n-1 rows selected in a table with n rows, that's not a probably, actually is exactly what I need.
Is that possible in some way?
Use the lag function:
SELECT value - lag(value) OVER (ORDER BY Id) FROM table
Sequences used for Ids can skip values, so Id-1 does not always work.
SQL has no built in notion of order, so you need to order by some column for this to be meaningful. Something like this:
select t1.value - t2.value from table t1, table t2
where t1.primaryKey = t2.primaryKey - 1
If you know how to order things but not how to get the previous value given the current one (EG, you want to order alphabetically) then I don't know of a way to do that in standard SQL, but most SQL implementations will have extensions to do it.
Here is a way for SQL server that works if you can order rows such that each one is distinct:
select rank() OVER (ORDER BY id) as 'Rank', value into temp1 from t
select t1.value - t2.value from temp1 t1, temp1 t2
where t1.Rank = t2.Rank - 1
drop table temp1
If you need to break ties, you can add as many columns as necessary to the ORDER BY.
WITH CTE AS (
SELECT
rownum = ROW_NUMBER() OVER (ORDER BY columns_to_order_by),
value
FROM table
)
SELECT
curr.value - prev.value
FROM CTE cur
INNER JOIN CTE prev on prev.rownum = cur.rownum - 1
Oracle, PostgreSQL, SQL Server and many more RDBMS engines have analytic functions called LAG and LEAD that do this very thing.
In SQL Server prior to 2012 you'd need to do the following:
SELECT value - (
SELECT TOP 1 value
FROM mytable m2
WHERE m2.col1 < m1.col1 OR (m2.col1 = m1.col1 AND m2.pk < m1.pk)
ORDER BY
col1, pk
)
FROM mytable m1
ORDER BY
col1, pk
, where COL1 is the column you are ordering by.
Having an index on (COL1, PK) will greatly improve this query.
LEFT JOIN the table to itself, with the join condition worked out so the row matched in the joined version of the table is one row previous, for your particular definition of "previous".
Update: At first I was thinking you would want to keep all rows, with NULLs for the condition where there was no previous row. Reading it again you just want that rows culled, so you should an inner join rather than a left join.
Update:
Newer versions of Sql Server also have the LAG and LEAD Windowing functions that can be used for this, too.
select t2.col from (
select col,MAX(ID) id from
(
select ROW_NUMBER() over(PARTITION by col order by col) id ,col from testtab t1) as t1
group by col) as t2
The selected answer will only work if there are no gaps in the sequence. However if you are using an autogenerated id, there are likely to be gaps in the sequence due to inserts that were rolled back.
This method should work if you have gaps
declare #temp (value int, primaryKey int, tempid int identity)
insert value, primarykey from mytable order by primarykey
select t1.value - t2.value from #temp t1
join #temp t2
on t1.tempid = t2.tempid - 1
Another way to refer to the previous row in an SQL query is to use a recursive common table expression (CTE):
CREATE TABLE t (counter INTEGER);
INSERT INTO t VALUES (1),(2),(3),(4),(5);
WITH cte(counter, previous, difference) AS (
-- Anchor query
SELECT MIN(counter), 0, MIN(counter)
FROM t
UNION ALL
-- Recursive query
SELECT t.counter, cte.counter, t.counter - cte.counter
FROM t JOIN cte ON cte.counter = t.counter - 1
)
SELECT counter, previous, difference
FROM cte
ORDER BY counter;
Result:
counter
previous
difference
1
0
1
2
1
1
3
2
1
4
3
1
5
4
1
The anchor query generates the first row of the common table expression cte where it sets cte.counter to column t.counter in the first row of table t, cte.previous to 0, and cte.difference to the first row of t.counter.
The recursive query joins each row of common table expression cte to the previous row of table t. In the recursive query, cte.counter refers to t.counter in each row of table t, cte.previous refers to cte.counter in the previous row of cte, and t.counter - cte.counter refers to the difference between these two columns.
Note that a recursive CTE is more flexible than the LAG and LEAD functions because a row can refer to any arbitrary result of a previous row. (A recursive function or process is one where the input of the process is the output of the previous iteration of that process, except the first input which is a constant.)
I tested this query at SQLite Online.
You can use the following funtion to get current row value and previous row value:
SELECT value,
min(value) over (order by id rows between 1 preceding and 1
preceding) as value_prev
FROM table
Then you can just select value - value_prev from that select and get your answer
Good day.
The question almost the same as topic subject.
So, if I have a query:
select t.* from mytable m,
json_table
(m.json_col,'$.arr[*]'
columns(...)
) t
where m.id = 1
should I bother with order of rows?
TIA,
Andrew.
As with many things to do with databases, there is a difference between the theoretical and the practical and the practical at large scales:
In theory, a result set is an unordered set of rows unless you have specified an ORDER BY clause.
In practice, for small data sets, the query will be handled by a single process and will generate rows in the order the rows are read from the data file and then processed; which means that it will read a row from mytable and then process the JSON data in order and produce the rows in the same order as the array.
In practice at larger scales, the query may be handled by multiple processes on a parallel system (among other factors that may affect the order in which results are generated) where each process reads part of the data set and processes it and then the outputs are combined into a single result set. In this case, there is no guarantee which part of the parallel system will provide the next row and a consistent order cannot be guaranteed.
If you want to guarantee an order then use a FOR ORDINALITY column to capture the array order and then use an ORDER BY clause:
SELECT m.something,
t.*
FROM mytable m
CROSS APPLY JSON_TABLE(
m.json_col,
'$.arr[*]'
COLUMNS(
idx FOR ORDINALITY,
value NUMBER PATH '$'
)
) t
WHERE m.id = 1
ORDER BY m.something, t.idx
Which, for the sample data:
CREATE TABLE mytable (
id NUMBER,
something VARCHAR2(10),
json_col CLOB CHECK(json_col IS JSON)
);
INSERT INTO mytable(id, something, json_col)
SELECT 1, 'AAA', '{"arr":[3,2,1]}' FROM DUAL UNION ALL
SELECT 1, 'BBB', '{"arr":[17,2,42,9]}' FROM DUAL;
Outputs:
SOMETHING
IDX
VALUE
AAA
1
3
AAA
2
2
AAA
3
1
BBB
1
17
BBB
2
2
BBB
3
42
BBB
4
9
db<>fiddle here
db<>fiddle here
I have a table with some names in a row. For each row I want to generate a random name. I wrote the following query to:
BEGIN transaction t1
Create table TestingName
(NameID int,
FirstName varchar(100),
LastName varchar(100)
)
INSERT INTO TestingName
SELECT 0,'SpongeBob','SquarePants'
UNION
SELECT 1, 'Bugs', 'Bunny'
UNION
SELECT 2, 'Homer', 'Simpson'
UNION
SELECT 3, 'Mickey', 'Mouse'
UNION
SELECT 4, 'Fred', 'Flintstone'
SELECT FirstName from TestingName
WHERE NameID = ABS(CHECKSUM(NEWID())) % 5
ROLLBACK Transaction t1
The problem is the "ABS(CHECKSUM(NEWID())) % 5" portion of this query sometime returns more than 1 row and sometimes returns 0 rows. I must be missing something but I can't see it.
If I change the query to
DECLARE #n int
set #n= ABS(CHECKSUM(NEWID())) % 5
SELECT FirstName from TestingName
WHERE NameID = #n
Then everything works and I get a random number per row.
If you take the query above and paste it into SQL management studio and run the first query a bunch of times you will see what I am attempting to describe.
The final update query will look like
Update TableWithABunchOfNames
set [FName] = (SELECT FirstName from TestingName
WHERE NameID = ABS(CHECKSUM(NEWID())) % 5)
This does not work because sometimes I get more than 1 row and sometimes I get no rows.
What am I missing?
The problem is that you are getting a different random value for each row. That is the problem. This query is probably doing a full table scan. The where clause is executed for each row -- and a different random number is generated.
So, you might get a sequence of random numbers where none of the ids match. Or a sequence where more than one matches. On average, you'll have one match, but you don't want "on average", you want a guarantee.
This is when you want rand(), which produces only one random number per query:
SELECT FirstName
from TestingName
WHERE NameID = floor(rand() * 5);
This should get you one value.
Why not use top 1?
Select top 1 firstName
From testingName
Order by newId()
This worked for me:
WITH
CTE
AS
(
SELECT
ID
,FName
,CAST(5 * (CAST(CRYPT_GEN_RANDOM(4) as int) / 4294967295.0 + 0.5) AS int) AS rr
FROM
dbo.TableWithABunchOfNames
)
,CTE_ForUpdate
AS
(
SELECT
CTE.ID
, CTE.FName
, dbo.TestingName.FirstName AS RandomName
FROM
CTE
LEFT JOIN dbo.TestingName ON dbo.TestingName.NameID = CTE.rr
)
UPDATE CTE_ForUpdate
SET FName = RandomName
;
This solution depends on how smart optimizer is.
For example, if I use INNER JOIN instead of LEFT JOIN (which is the correct choice for this query), optimizer would move calculation of random numbers outside the join loop and end result would be not what we expect.
I created a table TestingName with 5 rows as in the question and a table TableWithABunchOfNames with 100 rows.
Here is the execution plan with LEFT JOIN. You can see the Compute scalar that calculates random numbers is done before the join loop. You can see that 100 rows were updated:
Here is the execution plan with INNER JOIN. You can see the Compute scalar that calculates random numbers is done after the join loop and with extra filter. This query may update not all rows in TableWithABunchOfNames and some rows in TableWithABunchOfNames may be updated several times. You can see that Filter left 102 rows and Stream aggregate left only 69 rows. It means that only 69 rows were eventually updated and also there were multiple matches for some rows (102 - 69 = 33).
To guarantee that the result is what you expect you should generate random number for each row in TableWithABunchOfNames and explicitly remember the result, i.e. materialize the CTE shown above. Then use this temporary result to join with the table TestingName.
You can add a column to TableWithABunchOfNames to store generated random numbers or save CTE to a temp table or table variable.
I have a table called MyHistory my history have about 1000 rows in this table and the performance is crappy at best.
What I want to do is select rows showing the next row as a result. This is probably a bad example.
this is MyHistory structure ID int,DateTimeColumn datetime,ValueResult decimal (4,2)
my table has the following data
ID|DateTimeColumn|ValueResult
1|8/1/2005 1:01:29 PM|2
1|8/1/2006 1:01:29 PM|3
1|8/1/2007 1:01:29 PM|5
1|8/1/2008 1:01:29 PM|9
What I want to do is select out of this the following data
ID|DateTimeColumn|ValueResult|ChangeValue
1|8/1/2008 1:01:29 PM|9|4
1|8/1/2007 1:01:29 PM|5|2
1|8/1/2006 1:01:29 PM|3|1
1|8/1/2005 1:01:29 PM|2|
You'll notice that ID is = ID and the datetime column is now desc. Thats the easy part. But how do I make a self referencing table (in order to calculate the difference in value) based on which datetime comes next?
Thanks!
So, the task is:
to order records by DateTimeColumn descending,
to set sequence number for each record to identify next record,
to calculate required difference in value.
This is one of many possible solutions:
-- Use CTE to make intermediate table with sequence numbers - ranks
;WITH a (rank, ID, DateTimeColumn, ValueResult) AS
(
select rank() OVER (ORDER BY m.DateTimeColumn DESC) as rank, ID, DateTimeColumn, ValueResult
from MyHistory
)
-- Select all resulting columns
select a1.ID,
a1.DateTimeColumn,
a1.ValueResult,
a1.ValueResult - a2.ValueResult as ChangeValue -- Difference between current record and next one
from a a1
join a a2
on a2.rank = a1.rank + 1 -- linking next record to each one
I have a query that selects number of rows containing repeated rows that all columns values are the same except one column let's call it X column.
What I want to do is to combine all values of X column values in all repeated rows and separate the values with ',' char.
The query I use:
SELECT App.ID,App.Name,Grp.ColumnX
FROM (
SELECT * FROM CustomersGeneralGroups AS CG WHERE CG.GeneralGroups_ID IN(1,2,3,4)
) AS GroupsCustomers
LEFT JOIN Appointments AS App ON GroupsCustomers.Customers_ID = App.CustomerID
INNER JOIN Groups AS Grp ON Grp.ID = GroupsCustomers.GeneralGroups_ID
WHERE App.AppointmentDateTimeStart > #startDate AND App.AppointmentDateTimeEnd < #endDate
The column which will differ is ColumnX, columns ID and Name will be same but ColumnX will be different.
Ex:
if the query will return rows like these:
ID Name ColumnX
1 test1 1
1 test1 2
1 test1 3
The result I want to be is:
ID Name ColumnX
1 test1 1,2,3
I don't mind if I have to do it with linq not sql.
I used GroupBy in linq but it merges the ColumnX values.
If you have this data loaded in objects, you can use LINQ methods to achieve this like so:
var groupedRecords =
items
.GroupBy(item => new { item.Id, item.Name })
.Select(grouping => new
{
grouping.Key.Id,
grouping.Key.Name,
columnXValues = string.Join(",", grouping.Select(g => g.ColumnX))
});