Grouping results by test - sql-server

I have a table with this structure:
Test Value Shape
1 1,89 20
1 2,08 27
1 2,05 12
2 2,01 12
2 2,05 35
2 2,03 24
I need a column for each Test value, in this case, something like this:
Test 1 | Test 2
Value | Shape | Value | Shape
I tried to do this with pivot, but the results wasn't good. Can someone help me?
[]'s

There are a few different ways that you can get the result since you are using SQL Server. In order to get the result, you will first need to create a unique value that will allow you return multiple rows for each Test. I would apply a windowing function like row_number():
select test, value, shape,
row_number() over(partition by test
order by value) seq
from yourtable
This query will be used as the base for the rest of your process. This creates a unique sequence for each test and then when you apply the aggregate function you are able to return multiple rows.
You can get your final result using an aggregate function with a CASE expression:
select
max(case when test = 1 then value end) test1Value,
max(case when test = 1 then shape end) test1Shape,
max(case when test = 2 then value end) test2Value,
max(case when test = 2 then shape end) test2Shape
from
(
select test, value, shape,
row_number() over(partition by test
order by value) seq
from yourtable
) d
group by seq;
See SQL Fiddle with Demo.
If you want to implement the PIVOT function, then I would first need to unpivot your multiple columns of Value and Shape and then apply the PIVOT. You will still use row_number() to generate a unique sequence that will be needed to return multiple rows. The basic syntax will be:
;with cte as
(
-- get unique sequence
select test, value, shape,
row_number() over(partition by test
order by value) seq
from yourtable
)
select test1Value, test1Shape,
test2Value, test2Shape
from
(
-- unpivot the multiple columns
select t.seq,
col = 'test'+cast(test as varchar(10))
+ col,
val
from cte t
cross apply
(
select 'value', value union all
select 'shape', cast(shape as varchar(10))
) c (col, val)
) d
pivot
(
max(val)
for col in (test1Value, test1Shape,
test2Value, test2Shape)
) piv;
See SQL Fiddle with Demo. Both versions give a result:
| TEST1VALUE | TEST1SHAPE | TEST2VALUE | TEST2SHAPE |
|------------|------------|------------|------------|
| 1,89 | 20 | 2,01 | 12 |
| 2,05 | 12 | 2,03 | 24 |
| 2,08 | 27 | 2,05 | 35 |

Related

PostgreSQL - Filtering result set by array column

I have a function which returns a table. One of the columns happens to be a text array. Currently the values in this array column will only ever have at most 2 elements, however there is the instance when the same row will be returned twice with the duplicate row elements in the opposite order. I'm hoping to find a way to only return 1 of these rows and discard the other. To give an example, I run a function as
SELECT * FROM schema.function($1,$2,$3)
WHERE conditions...;
which returns me something like this
ID | ARRAY_COL | ...
1 | {'Good','Day'} | ...
2 | {'Day','Good'} | ...
3 | {'Stuck'} | ...
4 | {'with'} | ...
5 | {'array'} | ...
6 | {'filtering'} | ...
So in this example, I want to return the whole result set with the exception that I only want either row 1 or 2 as they have the same elements in the array (albeit inverted with respect to each other). I'm aware this is probably a bit of a messy problem, but it's something I need to get to the bottom of. Ideally I would like to stick a WHERE clause at the end of my function call which forced the result set to ignore any array value that had the same elements as a previous row. Pseudo code might be something like
SELECT * FROM schema.function($1,$2,$3)
WHERE NOT array_col #> (any previous array_col value);
Any pointers in the right direction would be much appreciated, thanks.
Not sure is the best solution, but could work especially in cases where you might have partially overlapping arrays.
The solution is in 3 steps:
unnest the array_col column and order by id and the item value
select
id,
unnest(array_col) array_col_val
from dataset
order by
id,
array_col_val
regroup by id, now row with id 1 and 2 have the same array_col value
select id,
array_agg(array_col_val) array_col
from ordering
group by id
Select the min id grouping by array_col
select array_col, min(id) from regrouping group by array_col
Full statement
with dataset as (
select 1 id, ARRAY['Good','Day'] array_col UNION ALL
select 2 id, ARRAY['Day','Good'] array_col UNION ALL
select 3 id, ARRAY['Stuck'] array_col UNION ALL
select 4 id, ARRAY['with'] array_col UNION ALL
select 5 id, ARRAY['array_colay'] array_col UNION ALL
select 6 id, ARRAY['filtering'] array_col
)
, ordering as
(select id,
unnest(array_col) array_col_val
from dataset
order by id,
array_col_val)
, regrouping as
(
select id,
array_agg(array_col_val) array_col
from ordering
group by id
)
select array_col, min(id) from regrouping group by array_col;
Result
array_col | min
---------------+-----
{Stuck} | 3
{array_colay} | 5
{filtering} | 6
{with} | 4
{Day,Good} | 1
(5 rows)

sql selection of one value from several identical

I have the result of executing a query. it collects data from several tables. he is such a:
|Name|date |number|Id
|alex|01-01-2021 |1111 | 1
|mike|01-01-2021 |2222 | 2
|alex|02-01-2021 |1111 | 3
|alex|03-01-2021 |1111 | 4
|john|04-01-2021 |3333 | 5
i need to get the following result:
|Name|date |number| Id
|mike|01-01-2021|2222 | 2
|alex|any value |1111 | Any value
|john|04-01-2021|3333 | 5
I need to select one of the repeated values and show it.I have a large query with many columns. here I gave only a short version to explain the essence of the problem
select Name,max(date) as date,number
from atable
group by Name, number
You may use this CTE and manage which date (first or last) you will get
WITH data AS (
SELECT
Name,
date,
number,
row_number() OVER (PARTITION BY Name ORDER BY date) AS row_num
FROM test01
)
SELECT
Name,
date,
number
FROM data
WHERE row_num = 1

Using multiple row results on a formula, based by group

Is there a way to use the results from a multiple rows on a formula, divided by each group.
I have the followin formula:
result = (1st vfg ) / (1 + (1st vfg / 2nd vfg) + (1st vfg / 3rd vfg) + ... + (1st vfg / *nth* vfg) )
vfg = value from group
For example, the table bellow:
Group | Value
---------------
1 | 1000
1 | 280
1 | 280
2 | 1000
Note: I guarantee that there will be no 0 (zero) or NULLs in the value for the first table
Should give me the following result:
Group | Result
---------------
1 | 122.85
2 | 1000 -> If there is only one value on the group, the result will be the value itself
You need a column that indicates the row order within a group (timestamps, the sequence number, identity column, etc.). Rows in a database table have no implicit order. Once you have that, you can use a CTE and window functions to solve the problem:
;WITH
cte AS
(
SELECT [Group]
, [Value]
, FIRST_VALUE([Value]) OVER (PARTITION BY [Group] ORDER BY RowOrder) AS FirstValue
, FIRST_VALUE([Value]) OVER (PARTITION BY [Group] ORDER BY RowOrder) / [Value] AS Subtotal
FROM MyTable
)
SELECT [Group]
, AVG(FirstValue) / SUM(Subtotal) AS Result
FROM cte
GROUP BY [Group]

TSQL Conditional Where or Group By?

I have a table like the following:
id | type | duedate
-------------------------
1 | original | 01/01/2017
1 | revised | 02/01/2017
2 | original | 03/01/2017
3 | original | 10/01/2017
3 | revised | 09/01/2017
Where there may be either one or two rows for each id. If there are two rows with same id, there would be one with type='original' and one with type='revised'. If there is one row for the id, type will always be 'original'.
What I want as a result are all the rows where type='revised', but if there is only one row for a particular id (thus type='original') then I want to include that row too. So desired output for the above would be:
id | type | duedate
1 | revised | 02/01/2017
2 | original | 03/01/2017
3 | revised | 09/01/2017
I do not know how to construct a WHERE clause that conditionally checks whether there are 1 or 2 rows for a given id, nor am I sure how to use GROUP BY because the revised date could be greater than or less than than the original date so use of aggregate functions MAX or MIN don't work. I thought about using CASE somehow, but also do not know how to construct a conditional that chooses between two different rows of data (if there are two rows) and display one of them rather than the other.
Any suggested approaches would be appreciated.
Thanks!
you can use row number for this.
WITH T AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Type DESC) AS RN
FROM YourTable
)
SELECT *
FROM T
WHERE RN = 1
Is something like this sufficient?
SELECT *
FROM mytable m1
WHERE type='revised'
or 1=(SELECT COUNT(*) FROM mytable m2 WHERE m2.id=m1.id)
You could use a subquery to take the MAX([type]). In this case it works for [type] since alphabetically we want revised first, then original and "r" comes after "o" in the alphabet. We can then INNER JOIN back on the same table with the matching conditions.
SELECT T2.*
FROM (
SELECT id, MAX([type]) AS [MAXtype]
FROM myTABLE
GROUP BY id
) AS dT INNER JOIN myTable T2 ON dT.id = T2.id AND dT.[MAXtype] = T2.[type]
ORDER BY T2.[id]
Gives output:
id type duedate
1 revised 2017-02-01
2 original 2017-03-01
3 revised 2017-09-01
Here is the sqlfiddle: http://sqlfiddle.com/#!6/14121f/6/0

How to convert JSON Array of Arrays to columns and rows

I'm pulling data from an API in JSON with a format like the example data below. Where essentially every "row" is an array of values. The API doc defines the columns and their types in advance. So I know the col1 is, for example, a varchar, and that col2 is an int.
CREATE TEMP TABLE dat (data json);
INSERT INTO dat
VALUES ('{"COLUMNS":["col1","col2"],"DATA":[["a","1"],["b","2"]]}');
I want to transform this within PostgreSQL 9.3 such that I end up with:
col1 | col2
------------
a | 1
b | 2
Using json_array_elements I can get to:
SELECT json_array_elements(data->'DATA')
FROM dat
json_array_elements
json
---------
["a","1"]
["b","2"]
but then I can't figure out how to do either convert the JSON array to a PostgreSQL array so I can perform something like unnest(ARRAY['a','1'])
General case for unknown columns
To get a result like
col1 | col2
------------
a | 1
b | 2
will require a bunch of dynamic SQL, because you don't know the types of the columns in advance, nor the column names.
You can unpack the json with something like:
SELECT
json_array_element_text(colnames, colno) AS colname,
json_array_element_text(colvalues, colno) AS colvalue,
rn,
idx,
colno
FROM (
SELECT
data -> 'COLUMNS' AS colnames,
d AS colvalues,
rn,
row_number() OVER () AS idx
FROM (
SELECT data, row_number() OVER () AS rn FROM dat
) numbered
cross join json_array_elements(numbered.data -> 'DATA') d
) elements
cross join generate_series(0, json_array_length(colnames) - 1) colno;
producing a result set like:
colname | colvalue | rn | idx | colno
---------+----------+----+-----+-------
col1 | a | 1 | 1 | 0
col2 | 1 | 1 | 1 | 1
col1 | b | 1 | 2 | 0
col2 | 2 | 1 | 2 | 1
(4 rows)
You can then use this as input to the crosstab function from the tablefunc module with something like:
SELECT * FROM crosstab('
SELECT
to_char(rn,''00000000'')||''_''||to_char(idx,''00000000'') AS rowid,
json_array_element_text(colnames, colno) AS colname,
json_array_element_text(colvalues, colno) AS colvalue
FROM (
SELECT
data -> ''COLUMNS'' AS colnames,
d AS colvalues,
rn,
row_number() OVER () AS idx
FROM (
SELECT data, row_number() OVER () AS rn FROM dat
) numbered
cross join json_array_elements(numbered.data -> ''DATA'') d
) elements
cross join generate_series(0, json_array_length(colnames) - 1) colno;
') results(rowid text, col1 text, col2 text);
producing:
rowid | col1 | col2
---------------------+------+------
00000001_ 00000001 | a | 1
00000001_ 00000002 | b | 2
(2 rows)
The column names are not retained here.
If you were on 9.4 you could avoid the row_number() calls and use WITH ORDINALITY, making it much cleaner.
Simplified with fixed, known columns
Since you apparently know the number of columns and their types in advance the query can be considerably simplified.
SELECT
col1, col2
FROM (
SELECT
rn,
row_number() OVER () AS idx,
elem ->> 0 AS col1,
elem ->> 1 :: integer AS col2
FROM (
SELECT data, row_number() OVER () AS rn FROM dat
) numbered
cross join json_array_elements(numbered.data -> 'DATA') elem
ORDER BY 1, 2
) x;
result:
col1 | col2
------+------
a | 1
b | 2
(2 rows)
Using 9.4 WITH ORDINALITY
If you were using 9.4 you could keep it cleaner using WITH ORDINALITY:
SELECT
col1, col2
FROM (
SELECT
elem ->> 0 AS col1,
elem ->> 1 :: integer AS col2
FROM
dat
CROSS JOIN
json_array_elements(dat.data -> 'DATA') WITH ORDINALITY AS elements(elem, idx)
ORDER BY idx
) x;
this code worked fine for me, maybe it be useful for someone.
select to_json(array_agg(t))
from (
select text, pronunciation,
(
select array_to_json(array_agg(row_to_json(d)))
from (
select part_of_speech, body
from definitions
where word_id=words.id
order by position asc
) d
) as definitions
from words
where text = 'autumn'
) t
Credits:
https://hashrocket.com/blog/posts/faster-json-generation-with-postgresql

Resources