How to sort matching colums - database

I have a huge database to sort. Basically I have datas from 2 periods. Over the years let's say people joined the database while others leaved.
Here is a screenshot :
Problem
How can I come to this type of result where everything is sorted :
Final result
Thanks

I think to solve this problem you should use google-script.
This is an option using pure google-sheet formula, hopefully can help you somewhat
Method 1:
=ArrayFormula({INDEX(SPLIT(FILTER(INDEX(QUERY({SPLIT("2005"&"#"&FILTER(A2:A&"/"&B2:B,A2:A<>""),"#");SPLIT("2015"&"#"&FILTER(C2:C&"/"&D2:D,C2:C<>""),"#")},"select Col2,count(Col2) group by Col2 pivot Col1"),0,1),INDEX(QUERY({SPLIT("2005"&"#"&FILTER(A2:A&"/"&B2:B,A2:A<>""),"#");SPLIT("2015"&"#"&FILTER(C2:C&"/"&D2:D,C2:C<>""),"#")},"select Col2,count(Col2) group by Col2 pivot Col1"),0,1)<>""),"/"),0,1),
IF(FILTER(INDEX(QUERY({SPLIT("2005"&"#"&FILTER(A2:A&"/"&B2:B,A2:A<>""),"#");SPLIT("2015"&"#"&FILTER(C2:C&"/"&D2:D,C2:C<>""),"#")},"select Col2,count(Col2) group by Col2 pivot Col1"),0,2),INDEX(QUERY({SPLIT("2005"&"#"&FILTER(A2:A&"/"&B2:B,A2:A<>""),"#");SPLIT("2015"&"#"&FILTER(C2:C&"/"&D2:D,C2:C<>""),"#")},"select Col2,count(Col2) group by Col2 pivot Col1"),0,1)<>"")<>"",INDEX(SPLIT(FILTER(INDEX(QUERY({SPLIT("2005"&"#"&FILTER(A2:A&"/"&B2:B,A2:A<>""),"#");SPLIT("2015"&"#"&FILTER(C2:C&"/"&D2:D,C2:C<>""),"#")},"select Col2,count(Col2) group by Col2 pivot Col1"),0,1),INDEX(QUERY({SPLIT("2005"&"#"&FILTER(A2:A&"/"&B2:B,A2:A<>""),"#");SPLIT("2015"&"#"&FILTER(C2:C&"/"&D2:D,C2:C<>""),"#")},"select Col2,count(Col2) group by Col2 pivot Col1"),0,1)<>""),"/"),0,2),),
IF(FILTER(INDEX(QUERY({SPLIT("2005"&"#"&FILTER(A2:A&"/"&B2:B,A2:A<>""),"#");SPLIT("2015"&"#"&FILTER(C2:C&"/"&D2:D,C2:C<>""),"#")},"select Col2,count(Col2) group by Col2 pivot Col1"),0,3),INDEX(QUERY({SPLIT("2005"&"#"&FILTER(A2:A&"/"&B2:B,A2:A<>""),"#");SPLIT("2015"&"#"&FILTER(C2:C&"/"&D2:D,C2:C<>""),"#")},"select Col2,count(Col2) group by Col2 pivot Col1"),0,1)<>"")<>"",INDEX(SPLIT(FILTER(INDEX(QUERY({SPLIT("2005"&"#"&FILTER(A2:A&"/"&B2:B,A2:A<>""),"#");SPLIT("2015"&"#"&FILTER(C2:C&"/"&D2:D,C2:C<>""),"#")},"select Col2,count(Col2) group by Col2 pivot Col1"),0,1),INDEX(QUERY({SPLIT("2005"&"#"&FILTER(A2:A&"/"&B2:B,A2:A<>""),"#");SPLIT("2015"&"#"&FILTER(C2:C&"/"&D2:D,C2:C<>""),"#")},"select Col2,count(Col2) group by Col2 pivot Col1"),0,1)<>""),"/"),0,2),)})
Method 2: (make formula more compact)
=QUERY(ArrayFormula({SPLIT("2005"&"|"&FILTER(A2:A&"|"&B2:B,A2:A<>""),"|");
SPLIT("2015"&"|"&FILTER(C2:C&"|"&D2:D,C2:C<>""),"|")}),
"select Col2,min(Col3) group by Col2,Col3 pivot Col1 order by Col2")

Related

SQL Server : return row from two rows based on the precedence of one over another

I have below table from which I want to return records based on the precedence of the column value.
col1
col2
Col3
1
val1
Master
1
val1
Distributor
2
val2
Master
3
val3
Distributor
3
val3
Master
precedence
type
1
Master
2
Distributor
Here I have type Master as precedence 1 and type Distributor has precedence 2. So, if Col1 and Col2 values are repeated for type (Master and Distributor) then I want to return the row for Master as it take precedence over Distributor.
Output:
The result I want to get is:
col1
col2
Col3
1
val1
Master
2
val2
Master
3
val3
Master
Please someone help me write the SQL query for this output.
You can use the WITH TIES option in concert with row_number()
Select top 1 with ties *
From YourTable
Order By row_number() over (partition by col1,col2 order by col3 desc)
If two tables
Select top 1 with ties A.*
From YourTable A
Join SeqTable B on A.Col3=B.Type
Order By row_number() over (partition by col1,col2 order by B.precedence)
You can use the INNER JOIN and analytical function as follows:
select col1, col2, col3 from
(select t1.*,
row_number() over (partition by t1.col1, t1.col2 order by t2.precedence) as rn
from table1 t1 join table2 t2 on t1.col3 = t2.precedence) t
where rn = 1
Perhaps it is possible to optimize, first I do aggregation with a minimum, then I merge the table to pull out type and then I select three fields from all this
SELECT
col1,
col2,
type
FROM (
SELECT *
FROM precedence_table
INNER JOIN
(
SELECT
col1,
col2,
Min(precedence_table.precedence) AS top
FROM `my_table`
LEFT JOIN `precedence_table`
ON my_table.col3 = precedence_table.type
GROUP BY my_table.col1, my_table.col2) AS agregated
HAVING precedence = agregated.top) AS queried

Error while creating pivot on SQL Server

I am facing error in last like while creating a pivot table on SQL Server.
Following is the code:
SELECT
COL1, 'X'
FROM
(SELECT COL1, COL2
FROM TABLE_X
WHERE COL3 = 'B' AND COL4 = 'Activation') AS SourceTable
PIVOT
(COUNT(COL1)
FOR COL2 IN ('X')
) AS PivotTable
Error:
Incorrect syntax near 'X'.
Thanks in advance.
Column COL1 will not exist in the Pivot result since it is the Aggregated column.
you can change this example to just
SELECT
*
FROM
(SELECT COL1, COL2
FROM TABLE_X
WHERE COL3 = 'B' AND COL4 = 'Activation') AS SourceTable
PIVOT
(COUNT(COL1)
FOR COL2 IN ([X]) -- put the values in square brackets instead of single quote
) AS PivotTable
and you should only get a single column back named X

SQL: how to list values of a column that are not the 5 most occurring value of that same column?

I understand how to display the 5 most occurring value of a column like so:
select top 5 col1, count(col1)
from table1
group by col1
order by count(col1) desc;
However, how do I create a query that displays all other values of the same column that are not in the result of the above query?
I tried the following sub query:
select col1
from table1
where col1 not in
(select top 5 col1, count(col1)
from table1
group by col1
order by count(col1) desc);
However the query failed and I got the following error message:
Only one expression can be specified in the select list when the
subquery is not introduced with EXISTS.
For Sql Server 2012+ you can use offset:
select col1, count(col1)
from table1
group by col1
order by count(col1) desc
offset 5 rows
You may want to add tiebreaker to your ordering here to make it deterministic:
select col1, count(col1)
from table1
group by col1
order by count(col1) desc, col1
offset 5 rows
Problem is you cannot select more than one column inside subquery.
(select top 5 col1, count(col1)..
You can remove the count(col1) from subquery but NOT IN clause can fail when col1 in subquery has NULL values
Try changing like this
with cte as
(
select top 5 col1
from table1
group by col1
order by count(col1) desc
)
select * from table1 A
where not exists (select 1 from cte B where a.Col=b.col)
Use OFFSET
select col1, count(col1)
from table1
group by col1
order by count(col1) desc
OFFSET 5 ROWS -- skip 5 rows, must use with order by

T-SQL Count Multi-Column Changes

I have a data like this:
a,x,1
a,x,2
a,y,5
a,z,5
a,t,5
a,s,6
b,x1,11
b,x1,21
b,y1,51
b,z1,51
b,t1,51
I want to count value changes but if 2nd field and 3rd field values isn't changed; this isn't a changing. Both 2nd and 3st field values must change.
In my example above; 1st row to 2nd row isn't a changing but 2nd row to 3rd row is a changing because both x and 2 values are changed. Again, 3rd row to 4th row is a changing.
I want to have result with query as
a,3
b,2
Thank you.
From your question actually the count for both a and b is 1. Becoz only one time there is a change in both the rows.
CREATE TABLE #t
(
col1 VARCHAR(10),
col2 VARCHAR(10),
col3 INT
)
INSERT INTO #t
VALUES ('a','x',1),
('a','x',2),
('a','y',5),
('a','z',5),
('b','x1',11),
('b','x1',21),
('b','y1',51),
('b','z1',51),
('b','t1',51);
WITH cte
AS (SELECT Dense_rank()
OVER(
partition BY col1
ORDER BY col2) col1_rn,
Dense_rank()
OVER(
partition BY col1
ORDER BY col3) col2_rn,
*
FROM #t)
SELECT a.col1,
Count(1) AS [count]
FROM cte a
LEFT JOIN cte b
ON a.col1 = b.col1
AND a.col1_rn = b.col1_rn + 1
AND a.col2_rn = b.col2_rn + 1
WHERE b.col1_rn IS NOT NULL
GROUP BY a.col1

How to filter rows by values of one column?

I need to get several columns form sql query. Then I have to filter this answer by the "distinct" values of one column, but in the output I need to have all columns, not only this which values has to be distinct. Can anybody help me? Order by clause is not an answer for me.
A,B,C,D
E,F,G,H
I,J,C,L
M,N,Z,H
Above is a simple rows output. Please have a look onto 3rd column. Let's assume that we don't know how many rows do we have. I need to select only rows which has distinct value in 3rd column. (C,G,Z) - We need to filter anyone from "C" rows.
I've arbitrarily chosen to use col1 to break ties on col3. You can adjust the order by portion of the partition to suit your needs.
/* Set up test data */
declare #test table (
col1 char(1),
col2 char(1),
col3 char(1),
col4 char(1)
)
insert into #test
(col1, col2, col3, col4)
select 'A','B','C','D' union all
select 'E','F','G','H' union all
select 'I','J','C','L' union all
select 'M','N','Z','H'
/* Here's the query */
;with cteRowNumber as (
select col1, col2, col3, col4,
row_number() over (partition by col3 order by col1) as RowNumber
from #test
)
select col1, col2, col3, col4
from cteRowNumber
where RowNumber = 1
Returns
col1 col2 col3 col4
----------------------------
A B C D
E F G H
M N Z H
ROLL UP or CUBE could be helpful for your problem, since they can aggregate (i.e. subtotal) data based on the GROUP BY and still return the individual rows.

Resources