Subquery in snowflake with unknown matches, matches to columns - snowflake-cloud-data-platform

I have a few tables that I'm joining and one of them can have an unknown number of matches, up to 6. Each match should be returned as a row value in the initial query. For example:
SELECT a.ID, a.match1, a.match2, a.match3, a.match4, a.match5, a.match6
FROM table1 a, (SELECT ID, match FROM table2 WHERE a.ID = table2.ID) b
WHERE a.ID = b.ID
That's probably not the right syntax but hopefully it shows what I need. So the nested query MAY return 1 match or 5. Each match should be the value for the corresponding column name, ie a.match1 = first match, b.match2 = second match, etc etc.
Please let me know if I need to explain further. I know this isn't the optimal schema to use but it's what I was told to work with.

Something that SQL doesn't like is an unknown number of columns.
As a quick hack, you could aggregate all matches in an array, and then have a query around it transforming the matches into a predefined (large) number of columns.
Like this:
with data as (
select $1 id
from (values(1),(2))
), data2 as (
select $1 id, $2 match
from (values(1, 'a1'),(1, 'a2'),(2, 'b1'),(2, 'b2'),(2, 'b3'))
)
select id, matches[0], matches[1], matches[2], matches[3]
from (
select a.id, array_agg(match) matches
from data a
join data2 b
on a.id=b.id
group by 1
);

Related

Sort the result according to the ARRAY elements?

I have the following query :
SELECT id,word FROM map
WHERE id::integer in (SELECT unnest(ary) FROM abc WHERE id = 11)
the problem is that the result comes in random order.
What I want is the result to come in the order defined by the content of ARRAY "ary"
How do I do that ?
I would unnest first and with that order given, would join the other tables on the id column:
SELECT
id,
word
FROM (
SELECT
unnest(ary) as id
FROM
abc
WHERE
id = 11
) a JOIN map
USING
(id)

Getting non-deterministic results from WITH RECURSIVE cte

I'm trying to create a recursive CTE that traverses all the records for a given ID, and does some operations between ordered records. Let's say I have customers at a bank who get charged a uniquely identifiable fee, and a customer can pay that fee in any number of installments:
WITH recursive payments (
id
, index
, fees_paid
, fees_owed
)
AS (
SELECT id
, index
, fees_paid
, fee_charged
FROM table
WHERE index = 1
UNION ALL
SELECT t.id
, t.index
, t.fees_paid
, p.fees_owed - p.fees_paid
FROM table t
JOIN payments p
ON t.id = p.id
AND t.index = p.index + 1
)
SELECT *
FROM payments
ORDER BY 1,2;
The join logic seems sound, but when I join the output of this query to the source table, I'm getting non-deterministic and incorrect results.
This is my first foray into Snowflake's recursive CTEs. What am I missing in the intermediate result logic that is leading to the non-determinism here?
I assume this is edited code, because in the anchor of you CTE you select the fourth column fee_charged which does not exist, and then in the recursion you don't sum the fees paid and other stuff, basically you logic seems rather strange.
So creating some random data, that has two different id streams to recurse over:
create or replace table data (id number, index number, val text);
insert into data
select * from values (1,1,'a'),(2,1,'b')
,(1,2,'c'), (2,2,'d')
,(1,3,'e'), (2,3,'f')
v(id, index, val);
Now altering you CTE just a little bit to concat that strings together..
WITH RECURSIVE payments AS
(
SELECT id
, index
, val
FROM data
WHERE index = 1
UNION ALL
SELECT t.id
, t.index
, p.val || t.val as val
FROM data t
JOIN payments p
ON t.id = p.id
AND t.index = p.index + 1
)
SELECT *
FROM payments
ORDER BY 1,2;
we get:
ID INDEX VAL
1 1 a
1 2 ac
1 3 ace
2 1 b
2 2 bd
2 3 bdf
Which is exactly as I would expect. So how this relates to your "it gets strange when I join to other stuff" is ether, your output of you CTE is not how you expect it to be.. Or your join to other stuff is not working as you expect, Or there is a bug with snowflake.
Which all comes down to, if the CTE results are exactly what you expect, create a table and join that to your other table, so eliminate some form of CTE vs JOIN bug, and to debug why your join is not working.
But if your CTE output is not what you expect, then lets help debug that.

How to perform join between on float type id column and one string having multiple ids seperated by comma

I have to join id of two sets where in on set has multiple ids and another table has only one id. My query is:
Select * from(Select('1301,1303,1305,1307,1309,1311,1313,1315') IDs from market group by market.Segment)P join DST d on p.IDs = d.ID
One thing to note is '1301,1303,1305,1307,1309,1311,1313,1315' is value coming from dynamic query so I can not manipulate this value (to 1301,1303,1305,1307,1309,1311,1313,1315).
In this query,d.ID is float type. This query does not work.
My aim is to find any record from DST table that has at least on id among ids 1301,1303,1305,1307,1309,1311,1313,1315.
How can I do this?
You can use IN clause :
select m.*
from market m
where m.id in (1301,1303,1305,1307,1309,1311,1313,1315);
If ids are in string format then you can do :
select m.*
from market m cross apply
dbo.string_split('1301,1303,1305,1307,1309,1311,1313,1315', ',') mm
where mm.value = m.id;

Count all max number value in difference tables sql

I got an error when I tried to solve this problem. First I need to count all values of 2 tables then I need in where condition get all max values.
My code:
Select *
FROM (
select Operator.OperatoriausPavadinimas,
(
select count(*)
from Plan
where Plan.operatoriausID= Operator.operatoriausID
) as NumberOFPlans
from Operator
)a
where a.NumberOFPlans= Max(a.NumberOFPlans)
I get this error
Msg 147, Level 15, State 1, Line 19
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
I don't know how to solve this.
I need get this http://prntscr.com/p700w9
Update 1
Plan table contains of http://prntscr.com/p7055l values and
Operator table contains of http://prntscr.com/p705k0 values.
Are you looking for... an aggregate query that joins both tables and returns the record that has the maximum count?
I suspect that this might phrase as follows:
SELECT TOP(1) o.OperatoriausPavadinimas, COUNT(*)
FROM Operatorius o
INNER JOIN Planas p ON p.operatoriausID = o.operatoriausID
GROUP BY o.OperatoriausPavadinimas
ORDER BY COUNT(*) DESC
If you want to allow ties, you can use TOP(1) WITH TIES.
You can use top with ties. Your query is a bit hard to follow, but I think you want:
select top (1) with ties o.OperatoriausPavadinimas, count(*)
from plan p join
operator o
on p.operatoriausID = o.operatoriausID
group by o.OperatoriausPavadinimas
order by count(*) desc;

Selecting Max with Lots of Other Items

Sorry for the poor title. I wasn't sure how to describe my problem. I've written a query that returns about 23,000 records. A lot of those records have similar information and I want to only select the records with the maximum of the field dbo.tblMsgsOnAir_Type8.fldBuddyLinkSigStrength. I've tried grouping by all of the other columns being selected, but it doesn't appear to work correctly. I don't fully understand SQL, especially the max and group functions. I can do simple max functions when I only want or need to select one thing. I don't understand how it works when I want to select a bunch of other data. Below is the query.
SELECT
dbo.tblmeterinfo.fldMeterSerialNumber AS "MOP_FNP_Meter",
dbo.tblMsgsOnAir_Type8.fldRBuddyId AS "MOP_FNP_FNID",
dbo.TBLMETERMAINT.fldmeterid AS "Meter_ID_Helped",
dbo.tblMsgsOnAir_Type8.fldCBuddyId AS "FNID_Helped",
dbo.fn_dt(dbo.tblMsgsOnAir_Type8.fldRBuddyToi) AS "TOI",
dbo.tblMsgsOnAir_Type8.fldBuddyLinkSigStrength AS "Sig_Str",
dbo.TBLSAWN_CIS_INFO.SML AS "Buddy_SML",
dbo.TBLMETERLIST.fldaddress AS "Buddy_Address",
dbo.TBLSAWNGISCOORD.X_COORD AS "X_Coord",
dbo.TBLSAWNGISCOORD.Y_COORD AS "Y_Coord"
FROM dbo.tblMsgsOnAir_Type8
LEFT OUTER JOIN dbo.TBLMETERLIST
ON (dbo.TBLMETERLIST.FLDREPID = dbo.tblMsgsOnAir_Type8.fldCBuddyId)
LEFT OUTER JOIN dbo.TBLMETERMAINT
ON (dbo.TBLMETERMAINT.FLDREPID = dbo.tblMsgsOnAir_Type8.fldCBuddyID)
LEFT OUTER JOIN dbo.TBLSAWN_CIS_INFO
ON (dbo.TBLSAWN_CIS_INFO.FLDREPID = dbo.tblMsgsOnAir_Type8.fldCBuddyId)
LEFT OUTER JOIN dbo.TBLSAWNGISCOORD
ON (dbo.TBLSAWNGISCOORD.SRV_MAP_LOC = dbo.TBLSAWN_CIS_INFO.SML)
LEFT OUTER JOIN dbo.tblmeterinfo
ON (dbo.tblmeterinfo.fldRepId = dbo.tblMsgsOnAir_Type8.fldRBuddyId)
WHERE dbo.tblMsgsOnAir_Type8.fldRBuddyId IN (SELECT
dbo.tblSAWN_FNPmap.Repid
FROM dbo.tblSAWN_FNPmap)
AND dbo.TBLMETERMAINT.fldmeterid IS NOT NULL
The query below is simple and does what I want, but doesn't get all of the other field. This query only returns 617 records. I would like the above query to return 617 records, but include all of the other information I've selected.
SELECT
dbo.TBLMETERMAINT.fldmeterid AS "Meter_ID_Helped",
MAX(dbo.tblMsgsOnAir_Type8.fldBuddyLinkSigStrength) AS "Max_Sig"
FROM dbo.tblMsgsOnAir_Type8
LEFT OUTER JOIN dbo.TBLMETERMAINT
ON (dbo.TBLMETERMAINT.FLDREPID = dbo.tblMsgsOnAir_Type8.fldCBuddyID)
WHERE dbo.tblMsgsOnAir_Type8.fldRBuddyId IN (SELECT
dbo.tblSAWN_FNPmap.Repid
FROM dbo.tblSAWN_FNPmap)
AND dbo.TBLMETERMAINT.fldmeterid IS NOT NULL
GROUP BY dbo.TBLMETERMAINT.fldmeterid
Probably row_number() to the rescue. You can use it to find the best records in a set, with a grouping by some subset or other. Something like
select *
from ....
where row_number over (partition by id order by fldBuddyLinkSigStrength) = 1
So SQL Server assigns a row number within the groups. Each record will be sub-grouped by id, in this case, and given 1 if it's the best strength, 2 if it's next, etc.
If you are getting duplicates have you tried using SELECT DISTINCT?
Basically how Max works is that it will select the highest value in the group.
So if you have a table:
ID | VALUE
1 | 10
1 | 7
1 | 9
2 | 6
2 | 8
And do
SELECT ID, MAX(VALUE)
FROM TABLE
GROUP BY ID
You'll get the max value per ID
ID | VALUE
1 | 10
2 | 8
If you want to get the Max while not grouping the result then you can do the group in a subselect
SELECT ID, VALUE, MAX_VALUE etc etc
FROM TABLE
JOIN ( SELECT ID, MAX(VALUE) AS MAX_VALUE FROM TABLE GROUP BY ID) as MAX ON MAX.ID = TABLE.ID
Without knowing your table structures in more detail I can't be sure this is the best way, but here's something that should work. Use the 2nd query as the left side of a left join, to pick up the extra columns:
select a.*
from (<your 2nd query>) a
left join dbo.TBLMETERLIST
on (a.FLDREPID = dbo.tblMsgsOnAir_Type8.fldCBuddyId)
left join <next table> ...
and so on. You'll also have to left join on dbo.tblMsgsOnAir_Type8 in order to pick up the columns in that table, so that's one additional left join beyond what your first query does. By the way, it's a good idea to post code here laid out so it's readable; it makes it a lot easier for others to understand.

Resources