I'm having problems comparing Postgres types, and would be grateful for some help. I am extracting valid document types from a configuration table that holds a tilda-separated string, as follows:
SELECT string_to_array(value,'|') as document_kinds
FROM company_configs
WHERE option = 'document_kinds'
this gives me an array of values, so
'doc1|doc2|doc3' becomes {doc1,doc2,doc3}
Next I need to select the documents for a given person which match my document types:
SELECT * FROM people
JOIN documents ON ...
WHERE kind IN
(SELECT string_to_array(value,'|') as document_kinds
FROM company_configs
WHERE option = 'document_kinds')
the documents.kind column is 'character varying'
my understanding is that string_to_array is producing an array of text values 'text[]'
This query produces the error 'ERROR: operator does not exist: character varying = text[]'
If I cast 'kind' into text, with
SELECT * FROM people
JOIN documents ON ...
WHERE kind::text IN
(SELECT string_to_array(value,'|') as visa_document_kinds FROM languages_united.company_configs WHERE option = 'visa_document_kinds')
I get the error 'ERROR: operator does not exist: text = text[]'
I'm not sure how to compare the two, and would be grateful for any advice.
Thanks in advance
Dan
Postgres 9.4.1
You can select against any array element by using the ANY operator, if your sub-query returns exactly one row:
SELECT *
FROM people
JOIN documents ON ...
WHERE kind = ANY (
SELECT string_to_array(value,'|') as document_kinds
FROM company_configs
WHERE option = 'document_kinds');
If the sub-query possibly returns multiple rows, you can use the regexp_split_to_table() function:
SELECT *
FROM people
JOIN documents ON ...
JOIN (
SELECT document_kinds
FROM company_configs,
regexp_split_to_table(value, '\|') as document_kinds
WHERE option = 'document_kinds') sub ON sub.document_kinds = kind;
(You will have to tweak this to match the rest of your query.)
Related
I am moving a query from SQL Server to Snowflake. Part of the query creates a pivot table. The pivot table part works fine (I have run it in isolation, and it pulls numbers I expect).
However, the following parts of the query rely on the pivot table- and those parts fail. Some of the fields return as a string-type. I believe that the problem is Snowflake is having issues converting string data to numeric data. I have tried CAST, TRY_TO_DOUBLE/NUMBER, but these just pull up 0.
I will put the code down below, and I appreciate any insight as to what I can do!
CREATE OR REPLACE TEMP TABLE ATTR_PIVOT_MONTHLY_RATES AS (
SELECT
Market,
Coverage_Mo,
ZEROIFNULL(TRY_TO_DOUBLE('Starting Membership')) AS Starting_Membership,
ZEROIFNULL(TRY_TO_DOUBLE('Member Adds')) AS Member_Adds,
ZEROIFNULL(TRY_TO_DOUBLE('Member Attrition')) AS Member_Attrition,
((ZEROIFNULL(CAST('Starting Membership' AS FLOAT))
+ ZEROIFNULL(CAST('Member Adds' AS FLOAT))
+ ZEROIFNULL(CAST('Member Attrition' AS FLOAT)))-ZEROIFNULL(CAST('Starting Membership' AS FLOAT)))
/ZEROIFNULL(CAST('Starting Membership' AS FLOAT)) AS "% Change"
FROM
(SELECT * FROM ATTR_PIVOT
WHERE 'Starting Membership' IS NOT NULL) PT)
I realize this is a VERY big question with a lot of moving parts... So my main question is: How can I successfully change the data type to numeric value, so that hopefully the formulas work in the second half of the query?
Thank you so much for reading through it all!
EDITED FOR SHORTENING THE QUERY WITH UNNEEDED SYNTAX
CAST(), TRY_TO_DOUBLE(), TRY_TO_NUMBER(). I have also put the fields (Starting Membership, Member Adds) in single and double quotation marks.
Unless you are quoting your field names in this post just to highlight them for some reason, the way you've written this query would indicate that you are trying to cast a string value to a number.
For example:
ZEROIFNULL(TRY_TO_DOUBLE('Starting Membership'))
This is simply trying to cast a string literal value of Starting Membership to a double. This will always be NULL. And then your ZEROIFNULL() function is turning your NULL into a 0 (zero).
Without seeing the rest of your query that defines the column names, I can't provide you with a correction, but try using field names, not quoted string values, in your query and see if that gives you what you need.
You first mistake is all your single quoted columns names are being treated as strings/text/char
example your inner select:
with ATTR_PIVOT(id, studentname) as (
select * from values
(1, 'student_a'),
(1, 'student_b'),
(1, 'student_c'),
(2, 'student_z'),
(2, 'student_a')
)
SELECT *
FROM ATTR_PIVOT
WHERE 'Starting Membership' IS NOT NULL
there is no "starting membership" column and we get all the rows..
ID
STUDENTNAME
1
student_a
1
student_b
1
student_c
2
student_z
2
student_a
So you need to change 'Starting Membership' -> "Starting Membership" etc,etc,etc
As Mike mentioned, the 0 results is because the TRY_TO_DOUBLE always fails, and thus the null is always turned to zero.
now, with real "string" values, in real named columns:
with ATTR_PIVOT(Market, Coverage_Mo, "Starting Membership", "Member Adds", "Member Attrition") as (
select * from values
(1, 10 ,'student_a', '23', '150' )
)
SELECT
Market,
Coverage_Mo,
ZEROIFNULL(TRY_TO_DOUBLE("Starting Membership")) AS Starting_Membership,
ZEROIFNULL(TRY_TO_DOUBLE("Member Adds")) AS Member_Adds,
ZEROIFNULL(TRY_TO_DOUBLE("Member Attrition")) AS Member_Attrition
FROM ATTR_PIVOT
WHERE "Starting Membership" IS NOT NULL
we get what we would expect:
MARKET
COVERAGE_MO
STARTING_MEMBERSHIP
MEMBER_ADDS
MEMBER_ATTRITION
1
10
0
23
150
Table Variables:
Column Name
Type
name
varchar
value
int
name is the primary key for this table.
This table contains the stored variables and their values.
Table Expressions:
Column Name
Type
left_operand
varchar
operator
enum
right_operand
varchar
(left_operand, operator, right_operand) is the primary key for this table.
This table contains a boolean expression that should be evaluated.
operator is an enum that takes one of the values ('<', '>', '=')
The values of left_operand and right_operand are guaranteed to be in the Variables table.
Write an SQL query to evaluate the boolean expressions in Expressions table.
Return the result table in any order.
I am working on a SQL problem as shown in the above. I used MS SQL server and tried
SELECT
left_operand, operator, right_operand,
IIF(
(left_values > right_values AND operator = '>') OR
(left_values < right_values AND operator = '<' ) OR
(left_values = right_values AND operator = '='), 'true', 'false') as 'value'
FROM
(SELECT *,
IIF(left_operand = 'x', (SELECT value FROM Variables WHERE name='x')
, (SELECT value FROM Variables WHERE name='y')) as left_values,
IIF(right_operand = 'x', (SELECT value FROM Variables WHERE name='x')
, (SELECT value FROM Variables WHERE name='y')) as right_values
FROM Expressions) temp;
It works well on the test set but gets wrong when I submit it.
I think my logic is correct, could anyone help take a look at it and let me know what my problem is?
Thank you!
It feels like your example code is a lot more complicated than it needs to be. That's probably why it's failing the check. In your FROM you're using sub-selects but really a simple inner join would work much simpler. Also, if there were variables other than X and Y it doesn't look like your example code would work. Here's my code that I wrote in Postgres (should work in any SQL though).
SELECT e.left_operand, l.value as left_val, e.operator, e.right_operand, r.value as right_val,
CASE e.operator
WHEN '<' THEN
(l.value < r.value)
WHEN '=' THEN
(l.value = r.value)
WHEN '>' THEN
(l.value = r.value)
END as eval
FROM
expression as e
JOIN
variable as l on e.left_operand = l.name
JOIN
variable as r on e.right_operand = r.name
Here's a screenshot of my output:
I also have a db-fiddle link for you to check out.
https://www.db-fiddle.com/f/fdnJVSUQHS9Vep4uDSe5ZP/0
I need some help to improve part of my query. The query is returning the correct data, I just need to exclude some extra information that I don't need.
I believe that one of the main parts that will change is:
JOIN TBL_DATA_TYPE_RO_BODY TB ON TB.FK_ID_TBL_FILE_NAMES=VMI.ID_TBL_FILE_NAMES
In this part, I have, for example, 2 FK_ID_TBL_FILE_NAMES, it will return 2 results from TBL_DATA_TYPE_RO_BODY.
The data that I have is (I excluded some extra columns):
If I have 2 or more equal MAG for the same field "ONLY_FIELD_NAME" I should return only the first one (I don't care about the others one). I believe that this is a simple case for Group by, but I am having trouble doing the group by on the join.
My ideas:
Use select top (i.e. here)
Use first valeu (i.e. here)
What I have (note the 2 last lines):
Freq|Mag|Phase|Date|ONLY_FILE_NAME
1608039|767|3234|37:00.0|RO_Mass_Load_4b
1608039|781|3371|44:00.0|RO_Mass_Load_4b
1608039|788|3138|37:00.0|RO_Mass_Load_4b
1608039|797|3326|44:00.0|RO_Mass_Load_4b
1608039|808|3117|37:00.0|RO_Mass_Load_4b
1608039|808|3269|44:00.0|RO_Mass_Load_4b
What I would like to have (note the last line):
Freq|Mag|Phase|Date|ONLY_FILE_NAME
1608039|767|3234|37:00.0|RO_Mass_Load_4b
1608039|781|3371|44:00.0|RO_Mass_Load_4b
1608039|788|3138|37:00.0|RO_Mass_Load_4b
1608039|797|3326|44:00.0|RO_Mass_Load_4b
1608039|808|3117|37:00.0|RO_Mass_Load_4b
Note that the mag field is coming from my JOIN.
Ideas? Any help?
In case you wanna see the whole code is:
SELECT TW.CURRENT_MEASUREMENT as Cycle_Current_Measurement,
TW.REF_MEASUREMENT as Cycle_Ref_Measurement,
CONVERT(REAL,TT.CURRENT_TEMP) as Cycle_Current_Temp,
CONVERT(REAL,TT.REF_TEMP) as Cycle_Ref_Temp,
TP.TYPE as Cycle_Type, TB.FREQUENCY as Freq,
TB.MAGNITUDE as Mag,
TB.PHASE as Phase,
VMI.TIME_FORMATTED as Date,
VMI.ID_TBL_FILE_NAMES as IdFileNames, VMI.ID_TBL_DATA_TYPE_RO_HEADER as IdHeader, VMI.*
FROM VW_MAIN_INFO VMI
JOIN TBL_DATA_TYPE_RO_BODY TB ON TB.FK_ID_TBL_FILE_NAMES=VMI.ID_TBL_FILE_NAMES
LEFT JOIN TBL_POINTS_AND_CYCLES TP ON VMI.ID_TBL_DATA_TYPE_RO_HEADER = TP.FK_ID_TBL_DATA_TYPE_RO_HEADER
LEFT JOIN TBL_POINTS_AND_MEASUREMENT TW ON VMI.ID_TBL_DATA_TYPE_RO_HEADER = TW.FK_ID_TBL_DATA_TYPE_RO_HEADER
LEFT JOIN TBL_POINTS_AND_TEMP TT ON VMI.ID_TBL_DATA_TYPE_RO_HEADER = TT.FK_ID_TBL_DATA_TYPE_RO_HEADER
Try something like this. the partition by is like a group by; it defines groups over which row_number will auto-increment an integer by 1. The order by tells row_number which rows should have a lower number. So in this example, the lowest date will have RID = 1. Then subquery it, and select only those rows which have RID = 1
select *
from (select RID = row_number() over (partition by tb.Magnitude order by vmi.time_formatted)
from ...<rest of your query>) a
where a.RID = 1
I want to slice a word eg: SMILE into :
S
M
I
L
E
I did it like this
SEL SUBSTR(EMP_NAME,1,1) FROM etlt5.employe where EMP_ID='28008'
UNION ALL
SEL SUBSTR(EMP_NAME,2,1) FROM etlt5.employe where EMP_ID='28008'
UNION ALL
SEL SUBSTR(EMP_NAME,3,1) FROM etlt5.employe where EMP_ID='28008'
I also tried it with recursive query but no final results.is there a better way of doing this because this looks more like a hardcoded one.
You could use STRTOK_SPLIT_TO_TABLE to do this. STRTOK_SPLIT_TO_TABLE splits a field by a delimiter and then takes each token (stuff between the delimiter) and sticks it in it's own record of a new derived table.
In your case you don't have a delimiter between the characters of "SMILE" so we can use some REGEXP_REPLACE magic to stick a comma between each letter, and then split that to a table:
WITH test (id, word) AS (SELECT 1, 'SMILE')
SELECT D.*
FROM TABLE (strtok_split_to_table(test.id, REGEXP_REPLACE(test.word, '([a-zA-Z])', ',\1'), ',')
RETURNS
( id integer
, rownum integer
, new_col varchar(100)character set unicode)
) as d
I've used this STRTOK_SPLIT_TO_TABLE(REGEXP_REPLACE()) before to split apart document numbers in order to determine a check digit, so it definitely has its uses.
May I ask why you want to do that?
You need a table with a sequence from 1 to the max length of EMP_NAME:
select SUBSTR(EMP_NAME,n,1)
FROM etlt5.employe CROSS JOIN number_table
where EMP_ID='28008'
Query 1:
SELECT ARRAY(select id from contacts where id = 0)::INT[],
ARRAY[]::INT[],
ARRAY(SELECT id FROM contacts WHERE id = 0)::INT[] = ARRAY[]::int[]
Produces this result:
int4 array ?column?
{} {} TRUE
Query 2:
SELECT (ARRAY(SELECT id FROM contacts WHERE id = 0)::INT[]
& ARRAY(select id from contacts where id = 0)::INT[]),
ARRAY[]::INT[],
(ARRAY(SELECT id FROM contacts WHERE id = 0)::INT[]
& ARRAY(SELECT id FROM contacts WHERE id = 0)::INT[]) = ARRAY[]::int[]
Produces a different result:
?column? array ?column?
{} {} FALSE
Why the difference?
Is there any other way to compare an empty integer array with the result of an intersection of two arrays like in the second query?
Standard PostgreSQL does not support the ARRAY intersection operator. You must have installed the additional module intarray.
Your question boils down to this:
The intersection of two empty integer arrays yields an empty integer array. Why does this query yield false?
SELECT ('{}'::int[] & '{}'::int[]) = '{}'::int[]
Or in other syntax, meaning the same:
SELECT (ARRAY[]::int[] & ARRAY[]::int[]) = ARRAY[]::int[]
While this yields true:
SELECT '{}'::int[] = '{}'::int[]
And yes, that is a very good question.
For what it's worth, I can explain the difference:
SELECT array_dims('{}'::int[])
<NULL>
SELECT array_dims('{}'::int[] & '{}'::int[])
[1:0]
In other words, the first one is just an empty array, while the second one is a one-dimensional array with an empty element.
This can be very confusing. For instance see this thread about how to treat string_to_array() with empty output.
I am not sure the & operator does the right thing here.