Unnest multiple arrays in parallel - arrays

My last question Passing an array to stored to postgres was a bit unclear. Now, to clarify my objective:
I want to create an Postgres stored procedure which will accept two input parameters. One will be a list of some amounts like for instance (100, 40.5, 76) and the other one will be list of some invoices ('01-2222-05','01-3333-04','01-4444-08'). After that I want to use these two lists of numbers and characters and do something with them. For example I want to take each amount from this array of numbers and assign it to corresponding invoice.
Something like that in Oracle would look like this:
SOME_PACKAGE.SOME_PROCEDURE (
789,
SYSDATE,
SIMPLEARRAYTYPE ('01-2222-05','01-3333-04','01-4444-08'),
NUMBER_TABLE (100,40.5,76),
'EUR',
1,
P_CODE,
P_MESSAGE);
Of course, the two types SIMPLEARRAYTYPE and NUMBER_TABLE are defined earlier in DB.

You will love this new feature of Postgres 9.4:
unnest(anyarray, anyarray [, ...])
unnest() with the much anticipated (at least by me) capability to unnest multiple arrays in parallel cleanly. The manual:
expand multiple arrays (possibly of different types) to a set of rows. This is only allowed in the FROM clause;
It's a special implementation of the new ROWS FROM feature.
Your function can now just be:
CREATE OR REPLACE FUNCTION multi_unnest(_some_id int
, _amounts numeric[]
, _invoices text[])
RETURNS TABLE (some_id int, amount numeric, invoice text) AS
$func$
SELECT _some_id, u.* FROM unnest(_amounts, _invoices) u;
$func$ LANGUAGE sql;
Call:
SELECT * FROM multi_unnest(123, '{100, 40.5, 76}'::numeric[]
, '{01-2222-05,01-3333-04,01-4444-08}'::text[]);
Of course, the simple form can be replaced with plain SQL (no additional function):
SELECT 123 AS some_id, *
FROM unnest('{100, 40.5, 76}'::numeric[]
, '{01-2222-05,01-3333-04,01-4444-08}'::text[]) AS u(amount, invoice);
In earlier versions (Postgres 9.3-), you can use the less elegant and less safe form:
SELECT 123 AS some_id
, unnest('{100, 40.5, 76}'::numeric[]) AS amount
, unnest('{01-2222-05,01-3333-04,01-4444-08}'::text[]) AS invoice;
Caveats of the old shorthand form: besides being non-standard to have set-returning function in the SELECT list, the number of rows returned would be the lowest common multiple of each arrays number of elements (with surprising results for unequal numbers). Details in these related answers:
Parallel unnest() and sort order in PostgreSQL
Is there something like a zip() function in PostgreSQL that combines two arrays?
This behavior has finally been sanitized with Postgres 10. Multiple set-returning functions in the SELECT list produce rows in "lock-step" now. See:
What is the expected behaviour for multiple set-returning functions in SELECT clause?

Arrays are declared by adding [] to the base datatype. You declare them as a parameter the same way you declare regular parameters:
The following function accepts an array of integers and and array of strings and will return some dummy text:
create function array_demo(p_data integer[], p_invoices text[])
returns text
as
$$
select p_data[1] || ' => ' || p_invoices[1];
$$
language sql;
select array_demo(array[1,2,3], array['one', 'two', 'three']);
SQLFiddle demo: http://sqlfiddle.com/#!15/fdb8d/1

Related

How to add fields dynamically to snowflake's object_construct function

I have a large table of data in Snowflake that contains many fields with a name prefix of 'RAW_'. In order to make my table more manageable, I wish to condense all of these 'RAW_' fields into just one field called 'RAW_KEY_VALUE', condensing all of it into a key-value object store.
It initially appeared that Snowflake's 'OBJECT_CONSTRUCT' function was going to be my perfect solution here. However, the issue with this function is that it requires a manual input/hard coding of the fields you wish to convert to a key-value object. This is problematic for me as I have anywhere from 90-120 fields I would need to manually place in this function. Additionally, these fields with a 'RAW_' prefix change all the time. It is therefore critical that I have a solution that allows me to dynamically add these fields and convert them to a key-value store. (I haven't tried creating a stored procedure for this yet but will if all else fails)
Here is a snippet of the data in question
create or replace table reviews(name varchar(50), acting_rating int, raw_comments varchar(50), raw_rating int, raw_co varchar(50));
insert into reviews values
('abc', 4, NULL, 1, 'NO'),
('xyz', 3, 'some', 1, 'haha'),
('lmn', 1, 'what', 4, NULL);
Below is the output I'm trying to achieve (using the manual input/hard coding approach with object_construct)
select
name ,
acting_rating ,
object_construct_keep_null ('raw_comments',raw_comments,'raw_rating',raw_rating,'raw_co',raw_co) as RAW_KEY_VALUE
from reviews;
The above produces this desired output below.
Please let me know if there are any other ways to approach here. I think if I was able to work out a way to add the relevant fields to the object_construct function dynamically, that would solve my problem.
You can do this with a JS UDF and object_construct(*):
create or replace function obj_with_prefix(PREFIX string, A variant)
returns variant
language javascript
as $$
let result = {};
for (key in A) {
if (key.startsWith(PREFIX))
result[key] = A[key];
}
return result
$$
;
Test:
with data(aa_1, aa_2, bb_1, aa_3) as (
select 1,2,3,4
)
select obj_with_prefix('AA', object_construct(*))
from data

Is it possible to pass an array to a prepared statement on Amazon Redshift?

I need to pass an array to a prepared statement defined on AWS Redshift to filter my query. Since this is not supported in Redshift, I was trying to find a work around using a Python UDF. Something like:
-- a function to split a comma separated list of values (string is already validated)
CREATE FUNCTION split_str_to_ints (string char) RETURNS int[] IMMUTABLE as $$
def split_stoi(string):
ints = [int(item) for item in string.split(',')]
return ints
$$ LANGUAGE plpythonu;
-- a prepared statement to return orders filtered by company id
PREPARE get_orders (char) as
SELECT order.id
order.company_id,
COUNT(order.id) AS order_count
FROM order
INNER JOIN company ON (order.company_id = company.company_id)
WHERE company.company_id IN (split_str_to_ints($1))
GROUP BY order.id, order.company_id
ORDER BY order_count DESC;
EXECUTE get_orders('1,2,3,4')
But when I want to define the function I get an error saying that array of integers is not a supported plpythonu UDF return type.
Is there any way other way to pass list of integers (or chars) to a prepared statement on Redshift?

Array_agg in postgres selectively quotes

I have a complex database with keys and values stored in different tables. It is useful for me to aggregate them when pulling out the values for the application:
SELECT array_agg(key_name), array_agg(vals)
FROM (
SELECT
id,
key_name,
array_agg(value)::VARCHAR(255) AS vals
FROM factor_key_values
WHERE id=20
GROUP BY key_name, id
) f;
This particular query, in my case gives the following invalid JSON:
-[ RECORD 1 ]-----------------------------------------------------------------------
array_agg | {"comparison method","field score","field value"}
array_agg | {"{\"text category\"}","{100,70,50,0,30}","{A,B,C,F,\"No Experience\"}"}
Notice that the array of varchars is only quoted if the string has a space. I have narrowed this down to the behaviour of ARRAY_AGG. For completeness here is an example:
BEGIN;
CREATE TABLE test (txt VARCHAR(255));
INSERT INTO test(txt) VALUES ('one'),('two'),('three'), ('four five');
SELECT array_agg(txt) FROM test;
The result will be:
{one,two,three,"four five"}
This is why my JSON is breaking. I can handle unquoted or quoted strings in the application code, but have a mix in nuts.
Is there any solution to this?
Can't you use json_agg?
select json_agg(txt) from test;
json_agg
--------------------------------------
["one", "two", "three", "four five"]
Unfortunately, this is the inconsistent standard that PostgreSQL uses for formatting arrays. See "Array Input and Output Syntax" for more information.
Clodoaldo's answer is probably what you want, but as an alternative, you could also build your own result:
SELECT '{'||array_to_string(array_agg(txt::text), ',')||'}' FROM test;

Postgres function with jsonb parameters

I have seen a similar post here but my situation is slightly different from anything I've found so far. I am trying to call a postgres function with parameters that I can leverage in the function logic as they pertain to the jsonb query. Here is an example of the query I'm trying to recreate with parameters.
SELECT *
from edit_data
where ( "json_field"#>'{Attributes}' )::jsonb #>
'{"issue_description":"**my description**",
"reporter_email":"**user#generic.com**"}'::jsonb
I can run this query just fine in PGAdmin but all my attempts thus far to run this inside a function with parameters for "my description" and "user#generic.com" values have failed. Here is a simple example of the function I'm trying to create:
CREATE OR REPLACE FUNCTION get_Features(
p1 character varying,
p2 character varying)
RETURNS SETOF edit_metadata AS
$BODY$
SELECT * from edit_metadata where ("geo_json"#>'{Attributes}' )::jsonb #> '{"issue_description":**$p1**, "reporter_email":**$p2**}'::jsonb;
$BODY$
LANGUAGE sql VOLATILE
COST 100
ROWS 1000;
I know that the syntax is incorrect and I've been struggling with this for a day or two. Can anyone help me understand how to best deal with these double quotes around the value and leverage a parameter here?
TIA
You could use function json_build_object:
select json_build_object(
'issue_description', '**my description**',
'reporter_email', '**user#generic.com**');
And you get:
json_build_object
-----------------------------------------------------------------------------------------
{"issue_description" : "**my description**", "reporter_email" : "**user#generic.com**"}
(1 row)
That way there's no way you will input invalid syntax (no hassle with quoting strings) and you can swap the values with parameters.

SQL Server 2014 - XQuery - get comma-separated List

I have a database table in SQL Server 2014 with only an ID column (int) and a column xmldata of type XML.
This xmldata column contains for example:
<book>
<title>a nice Novel</title>
<author>Maria</author>
<author>Peter</author>
</book>
As expected, I have multiple books, therefore multiple rows with xmldata.
I now want to execute a query for all books, where Peter is an Author. I tried this in some xPath2.0 testers and got to the conclusion that:
/book/author/concat(text(), if(position() != last())then ',' else '')
works.
If you try to port this success into SQL Server 2014 Express it looks like this, which is correctly escaped syntax etc.:
SELECT id
FROM books
WHERE 'Peter' IN (xmldata.query('/book/author/concat(text(), if(position() != last())then '','' else '''')'))
SQL Server however does not seem to support a construction like /concat(...) because of:
The XQuery syntax '/function()' is not supported.
I am at a loss then however, why /text() would work in:
SELECT id, xmldata.query('/book/author/text()')
FROM books
which it does.
My constraints:
I am bound to use SQL Server
I am bound to xpath or something else that can be "injected" as the statement above (if the structure of the xml or the database changes, the xpath above could be changed isolated and the application logic above that constructs the Where clause will not be touched) SEE EDIT
Is there a way to make this work?
regards,
BillDoor
EDIT:
My second constraint boils down to this:
An Application constructs the Where clause by
expression <operator> value(s)
expression is stored in a database and is mapped by the xmlTag eg.:
| tokenname| querystring
| "author" | "xmldata.query(/book/author/text())"
the values are presented by the Requesting user. so if the user asks for the author "Peter" with operator "EQUALS" the application constructs:
xmaldata.query(/book/author/text()) = "Peter"
as where clause.
If the customer now decides that author needs to be nested in an <authors> element, i can simply change the expression in the construction-database and the whole machine keeps running without any changes to code, simply manageable.
So i need a way to achieve that
<xPath> <operator> "Peter"
or any other combination of this three isolated components (see above: "Peter" IN <xPath>...) gets me all of Peters' books, even if there are multiple unsorted authors.
This would not suffice either (its not sqlserver syntax, but you get the idea):
WHERE xmldata.exist('/dossier/client[text() = "$1"]', "Peter") = 1;
because the operator is still nested in the expression, i could not request <> "Peter".
I know this is strange, please don't question the concept as a whole - it has a history :/
EDIT: further clarification:
The filter-rules come into the app in an XML structure basically:
Operator: "EQ"
field: "name"
value "Peter"
evaluates to:
expression = lookupExpressionForField("name") --> "table2.xmldata.value('book/author/name[1]', 'varchar')"
operator = lookUpOperatorMapping("EQ") --> "="
value = FormatValues("Peter") --> "Peter" (if multiple values are passed FormatValues cosntructs a comma seperated list)
the application then builds:
- constructClause(String expression,String operator,String value)
"table2.xmldata.value('book/author/name[1]', 'varchar')" + "=" + "Peter"
then constructs a Select statement with the result as WHERE clause.
it does not build it like this, unescaped, unfiltered for injection etc, but this is the basic idea.
i can influence how the input is Transalted, meaning I can implement the methods:
lookupExpressionForField(String field)
lookUpOperatorMapping(String operator)
Formatvalues(List<String> values) | Formatvalues(String value)
constructClause(String expression,String operator,String value)
however i choose to do, i can change the parameter types, I can freely implement them. The less the better of course. So simply constructing a comma-seperated list with xPath would be optimal (like if i could somewhere just tick "enable /function()-syntax in xPath" in sqlserver and the /concat(if...) would work)
How about something like this:
SET NOCOUNT ON;
DECLARE #Books TABLE (ID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY, BookInfo XML);
INSERT INTO #Books (BookInfo)
VALUES (N'<book>
<title>a nice Novel</title>
<author>Maria</author>
<author>Peter</author>
</book>');
INSERT INTO #Books (BookInfo)
VALUES (N'<book>
<title>another one</title>
<author>Bob</author>
</book>');
SELECT *
FROM #Books bk
WHERE bk.BookInfo.exist('/book/author[text() = "Peter"]') = 1;
This returns only the first "book" entry. From there you can extract any portion of the XML field using the "value" function.
The "exist" function returns a boolean / BIT. This will scan through all "author" nodes within "book", so there is no need to concat into a comma-separated list only for use in an IN list, which wouldn't work anyway ;-).
For more info on the "value" and "exist" functions, as well as the other functions for use with XML data, please see:
xml Data Type Methods

Resources