SQL Server Filtered Index WHERE Column = Column - sql-server

I was hoping to try use a filtered index on a table in SQL Server 2012 to see if it would improve query execution though when trying to create it I am getting the following error:
Msg 10735, Level 15, State 1, Line 3
Incorrect WHERE clause for filtered index 'IX_SRReferralIn_Filtered' on table 'dbo.SRReferralIn'.
Below is the statement I am using. RowIdentifier and IDOrganisationVisibleTo are the columns in the CLUSTERED PRIMARY KEY
CREATE NONCLUSTERED INDEX IX_SRReferralIn_Filtered
ON dbo.SRReferralIn(RowIdentifier, IDOrganisationVisibleTo)
WHERE IDOrganisationVisibleTo = IDOrganisation;
Is the expression in the WHERE clause not supported?

No this is not supported.
The grammar only allows comparisons with constants
<filter_predicate> ::=
<conjunct> [ AND <conjunct> ]
<conjunct> ::=
<disjunct> | <comparison>
<disjunct> ::=column_name IN (constant ,...n)
<comparison> ::=column_name <comparison_op> constant<comparison_op> ::=
{ IS | IS NOT | = | <> | != | > | >= | !> | < | <= | !< }
You could create an indexed view with this condition though.

Related

Do I really have to retype all the columns in a MERGE statement?

Suppose I have a table PRODUCTS with many columns, and that I want to insert/update a row using a MERGE statement. It is something along these lines:
MERGE INTO PRODUCTS AS Target
USING (VALUES(42, 'Foo', 'Bar', 0, 14, 200, NULL)) AS Source (ID, Name, Description, IsSpecialPrice, CategoryID, Price, SomeOtherField)
ON Target.ID = Source.ID
WHEN MATCHED THEN
-- update
WHEN NOT MATCHED BY TARGET THEN
-- insert
To write the UPDATE and INSERT "sub-statements" it seems I have to specify once again each and every column field. So -- update would be replaced by
UPDATE SET ID = Source.ID, Name = Source.Name, Description = Source.Description...
and -- insert by
INSERT (ID, Name, Description...) VALUES (Source.ID, Source.Name, Source.Description...)
This is very error-prone, hard to maintain, and apparently not really needed in the simple case where I just want to merge two "field sets" each representing a full table row. I appreciate that the update and insert statements could actually be anything (I've already used this in an unusual case in the past), but it would be great if there was a more concise way to represent the case where I just want "Target = Source" or "insert Source".
Does a better way to write the update and insert statements exist, or do I really need to specify the full column list every time?
You have to write the complete column lists.
You can check the documentation for MERGE here. Most SQL Server statement documentation starts with a syntax definition that shows you exactly what is allowed. For instance, the section for UPDATE is defined as:
<merge_matched>::=
{ UPDATE SET <set_clause> | DELETE }
<set_clause>::=
SET
{ column_name = { expression | DEFAULT | NULL }
| { udt_column_name.{ { property_name = expression
| field_name = expression }
| method_name ( argument [ ,...n ] ) }
}
| column_name { .WRITE ( expression , #Offset , #Length ) }
| #variable = expression
| #variable = column = expression
| column_name { += | -= | *= | /= | %= | &= | ^= | |= } expression
| #variable { += | -= | *= | /= | %= | &= | ^= | |= } expression
| #variable = column { += | -= | *= | /= | %= | &= | ^= | |= } expression
} [ ,...n ]
As you can see, the only options in <set clause> are individual columns/assignments. There is no "bulk" assignment option. Lower down in the documentation you'll find the options for INSERT also requires individual expressions (at least, in the VALUES clause - you can omit the column names after the INSERT but that's generally frowned upon).
SQL tends to favour verbose, explicit syntax.

Usage of " " inside a concat statement in excel

I'm working on data cleansing of a database and I'm currently in the process of changing the upper case names into proper case. Hence, I'm using excel to have an update statement like this:
A | B | C | D |
| 1 | Name | id | Proper case name| SQL Statement |
|-----|------|-----|-----------------|---------------|
| 2 | AAAA | 1 |Aaaa |=CONCAT("UPDATE table SET Name = "'",C2,"'" WHERE id = ",B2,";") |
|-----|------|-----|-----------------|---------------|
| 3 | BBBB | 2 |Bbbb |=CONCAT("UPDATE table SET Name = "'",C3,"'" WHERE id = ",B3,";")|
The SQL state should be something like this:
UPDATE table SET Name = 'Aaaa' WHERE id = 1
UPDATE table SET Name = 'Bbbb' WHERE id = 2
I'm finding it difficult to get apostrophe around the name.
I think you need:
=CONCATENATE("UPDATE table SET Name = '",C2,"' WHERE id = ",B2,";")

postgres 9.6 - jsonb & indexing - poor performances

On a table (used by a django model) I'm using jsonb column data to store arbitrary data fetched from a webservice:
abs=# \d data_importer_rawdata;
Table "public.data_importer_rawdata"
Column | Type | Collation | Nullable | Default
-----------------+--------------------------+-----------+----------+---------------------------------------------------
id | integer | | not null | nextval('data_importer_rawdata_id_seq'::regclass)
created | timestamp with time zone | | not null |
modified | timestamp with time zone | | not null |
entity_id | character varying(50)[] | | not null |
entity_id_key | character varying(50)[] | | not null |
service | character varying(100) | | not null |
data | jsonb | | not null |
data_hash | bigint | | not null |
content_type_id | integer | | not null |
last_update | timestamp with time zone | | |
Indexes:
"data_importer_rawdata_pkey" PRIMARY KEY, btree (id)
"data_importer_rawdata_entity_id_service_conten_5fcc60bd_uniq" UNIQUE CONSTRAINT, btree (entity_id, service, content_type_id)
"data_importer_rawdata_content_type_id_63138c35" btree (content_type_id)
"rawdata_data_idx" gin (data jsonb_path_ops)
"rawdata_entity_id_idx" btree (entity_id)
"rawdata_entity_id_key_idx" btree (entity_id_key)
"rawdata_service_idx" btree (service)
Foreign-key constraints:
"data_importer_rawdat_content_type_id_63138c35_fk_django_co" FOREIGN KEY (content_type_id) REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED
records are > 1M.
However, despite various indexing strategies (followed this blog post), performance is still poor:
abs=# EXPLAIN ANALYZE SELECT
"data_importer_rawdata"."id",
"data_importer_rawdata"."created",
"data_importer_rawdata"."modified",
"data_importer_rawdata"."entity_id",
"data_importer_rawdata"."entity_id_key",
"data_importer_rawdata"."service",
"data_importer_rawdata"."content_type_id",
"data_importer_rawdata"."data",
"data_importer_rawdata"."data_hash",
"data_importer_rawdata"."last_update"
FROM "data_importer_rawdata"
WHERE ("data_importer_rawdata"."data" -> 'object_id')
= '"b8a096da-ff83-47dc-8d22-289ddb46b1c1"';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Seq Scan on data_importer_rawdata (cost=0.00..142508.65 rows=5155 width=856) (actual time=933.902..8240.465 rows=2 loops=1)
Filter: ((data -> 'object_id'::text) = '"b8a096da-ff83-47dc-8d22-289ddb46b1c1"'::jsonb)
Rows Removed by Filter: 1030908
Planning time: 0.158 ms
Execution time: 8240.493 ms
I tried to drop "rawdata_data_idx" and use a BTree index on a single jsonb key object_id, but performances are pretty much the same:
abs=# drop index "rawdata_data_idx";
abs=# CREATE INDEX "rawdata_data_object_ididx"
ON "data_importer_rawdata" USING BTREE ((data->>'object_id'));
abs=# EXPLAIN ANALYZE SELECT
"data_importer_rawdata"."id",
"data_importer_rawdata"."created",
"data_importer_rawdata"."modified",
"data_importer_rawdata"."entity_id",
"data_importer_rawdata"."entity_id_key",
"data_importer_rawdata"."service",
"data_importer_rawdata"."content_type_id",
"data_importer_rawdata"."data",
"data_importer_rawdata"."data_hash",
"data_importer_rawdata"."last_update"
FROM "data_importer_rawdata"
WHERE ("data_importer_rawdata"."data" -> 'object_id')
= '"b8a096da-ff83-47dc-8d22-289ddb46b1c1"';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Seq Scan on data_importer_rawdata (cost=0.00..142508.65 rows=5155 width=856) (actual time=951.522..8318.851 rows=2 loops=1)
Filter: ((data -> 'object_id'::text) = '"b8a096da-ff83-47dc-8d22-289ddb46b1c1"'::jsonb)
Rows Removed by Filter: 1030908
Planning time: 0.311 ms
Execution time: 8318.878 ms
Any suggestion about that? Not sure that this is the average performance for this kind of task.
Your query execution is slow because the index cannot be used.
To use the index, the expression in the condition must be the same as in the definition of the index, i.e.
WHERE "data_importer_rawdata"."data" ->> 'object_id'
= 'b8a096da-ff83-47dc-8d22-289ddb46b1c1'

Hive lateral view not working AWS Athena

Im working on a process of AWS Cloudtrail log analysis, Im getting stuck in extract JSON from a row,
This is my table definition.
CREATE EXTERNAL TABLE cloudtrail_logs (
eventversion STRING,
eventName STRING,
awsRegion STRING,
requestParameters STRING,
elements STRING ,
additionalEventData STRING
)
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://XXXXXX/CloudTrail'
If I run select elements from cl1 limit 1 it returns this result.
{"groupId":"sg-XXXX","ipPermissions":{"items":[{"ipProtocol":"tcp","fromPort":22,"toPort":22,"groups":{},"ipRanges":{"items":[{"cidrIp":"0.0.0.0/0"}]},"prefixListIds":{}}]}}
I need to show this result as virtual columns like,
| groupId | ipProtocol | fromPort | toPort| ipRanges.items.cidrIp|
|---------|------------|--------- | ------|-----------------------------|
| -1 | 0 | | | |
Im using AWS Athena and I tried Lateral view and get_json_object is not working in AWS.
its an external table
select json_extract_scalar(i.item,'$.ipProtocol') as ipProtocol
,json_extract_scalar(i.item,'$.fromPort') as fromPort
,json_extract_scalar(i.item,'$.toPort') as toPort
from cloudtrail_logs
cross join unnest (cast(json_extract(elements,'$.ipPermissions.items')
as array(json))) as i (item)
;
ipProtocol | fromPort | toPort
------------+----------+--------
"tcp" | 22 | 22

Why is that Strange Behaviour with SETOF in PostgreSQL?

I have the following Code Snippets{ CODE#1 , CODE#2, CODE #3 } In my Database.
CODE #1 : Create Statement of a Table "Class_Type" which is like a "ENUM in JAVA".
It Contains some data as { "class-A", 'class-B", "sports-A", "RED", "BLUE",.....}
Now i am trying to fetch these values using a Stored Procedure which is CODE#2 and CODE#3.
Expected Output of CODE#2 and CODE#3 :
{
| classtype character varying |
| class-A |
| class-B |
| sports-A |
| RED |
| BLUE |
| ....... |
}
What did i find Strange?
The CODE#2 is returning the Expected Output some times and some times it returns "Unexpected output". What is the Reason behind ?
The Code#3 is working fine and resulting the Expected Output Every Time.
Unexpected Output of CODE#2 :
{
| get_class_type_list character varying |
| class-A |
| class-B |
| sports-A |
| RED |
| BLUE |
| ....... |
}
Following are the Code Snippets :
CODE#1
{
CREATE TABLE test.class_type
(
value character varying(80) NOT NULL,
is_active boolean NOT NULL DEFAULT true,
sort_order integer,
CONSTRAINT class_type_pkey PRIMARY KEY (value)
)
WITH (
OIDS=FALSE
);
}
CODE#2
{
CREATE OR REPLACE FUNCTION test.get_class_type_list()
RETURNS SETOF character varying AS
$BODY$
DECLARE
SQL VARCHAR;
BEGIN
RETURN QUERY
(SELECT value AS "classType"
FROM TEST.CLASS_TYPE);
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000;
}
CODE#3
{
CREATE OR REPLACE FUNCTION test.get_class_type_list()
RETURNS TABLE(classType character varying) AS
$BODY$
DECLARE
SQL VARCHAR;
BEGIN
RETURN QUERY
(SELECT value AS "classType"
FROM TEST.CLASS_TYPE);
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000;
}
SQL Fiddle Sample Code
Edited:
I want the column name of the Return Function to be as "classType", but not the function name.
The only difference I see in the results, is the column name (which is actually an alias). You have full control over column aliases at the calling context: sqlfiddle (and the reason, sometimes it behaves differently, it's because RETURNS SETOF <primitive-type> is a special returning clause).
The rule of thumb is, if you have alias in the function definition (like OUT parameters, RETURNS TABLE & RETURNS SETOF <composite/row-type>; but not in the function body itself), postgresql will use that, unless there is an explicit alias. If you use RETURNS SETOF <primitive/simple-type> without OUT parameters, the default alias is the function name for that column.
Note: the reason behind I didn't post this as an answer the first time, is because unfortunately, I couldn't find any reference for this in the docs.

Resources