Remove the delimiters in column value - snowflake-cloud-data-platform

Remove the delimiters in column value - snowflake-cloud-data-platform

I have value in snowlflake
"<p>jeep is back under the tree</p>"
The column is variant datatype in table.
How do i get only jeep is back under the tree, Any suggestion please.
Thanks,

You can use the regexp_replace function to strip most HTML tags:
with T1 as
(
select '<p>jeep is back under the tree</p>'::variant as V
)
select regexp_replace(V, '<[^>]*>', '') as HTML_TAGS_STRIPPED from T1
;
Output:
HTML_TAGS_STRIPPED
jeep is back under the tree

Related

Need help in using REGEXP_REPLACE WITH LISTAGG function

WITH CTE AS (
SELECT
ID,
X.A_NO::INT AS A_NO,
REGEXP_REPLACE(TEXT_COL,'\::abcd::0|:::xyza::',' ') as TEXT_COL
FROM table X
WHERE date = '2022-02-02'
AND T_ID IN('12345','56789')
ORDER BY A_NO
)
SELECT
ID,
LISTAGG (distinct TEXT_COL , ',') WITHIN GROUP (ORDER BY TEXT_COL) AS TEXT_COL_A
FROM CTE
GROUP BY ID;
When I run the above query, I'm getting results as mentioned below:
ID TEXT_COL_A
12345 ,abc_xyz_ecom_data
56789
The value for TEXT_COL_A for the second row is empty. I want to remove the comma in the first row and update the second row AS NULL in the result. Can anyone guide me how to achieve this?

The second problem can be tackled with IFNULL
SELECT 'a' as a, IFNULL(a,'a was null') as b, null as c, IFNULL(c,'c was null') as d;
gives:
A
B
C
D
a
a
c was null
The first PROBLEM to remove things are the ends use TRIM, but to have just the start OR end trim use LTRIM or RTRIM
SELECT ',,a,,' as a, trim(a,','), ltrim(a,','), rtrim(a,',');
A
TRIM(A,',')
LTRIM(A,',')
RTRIM(A,',')
,,a,,
a
a,,
,,a

Why TRY_PARSE its so slow?

I have this query that basically returns (right now) only 10 rows as results:
select *
FROM Table1 as o
inner join Table2 as t on t.Field1 = o.Field2
where Code = 123456 and t.FakeData is not null
Now, if I want to parse the field FakeData (which, unfortunately, can contain different types of data, from DateTime to Surname/etc; i.e. nvarchar(70)), for data show and/or filtering:
select *, TRY_PARSE(t.FakeData as date USING 'en-GB') as RealDate
FROM Table1 as o
inner join Table2 as t on t.Field1 = o.Field2
where Code = 123456 and t.FakeData is not null
It takes x10 the query to be executed.
Where am I wrong? How can I speed up?
I can't edit the database, I'm just a customer which read data.

The TSQL documentation for TRY_PARSE makes the following observation:
Keep in mind that there is a certain performance overhead in parsing the string value.
NB: I am assuming your typical date format would be dd/mm/yyyy.
The following is something of a shot-in-the-dark that might help. By progressively assessing the nvarchar column if it is a candidate as a date it is possible to reduce the number of uses of that function. Note that a data point established in one apply can then be referenced in a subsequent apply:
CREATE TABLE mytable(
FakeData NVARCHAR(60) NOT NULL
);
INSERT INTO mytable(FakeData) VALUES (N'oiwsuhd ouhw dcouhw oduch woidhc owihdc oiwhd cowihc');
INSERT INTO mytable(FakeData) VALUES (N'9603200-0297r2-0--824');
INSERT INTO mytable(FakeData) VALUES (N'12/03/1967');
INSERT INTO mytable(FakeData) VALUES (N'12/3/2012');
INSERT INTO mytable(FakeData) VALUES (N'3/3/1812');
INSERT INTO mytable(FakeData) VALUES (N'ohsw dciuh iuh pswiuh piwsuh cpiuwhs dcpiuhws ipdcu wsiu');
select
t.FakeData, oa3.RealDate
from mytable as t
outer apply (
select len(FakeData) as fd_len
) oa1
outer apply (
select case when oa1.fd_len > 10 then 0
when len(replace(FakeData,'/','')) + 2 = oa1.fd_len then 1
else 0
end as is_candidate
) oa2
outer apply (
select case when oa2.is_candidate = 1 then TRY_PARSE(t.FakeData as date USING 'en-GB') end as RealDate
) oa3
FakeData
RealDate
oiwsuhd ouhw dcouhw oduch woidhc owihdc oiwhd cowihc
null
9603200-0297r2-0--824
null
12/03/1967
1967-03-12
12/3/2012
2012-03-12
3/3/1812
1812-03-03
ohsw dciuh iuh pswiuh piwsuh cpiuwhs dcpiuhws ipdcu wsiu
null
db<>fiddle here

Extraction all values between special characters SQL

I have the following values in the SQL Server table:
But I need to build query from which output look like this:
I know that I should probably use combination of substring and charindex but I have no idea how to do it.
Could you please help me how the query should like?
Thank you!

Try the following, it may work.
SELECT
offerId,
cTypes
FROM yourTable AS mt
CROSS APPLY
EXPLODE(mt.contractTypes) AS dp(cTypes);

You can use string_split function :
select t.offerid, trim(translate(tt.value, '[]"', ' ')) as contractTypes
from table t cross apply
string_split(t.contractTypes, ',') tt(value);

The data in each row in the contractTypes column is a valid JSON array, so you may use OPENJSON() with explicit schema (result is a table with columns defined in the WITH clause) to parse this array and get the expected results:
Table:
CREATE TABLE Data (
offerId int,
contractTypes varchar(1000)
)
INSERT INTO Data
(offerId, contractTypes)
VALUES
(1, '[ "Hlavni pracovni pomer" ]'),
(2, '[ "ÖCVS", "Staz", "Prahovne" ]')
Table:
SELECT d.offerId, j.contractTypes
FROM Data d
OUTER APPLY OPENJSON(d.contractTypes) WITH (contractTypes varchar(100) '$') j
Result:
offerId contractTypes
1 Hlavni pracovni pomer
2 ÖCVS
2 Staz
2 Prahovne
As an additional option, if you want to return the position of the contract type in the contractTypes array, you may use OPENJSON() with default schema (result is a table with columns key, value and type and the value in the key column is the 0-based index of the element in the array):
SELECT
d.offerId,
CONVERT(int, j.[key]) + 1 AS contractId,
j.[value] AS contractType
FROM Data d
OUTER APPLY OPENJSON(d.contractTypes) j
ORDER BY CONVERT(int, j.[key])
Result:
offerId contractId contractType
1 1 Hlavni pracovni pomer
2 1 ÖCVS
2 2 Staz
2 3 Prahovne

SQL to split a column values into rows in Netezza

I have data in the below way in a column. The data within the column is separated by two spaces.
4EG C6CC C6DE 6MM C6LL L3BC C3
I need to split it into as beloW. I tried using REGEXP_SUBSTR to do it but looks like it's not in the SQL toolkit. Any suggestions?
1. 4EG
2. C6CC
3. C6DE
4. 6MM
5. C6LL
6. L3BC
7. C3

This has ben answered here: http://nz2nz.blogspot.com/2016/09/netezza-transpose-delimited-string-into.html?m=1
Please note the comment at the button about the best performing way of use if array functions. I have measured the use of regexp_extract_all_sp() versus repeated regex matches and the benefit can be quite large

The examples from nz2nz.blogpost.com are hard to follow. I was able to piece together this method:
with
n_rows as (--update on your end
select row_number() over(partition by 1 order by some_field) as seq_num
from any_table_with_more_rows_than_delimited_values
)
, find_values as ( -- fake data
select 'A' as id, '10,20,30' as orig_values
union select 'B', '5,4,3,2,1'
)
select
id,
seq_num,
orig_values,
array_split(orig_values, ',') as array_list,
get_value_varchar(array_list, seq_num) as value
from
find_values
cross join n_rows
where
seq_num <= regexp_match_count(orig_values, ',') + 1 -- one row for each value in list
order by
id,
seq_num

How to compare records in same SQL Server table

My requirement is to compare each column of row with its previous row.
Compare row 2 with row 1
Compare row 3 with row 2
Also, if there is no difference, I need to make that column NULL. Eg: request_status_id of row 3 is same as that of row 2 so I need to update request_status_id of row 3 to NULL.
Is there a clean way to do this?

You can use the following UPDATE statement that employs LAG window function available from SQL Server 2012 onwards:
UPDATE #mytable
SET request_status_id = NULL
FROM #mytable AS m
INNER JOIN (
SELECT payment_history_id, request_status_id,
LAG(request_status_id) OVER(ORDER BY payment_history_id) AS prevRequest_status_id
FROM #mytable ) t
ON m.payment_history_id = t.payment_history_id
WHERE t.request_status_id = t.prevRequest_status_id
SQL Fiddle Demo here
EDIT:
It seems the requirement of the OP is to SET every column of the table
to NULL, in case the previous value is same as the current value. In this case the query becomes a bit more verbose. Here is an example with two columns being set. It can easily be expanded to incorporate any other column of the table:
UPDATE #mytable
SET request_status_id = CASE WHEN t.request_status_id = t.prevRequest_status_id THEN NULL
ELSE T.request_status_id
END,
request_entity_id = CASE WHEN t.request_entity_id = t.prevRequest_entity_id THEN NULL
ELSE t.request_entity_id
END
FROM #mytable AS m
INNER JOIN (
SELECT payment_history_id, request_status_id, request_entity_id,
LAG(request_status_id) OVER(ORDER BY payment_history_id) AS prevRequest_status_id,
LAG(request_entity_id) OVER(ORDER BY payment_history_id) AS prevRequest_entity_id
FROM #mytable ) t
ON m.payment_history_id = t.payment_history_id
SQL Fiddle Demo here

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Remove the delimiters in column value - snowflake-cloud-data-platform

I have value in snowlflake "<p>jeep is back under the tree</p>" The column is variant datatype in table. How do i get only jeep is back under the tree, Any suggestion please. Thanks,

You can use the regexp_replace function to strip most HTML tags: with T1 as ( select '<p>jeep is back under the tree</p>'::variant as V ) select regexp_replace(V, '<[^>]*>', '') as HTML_TAGS_STRIPPED from T1 ; Output: HTML_TAGS_STRIPPED jeep is back under the tree

Related

Need help in using REGEXP_REPLACE WITH LISTAGG function

Why TRY_PARSE its so slow?

Extraction all values between special characters SQL

SQL to split a column values into rows in Netezza

How to compare records in same SQL Server table

Categories

Resources