TSQL Split Function - Comma Delimited Parameter Values - sql-server

I've been trying to solve this issue where a sproc I'm using passes in user names with comma's in them. The part before the comma is a position prefix so for example, 'sel, MyName'. Our split function looks for commas to pass in something like this, 'sel, MyName, sel, YourName'.
I cannot figure out how to keep the comma but also separate the comma between names to perform the query against where username in (select result from dbo.split(#namestosplit)
I've tried removing the comma then putting it back, tried replacing temporarily, I've tried prefixing with the text (removing the prefix from the param's passed in)

I figured out a way to do this.
select 'someprefix, ' + Item from Split(replace(#valueToSplit,'sel, ',''), ',')
This ends up forming a select like this:
row 1: sel, Some Person
row 2: sel, Another Person
I essentially removed it to split the names then put it back as a concat for the where in (do something) piece.

Related

Regular expression in snowflake

I have a requirement where the string from a column has a value "/Date(-34905600000)/". The value within brackets could be in any one of the following patters
"/Date(-34905600000)/"
"/Date(1407283200000)/"
"/Date(1636654411000+0000)/"
I need to extract all inside the parenthesis for examples 1 and 2 including the "-" if any. For the 3rd example, it should be only the numbers inside the parenthesis before "+" ie 1636654411000.
I tried the following and not getting the results as the output is coming along with the parenthesis.
select REGEXP_substr("/Date(-34905600000)/", '\\([[:alnum:]\-]+\\)')
from table A;
select REGEXP_substr("/Date(-34905600000)/", '\\((.*?)\\)') from table
A;
select REGEXP_substr("/Date(-34905600000)/", '[0-9]+') from table A;
Using regexp_replace() instead you could do:
regexp_replace(colA, '(\\/Date\\()([-0-9]*)(.*)', '\\2')
That splits the string into three substitution groups and then only keeps the second. I often end up doing regexp_replace() with substitution groups like this when regexp_substr() fails me.
if you want the REGEXP_SUBSTR to sub-matches you need to use the 'e' <regex_parameters> option, and then you can use 1 as the to match your first grouping, thus:
SELECT column1,
REGEXP_substr(column1, 'Date\\(([-+]?[0-9]+)',1,1,'e')
FROM VALUES
('"/Date(-34905600000)/"'),
('"/Date(1407283200000)/"'),
('"/Date(1636654411000+0000)/"');
gives:
COLUMN1
REGEXP_SUBSTR(COLUMN1, 'DATE\(([-+]?[0-9]+)',1,1,'E')
"/Date(-34905600000)/"
-34905600000
"/Date(1407283200000)/"
1407283200000
"/Date(1636654411000+0000)/"
1636654411000
I am quite sure the regexp is greedy by default, but otherwise you can force the match to the timezone or paren with
'Date\\(([-+]?[0-9]+)[-+\\)]'

Extract string after first '/' using snowflake query

I have an input table in snowflake with column contains data pattern as follows
city, state/LOCATION/designation
city state/LOCATION/designation
city, state/LOCATION
Want to extract only location and store in another column, can you help me doing this?
You could use SPLIT_PART, as mentioned in a previous answer, but if you wanted to use regular expressions I would use REGEXP_SUBSTR, like this:
REGEXP_SUBSTR(YOUR_FIELD_HERE,'/([^/]+)',1,1,'e')
To break it down, briefly, it's looking for a slash and then takes all the non-slash characters that follow it, meaning it ends just before the next slash, or at the end of the string.
The 1,1,'e' correspond to: starting at the first character of the string, returning the 1st match, and extracting the substring (everything in the parentheses).
Snowflake documentation is here.
There are several ways to do this:
A) using the SPLIT_PART function:
SELECT SPLIT_PART('city, state/LOCATION/designation', '/', 2);
Reference: SPLIT_PART
B) using the SPLIT_TO_TABLE tabular function:
SELECT t.VALUE
FROM TABLE(SPLIT_TO_TABLE('city, state/LOCATION/designation', '/')) AS t
WHERE t.INDEX = 2;
Reference: SPLIT_TO_TABLE
C) using REGEXP expressions:
SELECT REGEXP_REPLACE('city, state/LOCATION/designation', '(.*)/(.*)/(.*)', '\\2');
but this one doesn't work if you don't have a third term ('designation'), you need to combine with two calls and check by number of backslashes.
SELECT IFF(REGEXP_COUNT('city, state/LOCATION', '/') = 1,
REGEXP_REPLACE('city, state/LOCATION','(.*)/(.*)','\\2'),
REGEXP_REPLACE('city, state/LOCATION','(.*)/(.*)/(.*)','\\2'));
Reference: REGEXP_REPLACE

How to remove last character of last row in tmap using Talend?

I am extracting two columns using textractjson and passing it to a tmap component where I am concatenating both of the columns as single one.
Also, I want a comma at the end of each row except the last row.
I am not sure how to do this.
Example:
There are two columns in tmap as input:
FirstName
LastName
In output, I have concatenated them to:
FirstName+LastName+","
The problem is I dont want the comma in the last row.
This depends on your job layout and input structure.
You seem to use tExtractJSON, which could mean you get the data from a REST call or out of a database.
Since there is no fixed row amount because of the JSON data structure, you wouldn't be able to use ((Integer)globalMap.get("tFileInputDelimited_1_NB_LINE")). Again, since we don't know your job layout, this depends on it.
If this is not the case, I would count the rows in a tJavaRow component first and then add a second tJavaRow where I'd concat the strings (wouldn't do that in the tMap), but omit the last comma. I'd be able to find the last row with the count I did first. This depends on your Java skills.
You may also concatenate all the rows in a global variable using a tJavaRow, with a comma at the end for each row.
Then, start a new subjob (onSubjobOk) then using a tJava, remove the last character (so the last comma).
Pretty simple, don't have to know how many rows from the input flow but supposing you want the result as a single string (here contained in a global variable).
I might be wrong also. Basically concatenate FirstName and LastName , Create record number column using Numeric.Sequence() function and use one more context variable and store the same sequence number here(it might store last value) also in tJavaRow component.
output_row.name = input_row.FirstName+""+input_row.LastName+","+;
output_row.record_number = Numeric.sequence("recNo", 1, 1);
context.lastNumber = Numeric.sequence("recNo", 1, 1);
create a method in custom java routine.
public static string main _nameChange(String name,Integer record_number){
if(context.lastNumber == record_number){
name = name.substring(0,name.length()-1);
return name;
}
else{
return name;
}
}
Now call _nameChange method within tmap component. Now you can trim the last row's last character.
Ror reference check this

How to create a Pipe delimited text file in stored procedure

I need to create a stored procedure that would create a pipe delimited text file based on user requirements.
The table that I will use has only 6 columns with names different from user required fields.
Also, the number of columns that user wants is 23. Some of them we do not have data for. I just need to display them in the text file.
I'm not sure how to display the data lined up under appropriate column while skipping other optional columns.
I think I would need something like this:
OptionalColumn 1|DataColumn 1|OptionalColumn 2|DataColumn 2
12/12/2015 Name 1
12/12/2015 Name 2
Or some other formatting for pipe delimited file.
How would I approach this?
Never done something like this.
You could probably concatenate your fields with a simple select like
SELECT '|' + YOUR_COLUMN_NAME
If you need more elaborate selections then perhaps this Stack Overflow approach may give you ideas Comma Separated results in SQL
The example creates a comma separated list, but the principle is the same. It allows for concatenation of data from multiple rows for a same id.
Seems like you might be able to just use concatenation if your output is always consistent....
Select 'col1|col2|col3|coln....'
union
Select '' as extracol + '|' + realcol + '|' + '' as extracol2 + '|' + realcol2 +'...';
..which will produce a String with headers
if you need to discover the structure dynamically that's a different story, you'll have to use the system tables, but this might be a simple solution. Also if you have a ton of rows, this is not a great approach. If your app is going to stream entries to a file, it may be best to have your app create the file format, not the database stored proc

String manipulation in a column in an Oracle table

One of my tables' column contains names, for example as "Obama, Barack" (with double quotes). I was wondering if we can do something to make it appear as Barack Obama in the tables. I think we can do it with declaring a variable but just could not manage to find a solution.
And yes as this table contains the multiple transactions of the same person we also end up with having multiple rows of "Obama, Barack"... a data warehouse concept (fact tables).
What #Ben has said is correct. Having two columns one for first name and one for last name is correct.
However if you wish to update the entire database as it is you could do...
/*This will swap the order round*/
UPDATE TableName SET NameColumn = SUBSTRING(NameColumn, 1, CHARINDEX(',',NameColumn))+SUBSTRING(NameColumn, CHARINDEX(',', NameColumn),LEN(NameColumn)-CHARINDEX('"', NameColumn,2))
/*This will remove the quotes*/
UPDATE TableName SET NameColumn = REPLACE(NameColumn, '"', '')
Edit:- but as I can't see your data you may have to edit it slightly. But the theory is correct. See here http://www.technoreader.com/SQL-Server-String-Functions.aspx
From the question I assume you want to:
Remove the quotes
Remove the comma
Swap the names
So Regexp_replace is probably your best bet
UPDATE tablename
SET column_name = REGEXP_REPLACE( column_name, '^"(\w+), (\w+)"$', '\2 \1' )
So regexp_replace is changing the column value as long as it matches the pattern exactly. What the parts of the expression are
^" means it must start with a double quote
(\w+) means immediately followed by a string of 1 or more alphanumeric characters. This string is then saved as the variable \1 because its the first set of ()
, means immediately followed by a comma and a space
(\w+) means immediately followed by a string of 1 or more alphanumeric characters. This string is then saved as the variable \2 because its the second set of ()
"$ means immediately follwed by a double quote which is the end of the string
\2 \1 is the replacement string, the second saved string followed by a space followed by the first saved string
So anything which does not exactly match these conditions will not be replaced. So if you have an leading or traling spaces, or more than one space after the comma, or many other reasons the text will not be replaced.
A much more flexible (maybe too flexible) option could be:
UPDATE tablename
SET column_name = REGEXP_REPLACE( column_name, '^\W*(\w+)\W+(\w+)\W*$', '\2 \1' )
This is similar but effectively makes the quotes and the comma optional, and deals with any other leading or trailing pubctuation or whitespace.
^\W* means must start with zero or more non-alphanumberics
(\w+)\W+(\w+) means two alphanumberic strings separated by one or more non-alphanumerics. The two strings are saved as described above
\W*$ means must then end with zero or more non-alphanumberics
More info on regexp in oracle is here
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/ap_posix.htm
http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm

Resources