Add a prefix to a string in a SQL server table - sql-server

Hope someone can help with this question. I have a table with image file names sorted in it and some are missing the directory prefix. eg: (without folder info): "some_imagname.jpg" and then with the folder : "/photos/another_imagname.jpg".
I would like to run an update on the table, so that all image names have the folder prefixed added to it, where it is missing.
It's a long story as to how this happened, but suffice to say, I really need to get this updated quite soon.
Many thanks
Hans

A simple update statement with concatenation using the + operator and a filter to exclude the records that are already prefixed:
UPDATE table
SET column = '/photos/' + column
WHERE column NOT LIKE '/photos/%'

With the little information given i'd say:
UPDATE table
SET name_column = '/photos/' + name_column
WHERE name_column NOT LIKE '/photos%';

Related

How to query multiple JSON document schemas in Snowflake?

Could anyone tell me how to change the Stored Procedure in the article below to recursively expand all the attributes of a json file (multiple JSON document schemas)?
https://support.snowflake.net/s/article/Automating-Snowflake-Semi-Structured-JSON-Data-Handling-part-2
Craig Warman's stored procedure posted in that blog is a great idea. I asked him if it was okay to refactor his code, and he agreed. I've used the refactored version in the field, so I know the SP well as well as how it works.
It may be possible to modify the SP to work on your JSON. It will depend on whether or not Snowflake types the JSON in your variant column. The way you have it structured, it may not type everything. You can check by running this SQL and seeing if the result set includes all the columns you need:
set VARIANT_TABLE = 'WEATHER';
set VARIANT_COLUMN = 'V';
with MAIN_TABLE as
(
select * from identifier($VARIANT_TABLE) sample (1000 rows)
)
select distinct REGEXP_REPLACE(REGEXP_REPLACE(f.path, '\\[(.+)\\]'),'[^a-zA-Z0-9]','_') AS path_name, -- This generates paths with levels enclosed by double quotes (ex: "path"."to"."element"). It also strips any bracket-enclosed array element references (like "[0]")
typeof(f.value) AS attribute_type, -- This generates column datatypes.
path_name AS alias_name -- This generates column aliases based on the path
from
MAIN_TABLE,
LATERAL FLATTEN(identifier($VARIANT_COLUMN), RECURSIVE=>true) f
where TYPEOF(f.value) != 'OBJECT'
AND NOT contains(f.path, '[');
Be sure to replace the variables to your table and column names. If this picks up the type information for the columns in your JSON, then it's possible to modify this SP to do what you need. If it doesn't but there's a way to modify the query to get it to pick up the columns, that would work too.
If it doesn't pick up the columns, based on Craig's idea I decided to write type inference for non variant (such as strings from CSV log files without type information). Try the SQL above and see what results first.

SQL Filestream - tying up rows with File Stream files

hopefully a simple question to settle my intrigue more than anything. I have setup Filestream on our SQL Server. From what I have read I was expecting to see the GUID in the database row match the filestream file name in the file path... but they dont match. Is there any other wizardry going on that I am missing?
Database Table:
where column id is the GUID and FileData is the Filestream column.
When I then go to the location where these BLOBS are being stored, I expect to see these GUIDs as the filenames:
Filestream Files
I am just looking to understand the whole process of how Filestream works. I have done a bit of digging around so if anyone is able to fill in those gaps for me would be great.
It is a bit confusing. But below is what I eventually figured out.
Notice the column circled in red above. The type and name may be different.
DECLARE #BaseDirectory AS varchar(200) = '\\server-name\Instance\DocumentPages\';
SELECT TOP(10) #BaseDirectory + SUBSTRING(CONVERT(varchar(200), file_stream.PathName(0)), LEN('file_stream\') + CHARINDEX('file_stream\', CONVERT(varchar(200), file_stream.PathName(0)), 1), 36) + '.' + dp.file_type, d.DocumentID
FROM [Document].[dbo].[DocumentPages] AS dp
INNER JOIN [Document].[dbo].[Documents] AS d
ON d.stream_id = dp.stream_id
ORDER BY d.DocumentID DESC
This query uses a function called .PathName() on this column. This isn't how I have known SQL to typically work (usually, functions take the column value in as a parameter). But it does!
Because our system is using Availability Groups, I had to find the base directory and ignore what SQL Server was returning.
You may also try file_stream.PathName(0), file_stream.PathName(1), or file_stream.PathName(2) to try and find the base path. And if the path matches, remove the SUBSTRING logic!
Hopefully Helpful!

How to update keys in a JSON array Postgresql

I am using PostgreSQL 9.4.1 and have run into an issue with a column that I need to update. The column is of type JSON with elements in the following format:
["a","b","c",...]
["a","c","d",...]
["c","d","e",...]
etc.
so that each element is a string. It is my understanding that each of these elements are considered keys to the JSON array (please correct me if I am a bit confused here, I haven't ever worked with JSON datatype columns before, so I'm still trying to get a grip on them anyways). There is not an actual pattern to these arrays, and their contents are dependent on user input from somewhere else. My goal is to update any of the arrays that contain a particular element (say "b" for the purpose of explaining my question more thoroughly) and replace the content "b" with say "b1". Meaning that:
["a","b","c",...]
would be updated to
["a","b1","c",...]
I have found a few ways listed on this site (I don't currently have the links, but I can find them again if necessary) to update VALUES for a particular KEY, but I haven't found a way mentioned to change the KEY itself. I have already found a way to target the specific rows of interest by doing something similar to:
SELECT *
FROM TableA
WHERE column::json ?| ["b", other string elements of interest]
Any suggestions would be greatly appreciated. Thanks for your help!
So I went ahead and gave that a check (because it looks like it should work, and it's more or less what I ended up doing), but I figured out what I was trying to do! What I got was this:
UPDATE TableA
SET column = REPLACE(column::TEXT,'b','b1')::JSON
WHERE column::JSON ?| ['b']
And now that I think about it, I probably don't even need the last where condition because the replace won't affect anything that doesn't have 'b' in it. But that worked for me, and it looks like yours probably should too! Thanks for the help!
I wanted to rename a specific key for json array column.
I tried and it worked on PostgreSQL 9.4:
UPDATE Your_Table_Name
SET Your_Column_Name = replace(Your_Column_Name::TEXT,'Key_Old_Name','Key_New_Name')::json
WHERE attributes::jsonb ? 'Key_Old_Name'
Basically, solution is to go over the list of json_array_elements, and based on json value using CASE condition replace certain value with other one. After all, need to re-build new json array using array_agg() and to_json() description of aggregate functions in psql is here.
Possible query can be the following:
-- Sample DDL and JSON data
CREATE TABLE jsontest (data JSON);
INSERT INTO jsontest VALUES ('["a","b","c"]'::JSON);
-- QUERY
WITH result AS (
SELECT to_json( -- creating updated json structure
array_agg( -- create array with new element "b1"
CASE WHEN element::TEXT = '"b"' -- here we process array elements to find "b"
THEN to_json('b1'::TEXT)
ELSE element
END)) as new_json
FROM jsontest,json_array_elements(jsontest.data) as element
)
UPDATE jsontest SET data = result.new_json FROM result;

How to Dynamically render Table name and File name in pentaho DI

I have a requirement in which one source is a table and one source is a file. I need to join these both on a column. The problem is that I can do this for one table with one transformation but I need to do it for multiple set of files and tables to load into another set of specific files as target using the same transformation.
Breaking down my requirement more specifically :
Source Table Source File Target File
VOICE_INCR_REVENUE_PROFILE_0 VoiceRevenue0 ProfileVoice0
VOICE_INCR_REVENUE_PROFILE_1 VoiceRevenue1 ProfileVoice1
VOICE_INCR_REVENUE_PROFILE_2 VoiceRevenue2 ProfileVoice2
VOICE_INCR_REVENUE_PROFILE_3 VoiceRevenue3 ProfileVoice3
VOICE_INCR_REVENUE_PROFILE_4 VoiceRevenue4 ProfileVoice4
VOICE_INCR_REVENUE_PROFILE_5 VoiceRevenue5 ProfileVoice5
VOICE_INCR_REVENUE_PROFILE_6 VoiceRevenue6 ProfileVoice6
VOICE_INCR_REVENUE_PROFILE_7 VoiceRevenue7 ProfileVoice7
VOICE_INCR_REVENUE_PROFILE_8 VoiceRevenue8 ProfileVoice8
VOICE_INCR_REVENUE_PROFILE_9 VoiceRevenue9 ProfileVoice9
The table and file names are always corresponding i.e. VOICE_INCR_REVENUE_PROFILE_0 should always join with VoiceRevenue0 and the result should be stored in ProfileVoice0. There should be no mismatches in this case. I tried setting the variables with table names and file names, but it only takes on value at a time.
All table names and file names are constant. Is there any other way to get around this. Any help would be appreciated.
Try using "Copy rows to result" step. It will store all the incoming rows (in your case the table and file names) into a memory. And for every row, it will try to execute your transformation. In this way, you can read multiple filenames at one go.
Try reading this link. Its not the exact answer, but similar.
I have created a sample here. Please check if this is what is required.
In the first transformation, i read the tablenames and filenames and loaded it in the memory. After that i have used the get variable step to read all the files and table names to generate the output. [Note: I have not used table input as source anywhere, instead used TablesNames. You can replace the same with the table input data.]
Hope it helps :)

Hacked SQL Server database need regex

A database that a client of mine has was hacked. I am in the process of trying to rebuild the data. The site is running classic ASP with a SQL Server database. I believe I have found where the weak point was for the hackers and removed that entry point for now.
Every text colummn in the database was appended with some html markup and inline script/js tags.
Here is an example of a field:
all</title><script>
document.write("<style>.aq21{position:absolute;clip:rect(436px,auto,auto,436px);}</style>");
</script>
<div class=aq21>
<a href=http://samedaypaydayloansonlineelqmt.com >same day payday loans online</a>
<a href=http://samedaypaydayloan
This example was in the Users table in the UserRights column. The initial value was all, but then you can see the links that were appended.
I need to write a regex script that will search through all fields in each column of each table in the database and remove this extra markup.
Essentially, if I try to match </table>, then that string and everything that appends it can be replaced with a blank string.
All of these appended strings are the same for each field in the same column. However, there are multiple columns in each table.
This is what I have been doing so far, replacing the hacked part, but a nice regex would probably help me out, though my regex skills.... well suck.
UPDATE [databasename.[db].[databasetable]
set
UserRights = replace(UserRights,'</title><script>document.write("<style>.aq21{position:absolute;clip:rect(436px,auto,auto,436px);}</style>");</script><div class=aq21><a href=http://samedaypaydayloansonlineelqmt.com >same day payday loans online</a><a href=http://samedaypaydayloan','');
Any regex help and/or tips are appreciated.
This is what I ended up doing (big thanks to #Bohemian):
I went through each table and checked which column was affected. Then I ran the following script on each column:
UPDATE [tablename]
set columnname = substring(columnname, 1, charindex('/', columnname)-1)
where columnname like '%</%';
If the column had any markup in it, then I ended up manually updating those records manually. (lucky for me there was only a couple of records).
If anyone has any better solutions, please feel free to comment.
Thanks!
Since the bad stuff starts with a <, and that is an unusual character to typically find, I would use normal text functions, something like this:
update mytable set
mycol = substr(mycol, 1, charindex('<', mycol) - 1)
where mycol like '%<%';
And methodically do this with every column of every table.
Note that I'm only guessing at the right function to use, since I'm unfamiliar with SQL Server, but you get idea.
I welcome someone editing the SQL to improve it.

Resources