Iterative UPDATE loop in SQL Server - sql-server

I would really like to find some kind of automation on this issue I am facing;
A client has had a database attached to their front end site for a few years now, and until this date has been inputting certain location information as a numeric code (i.e. County/State data).
They now would like to replace these values with their corresponding nvarchar values. (e.g Instead of having '8' in their County column, they want it to read 'Clermont County' etc etc for upwards of 90 separate entries).
I have been provided with a 2-column excel sheet, one with the old county numeric code and one with the text equivalent they request. I have imported this to a temp table, but cannot find a fast way of iteratively matching and updating these values.
I don't really want to write a 90 line CASE WHEN paragraph and type out each county name manually. Opens doors for human error etc.
Is there something much simpler I don't know about what I can do here?

I realize that it might be a bit late, but just in case someone else is searching and comes across this answer...
There are two ways to handle this: In Excel, or in SQL Server.
1. In Excel
Create a concatenated string in one of the available columns that meets your criteria, i.e.
=CONCATENATE("UPDATE some_table SET some_field = '",B2,"' WHERE some_field = ",A2)
You can then auto-fill this column all the way down the list, and thus get 90 different update statements which you can then copy and paste into a query window and run. Each one will say
UPDATE some_table SET some_field = 'MyCounty' WHERE some_field = X
Each one will be specific to a case; therefore, you can run them sequentially and get the desired result, or...
2. In SQL Server
If you can import the data to a table then all you need to do is write a simple query with a JOIN which handles the case, i.e.
UPDATE T1
SET T1.County_Name = T2.Name
FROM Some_Table T1 -- The original table to be updated
INNER JOIN List_Table T2 -- The imported table from an Excel spreadsheet
ON T1.CountyCode = T2.Code
;
In this case, Row 1 of your original Some_Table would be joined to the imported data by the County_Code, and would update the name field with the name from that same code in the imported data, which would give you the same result as the Excel option, minus a bit of typing.

Related

json_extract a data set that appears in the field multiple times

So I am working on a game server and I have a player inventory that stores as a json field. I am trying to create a query that specifically pulls two of the data sets in the field but those data sets repeat. For example two of the data sets are going to be name: and amount: but the same field will use these data sets as many times as there are items in the inventory table.
So example here I call on the two data sets on a specific vehicle trunk identified by its plate in the database. What I get back are Null Null.
SELECT json_extract(items, '$."amount"') AS amount, json_extract(items, '$."name"') AS name FROM trunkitems WHERE plate='6DV689SW';
What I need it to do its return an expanding table for just those two data points.
The proper solution is to use JSON_TABLE(), but this function is not implemented in MariaDB until version 10.6. You said you are currently using MariaDB 10.3, so you'll have to upgrade to use this solution.
SELECT j.amount, j.name
FROM trunkitems
CROSS JOIN JSON_TABLE(items, '$[*]' COLUMNS(
amount INT PATH '$.amount',
name VARCHAR(20) PATH '$.name'
)) AS j
WHERE j.name IS NOT NULL;
Dbfiddle
If you can't upgrade, then I recommend you store data in normal rows and columns, and not use JSON.
In fact, I usually recommend to avoid using JSON in MySQL or MariaDB in all cases. It just makes queries harder to write and maintain and optimize, and makes data take more storage space.

SSIS - combine results only if key doesn't exist in first dataset

I am trying to combine two inventory sources with SSIS. The first of which contains inventory information from our new system while the second contains legacy data. I am getting the data from the sources just fine.
Both data sets have the same columns, but I only want to get the results from the second data set if the ItemCode value for that record doesn't exist in the first data set.
Which transform would I need to use to achieve this?
Edit - here is what I have so far in my data flow.
I need to add a transform to the Extract Legacy Item Data source so that it will remove records whose item codes already exist in the Extract New Item Data source.
The two sources are on different servers so I cannot resolve by amending the query. I would also like to avoid running the same query that is run in the Extract New Item Data source.
If both Sources type is SQL databases and they are stored on the same Server, you can use an SQL Command as Source to achieve that:
SELECT Inverntory2.*
FROM Inverntory2 LEFT JOIN Inverntory1
On Inverntory2.ItemCode = Inverntory1.ItemCode
WHERE Inverntory1.ItemCode IS NULL
OR
SELECT *
FROM Inverntory2
WHERE NOT EXISTS (SELECT 1 FROM Inverntory1 WHERE Inverntory2.ItemCode = Inverntory1.ItemCode)
An example of this is below. Using a SQL Server Destination will work fine, however this only allows for loading to a local SQL Server instance, something that you may want to consider for the future. While a Lookup typically performs better, Merge Joins can be beneficial in certain circumstances such as when many additional columns are introduced into the Data Flow, as may be done with your data sets. It looks like #Hadi has covered how to do this with a Lookup, so you may want to test both approaches in a non-production environment that mimics prod, then assess the results to determine the better option.
Start off by creating a staging table which is an exact clone of one of the tables. Either table will work since they have the same definition. Make sure all columns in the staging allow null values.
Add an Execute SQL Task to clear the staging table before the Data Flow Task by either truncating or dropping and then creating the table.
Since ItemCode is unique sort on this column in each OLE DB Source. If you aren't already change the Data Access Mode to SQL command in both OLE DB Sources and add an ORDER BY clause for ItemCode. Do this by right-clicking the OLE DB Source and going to Show Advanced Editor > Input and Output Properties > OLE DB Source Output > Output Column > then select ItemCode and set the SortKeyPosition property to 1 (assuming you do ASC source in SQL statement).
Next add a Merge Join in the Data Flow Task. This requires both inputs to be sorted, which is why the inputs are now sorted. You can do this either way, but for this example use the OLE DB Source that will only be used when ItemCode does not exist as the merge join left input. Use a left outer join and the ItemCode column as the join key by connecting them via dragging a line from one to the other in the GUI. Add all the columns from the OLE DB Source that you want to use when the same ItemCode is in both data sets (from what I could tell this is Extract New Item Data, please adjust this if it isn't) by checking the check-box next to them in the Merge Join editor. Use an output alias prefix that will help you distingush these, for example X_ItemCode for the matching rows.
After the Merge Join add a Conditional Split. This is divide the records based on whether X_ItemCode was found. For the expression of the first output, use the ISNULL function to test if there was a match from the left outer join. For example ISNULL(X_ItemCode) != TRUE indicates that the ItemCode does exists in both data sets. You can call this output Matching Rows. The default output will contain the non-matches. To make it easier to distinguish you can rename the default output Non-Matching Rows.
Connect the Matching Rows output to the destination table. In this map only the columns of rows that were matched for the source you want to use when ItemCode exists in both data sets, i.e. the X_ prefixed rows such as X_ItemCode.
Add another SQL Server Destination in the Data Flow and connect the output Non-Matching Rows output to this, with all the columns mapped from rows that did not match, the one's without X_ in this example.
Back on the Control Flow in the package, add another Data Flow Task after this one. Use the staging table as the OLE DB Source and the destination table as the SQL Server Destination. Sorting isn't necessary here.
First of all, concerning that you are using SQL Server Destination, i suggest reading the following answer from the SSIS guru #billinkc:
Should SSIS packages and SQL database be on same server?
I will provide different methods to achieve that:
(1) Using Lookup transformation
You should add a Data Flow Task, where you add the second inventory (legacy) as source
Add a lookup transformation where you select the first inventory source as lookup table.
Map the source and lookup table with ItemCode column
In the lookup transformation select Redirect rows to no match output from the drop down list.
Use the Lookup no match output to get the desired rows (not found in the first Inventory source)
You can refer to the link below, it contains a step by step tutorials.
Helpful link
UNDERSTAND SSIS LOOKUP TRANSFORMATION WITH AN EXAMPLE STEP BY STEP
Old Versions of SSIS
If you are using old versions of SSIS, then you will not find the Redirect rows to no match output drop down list. Instead you should go to the Lookup Error output, select Redirect Row option for No Match situation, and use the error output to get the desired rows.
(2) Using Linked Servers
On the Second inventory create a Linked server to be able to connect the the first Server. Now you are be able to use an SQL Command that only select the rows not found in the first source:
SELECT *
FROM Inverntory2
WHERE NOT EXISTS (SELECT 1 FROM <Linked Server>.<database>.<schema>.Inverntory1 Inv1 WHERE Inverntory2.ItemCode = Inv1.ItemCode)
(3) Staging table + MERGE, MERGE JOIN , UNION ALL transformation
On each source SQL command add a fixed value column that contains the source id (1,2), example:
SELECT *, 1 as SourceID FROM Inventory
You can combine both sources in one destination using one of the transformation listed above, then add a second Data flow task to import distinct data from staging table into destination based on the ItemCode column, example:
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER(PARTITION BY ItemCode ORDER BY SourceID) rn
FROM StagingTable ) s
Where s.rn = 1
Then it will return all rows from SourceId =1 and the new rows from SourceId = 2
To learn more about Merge, Merge Join and UNION ALL transformation you can refer to one of the following links:
Learn SSIS : MERGE, MERGE JOIN and UNION ALL
SSIS Do I Union All or Merge??
Using the SSIS Merge Join
How to get unmatched data between two sources in SSIS Data Flow?
Note: check the answer provided by #userfl89 it contains very detailed information about using Merge Join transformation and it described another approach that can help. Now you have to test which approach fits your needs. Good Luck

Substituting Column Values From Other Columns On The Fly

I want the contents of a query to land in an Excel sheet, the data flow already applied in SSIS.
Th query is trying to achieve a set of values to appear in one column using values from other columns in the same table.
Example:
Case When col_date1 > 1900 then col_date1
When col_date1 = '' and col_date2 like 18% then col_date3
Else col_date4
End
Usually the col_date values are hardcoded strings but in my case I want them to be values from the other columns dynamically applied as the query runs and stores the output into Excel.
i don't think an Update statement works here because that would permenantly change the conents of the table which I'm trying to avoid. The changes should take effect on the fly and land in Excel while leaving the SQL Server table unchanged.
Perhaps a left join somehow but not sure that works either.
Any ideas?

Mule - Record cannot be mapped as it contains multiple columns with the same label

I need to do join query to MS SQL Server 2014 DB based on a column name value. The same query runs when doing query directly to DB, but when doing query through Mule I'm getting error. The query looks something like this :
SELECT * FROM sch.emple JOIN sch.dept on sch.emple.empid = sch.dept.empid;
The above query work fine while doing query directly to MS SQL Server DB, but gives the following error through mulesoft.
Record cannot be mapped as it contains multiple columns with the same label. Define column aliases to solve this problem (java.lang.IllegalArgumentException). Message payload is of type: String
Request you to please help me out.
Specify columns list directly:
SELECT e.<col1>, e.<col2>, ...., d.<col1>,...
FROM sch.emple AS e
JOIN sch.dept AS d
ON e.empid = d.empid;
Remarks:
You could use aliases instead of schema.table_name
SELECT * in production code in 95% cases is bad practice
The column that has duplicate is empid(or more). You could add alias for it e.empid AS emple_empid and d.empid AS dept_empid or just specify e.empid once.
To avoid specifying all columns manually, you could drag and drop them from object explorer to query pane like Drag and Drop Column List into query window.
Second way is to use plugin like Redgate Prompt to expand SELECT *:
Image from: https://www.simple-talk.com/sql/sql-tools/sql-server-intellisense-vs.-red-gate-sql-prompt/
Addendum
But the same query works directly.
It works because you don't bind them. Please read carefully link I provided for SELECT * antipattern and especially:
Binding Problems
When you SELECT *, it's possible to retrieve two columns of the same name from two different tables. This can
often crash your data consumer. Imagine a query that joins two
tables, both of which contain a column called "ID". How would a
consumer know which was which? SELECT * can also confuse views (at
least in some versions SQL Server) when underlying table structures
change -- the view is not rebuilt, and the data which comes back can
be nonsense. And the worst part of it is that you can take care
to name your columns whatever you want, but the next guy who comes
along might have no way of knowing that he has to worry about adding a
column which will collide with your already-developed names.
But the same query works directly.
by Dave Markle

How do you get an SSIS package to only insert new records when copying data between servers

I am copying some user data from one SqlServer to another. Call them Alpha and Beta. The SSIS package runs on Beta and it gets the rows on Alpha that meet a certain condition. The package then adds the rows to Beta's table. Pretty simple and that works great.
The problem is that I only want to add new rows into Beta. Normally I would just do something simple like....
INSERT INTO BetaPeople
SELECT * From AlphaPeople
where ID NOT IN (SELECT ID FROM BetaPeople)
But this doesn't work in an SSIS package. At least I don't know how and that is the point of this question. How would one go about doing this across servers?
Your example seems simple, looks like you are adding only new people, not looking for changed data in existing records. In this case, store the last ID in the DB.
CREATE TABLE dbo.LAST (RW int, LastID Int)
go
INSERT INTO dbo.LAST (RW, LastID) VALUES (1,0)
Now you can use this to insert the last ID of the row transferred.
UPDATE dbo.LAST SET LastID = #myLastID WHERE RW = 1
When selecting OLEDB source, set data access mode to SQL Command and use
DECLARE #Last int
SET #Last = (SELECT LastID FROM dbo.LAST WHERE RW = 1)
SELECT * FROM AlphaPeople WHERE ID > #Last;
Note, I do assume that you are using ID int IDENTITY for your PK.
If you have to monitor for data changes of existing records, then have the "last changed" column in every table, and store time of the last transfer.
A different technique would involve setting-up a linked server on Beta to Alpha and running your example without using SSIS. I would expect this to be way slower and more resource intensive than the SSIS solution.
INSERT INTO dbo.BetaPeople
SELECT * FROM [Alpha].[myDB].[dbo].[AlphaPeople]
WHERE ID NOT IN (SELECT ID FROM dbo.BetaPeople)
Add a lookup between your source and destination.
Right click the lookup box to open Lookup Transformation Editor.
Choose [Redirect rows to no match output].
Open columns, map your key columns.
Add an entry with the table key in lookup column , lookup operation as
Connect lookup box to destination, choose [Lookup no Match Output]
Simplest method I have used is as follows:
Query Alpha in a Source task in a Dataflow and bring in records to the data flow.
Perform any needed Transformations.
Before writing to the Destination (Beta) perform a lookup matching the ID column from Alpha to those in Beta. On the first page of the Lookup Transformation editor, make sure you select "Redirect rows to no match output" from the dropdown list "Specify how to handle rows with now matching error"
Link the Lookup task to the Destination. This will give you a prompt where you can specify that it is the unmatched rows that you want to insert.
This is the classical Delta detection issue. The best solution is to use Change Data Capture with/without SSIS. If what you are looking for is a once in a life time activity, no need to go for SSIS. Use other means such as linked server and compare with existing records.
The following should solve issue of loading Changed and New records using SSIS:
Extract Data from Source usint Data flow.
Extract Data from Target.
Match on Primary key Add Unmatch records and split matched and unmatched records from Source and Matched records from Target call them Matched_Source,
Unmatch_Source and Matched_Target.
Compare Matched_Source and Matched_Target and Split Matched_Source to Changed and Unchanged.
Null load TempChanged Table.
Add Changed Records to TempChanged.
Execute SQL script/stored proc to Delete Records from Target for primary key in TempChanged and add records in TempChanged to Target.
Add Unmatched_Source to Target.
Another solution would be to use a temporary table.
In the properties for Beta's connection manager, change RetainSameConnection to true (by default SSIS runs each query in it's own connection, this would mean the temporary table would be killed as soon as it has been created).
Create a SQL Task using Beta's connection and use the following SQL to create your temporary table:
SELECT TOP 0 *
INTO ##beta_temp
FROM Beta
Next create a data flow that pulls data from Alpha and loads into ##beta_temp (you will need to run the SQL statement above on SSMS first so that Visual Studio can see the table at design time and you will also need to set the DelayValidation property to true on the Data Flow task).
Now you have two tables on the same server and you can just use your example SQL modified to use the temporary table.
INSERT INTO Beta
SELECT * FROM ##beta_temp
WHERE ID NOT IN (SELECT ID FROM Beta)

Resources