I am building an SSIS package that will populate data from an Excel Spreadsheet into our Database for Reporting.
The customer did not provide an individual column for the City and Unfortunately, the customer cannot update their export file to add the city, so I am trying to build a city column using the Branch Names.
I need an SSIS Expression (or several) to use in a Derived Column Transformation to pull the Name of the Cities out of the Branch Name. The issue I have is that the Spacing and placement of the names varies. I have tried to use Token, Sub string and Right and Left combined with other expressions and I always seem to cut something off.
Has anyone else run into this and how can I fix it. (I am not familiar with C# to use a Script Component).
Here is a Sample of the Data that I have.
Branch Name
JS OMAHA - 09
JS SIOUX FALLS - 48
JS DOWNINGTOWN - 53
JS ST PAUL - 70
JS BLOOMINGTON - 103
JS PITTSBURGH NORTH -149-
JS TINTON FALLS - 186
JS BLAINE - 337
JS ROCHESTER MN - 423
Do you have a list of valid cities sitting in a table? If so you can use a lookup transformation.
Lets say your list if cities is in a table called city
On the General tab pick No Cache
On the Connection tab tab pick the city table
On the Columns tab tab match the Branch Name column to the city column in your city table
In the Advanced tab, tick Modify the SQL statement and change the end to where [Branch Name] Like '%' + ? '%'
Now your lookup will find the closest match and pass it through as an extra column.
The other way is to load it all into a staging table and do an UPDATE, also using LIKE
Whatever you do, it will help to have a list of valid cities in a table
The other way is to make an assumption about the tokens in the data and use string functions in a derived column transformation to extract it out, but you can get some unexpected results.
I can expand further on these if you wish but I won't waste time if you're never going to return to the question.
Whilst you stated that you are not familiar with script components - they are the correct tool for the job. You will get much greater flexibility by using C# (or VB.Net) code to manipulate your strings. There are a number of good tutorials online to show you how to use a script task, and lots of information about string manipulation in C#.
Related
I really like Snowflake's new Snowsight web console. One minor issue is that all the numeric columns have commas , as thousands separator rather than just outputting the raw number.
For example I have a bunch of UNIX epochs stored in a column called created_time. For debugging purposes I'd like to quickly copy and paste them into a WHERE clause, but I have to manually remove the commas from 1,666,719,883,332 to be 1666719883332.
Sure it's a minor thing, but doing it several dozen times a day is really starting to up to minutes.
I realize I could cast the column to a VARCHAR, but I'd rather find a setting that I can turn off for this auto-thousand-separator default behavior.
Does anyone know a way to turn it off?
Here is an example:
create TABLE log (
CREATED_TIME NUMBER(38,0),
MSG VARCHAR(20000)
);
insert into log values (1666719883332, 'example');
select * From log;
which outputs
CREATED_TIME
MSG
1,666,719,883,332
example
Prepare to be amazed! The option to show/hide the 000 separator is on the left corner
I'd like to quickly copy and paste them into a WHERE clause, but I have to manually remove the commas from 1,666,719,883,332 to be 1666719883332.
The way I use it is a preview pane and Copy button:
this is my data table
I'm writing this formula in openoffice not excel, that's why you will see ";" instead of ","
my questions is that I'm trying to put the currency of each country's capital name, and I did it but the thing is that I'm unable to make more than 42 conditions!!!!!
Is there another way or another formula can I use???
Here is the formula I did, and it's working
=IF(D3="AMSTERDAM";"EUR";IF(D3="FRANKFURT";"EUR";IF(D3="OSLO";"NOK";IF(D3="COPENHAGEN";"MULTI";IF(D3="ALICANTE";"EUR";IF(D3="BARCELONA";"EUR";IF(D3="BERLIN TXL";"EUR";IF(D3="VILNIUS";"EUR";IF(D3="BRUSSELS";"EUR";IF(D3="CATANIA";"EUR";IF(D3="DUSSELDORF";"EUR";IF(D3="FARO";"EUR";IF(D3="GRAN CANARIA";"EUR";IF(D3="HELSINKI";"EUR";IF(D3="MALAGA";"EUR";IF(D3="MUNICH";"EUR";IF(D3="PARIS CDG";"EUR";IF(D3="RIGA";"EUR";IF(D3="SANTA CRUZ PALMA";"EUR";IF(D3="SEVILLA";"EUR";IF(D3="TENERIFE";"EUR";IF(D3="BUDAPEST";"HUF";IF(D3="ANTALYA";"TRY";IF(D3="GAZIPASA";"TRY";IF(D3="ISTANBUL";"TRY";IF(D3="BERGEN";"NOK";IF(D3="STAVANGER";"NOK";IF(D3="STAVANGER VIA ESBJERG";"NOK";IF(D3="LONDON CITY";"GBP";IF(D3="LONDON LHR";"GBP";IF(D3="LONDON STN";"GBP";IF(D3="MANCHESTER";"GBP";IF(D3="FUERTEVENTURA";"ISK";IF(D3="LANZAROTE";"ISK";IF(D3="PORTO SANTO";"ISK";IF(D3="GLASGOW";"SCP";IF(D3="GDANSK";"PLN";IF(D3="CLUJNAPOCA";"RON";IF(D3="STOCKHOLM";"SEK";IF(D3="PRAGUE";"CZK";""))))))))))))))))))))))))))))))))))))))))
I'd suggest you use a table in another section of your spreadsheet then use VLOOKUP to match the currency to your country.
=VLOOKUP(D3;Currency_Table;2;FALSE}
Which is lookup D3 in the table named Currency_Table and return the exact match (from FALSE) in the second column which will give you your currency.
Or if you want the formula to exist without dependency upon another table you could use something like:
=VLOOKUP(D3;{"AMSTERDAM"\,"EUR";"FRANKFURT"\,"EUR";"OSLO"\,"NOK"; etc...};2;FALSE}
NB: I've added an escape \ before the comma because I'm assuming you are from a language area that uses , as a decimal by your language settings I'm assuming you'll need that in your array for that to work.
I have a column named body(ntext,null). Basically anything in the body of the message will come out as one string of text. See example:
Report Count SITE Type ACCOUNT NUMBER STMT CD COLL SCHEME Previously Touched Resi Aging 98 Cleveland - 609 Former 22449903 1 RQ-1 1160201
I want the result to look like this:
Report Count SITE Type ACCOUNT NUMBER STMT CD
98 Cleveland - 609 Former 22449903 1 RQ-1 1160201
How can I get this output? Would it be easier to do in EXCEL using VBA verses SQL?
I am not an expert in SQL. I am still learning.
You COULD try to get this out of Sql but I think most would agree that Sql is not designed for extensively formatting text.
As a DBA, I would steer you towards making those fields discrete if possible using normalization or at the very least having a key/value pair table rather than a blob of text that represents both fields and data.
You could also consider a datatype of XML if you find that you need to store different fields and different responses for each row.
I am facing an issue regarding a significantly large database that I have to reorganize. There are two columns, one consists of the Service Code of an item and next is a column containing the Description of the relevant item. Below is an example:
TSB Trim Booklet
LMN Loading Manual
GLM Grain Loading Manual
etc.
There are a total of 170 different items.
The problem is this: On a different Excel file, there is a column containing (mixed around 16,000 times) only the Descriptions of the items without the 3-letter Service Code.
How can I link them quickly?
Assumptions: you want to take the service code from file 1 and apply it to the descriptions from file 2 and a single description always has the same service code.
Use the following formula in file 2 (the big one you want to add service codes to)
=INDEX([file1]Sheetname!$A:$A,MATCH([file2]Sheetname!A2,[file1]Sheetname!$B:$B,0))
Where
[file1]Sheetname!$A:$A
is the column with service codes in the file/sheet with both the code and the description
[file2]Sheetname!A2
is the cell with description in the file/sheet with just descriptions
and
[file1]Sheetname!$B:$B
is the column with descriptions in the file/sheet with both the code and the description
I have a table containing postcodes but there is no validation built in to the entry form so there is no consistency in the way they are stored in the database, sample below:
ID Postcode
001742 B5
001745
001746
001748 DY3
001750
001751
001768 B276LL
001774 B339HY
001776 B339QY
001780 WR51DD
I want to use these postcode to map the distance from a central point but before I can do that I need to put them into a valid format and filter out any blanks or incomplete postcodes.
I had considered using
left(postcode,3) + ' ' + right(postcode,3)
To correct the formatting but this wouldn't work for postcodes like 'M6 8HD'
My aim is to get the list of postcodes in a valid format but I don't know how to account for different lengths of postcode. Is this there a way to do this in SQL Server?
As discussed in the comments, sometimes looking at a problem the other way around presents a far simpler solution.
You have a list of arbitrary input provided by users, which frequently doesn't contain the correct spacing. You also have a list of valid postcodes which are correctly spaced.
You're trying to solve the problem of finding the correct place to insert spaces into your arbitrary inputs to make them match the list of valid codes, and this is extremely difficult to do in practice.
However, performing the opposite task - removing the spaces from the valid postcodes - is remarkably easy to do. So that is what I'd suggest doing.
In our most recent round of data modelling, we have modelled addresses with two postcode columns - PostCode containing the postcode as provided from whatever sources, and PostCodeNoSpace, a computed column which strips whitespace characters from PostCode. We use the latter column for e.g. searches based on user input. You may want to do something similar with your list of Valid postcodes, if you're keeping it around permanently - so that you can perform easy matches/lookups and then translate those matches back into a version that has spaces - which is actually a solution to the original question posed!