is it possible to pass columns through an SSIS script transformation? - sql-server

I have a source with 100+ columns.
I need to pass these through a script component transformation, which altersonly a handful of the columns.
Is there a simple way to allow any columns i dont modify to simply pass through the transformation?
Currently, i have to pass them all in, and assign them to the output with no changes.
This is a lot of work for 100 columns and id rather not have to do it if possible!
FYI:
There is no unique key, so i cannot split out the records using multicast and merge them after the script component.

You actually have to choose what columns you want included in your script component as either read only or read/write.
Anything you do not select as read/write simply passes through.
There are things you can do with a script task. Like add an output column to your current data flow or even create a separate data flow output.
In your case. you will want to select the handful of columns that you want to alter as read/write, then modify those columns in script and the rest will just pass through.

Related

where is the option to load CSV into Snowflake? I'm not seeing it

I'm testing out a trial version of Snowflake. I created a table and want to load a local CSV called "food" but I don't see any "load" data option as shown in tutorial videos.
What am I missing? Do I need to use a PUT command somewhere?
Don't think Snowsight has that option in the UI. It's available in the classic UI though. Go to Databases tab, select a database. Go to Tables tab and select a table the option will be at the top
If the classic UI is limiting you or you are already using Snowsight and don't want to switch back, then here is another way to upload a CSV file.
A preliminary is that you have installed SnowSQL on your device (https://docs.snowflake.com/en/user-guide/snowsql-install-config.html).
Start SnowSQL and perform the following steps:
Use the database where to upload the file to. You need various privileges for creating a stage, a fileformat, and a table. E.g. USE MY_TEST_DB;
Create the fileformat you want to use for uploading your CSV file. E.g.
CREATE FILE FORMAT "MY_TEST_DB"."PUBLIC".MY_FILE_FORMAT TYPE = 'CSV';
If you don't configure the RECORD_DELIMITER, the FIELD_DELIMITER, and other stuff, Snowflake uses some defaults. I suggest you have a look at https://docs.snowflake.com/en/sql-reference/sql/create-file-format.html. Some of the auto detection stuff can make your life hard and sometimes it is better to disable it.
Create a stage using the previously created fileformat
CREATE STAGE MY_STAGE file_format = "MY_TEST_DB"."PUBLIC".MY_FILE_FORMAT;
Now you can put your file to this stage
PUT file://<file_path>/file.csv #MY_STAGE;
You can find documentation for configuring the stage at https://docs.snowflake.com/en/sql-reference/sql/create-stage.html
You can check the upload with
SELECT d.$1, ..., d.$N FROM #MY_STAGE/file.csv d;
Then, create your table.
CREATE TABLE MY_TABLE (col1 varchar, ..., colN varchar);
Personally, I prefer creating first a table with only varchar columns and then create a view or a table with the final types. I love the try_to_* functions in snowflake (e.g. https://docs.snowflake.com/en/sql-reference/functions/try_to_decimal.html).
Then, copy the content from your stage to your table. If you want to transform your data at this point, you have to use an inner select. If not then the following command is enough.
COPY INTO mycsvtable from #MY_STAGE/file.csv;
I suggest doing this without the inner SELECT because then the option ERROR_ON_COLUMN_COUNT_MISMATCH works.
Be aware that the schema of the table must match the format. As mentioned above, if you go with all columns as varchars first and then transform the columns of interest in a second step, you should be fine.
You can find documentation for copying the staged file into a table at https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html
If you can check the dropped lines as follows:
SELECT error, line, character, rejected_record FROM table(validate("MY_TEST_DB"."MY_SCHEMA"."MY_CSV_TABLE", job_id=>'xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'))
Details can be found at https://docs.snowflake.com/en/sql-reference/functions/validate.html.
If you want to add those lines to your success table you can copy the the dropped lines to a new table and transform the data until the schema matches with the schema of the success table. Then, you can UNION both tables.
You see that it is pretty much to do for loading a simple CSV file to Snowflake. It becomes even more complicated when you take into account that every step can cause some specific failures and that your file might contain erroneous lines. This is why my team and I are working at Datameer to make these types of tasks easier. We aim for a simple drag and drop solution that does most of the work for you. We would be happy if you would try it out here: https://www.datameer.com/upload-csv-to-snowflake/

Can I Fix the a lookup field in Access database based upon a calculation or another field?

I am trying to create a database with field descriptions for a very large excel file that I have at work. I have created 3 tables- List of sheets, list of variables(including a lookup field pointing at the List of sheets table, so that I can select the sheet to which the variable belongs), and a third table which specify some validation rule.
In this third table, I want to see two lookup fields, one specifying the sheet in which the rule applies(say, 'Select Sheet'), and another specifying the variable(say, 'Select Variable'). I can point to the two different tables, but I want to do something a bit more nuanced than that. When I give a particular sheet name to 'Select Sheet', I want the lookup field for the variable('Select Variable') to show me only those variables which exists in that sheet.
I know that there probably will be solutions using forms, but this database is going to be very detailed and there are things to do afterwards, so I do not want to get into queries and forms before all data has been recorded in tables in a neat manner.
I have a good grasp of VBA in the context of excel and I am given to understand that I can extend the applications of Access using VBA. I am ready to do that, but I want to see before that whether this is some functionality of access that I am missing. Had anyone done anything similar before, and if so, did it take VBA to do it?

SSRS Dynamic Reports for Key Value Pairs

I need to use SSRS to create many different reports, and I have been trying to find the best way for me to easily create them as need, and for users to navigate them and use them for their needs.
To give you and idea of the two sets of data I am dealing with:
EDI file from our customer
Raw data output from hardware configuration
Now the EDI data is fairly consistent, so these columns are static.
The hardware data is usually a massive list of different configuration. I receive them in different flat files formats and using SSIS or other tools I get the data into Key Value Pairs. Now in a report, I use matrix to keep EDI columns static, it matches with the hardware on serial number, and Hardware data pivots.
So the report does not break, and so I don't give the user too much information, it matches up on another table where I specify what keys I want to be columns.
Here is a small example of one of my reports:
The green columns are EDI, while the orange is the hardware.
My question is, is there a better way for me to be doing this? Some reports can get complicated like needing total for certain hardware (counting hardrive space, ram total etc.) which is difficult to do dynamically.
I have tried creating in reports in this fashion, with these parameters:
This way I can create the Key columns per project and user can select what report they want to run. The default is All Data.
Is there a better way for me to create these reports? SSRS really doesn't seem to play well with dynamic pivots.
Is there a better tool that will handle these reports dynamically, or let users pick and choose what they want to see in a report?
I can't visualise your data but if I understand correctly, you could have a dropdown list showing all the unique values that are in the column you are using in the column group. Set this to be multi-value and then simply have the WHERE clause read something like
SELECT * FROM myTable WHERE myColumnGroupField IN (#myColumnChoiceParameter)
This way the user could select whichever columns they would like.
You could extend this by adding another parameter that has some preset groups of columns (I think you might have one of these already if I understand correctly) that would set the default value of the main #myColumnChoiceParameter parameter.
If you want something more flexible then you might want to look at Power BI but depending on how you intend to deploy that might not be a simple option.
You cannot dynamically create columns in SSRS but you can control the visibility of the columns.
1) Create a list in table that contains the names of all the columns that yo want to toggle and include a column titled 'All'.
2) Create a parameter that is based on this table and make sure multi-select is turned on.
3) Right click on every column that you want to toggle, select visibility and then create a condition that checks if the user either selected All or selected the column from the parameter list.
4) Train users that by selecting and deselecting from the dropdown they control whats visible.

SSIS - how to use lookup to add extra columns within data flow

I have a csv file containing the following columns:
Email | Date | Location
I want to pretty much throw this straight into a database table. The snag is that the Location values in the file are strings - eg: "Boston". The table I want to insert into has an integer property LocationId.
So, midway through my data flow, I need to do a database query to get the LocationId corresponding to the Location. eg:
SELECT Id as LocationId FROM Locations WHERE Name = { location string from current csv row }
and add this to my current column set as a new value "LocationId".
I don't know how to do this - I tried a lookup, but this meant I had to put the lookup in a separate data flow - the columns from my csv file don't seem to be available.
I want to use caching, as the same Locations are repeated a lot and I don't want to run selects for each row when I don't need to.
In summary:
How can I, part-way through the data flow, stick in a lookup transformation (from a different source, sql), and merge the output with the csv-derived columns?
Is lookup the wrong transformation to use?
Lookup transformation will work for you, it has caching and you can persist all input columns to the output and add columns from the query you use in lookup transformation.
You can also use merge join here, in some cases it is better solution, however it brings additional overhead because it requres sorting for its inputs.
Check this.
Right click on look up transformation -> go to show advanced editor -> go to Input and output properties.
here you can add new column or you can change data type of existing columns.
for more info how to use look up Click Here
Open the flat file connection manager, go to the Advanced tab.
Click "New" to add the new column and modify the properties.
Now go back to the Flat File Destination output, right click > Mappings > map the lookup column with the new one.

How to insert a row into a dataset using SSIS?

I'm trying to create an SSIS package that takes data from an XML data source and for each row inserts another row with some preset values. Any ideas? I'm thinking I could use a DataReader source to generate the preset values by doing the following:
SELECT 'foo' as 'attribute1', 'bar' as 'attribute2'
The question is, how would I insert one row of this type for every row in the XML data source?
I'm not sure if I understand the question... My assumption is that you have n number of records coming into SSIS from your data source, and you want your output to have n * 2 records.
In order to do this, you can do the following:
multicast to create multiple copies of your input data
derived column transforms to set the "preset" values on the copies
sort
merge
Am I on the right track w/ what you're trying to accomplish?
I've never tried it, but it looks like you might be able to use a Derived Column transformation to do it: set the expression for attribute1 to "foo" and the expression for attribute2 to "bar".
You'd then transform the original data source, then only use the derived columns in your destination. If you still need the original source, you can Multicast it to create a duplicate.
At least I think this will work, based on the documentation. YMMV.
I would probably switch to using a Script Task and place your logic in there. You may still be able leverage the File Reading and other objects in SSIS to save some code.

Resources