snowflake : copy into statement - snowflake-cloud-data-platform

when I add VALIDATION_MODE = 'RETURN_ERRORS' FILES=('bad.csv') in my copy into statement, I get error which error code is 002302. message as SQL Compilation error: Expression not supported within a VALIDATE expression. I have a column which type is variant.
anybody has idea?
thanks

the help notes:
VALIDATION_MODE does not support COPY statements that transform data during a load. If the parameter is specified, the COPY statement returns an error.
Which implies you are doing some form of JSON/Variant data processing in the COPY statements select. Which is valid without VALIDATION_MODE = 'RETURN_ERRORS'
So can you copy the data into a staging table, and do zero JSON process, to "just have raw data" which in a second command to merge/insert into a processed form?

Related

SSISDB package errs out, says variable does not exist

I have an SSIS package that writes csv files from a database, copies them to a couple locations, and then emails a success message. The process is:
Retrieve public file location from database into a variable, #[User::varSQLCSVOutputFolder]
Loop through a list of database records:
Create a local CSV for each one.
Copy the local file to the location in #[User::varSQLCSVOutputFolder]
Send email with MessageSource defined in #[User:varEmailBody].
#[User::varEmailBody] = "Files successfully saved to " + #[User::varCNNTargetCSVFolder]
#[User::varCNNTargetCSVFolder] = #[User::varSQLCSVOutputFolder]
#[User::varSQLCSVOutputFolder] loads from the database, value = \\server.domain.com\TEST\Output Files AM
(to confirm, #[User::varCNNTargetCSVFolder] is just a pass-thru)
I can confirm the expressions flow through at design time. But when I execute it from SSISDB, I get the error
Error: An error occurred with the following error message:
Failed to lock variable "Files successfully saved to
\\server.domain.com\TEST\Output Files AM" for read access with error
0xC0010001 The variable cannot be found. This occurs when an attempt
is made to retrieve a variable from the Variables collection on a
container during execution of the package, and the variable is not
there. The variable name may have changed or the variable is not
being created.
I thought maybe it was a weird problem with escaping the backslashes, but I tried using a REPLACE() in the expression, no luck. I do use the underlying variable #[User::varSQLCSVOutputFolder] repeatedly, but I have precedent constraints set up, so there should be no overlap.... any other possibilities?
It seems to be reading the CONTENT of my variable as the NAME of the variable.
Okay, this was a fun one. So, I had an expression defined for MessageSource, BUT, I had chosen MessageSourceType as Variable, not Direct Input. See below for posterity.

Large query as variable not evaluating in dataflow expression builder SSIS

I am querying a Cache database through an ADO net datasource in my dataflow in SSIS (SQL 2008 R2). I want to pass parameters to the query but can only do this through the expressions section of the dataflow item. The query itself is over 4000 characters hence I cant use the query in the expressions section (due to the 4000 character limitation) in its raw form.
I have tried using a script task to assign the query to a string variable [User::Query1] however when I click the Evaluate Expression button in the expression builder screen of the dataflow, it returns nothing. I have the following expression for [ADO NET Source].[SQLCommand]:
#[User::Query1]
When running the package, I get a an error saying that SQL command has not been set correctly. Check SQLCommand Property.
I've set ValidateExternalMetaData to false and I see the following error in the execution results;
Error: The variable User::Query1 contains a string that exceeds the maximum allowed length of 4000 characters.
Error: Reading the variable "User::Query1" failed with error code 0xC0047100.
Error: The expression "#[User::Query1]" on property "[ADO NET Source].[SqlCommand]" cannot be evaluated. Modify the expression to be valid.
In my script task I have the entire query assigned to a string variable and then I am assigning the string to the actual variable using the vb code below:
Dts.Variables("User::Query1").Value = sSql
MessageBox.Show(Dts.Variables("User::Query1").Value.ToString())
On the script task properties, I have [User::Query1] as a ReadWriteVariables. I have also made sure that EvaluateAsExpression for [User::Query1] is set to true.
In essence, I am trying to run the query using the expressions property of the dataflow as this will allow me to use dynamic parameter values.
Also variables are limited to 4000 chars in SSIS, you could split the query in multiple sub queries.
I have found a solution when you use an OLE DB connection. It is complex to prepare but finally it is running OK. Previously, you need to define the variable with a similar and same structure query and <4000 characters to validate the expression. When is all ok, you can redefine the variable with >4000 characters query in a script task.
Next link explain how to do it.
ssis more than 4000 chars
After version SQL 2012 (I think) the limitation of the variable size does not exist.
However for example in SQL 2017 you can put a large amount of data in a string variable (I tried with more than 20 MB) and it works fine. But when you try to do string operations on it, in an expression like FINDSTRING or such then you get DTS_E_EXPREVALSTRINGVARIABLETOOLONG error.
My solution to sort this out was to use a script task, because in .NET you do not have this limitation.

SSIS script component *bug* – Column data type DT_DBDATE is not supported by the PipelineBuffer class

Does anyone know exactly why these types of issues happen using a script component that can be “fixed” by deleting and re add the same code to make fix this type of issue? Why would metadata change when you delete and re add code? what happens inside the engine when this happens? What kind of issue could it ever fix to delete and re add a script component, copy the same code and rewire it?
I can reproduce at will with the following steps:
Took a working package with a script component and two output buffers. The script component has additional input columns and output columns setup for the second output buffer that are not being populated yet by the source query (OLE DB source SQL command) yet. Only one column is being populated in the second output buffer from the source query .
Copied in a new source query with additional columns for the second output buffer.
Run the package. Get the error message Column data type DT_DBDATE is not supported by the PipelineBuffer class.
Comment out the two lines for the second output buffer, run the package, the package runs successfully:
RedactedOutputBuffer.AddRow();
RedactedOutputBuffer. RedactedColumnName = Row. RedactedColumnName;
Uncomment the same two lines. The package still works. So the package is now exactly the same as when it did not work.
Well, no, it's not really a bug, it's more like SSIS doesn't try to be clever and fit square pegs in round holes.
I mean the error message is pretty clear, innit? The PipelineBuffer class doesn't have any methods to handle data types of DT_DBDATE. So it throws you a UnsupportedBufferDataTypeException:
The exception that is thrown when assigning a value to a buffer column
that contains the incorrect data type.
Anyway, since you didn't print your full error message stack, it's hard to say exactly but my guess is it tried to call SetDateTime (or GetDateTime ) on your busted column. So when you set your source query, it sets the pipeline buffer's data type as DT_DBDATE, but when you comment it out, let it run, then uncomment it out, it has converted the pipeline buffer's data type to DT_DBTIMESTAMP, which is compatible with SetDateTime or whatever method is being called from the PipelineBuffer class which is throwing your error.
Anyway, this MSDN post should give you a little more flavor on what you're seeing, but the bottom line is make sure that the field coming out of your source query is correctly identified as the field type you want it to be in your destination. If that's a SQL Server datetime field, you either need to cast it as datetime in your source query or use a Data Conversion component to explicitly cast it for your script component.

Wrong error in SSIS - [Execute SQL Task] Error: There is an invalid number of result bindings returned for the ResultSetType: "ResultSetType_Rowset"

I have an Execute SQL task which gives returns only one row with one column - a number. I set my result set to single row. Despite that, the task fails. Why ? How do I fix it ?
[Execute SQL Task] Error: There is an invalid number of result bindings returned for the
ResultSetType: "ResultSetType_Rowset".
Probably, you haven't configured your resultset parametrs correctly. To configure it, click on ResultSet in the Execute SQL Task, click Add. In the 'ResultSetName' column, enter the exact name of the columnname that you are retrieving or simply give it 0. In the 'variablename', select the variable you created to map the data returned.
In my case the ResultSet configured to "Full result set" when there were no results being returned. The Stored Procedure I was calling was simply deleting data. So I sat this to "None" and it worked..
Old question (but Google still finds it :-))
If you want to use a rowset as result (usually to combine it with an ForEach-Loop)
you have to create a variable with the type "Object" (e.g. MyResultSet)
in the Execute SQL task the ResultSet property must be Full Rowset (is already, when you are receiving the error message above)
on the ResultSet pane (left list entry) you have to add a single entry, set it name to 0 and its variable to the Object type variable (MyResultSet)
in the ForEachLoop-Container you have to switch to the Collation pane (left list entry)
there set the Enumerator property to Foreach-ADO-Enumerator
the (now visible) ADO-ObjectSourceVariable has to be set to your Object variable (User:MyResultSet)
Click onto the Variable assignment entry in the left pane
create an entry for every column in the result set (and hope you did not use a select * for a 100 column table :-))

SSIS Execute a Stored Procedure with the parameters from .CSV file SQL Server 2005

I'm learning SSIS and this seems like an easy task but I'm stuck.
I have a CSV file Orders.csv with this data:
ProductId,Quantity,CustomerId
1,1,104
2,1,105
3,2,106
I also have a stored procedure ssis_createorder that takes as input parameters:
#productid int
#quantity int
#customerid int
What I want to do is create an SSIS package that takes the .csv file as input and calls ssis_createorder three times for each row in the .csv file (the first row contains column names).
Here is what I have done so far.
I have created an SSIS package (Visual Studio 2005 & SQL Server 2005).
In Control Flow I have a Data Flow Task.
The Data Flow has a Flat File source of my .csv file. All of of the columns are mapped.
I have created a variable named orders of type Object. I also have variables CustomerId, ProductId, & Quantity of type int32.
Next I have a Recordset Destination that is assigning the contents of the .csv file into the varialbe orders. I'm not sure about how to use this tool. I'm setting the VariableName (under Customer Properties) to User::orders. I think that now orders holds an ADO record set made up of the contents from the original .csv file.
Next I'm adding a ForEach Loop Container on the Control Flow tag and linking it to the Data Flow Task.
Inside of the ForEach Loop Container I'm setting the Enumerator to "ForEach ADO Enumerator". I'm setting "ADO object source variable" to User::orders". For Enumeration mode I'm selecting "Rows in the first table".
In the Variable Mapping tab I have User::ProductId index 0, User::Quantity index 1, User::CustomerId index 2. I'm not sure if this is correct.
Next I have a Script Task inside of the ForEach Loop Container.
I have ReadOnlyVariables set to ProductId.
In the Main method this is what I'm doing:
Dim sProductId As String = Dts.Variables("ProductId").Value.ToString
MsgBox("sProductId")
When I run the package my ForEach Loop Container turns Bright Red and I get the following error messages
Error: 0xC001F009 at MasterTest: The type of the value being assigned to variable "User::ProductId" differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.
Error: 0xC001C012 at Foreach Loop Container: ForEach Variable Mapping number 1 to variable "User::ProductId" cannot be applied.
Error: 0xC001F009 at MasterTest: The type of the value being assigned to variable "User::Quantity" differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.
Error: 0xC001C012 at Foreach Loop Container: ForEach Variable Mapping number 2 to variable "User::Quantity" cannot be applied.
Error: 0xC001F009 at MasterTest: The type of the value being assigned to variable "User::CustomerId" differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.
Error: 0xC001C012 at Foreach Loop Container: ForEach Variable Mapping number 3 to variable "User::CustomerId" cannot be applied.
Warning: 0x80019002 at MasterTest: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the number of errors raised (12) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.
SSIS package "Package.dtsx" finished: Failure.
Dts.TaskResult = Dts.Results.Success
Any help would be appreciated
One of my coworkers just give me the answer.
You don't need the the ForEach Loop Container or the RecordSet Container.
All you need is the Flat File Source and an OLE DB Command. Connect to your database and inside the OLE DB Command select the appropriate connection.
In the Component Properties enter the following SQLCommand:
exec ssis_createorder ?, ?, ?
The "?" are place holders for the parameters.
Next under the Column Mappings tab map the .csv file columns to the stored procedure parameters.
You are finished go ahead and run the package.
Thanks Gary if you were on StackOverFlow I would give you an upvote and accept your answer.
If I understand correctly, what you want to do is execute a stored procedure 3 times for each row in the data source.
What if you just create a data flow with a flat file data source and pipe the data through 3 execute sql command tasks? Just map the columns in the data to the input params of your stored procedure.
Maybe I'm not seeing it correctly in your question and I'm thinking too simple, but in my experience you need to avoid using the foreach task in SSIS as much as possible.
I suspect that you need to look at your Data Flow task. It's likely that the values from the source CSV file are being interpreted as string values. You will probably need a Derived Column component or a Data Conversion component to convert your input values to the desired data type.
And, I think #StephaneT's solution would be good for executing the SP.
I'm not sure if this answers your question. But I was looking to do this and I achieved it using the BULK INSERT command. I created a staging table with all of the columns in the csv file, and instead of a stored procedure I used a INSTEAD OF INSERT trigger to handle the logic of inserting it into many tables.

Resources