Snowflake PUT command silently failing from StreamSets when in code block - snowflake-cloud-data-platform

I have a StreamSets pipeline putting some files into internal Snowflake stage.
I am using Snowflake Execute component instead of SnowFlake File Uploader as I need to conditionally execute the PUT.
The PUT command on its own works, but if PUT is inside a begin-end block, it silently fails. Other commands run OK but the file is not uploaded.
Below code illustrates the problem. In reality I have additional IF/THEN structure to conditionally execute the PUT, but even without that, the PUT fails if inside the begin/end code block.
Interestingly I can see the PUT statement in the Snowflake log, no error, but the file is not there if I use list command on the stage.
begin
PUT file://${record:value('/file_path')} #STAGE_STREAMSETS OVERWRITE = TRUE;
end;

Related

Insert status of SSIS package in SQL Table

My project consist of creating multiple sub directories and copy files to those sub directories. I developed this part using file system task inside a foreach loop in SSIS.
The final part is insert into SQL Table, the status of the process. If the file was copy successful the Status column should be "Successful" and the reason in another column should be "File was copied successfully" or something like that.
The error flow redirection (red arrow) is available for file system task or foreach loop? I have read somewhere that in event handlers you can work these status messages and insert them in SQL. Could someone please provide a solution or suggest one to solve this problem?
I would steer away from using event handlers. They are like hidden GOTOs, in which there is no indication in the control flow that they exist and you have to go to another screen to see what they are doing.
It's much more clear to use the control flow to direct errors. Any arrow from any task or container can be double clicked and configured. Change the constraint option to value=Failure to make the arrow go red.

Fail entire SSIS Package in case of Exception

I have a SSIS package with lots of containers and logic. I am running an additional task (which I want to run independently) let's say it acts as an Event Listener. When this separate task is errored, I want to error out the entire package. I thought, it should work by default but to my surprise, it's not:-
I have tried setting both FailPackageOnFailure & FailParentOnFailure properties on both the parent container & the child container but it's not working.
I was about to ask exactly the content of your last comment.
Failure of one piece of a package won't make another, unconnected piece stop executing. Once the executing piece is done, the package will fail, but Sequence Container 3 has no way to know what's happening in Sequence Container 2.
Which, honestly, is what we want. If Sequence Container 3 is doing DML, you could leave your data in an unfortunate state if an unrelated failure elsewhere in the package suddenly made everything come to a screeching halt.
If you don't want Sequence Container 3 to run if Sequence Container 2 fails, then just run a precedence constraint from Sequence Container 2 to Sequence Container 3, #3 won't execute until #2 succeeds and the Execute SQL Task succeeds.
I completely agree with Eric's answer. Let me explain to you why raising a flag on error won't work.
I redesign the package so it includes the flag check.
Let's say we have a success flag as user variable which is by default False.
Now we set this variable as True at the end of sequence 2 execution marking the success of all the other tasks in that sequence.
The second part is put into a for loop which runs only once(if at all). It checks if the success variable is true and only then run the inner tasks. It looks like below:
The problem is, the success variable check at the start of the for loop will always have the inital value which is false(because it runs in parallel with seq 2 and doesn't wait till seq 2 ends). Hence the second part is never executed. Now change the initial value of success variable to true and run the package again. Play by disabling the error prone tasks and run the package. You will understand how it works.

executeOfflineCommand skips a command while executing from storage on Android

I have to execute "Start" and "Finish" Commands in the Sequential Order in my program and synchronize everything at the end. So I'm inserting the Offline commands in the order first and assuming they will execute in the same order. I'm using "List" with "Iterator" for this.
Problem here is: Finish Command will be missed execution in some strange scenarios in the middle and "start" commands will execute next to each other and sending all wrong data and mapped it in a wrong way.
As action will get ID when command executes at the server, I'm keeping tempory id's to map the offline commands in storage(localID). Instaead of List if I use anyother collection will this gets any better? It is hard to reproducing this on simulator. Please review both scenarios and advise where can this approaches go wrong. Thanks
I will add the OfflineCommands into the List and save in the Storage. After that user can perform delete delete operation in the App so that I will retrieve the list and remove the commands which got deleted from storage so now I have filtered list.
Don't synchronize.
That's nearly always a mistake in Codename One. Your code deals with the UI so it should be on the EDT and Display.getInstance().isEDT() should be true.
My guess is that one of the commands in the middle uses one of the following invokeAndBlock() derivatives:
addToQueueAndWait
Modal dialogs
Which triggers a second round of synchronization to run.
You can trace that by reproducing the issue and checking which command in the list is specifically there at each time. Then fix that command so it doesn't block in this way.
Another approach to fixing it is to remove the list immediately when you start processing which will prevent a duplicate execution of commands.

Camel idempotentConsumer always use PUT instead of GET

I am using camel idempotent. Can someone please explain the logic behind idempotentConsumer xml tag.
I received file for first time. All good the idempotentconsumer block executed. on infinispan server I see a log PUT.
I dropped a duplicate file. Now idempotentconsumer identifies duplicated but on infinispan server I see a log with PUT instead of GET. I am wondering is this issue with server side or camel-client?
<idempotentConsumer messageIdRepositoryRef="infinispanRepo" >
<header>CamelFileAbsolutePath</header>
</idempotentConsumer>
No this is working as designed. The Idempotent Consumer EIP will attempt to put the key to the cache with a fixed value of true - that would be an atomic operation on Infinispan. The result of that put operation is then used to know if there was a duplicate or not.
If you do two operations with a GET and then PUT its no longer an atomic operation and you can end up with problems.
See the code at:
https://github.com/apache/camel/blob/master/components/camel-infinispan/src/main/java/org/apache/camel/component/infinispan/processor/idempotent/InfinispanIdempotentRepository.java#L68

Running a script component before writing data to a flat file in SSIS

In my SSIS package I need to modify the flat file name that will get creted at the last step of my data flow execution. Currently, I need to transit input data though the Script Component + do a code modifications to a variable that will form a flat file component connection string. The actual data set that should be written is generated by Merge Join component and transiting it through the script component jsut to call one user variable adjustment seems like and overhead.
What is the best practice for an aforemention situation?
If the file name doesn't depend on anything in the data, then I would use a Script Task in the control flow rather than a Script Component in the data flow to set the value.
If the file name does depend on something in the data, a Script Component is probably the best way to get that information; however, the Script Component cannot update any ReadWrite variables outside of the PostExecute method (which will not happen until all the input rows have been processed); this means that the variable changes will not be reflected in the output file's name. In this case, I'd suggest using a File System Task to rename the file after the data flow completes.
Personally I would use a rename DOS command afterwards. Export to a fixed filename and rename it afterwards. To me it's simpler.

Resources