Asynchronous Put Commands with snowflake.connector.python Throws Error - snowflake-cloud-data-platform

When using the PUT command in a threaded process, I often receive the following traceback:
File "/data/pydig/venv/lib64/python3.6/site-packages/snowflake/connector/cursor.py", line 657, in execute
sf_file_transfer_agent.execute()
File "/data/pydig/venv/lib64/python3.6/site-packages/snowflake/connector/file_transfer_agent.py", line 347, in execute
self._parse_command()
File "/data/pydig/venv/lib64/python3.6/site-packages/snowflake/connector/file_transfer_agent.py", line 1038, in _parse_command
self._command_type = self._ret["data"]["command"]
KeyError: 'command'
It seems to be fairly benign, yet occurs randomly. The command itself seems to run successfully when looking at the stage. To combat this, I simply catch KeyErrors when puts occur, and retry several times. This allows processes to continue as expected, but leads to issues with proceeding COPY INTO statements. Mainly, because the initial PUT succeeds, I will receive a LOAD_SKIPPED status from the COPY INTO. Effectively, the file is put and copied, but we lose information such as rows_parsed, rows_loaded, and errors_seen.
Please advise on work arounds for the initial traceback.
NOTE: An example output after running PUT/COPY INTO processes: SAMPLE OUTPUT
NOTE: I have found I can use the FORCE parameter with COPY INTO to bypass the LOAD_SKIPPED status, however, the initial error still persists, and this can cause duplication.

Related

How can I handle _popen() errors in C?

Good morning;
Right now, I'm writing a program which makes a Montecarlo simulation of a physical process and then pipes the data generated to gnuplot to plot a graphical representation. The simulation and plotting work just fine; but I'm interested in printing an error message which informs the user that gnuplot is not installed. In order to manage this, I've tried the following code:
#include <stdio.h>
#include <stdlib.h>
FILE *pipe_gnuplot;
int main()
{
pipe_gnuplot = _popen("gnuplot -persist", "w");
if (pipe_gnuplot==NULL)
{
printf("ERROR. INSTALL gnuplot FIRST!\n");
exit (1);
}
return 0;
}
But, instead of printing my error message, "gnuplot is not recognized as an internal or external command, operable program or batch file" appears (the program runs on Windows). I don't understand what I'm doing wrong. According to _popen documentation, NULL should be returned if the pipe opening fails. Can you help me managing this issue? Thanks in advance and sorry if the question is very basic.
Error handling of popen (or _popen) is difficult.
popen creates a pipe and a process. If this fails, you will get a NULL result, but this occurs only in rare cases. (no more system resources to create a pipe or process or wrong second argument)
popen passes your command line to a shell (UNIX) or to the command processor (Windows). I'm not sure if you would get a NULL result if the system cannot execute the shell or command processor respectively.
The command line will be parsed by the shell or command processor and errors are handled as if you entered the command manually, e.g. resulting in an error message and/or a non-zero exit code.
A successful popen means nothing more than that the system could successfully start the shell or command processor. There is no direct way to check for errors executing the command or to get the exit code of the command.
Generally I would avoid using popen if possible.
If you want to program specifically for Windows, check if you can get better error handling from Windows API functions like CreateProcess.
Otherwise you could wrap your command in a script that checks the result and prints specific messages you can read and parse to distinguish between success and error. (I don't recommend this approach.)
Just to piggy-back on #Bodo's answer, on a POSIX-compatible system you can use wait() to wait for a single child process to return, and obtain its exit status (which would typically be 127 if the command was not found).
Since you are on Windows you have _cwait(), but this does not appear to be compatible with how _popen is implemented, as it requires a handle to the child process, which _popen does not return or give any obvious access to.
Therefore, it seems the best thing to do is to essentially manually re-implemented popen() by creating a pipe manually and spawning the process with one of the spawn[lv][p][e] functions. In fact the docs for _pipe() give an example of how one might do this (although in your case you want to redirect the child process's stdin to the write end of your pipe).
I have not tried writing an example though.

ProFTPD Extended Log - Use a subset of command classes instead of whole command class

I am building a log parser for ProFTPD and have a question regarding the ExtendedLog config directive.
Official ProFTPD documentation has the following ExtendedLog spec:
ExtendedLog [ filename [[command-classes] format-nickname]]
There are a couple of valid command-classes, but they are mostly consisted of groups of commands. For me, this is a problem because if a user uploads large file and if there are many users and many uploads, a WRITE command in extended log occurs for portions of the actual upload, meaning if a file is large, for that file WRITE occurs many times. This may fill up the log space fairly easily for large uploads. In comparison to this, STOR command can be visible only at the end of the actual file upload.
I can't explicitly find WRITE as one of the commands in the write command class but I was wondering if there is a way to omit this specific WRITE command from log as I'm only interested in a portion of commands from the write command class. The commands that I'm particularly and only interested in logging are STOR, DELE and RMD.
Many thanks.
At the end I did not found any flags in ProFTPD that could handle this but rather implemented log rotation.
The log rotation restarts ProFTPD and sends interrupt to the log parser. Log parser then detects the interrupt, reads the current log file and then stops processing. Log rotate program then empties out the original log file.

How to disable timeout in LLDB?

In LLDB console, my process is stopped. I run thread step-in and eventually get:
Command timed out
How do I extend or disable this timeout?
In my case, this timeout is expected it because the program requires external interaction before going to the next line.
thread step-in has no timeout. That wouldn't make any sense, as your last comment demonstrates.
The print command can take a timeout, but by default does not. If you run po the object description printing part of that command is run with a timeout. And if you have any code-running variable formatters, they are also run with a timeout. lldb has removed most of the built-in code-running formatters, though there a few of them still around and they could also be responsible for the timeout message. But other than printing, there aren't really that many things lldb does with a timeout...
Anyway, what you are probably seeing is that after the previous stop happened some code was being run to present locals or something similar and that command was what timed out.
If you can get this to happen reliably, then please file a bug with http://bugreporter.apple.com.

AppleScript: "file is already open" but "file wasn't open"

Because of my slightly obsessive personality, I've been losing most of my productive time to a single little problem.
I recently switched from Mac OS X Tiger to Yosemite (yes, it's a fairly large leap). I didn't think AppleScript had changed that much, but I encountered a problem I don't remember having in the old days. I had the following code, but with a valid filepath:
set my_filepath to (* replace with string of POSIX filepath, because typing
colons was too much work *)
set my_file to open for access POSIX file my_filepath with write permission
The rest of the code had an error which I resolved fairly easily, but because the error stopped the script before the close access command, and of course AppleScript left the file reference open. So when I tried to run the script again, I was informed of a syntax error: the file is already open. This was to be expected.
I ran into a problem trying to close the reference: no matter what I did, I received an error message stating that the file wasn't open. I tried close access POSIX file (* filepath string again *), close access file (* whatever that AppleScript filepath format is called *), et cetera. Eventually I solved the problem by restarting my computer, but that's not exactly an elegant solution. If no other solution presents itself, then so be it; however, for intellectual and practical reasons, I am not satisfied with rebooting to close access. Does anyone have insights regarding this issue?
I suspect I've overlooked something glaringly obvious.
Edit: Wait, no, my switch wasn't directly from Tiger; I had an intermediate stage in Snow Leopard, but I didn't do much scripting then. I have no idea if this is relevant.
Agreed that restarting is probably the easiest solution. One other idea though is the unix utility "lsof" to get a list of all open files. It returns a rather large list so you can combine that with "grep" to filter it for you. So next time try this from the Terminal and see if you get a result...
lsof +fg | grep -i 'filename'
If you get a result you will get a process id (PID) and you could potentially kill/quit the process which is holding the file open, and thus close the file. I never tried it for this situation but it might work.
Have you ever had the Trash refuse to empty because it says a file is open? That's when I use this approach and it works most of the time. I actually made an application called What's Keeping Me (found here) to help people with this one problem and it uses this code as the basis for the app. Maybe it will work in this situation too.
Good luck.
When I've had this problem, it's generally sufficient to quit the Script editor and reopen it; a full restart of the machine is likely excessive. If you're running this from the Script Menu rather than Script Editor, you might try turning off the Script Menu (from Script Editor) and turning it back on again. The point is that files are held by processes, and if you quit the process it should release any lingering files pointers.
I've gotten into the habit, when I use open for access, of using try blocks to catch file errors. e.g.:
set filepath to "/some/posix/path"
try
set fp to open for access filepath
on error errstr number errnom
try
close access filepath
set fp to open for access filepath
on error errstr number errnom
display dialog errnum & ": " & errstr
end try
end try
This will try to open the file, try to close it and reopen it if it encounters and error, and report the error if it runs into more problems.
An alternative (and what I usually do) is that you can also comment out the open for access line and just add in a close access my_file to fix it.

Why doesn't inotify update?

I'm writing an inotify watcher in C for a Minecraft server. Basically, it watches server.log, gets the latest line, parses it, and if it matches a regex; performs some actions.
The program works fine normally through "echo string matching the regex >> server.log", it parses and does what it should. However, when the string is written to the file automatically via Minecraft server, it doesn't work until I shut down the server or (sometimes) log out.
I would post code, but I'm wondering if it doesn't have something to do with ext4 flushing data to disk or something along those lines; a filesystem problem. It would be odd if that were the case though, because "tail -f server.log" updates whenever the file does.
Solved my own problem. It turned out the server was writing to the log file faster than the watcher could read from it; so the watcher was getting out of sync.
I fixed it by adding a check after it processes the event saying "if the number of lines currently in the log file is more than the recorded length of the log, reprocess the file until the two are equal."
Thanks for your help!
Presumably that is because you are watching for IN_CLOSE events, which may not occur until the server shuts down (and closes the log file handle). See man inotify(7) for valid mask parameters for the inotify_add_watch() call. I expect you'll want to use IN_WRITE.
Your theory is more than likely correct, the log file is being buffered by the OS, and the log writer has no flushing of that buffer, so everything will remain in the buffer till the file is closed or the buffer is full. A fast way to test is to start up the log to the point where you know it would have written events to the log, then forcibly close it so it cannot close the handle, if the log is empty is definitly the buffer. If you can get hold of the file handle/descriptor, you can use setbuf to remove buffering, at the cost of performance.

Resources