Camel SFTP Component: different behavior running on Windows and CentOS - apache-camel

I am using using camel-sftp to retrieve some files from a remote SFTP Server and after retrieving them I want to delete them. When I am running my route in Windows files get deleted and everything works as expected. However, when running on CentOS files get retrieved but never deleted and they start piling up in the remote server.
I am using Camel 2.13.1 and Java 7. My consumer URI looks like this:
sftp://remoteUser#remoteHost?privateKeyFile=locationOfMyPrivateKey.key&binary=true&disconnect=true&delay=20s&delete=true&idempotent=false&include=.*.txt&useFixedDelay=false&maxMessagesPerPoll=10&eagerMaxMessagesPerPoll=false&sortBy=reverse:file:name
I have also been looking through the unresolved issues and resolved issues for Camel 2.13.2 and 2.13.3 and just found it this ticket: https://issues.apache.org/jira/browse/CAMEL-7565 which is not exactly what is going on in my scenario. The other possibility is that the issue comes from the underlying library JSCH but camel-ftp is using almost the latest version and the latest version changelog does not mention anything about this.
Last thing, if I use the sftp command in CentOS and I connect remotely to the SFTP Server, I have no issues removing files. This removes the idea that it could be a problem with the key.
Any ideas?
UPDATE
This is the log. It looks like it's not understanding the delete=true option even if it is set. Nothing in the log says 'deleting file', and I guess it is because the thread is not visiting the code on line 382 from SftpOperations.java. Not sure what's going on...
2014-12-15 20:36:28,431 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - GenericFileConsumer.processExchange (GenericFileConsumer.java:356) - Retrieving file: data/file1.txt from: Endpoint[sftp://remoteUser#remoteHost/data?binary=true&delay=20s&delete=true&disconnect=true&eagerMaxMessagesPerPoll=false&idempotent=false&include=.*.txt&maxMessagesPerPoll=10&passiveMode=true&privateKeyFile=locationOfMyPrivateKey.key&reconnectDelay=30000&soTimeout=50000&sortBy=file%3Aname&useFixedDelay=false]
2014-12-15 20:36:28,431 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.retrieveFile (SftpOperations.java:588) - retrieveFile(data/file1.txt)
2014-12-15 20:36:28,431 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.getCurrentDirectory (SftpOperations.java:472) - getCurrentDirectory()
2014-12-15 20:36:28,431 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.getCurrentDirectory (SftpOperations.java:475) - Current dir: /
2014-12-15 20:36:28,431 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.changeCurrentDirectory (SftpOperations.java:483) - changeCurrentDirectory(data)
2014-12-15 20:36:28,431 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.changeCurrentDirectory (SftpOperations.java:494) - Compacted path: data -> data using separator: /
2014-12-15 20:36:28,432 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.getCurrentDirectory (SftpOperations.java:472) - getCurrentDirectory()
2014-12-15 20:36:28,432 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.getCurrentDirectory (SftpOperations.java:475) - Current dir: /
2014-12-15 20:36:28,432 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.doChangeDirectory (SftpOperations.java:538) - Changing directory: data
2014-12-15 20:36:28,851 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - IOHelper.copy (IOHelper.java:191) - Copying InputStream: com.jcraft.jsch.ChannelSftp$2#308a9271 -> OutputStream: with buffer: 4096 and flush on each write false
2014-12-15 20:36:29,604 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.changeCurrentDirectory (SftpOperations.java:483) - changeCurrentDirectory(/)
2014-12-15 20:36:29,604 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.changeCurrentDirectory (SftpOperations.java:494) - Compacted path: / -> / using separator: /
2014-12-15 20:36:29,605 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.getCurrentDirectory (SftpOperations.java:472) - getCurrentDirectory()
2014-12-15 20:36:29,605 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.getCurrentDirectory (SftpOperations.java:475) - Current dir: /data
2014-12-15 20:36:29,605 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.getCurrentDirectory (SftpOperations.java:472) - getCurrentDirectory()
2014-12-15 20:36:29,605 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.getCurrentDirectory (SftpOperations.java:475) - Current dir: /data
2014-12-15 20:36:29,605 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - SftpOperations.doChangeDirectory (SftpOperations.java:538) - Changing directory: ..
2014-12-15 20:36:29,816 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - GenericFileConsumer.processExchange (GenericFileConsumer.java:386) - Retrieved file: data/file1.txt from: Endpoint[sftp://remoteUser#remoteHost/data?binary=true&delay=20s&delete=true&disconnect=true&eagerMaxMessagesPerPoll=false&idempotent=false&include=.*.txt&maxMessagesPerPoll=10&passiveMode=true&privateKeyFile=locationOfMyPrivateKey.key&reconnectDelay=30000&soTimeout=50000&sortBy=file%3Aname&useFixedDelay=false]
2014-12-15 20:36:29,817 DEBUG [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - GenericFileConsumer.processExchange (GenericFileConsumer.java:396) - About to process file: RemoteFile[file1.txt] using exchange: Exchange[file1.txt]
2014-12-15 20:36:29,817 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - DefaultUnitOfWork.<init> (DefaultUnitOfWork.java:77) - UnitOfWork created for ExchangeId: ID-my-machine-name-37091-1418675752051-0-23 with Exchange[file1.txt]
2014-12-15 20:36:29,817 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - EventHelper.doNotifyEvent (EventHelper.java:766) - Notitxtation of event is disabled: ID-my-machine-name-37091-1418675752051-0-23 exchange created: Exchange[file1.txt]
2014-12-15 20:36:29,935 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - GenericFileConsumer$1.done (GenericFileConsumer.java:405) - Done processing file: RemoteFile[file1.txt] synchronously
2014-12-15 20:36:29,935 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - GenericFileConsumer.processExchange (GenericFileConsumer.java:316) - Processing file: RemoteFile[file2.txt]
2014-12-15 20:36:29,937 TRACE [Camel (mySFTPRoute) thread #0 - sftp://remoteUser#remoteHost] - GenericFileConsumer.processExchange (GenericFileConsumer.java:356) - Retrieving file: data/file2.txt from: Endpoint[sftp://remoteUser#remoteHost/data?binary=true&delay=20s&delete=true&disconnect=true&eagerMaxMessagesPerPoll=false&idempotent=false&include=.*.txt&maxMessagesPerPoll=10&passiveMode=true&privateKeyFile=locationOfMyPrivateKey.key&reconnectDelay=30000&soTimeout=50000&sortBy=file%3Aname&useFixedDelay=false]

Related

Problems using libsrtp on EL9

I'm having some issues with enabling the res_srtp module in Asterisk. Every attempt results in this not-so-helpful error message:
WARNING[47044] res_srtp.c: Failed to initialize libsrtp
ERROR[47044] loader.c: *** Failed to load module res_srtp.so
ERROR[47044] asterisk.c: Module initialization failed. ASTERISK EXITING!
I'd like to fix this and get it running, so now we get into the programming side of things. I'm not a C programmer by trade, but I was able to use my limited gdb skills to trace the error back through libsrtp's crypto_kernel_init() function. The error occurs when trying to enable the AES-GCM-128 cipher.
I was only able to get as far as PK11_Encrypt() in the NSS library, where I was unable to step into this line of code. I think this is because (again, very limited C knowledge) it's a macro and not a true function?
crv = PK11_GETTAB(slot)->C_EncryptInit(session, &mech, symKey->objectID);
// returns 113
So my question is either, how can I proceed with debugging this to a point where I can file a bug report with someone, or (preferably) has anyone got libsrtp working in this environment? There were very few other reports of similar problems, likely because EL9 is not in wide use yet.
My distro (AlmaLinux) is running NSS 3.71, and I've tried updating to NSS 3.79 with no change.
Here's the backtrace from where I was able to get to, if it's of any help.
#0 PK11_Encrypt (symKey=0x555555ea1a00, mechanism=mechanism#entry=4231, param=param#entry=0x7fffffffc9a0,
out=out#entry=0x7fffffffcb50 "\331\061\062%\370\204\006\345\245Y\tů\365&\232\206\247\251S\025\064\367\332.L0=\212\061\212r\034<\f\225\225h\tS/\317\016$I\246\265%\261j\355\365\252\r\346W\272c{9", outLen=outLen#entry=0x7fffffffca3c, maxLen=<optimized out>,
data=0x7fffffffcb50 "\331\061\062%\370\204\006\345\245Y\tů\365&\232\206\247\251S\025\064\367\332.L0=\212\061\212r\034<\f\225\225h\tS/\317\016$I\246\265%\261j\355\365\252\r\346W\272c{9", dataLen=60) at ../pk11wrap/pk11obj.c:972
#1 0x00007ffff456a455 in srtp_aes_gcm_nss_do_crypto (enc_len=0x7fffffffca3c,
buf=0x7fffffffcb50 "\331\061\062%\370\204\006\345\245Y\tů\365&\232\206\247\251S\025\064\367\332.L0=\212\061\212r\034<\f\225\225h\tS/\317\016$I\246\265%\261j\355\365\252\r\346W\272c{9", encrypt=1, cv=0x5555562a19c0) at crypto/cipher/aes_gcm_nss.c:297
#2 srtp_aes_gcm_nss_encrypt (cv=0x5555562a19c0, buf=<optimized out>, enc_len=0x7fffffffca3c) at crypto/cipher/aes_gcm_nss.c:345
#3 0x00007ffff456cb24 in srtp_cipher_type_test (ct=0x7ffff457a6c0 <srtp_aes_gcm_128>, test_data=0x7ffff457a420 <srtp_aes_gcm_test_case_0>)
at crypto/cipher/cipher.c:297
#4 0x00007ffff456d545 in srtp_cipher_type_test (ct=<optimized out>, test_data=<optimized out>) at crypto/cipher/cipher.c:605
#5 0x00007ffff456d58d in srtp_cipher_type_self_test (ct=<optimized out>) at crypto/cipher/cipher.c:613
#6 0x00007ffff457005d in srtp_crypto_kernel_do_load_cipher_type (replace=0, id=6, new_ct=0x7ffff457a6c0 <srtp_aes_gcm_128>)
at crypto/kernel/crypto_kernel.c:293
#7 srtp_crypto_kernel_load_cipher_type (new_ct=new_ct#entry=0x7ffff457a6c0 <srtp_aes_gcm_128>, id=id#entry=6) at crypto/kernel/crypto_kernel.c:343
#8 0x00007ffff457036e in srtp_crypto_kernel_init () at crypto/kernel/crypto_kernel.c:139
#9 srtp_crypto_kernel_init () at crypto/kernel/crypto_kernel.c:72
#10 0x00007ffff457040d in srtp_init () at srtp/srtp.c:2729
#11 0x00007ffff458718d in res_srtp_init () at /usr/src/debug/asterisk-16.28.0-0.el9.x86_64/res/res_srtp.c:1237
#12 load_module () at /usr/src/debug/asterisk-16.28.0-0.el9.x86_64/res/res_srtp.c:1272
#13 0x000055555566c4dc in start_resource.part.0.lto_priv.0 (mod=0x555555a469d0) at /usr/src/debug/asterisk-16.28.0-0.el9.x86_64/main/loader.c:1718
#14 0x0000555555665517 in start_resource (mod=0x555555a469d0) at /usr/src/debug/asterisk-16.28.0-0.el9.x86_64/main/loader.c:1692
#15 start_resource_attempt (mod=mod#entry=0x555555a469d0, count=count#entry=0x7fffffffce94)
at /usr/src/debug/asterisk-16.28.0-0.el9.x86_64/main/loader.c:1894
#16 0x000055555566862f in start_resource_list (mod_count=0x7fffffffce94, resources=0x7fffffffceb0)
at /usr/src/debug/asterisk-16.28.0-0.el9.x86_64/main/loader.c:1991
#17 load_resource_list (mod_count=<synthetic pointer>, load_order=0x7fffffffcea0) at /usr/src/debug/asterisk-16.28.0-0.el9.x86_64/main/loader.c:2173
#18 load_modules () at /usr/src/debug/asterisk-16.28.0-0.el9.x86_64/main/loader.c:2396
#19 0x000055555559e074 in asterisk_daemon (isroot=<optimized out>, rungroup=<optimized out>, runuser=<optimized out>)
at /usr/src/debug/asterisk-16.28.0-0.el9.x86_64/main/asterisk.c:4258
#20 main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/asterisk-16.28.0-0.el9.x86_64/main/asterisk.c:4025
After going through the debug steps a few more times, I tried searching the initial return value from srtp_kernel_init() which was srtp_err_status_cipher_fail; I ended up finding a resolved issue with the same symptoms for Fedora Linux:
The issue turns out to be an incompatibility with nss-3.63 shipped with F34. The attached patch adds NSS_PKCS11_2_0_COMPAT defines to 2 header files to enable the backward compatibility. The same fix is already in the upstream github repo and is targeted for the next libsrtp release.
I was able to apply their patch to the libsrtp 2.3.0 codebase, and the library is successfully loading now:
index 4d6031f..b1da343 100644
--- a/crypto/include/aes_gcm.h
+++ b/crypto/include/aes_gcm.h
## -66,6 +66,8 ## typedef struct {
#ifdef NSS
+#define NSS_PKCS11_2_0_COMPAT 1
+
#include <nss.h>
#include <pk11pub.h>
index ad306dd..a57564f 100644
--- a/crypto/include/aes_icm_ext.h
+++ b/crypto/include/aes_icm_ext.h
## -65,6 +65,8 ## typedef struct {
#ifdef NSS
+#define NSS_PKCS11_2_0_COMPAT 1
+
#include <nss.h>
#include <pk11pub.h>
I'm not sure the second hunk is needed, as ICM ciphers were loading fine, but I applied it anyway.

PHP7 MSSQL SqlServer Nginx Laravel Forge - Bad Gateway

I've been trying to get my Laravel Forge Ubuntu server to connect to a remote MSSQL server. I've finally got the server set up to where it can reach the remote database and make queries through terminal. However, when I try and use the connection within Laravel I'm getting a "502 Bad Gateway". I've done quite a bit of searching at this point and I'm still none the wiser on how to debug this.
Any help would be greatly appreciated! I've included my nginx and php-fpm logs edited for security.
var/log/nginx/XXX.co-error.log
2017/06/12 13:51:40 [error] 5682#5682: *84 recv() failed (104:
Connection reset by peer) while reading response header from upstream,
client: 000.000.000.000, server: XXX.co, request: "GET
/tools/labels/refreshsalesorderitems HTTP/1.1", upstream:
"fastcgi://unix:/var/run/php/php7.1-fpm.sock:", host: "XXX.co",
referrer: "http://XXX.co/tools/labels"
var/log/php7.1-fpm.log
[12-Jun-2017 13:51:40] WARNING: [pool www] child 8538 exited on signal 11 (SIGSEGV) after 3135.874644
seconds from start
[12-Jun-2017 13:51:40] NOTICE: [pool www] child 9087 started
core dump backtrace
/etc/php/7.1/fpm: Success.
[New LWP 22567]
Core was generated by `php-fpm: pool www '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fdb7d6bfbe9 in ?? ()
(gdb) backtrace
#0 0x00007fdb7d6bfbe9 in ?? ()
#1 0x372e350a0000005b in ?? ()
#2 0x3921ba89f34d7f00 in ?? ()
#3 0x0004000400000006 in ?? ()
#4 0x000055d8b19043a0 in ?? ()
#5 0x000055d8b1904810 in ?? ()
#6 0x0000000000000000 in ?? ()
(gdb)

How to use GDB to debug QEMU with SMP (symmetric multiple processors)?

I am in a graduate operating systems class, and we are emulating our kernel using QEMU, and debugging it using gdb. Debugging has been straight-forward enough.. up until now. How can I connect gdb to the other CPUs I have running in QEMU?
Our makefile allows us to start qemu with either "make qemu-nox" or "make qemu-nox-gdb" in one terminal, and if we used the latter, then to connect to it with gdb using just "gdb" in another terminal (in the same directory). Thus, I'm not quite sure how to connect to the same QEMU, again, but to a different processor (I'm running with a total of 4 right now).
Each qemu CPU is visible as a separate thread within gdb. To inspect the state of another CPU, use the thread command to switch CPUs.
(gdb) info thread
Id Target Id Frame
* 1 Thread 1 (CPU#0 [running]) 0x80105163 in stosl (addr=0x89c3e000, data=16843009, cnt=1024) at x86.h:44
2 Thread 2 (CPU#1 [halted ]) halt () at x86.h:127
3 Thread 3 (CPU#2 [halted ]) halt () at x86.h:127
4 Thread 4 (CPU#3 [halted ]) halt () at x86.h:127
(gdb) where
#0 0x80105163 in stosl (addr=0x89c3e000, data=16843009, cnt=1024) at x86.h:44
#1 0x801051bf in memset (dst=0x89c3e000, c=1, n=4096) at string.c:8
#2 0x80102b5a in kfree (v=0x89c3e000 "\001\001\001\001") at kalloc.c:63
#3 0x80102af4 in freerange (vstart=0x80400000, vend=0x8e000000) at kalloc.c:47
#4 0x80102ac1 in kinit2 (vstart=0x80400000, vend=0x8e000000) at kalloc.c:38
#5 0x8010386a in main () at main.c:37
(gdb) thread 3
[Switching to thread 3 (Thread 3)]
#0 halt () at x86.h:127
127 }
(gdb) where
#0 halt () at x86.h:127
#1 0x80104aeb in scheduler () at proc.c:288
#2 0x801038f6 in mpmain () at main.c:59
#3 0x801038b0 in mpenter () at main.c:50
#4 0x0000705a in ?? ()

Are TCL Regular Expressions Shared Across Interpreters - TclReExec Program terminated with signal 11, Segmentation fault?

In a single EXE process, are TCL regular expressions shared by each interpreter instance returned by Tcl_CreateInterp? How could threads with 4 different interpreter instances (0x94fbcd8,0x94dff20,0x94c4170,0x94a8760) all be making a call like TclReFree (re=0x86b0444) at ./../generic/regfree.c:52?
This comment in the TCL manual hints that objects may be shared...
Tcl objects are allocated on the heap and are shared as much as possible to
reduce storage requirements. Reference counting is used to determine when an
object is no longer needed and can safely be freed.
Source: https://www.tcl.tk/man/tcl8.4/TclLib/Object.htm
We're encountering crashes in our 32-bit server application. We've isolated the root cause to a TCL regular expression shared between threads concurrently running in separate TCL interpreter instances.
The interpreters are failing on this line of TCL
regsub "\\*" $s "\\*" s
The application concurrently runs TCL 8.4.11 interpreter instances. Each interpreter is executing "user TCL scripts" in separate threads. The app creates threads that "own" 1 interpreter instance created using Tcl_CreateInterp. Each thread then tells the interpreter instance to run a "user TCL script" with Tcl_EvalObjv. The crash happens when each interpreter is configured to run the same "user TCL script" on the line containing the regsub shown above.
This app has been running in dozens of different production environments for over 15 years. In the current environment, the app is running on Red Hat Linux 6.5 64-bit.
The core dump looks like...
Program terminated with signal 11, Segmentation fault.
#0 0x0811c020 in miss ()
(gdb) bt
#0 0x0811c020 in miss ()
#1 0x0811b7ed in shortest ()
#2 0x0811a4fa in find ()
#3 0x0811a429 in TclReExec ()
#4 0x080fc83f in RegExpExecUniChar ()
#5 0x080fc970 in Tcl_RegExpExecObj ()
#6 0x080bb9f1 in Tcl_RegsubObjCmd ()
#7 0x080b027a in TclEvalObjvInternal ()
#8 0x080d2726 in TclExecuteByteCode ()
#9 0x080d1bd1 in TclCompEvalObj ()
#10 0x080fbd6c in TclObjInterpProc ()
#11 0x080b027a in TclEvalObjvInternal ()
#12 0x080d2726 in TclExecuteByteCode ()
#13 0x080d1bd1 in TclCompEvalObj ()
#14 0x080fbd6c in TclObjInterpProc ()
#15 0x080b027a in TclEvalObjvInternal ()
#16 0x080b0527 in Tcl_EvalObjv ()
After recompiling the app with a version of TCL with the compile flag --enable-symbols=mem and linked with D.U.M.A. - Detect Unintended Memory Access http://duma.sourceforge.net/ (a fork of Electric Fence to help catch buffer overruns), I'm getting a core dump like
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xea8eeb70 (LWP 31004)]
0x08151496 in TclReFree (re=0x86b0444) at ./../generic/regfree.c:52
52 (*((struct fns *)re->re_fns)->free)(re);
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.132.el6_5.2.i686
(gdb) list
47 regfree(re)
48 regex_t *re;
49 {
50 if (re == NULL)
51 return;
52 (*((struct fns *)re->re_fns)->free)(re);
53 }
(gdb) bt
#0 0x08151496 in TclReFree (re=0x86b0444) at ./../generic/regfree.c:52
#1 0x08124360 in FreeRegexp (regexpPtr=0x86b0440) at ./../generic/tclRegexp.c:989
#2 0x08123ec2 in FreeRegexpInternalRep (objPtr=0xf64041b8) at ./../generic/tclRegexp.c:746
#3 0x08128cab in SetStringFromAny (interp=0x0, objPtr=0xf64041b8) at ./../generic/tclStringObj.c:1762
#4 0x08127894 in Tcl_GetUnicodeFromObj (objPtr=0xf64041b8, lengthPtr=0xea8ecee8) at ./../generic/tclStringObj.c:567
#5 0x080c3e9a in Tcl_RegsubObjCmd (dummy=0x0, interp=0x94fbcd8, objc=4, objv=0x94fbf28) at ./../generic/tclCmdMZ.c:718
#6 0x080b1386 in TclEvalObjvInternal (interp=0x94fbcd8, objc=5, objv=0x94fbf24, command=0x0, length=0, flags=0) at ./../generic/tclBasic.c:3088
#7 0x080e5a88 in TclExecuteByteCode (interp=0x94fbcd8, codePtr=0x95174e0) at ./../generic/tclExecute.c:1417
#8 0x080e4959 in TclCompEvalObj (interp=0x94fbcd8, objPtr=0x95097f0) at ./../generic/tclExecute.c:981
#9 0x08122a35 in TclObjInterpProc (clientData=0x9514520, interp=0x94fbcd8, objc=2, objv=0x94fbf1c) at ./../generic/tclProc.c:1100
#10 0x080b1386 in TclEvalObjvInternal (interp=0x94fbcd8, objc=2, objv=0x94fbf1c, command=0x0, length=0, flags=0) at ./../generic/tclBasic.c:3088
#11 0x080e5a88 in TclExecuteByteCode (interp=0x94fbcd8, codePtr=0xf64011f8) at ./../generic/tclExecute.c:1417
#12 0x080e4959 in TclCompEvalObj (interp=0x94fbcd8, objPtr=0x9513f68) at ./../generic/tclExecute.c:981
#13 0x08122a35 in TclObjInterpProc (clientData=0x9514d10, interp=0x94fbcd8, objc=2, objv=0xea8ee34c) at ./../generic/tclProc.c:1100
#14 0x080b1386 in TclEvalObjvInternal (interp=0x94fbcd8, objc=2, objv=0xea8ee34c, command=0x81a4ffe "", length=0, flags=0) at ./../generic/tclBasic.c:3088
#15 0x080b15e4 in Tcl_EvalObjv (interp=0x94fbcd8, objc=2, objv=0xea8ee34c, flags=0) at ./../generic/tclBasic.c:3204
#16 0x0808a812 in run_tcl_proc (pDevice=0x82405e0, pInterp=0x830d340, iNumArgs=2, objv=0xea8ee34c, bIsCommand=0 '\000', pCommand=0x0)
#17 0x08093492 in Tcl_begin_next_state (pDevice=0x82405e0, iNextState=RunPoll, pCommand=0x0)
#18 0x08093579 in Tcl_port_thread (dummy=0x8232c00)
#19 0x0014fb39 in start_thread () from /lib/libpthread.so.0
#20 0x00967d7e in clone () from /lib/libc.so.6
(gdb)
This gdb sessions also clearly shows concurrent threads executing regfree on the same regular expression, even though each thread's TCL interpreter instance is completely thread bound. There should be zero sharing between threads. The only thing they have in common is they are executing a "user TCL script" file with the same filename. The files were all loaded with Tcl_EvalFile into per-thread interpreter instances.
(gdb) info threads
45 Thread 0xe30e2b70 (LWP 31017) 0x00110430 in __kernel_vsyscall ()
--snip--
34 Thread 0xe9eedb70 (LWP 31005) 0x00110430 in __kernel_vsyscall ()
* 33 Thread 0xea8eeb70 (LWP 31004) 0x08151496 in TclReFree (re=0x86b0444) at ./../generic/regfree.c:52
32 Thread 0xeb2efb70 (LWP 31003) 0x08151496 in TclReFree (re=0x86b0444) at ./../generic/regfree.c:52
31 Thread 0xebcf0b70 (LWP 31002) 0x08151496 in TclReFree (re=0x86b0444) at ./../generic/regfree.c:52
30 Thread 0xec6f1b70 (LWP 31001) 0x08151496 in TclReFree (re=0x86b0444) at ./../generic/regfree.c:52
29 Thread 0xed0f2b70 (LWP 31000) 0x00110430 in __kernel_vsyscall ()
--snip--
1 Thread 0xf7fec8d0 (LWP 30970) 0x00110430 in __kernel_vsyscall ()
(gdb)
Note that this question is a completely separate crash from my previous question alloc: invalid block - Are Tcl_IncrRefCount and Tcl_DecrRefCount thread safe for threaded Tcl / 1 interp per thread?.
After digging through the app's code, I found a case where in thread A an interpreter is created and asked to run a proc but then in thread B used to run many other procs. I'm guessing this may be the root cause of this crash. Strangely, the app doesn't crash on Windows but crashes immediately (most of the time) on Linux. The app creates threads:
On Windows, using the Win32 API.
On Linux, using POSIX Threads / pthreads.
To answer your immediate question, REs are shared by two mechanisms. Firstly, they're bound to the internal representation of the Tcl_Obj values generated from the values in your script (e.g., the literals and the results of operations). Secondly, they're also stored in a size-bounded per-thread LRU cache.
Both of these mechanisms are strictly thread-bound. REs are not shared between threads; Tcl shares extremely little between threads.
However, there are a number of larger issues in your question.
If you're sending messages (err, scripts) between threads for execution, you're strongly recommended to use the Thread extension for this, as this takes care to copy things that need to be copied. The Thread extension ships with a full distribution of Tcl 8.6 (it's now a contributed package, along with [incr Tcl], SQLite and TDBC) but it should be available separately for older versions of Tcl.
Also, you're using a doubly-unsupported version of Tcl. The most recent version of 8.4 is 8.4.20 (which should be a drop-in replacement) and even that has been out of security/build support for several years now. You really are recommended to upgrade. 8.5.17 is the current long-term support release, and 8.6.3 is the current production release. (They're also quite a bit faster on a lot of code.)

Apache Camel: handling unix file permission errors wrapped in GenericFileOperationFailedException

Here's the problem I've been grappling with for a while...I'm using Camel (v2.10.2) to set up many file routes to move data across file systems, servers, and in/out of the organisation (B2B). There are data and signal files in their respective dirs with some of the routes being short lived, while others run as services on different VMs/servers. These processes (routes) are run under different unix 'functional' ids, but there is an attempt to make them belong to the same unix group(s) if possible...
Of course on unix there is always the potential for file/dir permission problems...and that is the issue I'm facing/trying to solve.
I use the DefaultErrorHandler and log success or failure for an exchange via a custom RoutePolicy within the onExchangeDone(...) checking the Exchange.isFailed(). The signal file is either moved to the destination on success or moved to .error dir on fail, with an alert written to a system-wide alert log (checked by Tivoli)
The file route is configured to propagate errors occurring while picking up files, etc via the consumer.bridgeErrorHandler=true
Basically, if I have any unix permission related errors, then I want to stop (and maybe remove) the effected route, indicating clearly that this has happened and why - a permission issue is not easily solvable programmatically, so stop and alert is the only option.
So I'll illustrate a test case that causes an issue...
App_A creates some data files in ./data/. Then App_A creates the same number of signal files in ./signal/, but due to some 'data' related bug it also creates a signal file ./signal/acc_xyz.csv that doesn't have a corresponding data file.
Route starts to process ./signal/acc_xyz.csv and the 'validation process' finds that ./data/acc_xyz.csv doesn't exist and throws an exception to indicate this, hence stopping the exchange being processed further.
The File component is configured with moveFailed=.error to move the signal file to ./signal/.error/, but this dir is locked (don't worry why this is) to the functional user id executing the Java process and internal Camel processing throws a GenericFileOperationFailedException indicating the cause to be an underlying 'Permission denied' issue.
Oh dear, the same signal file is then processed again, and again, and...
I have tried to get this 'secondary error' propagated to my code, but have failed, hence I can't stop the route.
How can I get this and other internal Camel errors propagated to my code/exception handler/whatever and not just seeing it be logged and swallowed?
thanks in advance
ok more detail from log4j...note the sequence of times
Camel DefaultErrorHandler:
2013-04-25 15:06:26,001 [Camel (camel-1) thread #0 - file:///FTROOT/fileTransfer/outbound/signal] ERROR (MarkerIgnoringBase.java:161) - Failed delivery for (MessageId: ID-rwld601-rw-discoverfinancial-com-60264-1366902384246-0-1 on ExchangeId: ID-rwld601-rw-discoverfinancial-com-60264-1366902384246-0-2). Exhausted after delivery attempt: 1 caught: java.lang.IllegalStateException: missingFile: route [App_A.outboundReceipt] has missing file at /FTROOT/fileTransfer/outbound/data/stuff.log
java.lang.IllegalStateException: missingFile: route [App_A.outboundReceipt] has missing file at /FTROOT/fileTransfer/outbound/data/stuff.log
at com.myco.mft.process.BaseFileRouteBuilder.checkFile(BaseFileRouteBuilder.java:934)
My alert logger via the RoutePolicy.onExchangeDone(...) - at this pont the exchange has completed with a failure:
2013-04-25 15:06:26,011|Camel (camel-1) thread #0 - file:///FTROOT/fileTransfer/outbound/signal|exchange|App_A.outboundReceipt|signalFile=/FTROOT/fileTransfer/outbound/signal/stuff.log|there has been a routing failure|missingFile: route [App_A.outboundReceipt] has missing file at /FTROOT/fileTransfer/outbound/data/stuff.log
Camel endpoint post-processing - this is the stuff that Camel doesn't propagate to me:
2013-04-25 15:06:26,027 [Camel (camel-1) thread #0 - file:///FTROOT/fileTransfer/outbound/signal] WARN (GenericFileOnCompletion.java:149) - Rollback file strategy: org.apache.camel.component.file.strategy.GenericFileDeleteProcessStrategy#104e28b for file: GenericFile[/FTROOT/fileTransfer/outbound/signal/stuff.log]
2013-04-25 15:06:28,038 [Camel (camel-1) thread #0 - file:///FTROOT/fileTransfer/outbound/signal] WARN (MarkerIgnoringBase.java:136) - Caused by: [org.apache.camel.component.file.GenericFileOperationFailedException - Error renaming file from /FTROOT/fileTransfer/outbound/signal/stuff.log to /FTROOT/fileTransfer/outbound/signal/.error/stuff.log]
org.apache.camel.component.file.GenericFileOperationFailedException: Error renaming file from /FTROOT/fileTransfer/outbound/signal/stuff.log to /FTROOT/fileTransfer/outbound/signal/.error/stuff.log
at org.apache.camel.component.file.FileOperations.renameFile(FileOperations.java:72)
...
Caused by: java.io.FileNotFoundException: /FTROOT/fileTransfer/outbound/signal/stuff.log (Permission denied)
at java.io.FileInputStream.open(Native Method)

Resources