CamelFileName vs. message body, file operation - apache-camel

I have implemented a bz2 decompressor by means of the Apache commons-compress library to decompress bz2 files with camel below a certain point in the directory structure on the file system. I have picked up the file name to decompress from the CamelFileName header, opened the file with my decompressor and put the decompressed file back into the same directory. It works fine. The process() method that calls the decompressor I copied here shortened; this processor is invoked for all necessary files by a camel route:
public void process(Exchange exchange) throws Exception {
LOG.info(" #### BZ2Processor ####");
BZ2 bz2 = new BZ2();
String CamelFileName = exchange.getIn().getHeader("CamelFileName", String.class);
bz2.uncompress(CamelFileName);
}
I think that it would have been nicer if I take the file from the message body. How would you have implemented it that way?

The Body would be of type InputStream. You can directly work with this Java type. Camel reads the file on demand. I.e. when you try to access it in the route or in your bean:
String text = exchange.getIn().getBody(String.class); //or
byte[] bytes = exchange.getIn().getBody(byte[].class); //or
InputStream is = exchange.getIn().getBody(InputStream.class);
Use one of the above as you see fit. As for closing it, don't worry Camel will take care of it.

Related

Detecting file MIME in C

I have files with wrong extensions, and try to find the correct MIME in a C script.
For a PDF file with txt extension, magic (#include <magic.h>)
const char *mime;
magic_t magic;
magic = magic_open(MAGIC_MIME_TYPE);
magic_load(magic, NULL);
magic_compile(magic, NULL);
mime = magic_file(magic, filename);
printf("%s\n", mime);
magic_close(magic);
returned
application/octet-stream
which is not very helpful.
GIO 2.0 (#include <gio/gio.h>)
char *content_type = g_content_type_guess (file_name, NULL, 0, &is_certain);
if (content_type != NULL)
{
char *mime_type = g_content_type_get_mime_type (content_type);
g_print ("Content type for file '%s': %s (certain: %s)\n"
"MIME type for content type: %s\n",
file_name,
content_type,
is_certain ? "yes" : "no",
mime_type);
g_free (mime_type);
}
returned
Content type for file 'test.txt': text/plain (certain: no)
MIME type for content type: text/plain
However, file command in Linux returns the correct MIME:
file test.txt
test.txt: PDF document, version 1.6
This should not be the expected behaviors of these well-established libraries in C. What do I do wrong?
It is true, that file utility is base on top of libmagic, but what really determines returned values is flags provided to libmagic_open (or appropriate set functions) and used database of MIME types.
Library provides means to use pre-compiled database and raw database (has to be compiled by calling libmagic_compile), which is your case. Documentation defines default dabase files when called using NULL parameter as a /usr/local/share/misc/magic for raw database (on debian directory link from /usr/share/misc/magic to ../file/magic/, and is empty) and magic.mgs in same parent directory.
Compiled library is by default placed into working directory and on my debian system seams to be empty (confirmed by default directory of database data being empty). After realizing this, I tried your example with magic_compile removed and it seams to improve things significantly.

How to receive large HTML data using SSL_read

while(byte_count != 0){
byte_count = SSL_read(conn,get_buffer,sizeof(get_buffer));
printf("%s",get_buffer);
write_to_file(get_buffer,html,byte_count); // func to write to file
}
I've been trying to write a http/https client using sockets and SSL in C. The task is to get the HTML file of the landing page of a given website into a file on my system. I've handled the HTTP redirections and I was able to read only a portion of the HTTP payload since I've only called recv/SSL_read once. When I put this in a while loop it reads a few more 16kb segments and the connection times out. Is there any other way I can obtain whole of the HTML file ? (Sorry if this question seems vague, I'll be glad to make edits according to your responses)

How to release lock of file (camel exchange) to move it on exception (corrupted gz file)

I need to implement a handler that reacts on ZipException to move away corrupted gz files, otherwise the route will endlessly retry to unmarshal the gz.
The problem is that at the moment the exception is thrown there is a lock on this file (on linux canWrite() returns false) and there is the Camel lock file.
Is there an elegant Camel way to say/configure the onException that the lock is released (get write access and remove lockfile - if there is one)?
At the moment my code looks like that:
onException(ZipException.class)
.handled(true)
.process(corruptedFileProcessor)
.stop();
Thanks in advance.
The following route reads gzipped files from srcDir, writes unzipped files to destDir (without the .gz extension) and when a ZipException occurs, sends the file to errorDir.
from("file://srcDir/?delete=true")
.onException(ZipException.class)
.handled(true).useOriginalMessage()
.to("file://errorDir?autoCreate=true")
.end()
.unmarshal().gzip()
.to("file://destDir?autoCreate=true&fileName=${file:name.noext}");

Changes being made to a file downloaded with libcurl don't take effect

Let me explain in more detail. I'm trying to write a program that downloads a file from a remote FTP server, appends one line to the end of it, and then re-uploads it. The file operations work, and the text is appended to the file and re-uploaded, but when I download the file again, no text was appended. I've written a small test program to demonstrate this; here's the code at Pastebin:
http://pastebin.com/r07TkxEK
The program prints the following output on both the initial run and subsequent runs::
Remote URL: ftp://orangesquirrels.com
Got data.
Local data file size: 678 bytes.
Current position in file: 678
Uploading database file back to server...
Local data file size: 690 bytes.
Remote URL is ftp://orangesquirrels.com !
*** We read 690 bytes from file.
If the program works, the output from the subsequent run should be:
Remote URL: ftp://orangesquirrels.com
Got data.
Local data file size: 690 bytes.
Current position in file: 690
Uploading database file back to server...
Local data file size: 702 bytes.
Remote URL is ftp://orangesquirrels.com !
*** We read 702 bytes from file.
Because the data is written to the file and re-uploaded (I know this because the uploaded file is a greater size than the downloaded file) I assume the upload worked; my suspicion is that the problem lies in the download process and/or the curl_database_write function. I've been doing everything humanely possible to find out why this is happening, to no avail. If anyone knows anything about why this isn't working, I'd love to know. I'm being paid to write this program, and I know I've got to find a solution soon...
you are not using "short_database" nor "file_to_write" in your download()-function. So you download /tmp/musiclist.txt from the ftp-server instead of musiclist.txt.
you should check your defines, and when you use a define and when you use a stringvariable and when you use parameters
#define DATABASE_FILE "/tmp/musiclist.txt"
#define REMOTE_DATABASE_FILE "musiclist.txt"
int main() {
remove(DATABASE_FILE);
download(DATABASE_FILE, "musiclist.txt", REMOTE_URL, "Testing 123");
// ^^^ should't this be REMOTE_DATABASE_FILE?
...
}
void download(const char* file_to_write, const char* short_database, const char* addr, const char* msg) {
remove(DATABASE_FILE); // again?!?
struct FtpFile ftpfile={
DATABASE_FILE, /* name to store the file as if succesful */
// ^^^ which one? file_to_write or short_database
NULL
};

Writing my own HTTP Server - How to find relative path of a file

I'm currently writing an HTTP Server over UNIX Sockets in C, and I'm about to implement the part of the GET request that checks the requested file to make sure it has appropriate permissions.
Before I knew anything about HTTP servers, I set up an Apache server, and it is my understanding that there is a single directory which the HTTP server looks to find a requested file. I do not know if this is because the server somehow has no permissions outside of the directory, or if it actually validates the path to ensure it is inside the directory.
Now that I am about to implement this on my own, I'm not sure how to properly handle this. Is there a function in C that will allow me to determine if a path is inside a given directory (e.g. is foo/bar/../../baz inside foo/)?
In python, I would use os.path.relpath and check if the result starts with .., to ensure that the path is not outside the given directory.
For example, if the directory is /foo/bar/htdocs, and the given path is index.html/../../passwords.txt, I want ../passwords.txt, so I can see from the leading .. that the file is outside the /foo/bar/htdocs directory.
You'd be surprised how much of Python's I/O functionality more or less maps directly to what POSIX can do. :)
In other words, look up realpath().
It's awesome when POSIX has the more descriptive name for a function, with that extra letter included! :)
How to get the absolute path for a given relative path programmatically in Linux?
#include <stdlib.h>
#include <stdio.h>
int main()
{
char resolved_path[100];
realpath("../../", resolved_path);
printf("\n%s\n",resolved_path);
return 0;
}
You can try that. As the same ser (unwind) answered there.
The way it works is much simpler: once the server receives a request, it ONLY looks at its htdoc (static contents) directory to check if the requested resource exists:
char *htdoc = "/opt/server/htdoc"; // here a sub-directory of the server program
char *request = "GET /index.html"; // the client request
char *req_path = strchr(request, ' ') + 1; // the URI path
char filepath[512]; // build the file-system path
snprintf(filepath, sizeof(filepath) - 1, "%s/%s", htdos, req_path);
FILE *f = fopen(filepath, "r"); // try to open the file
...
Note that this code is unsafe because it does not check if the request ventures in the file system by containing "../" patterns (and other tricks). You should also use stat() to make sure that the file is a regular file and that the server has permissions to read it.
As a simple (but incomplete) solution, I just decided to write a bit of code to check the file path for any ...
int is_valid_fname(char *fname) {
char *it = fname;
while(TRUE) {
if (strncmp(it, "..", 2) == 0) {
return FALSE;
}
it = strchr(it, '/');
if (it == NULL) break;
it++;
}
return TRUE;
}

Resources