File upload over REST API - getting headers in the stream - file

I'm using the latest version of CodeIgniter and REST_Controller library for a REST functionality.
I'm trying to do a simple file upload to a server with Apache 2.2 and PHP 5.4 installed on it.
In order to get a content from the client, i've tried to use:
$file_content = $this->put();
//Or
$file_content = file_get_contents('php://input');
No matter which way I use, I always get something like this:
------WebKitFormBoundary3MTYXUNPMDmX8MXs
Content-Disposition: form-data; name="fileUpload"; filename="something.txt"
Content-Type: application/octet-stream
-----BEGIN RSA PRIVATE KEY-----
MIICXQIBAAKBgQC113DhhghzOiZHds2EOY7578Q1X141/kzpXodQZ4sCq+dOs3/O
iZS/j2y7ScE+4aRzQrPw/fPCsotwcCARfR0mhbKtUB8pE1n2pTcXJxqRGQPIVk6g
ZjsVhuCk9l880Zx8M4A2ebOR1i0SgLazpThlh3BNLPbwDIuXYE+9Qp94uQIDAQAB
AoGBALF61kz3wfWdEtF7bfmZKChf0XR6YXx3eN/piE580RvJZpjU73BJrioNtYVS
5k8WcqiguPoFE067bwdOGK6ZG8HgzfgZvs8hVN153fPoidmkPPvViwD7bNDJIG/5
-----END RSA PRIVATE KEY-----
------WebKitFormBoundary3MTYXUNPMDmX8MXs--
This, of course, is not good since there are some HTTP headers inside with a boundary.
So the question is - is there a way not to get these tags or how to clean them out?

You need to use the File Uploading class for this:
http://ellislab.com/codeigniter/user-guide/libraries/file_uploading.html
https://github.com/EllisLab/CodeIgniter/blob/develop/system/libraries/Upload.php#L346
Don't forget you need to configure the Upload object, the uploaded file cannot be read from the temp dir, it needs to be moved to a local dir with read/write permissions.

Related

Form Data: wrong Content-Type for .p7m files

I need to save a file with correct MimeType for .p7m files (application/pkcs7-mime) via form upload to the server.
In the request I noticed that Content-Type is wrong:
------WebKitFormBoundaryaglEgtBJlb65v7d5
Content-Disposition: form-data; name="file0"; filename="getmymimeplease.p7m"
Content-Type: application/pkcs7
it should be:
Content-Type: application/pkcs7-mime
How is possible that the '-mime' part is missing (or truncated) ?
This is usually controlled by OS and/or Browser. On windows, this is set in the registry, in HKEY_CLASSES_ROOT\.<fileextension>, e.g. HKEY_CLASSES_ROOT\.p7m, in the Field Content Type:
So in the end, this is controlled by the client. So if there are several possible mime types for the same extension, you need to cover that in your server code (accept or decline, convert to your default or not)

What is the type you get when you call javax.mail.Message.getInputStream()

I am trying to parse an email then save the actual email as a file in the file system for auditing and remove it from the exchange server. (same way like you do email save as in outlook)
So to do that I have found out that I can call
Message.getInputStream()
To retrieve the file bytes. Its working Ok, and I can write the email as a file to the file system.
my question is what is this file type? is it .eml or .msg? or something else?
when looking at its content I see text and not binary data
--_004_MWH_#TRUNCATED#_11namp_
Content-Type: multipart/alternative;
boundary="_000_MWHPR_#TRUNCATED#_1711namp_"
--_000_MWHPR10_#TRUNCATED#_HPR10MB1711namp_
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
getInputStream doesn't say much about the type of data...
https://docs.oracle.com/javaee/7/api/javax/mail/Part.html#getInputStream--
Use getContentType on the message. Mime messages map to .eml on Windows.
You need to use Message.writeTo to save the message to a file.

Add SSL certificate from Godaddy to Google App Engine

I'm trying to add an SSL certificate that I created on Godaddy to my Google App Engine account on a Mac.
Using Keychain, I created a new 2048bit RSA private-public key pair, and with it created a CertificateSigningRequest.certSigningRequest. I then used this certificate signing request to create the new SSL certificate on Godaddy. They then let me download a zip file with two .crt files in it (734b34####.crt and gd_bundle-g2-g1.crt).
And then trying to add it to GAE, I get this screen:
Can anyone tell me what to enter as "PEM encoded X.509 public key certificate" and what as "Unexcrypted PEM encoded RSA private key"?
I tried exporting from Keychain all different relevant keys and certificates in all kinds of format (p12, cer, and converting them to pem), even without passwords on them.
For some reason, whenever I export & convert the private key, its beginning looks like this:
Bag Attributes
friendlyName: *.mydomain.com
localKeyID: 10 93 42 BE 45...
subject=/OU=Domain Control Validated/CN=*.mydomain.com
issuer=/C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU=http://certs.godaddy.com/repository//CN=Go Daddy Secure Certificate Authority - G2
-----BEGIN CERTIFICATE-----
After not finding any guide to do it on a Mac, and trying different options for hours, here's what I did:
Concat the two .crt provided by Godaddy into one: cat 734b34####.crt gd_bundle-g2-g1.crt > godaddy.crt.
Use godaddy.crt for the first certificate ("PEM encoded X.509 public key certificate").
In Keychain, export (without a password) the private key that was used for the certificate signing request in p12 format, let's call it private.p12:
Convert the p12 private key: openssl pkcs12 -in private.p12 -out private.pem -nodes -clcerts. The password is just empty.
[EDIT] - then convert the private.pem file to RSA type: openssl rsa -in private.pem -out private_unencrypted.pem -outform PEM
Copy the contents of the created file: pbcopy < private_unencrypted.pem.
Paste (what we've just copied) into the second text area ("Unecrypted PEM encoded RSA private key").
Edit the pasted text, so that all of the text starting from Bag Attributes until -----BEGIN RSA PRIVATE KEY----- (excluding) is deleted. The result is a long string that starts with -----BEGIN RSA PRIVATE KEY----- and ends with -----END RSA PRIVATE KEY-----.
You should now be able to click the Upload button at the bottom.
Phew!
Would love to see if anyone had a more elegant / official way to do it.

Rackspace cloud files return "application/unknown" as mime-type when uploaded with Jclouds

Basically I have this code which uploads javascripts and other content to Rackspace using Jclouds:
SwiftObject obj = cloudFilesClient.newSwiftObject();
obj.getInfo().setName(name);
obj.getInfo().setContentType(contentType);
obj.setPayload(payloadFile);
cloudFilesClient.putObject(container, obj);
I noticed that Chrome complains about scripts being transferred with text/plain and so set out to investigate. curl -I report instead: Content-Type: application/unknown.
I've Googled a lot and tried to find some clues, and I've tried:
not setting content type at all
setting empty string (found some rumour about that somewhere)
setting to application/javascript (correct)
setting to text/javascript (wrong, but common)
obj.getAllHeaders().put("Content-Type", contentType);
When we used to upload with basic HTTP before, this just worked without setting anything manually at all.
Finally finally managed to figure it out by digging in the source code - this works:
FilePayload payload = new FilePayload(uploadableFile.localPath.toFile());
payload.getContentMetadata().setContentType(uploadableFile.contentType);
obj.setPayload(payload);
In case anyone else is looking for this in the future, posting Q&A.

how to get dataset of 10.000 static html pages from Wiki

I am working on a classification algorithm. In order to do that I need a dataset that contains about 10,000 static HTML pages from wikimedia. Something like
page-title-1.html .... page-title-10000.html
I tried Google and I find out that my best solution was downloading it from http://dumps.wikimedia.org/other/static_html_dumps/2008-06/en/.
However, I do not know how to use it in order to get what I want.
There are some files as following
html.lst 2008-Jun-19 17:25:05 692.2M application/octet-stream
images.lst 2008-Jun-19 18:02:09 307.4M application/octet-stream
skins.lst 2008-Jun-19 17:25:06 6.0K application/octet-stream
wikipedia-en-html.tar.7z 2008-Jun-21 16:44:22 14.3G application/x-7z-compressed
I want to know how to do with *.lst files and what is in wikipedia-en-html.tar.7z
You might want to read the section "Static HTML tree dumps for mirroring or CD distribution" of Database download on Wikipedia (and in fact that whole page, which points you to 7zip for unpacking the main archive).

Resources