Simple library to encapsulate a file in an uncompressed ZIP file? - c

I want to send a file as an email attachment, but at present there is an email filter that prevents that. Is there a simple method or library to encapsulate a file of any length inside an uncompressed ZIP file? I'd like to avoid adding an actual ZIP library that compresses, if I can. For one thing, the file I'm sending is already compressed.

The zip format has a stored method (method 0) that would allow you to simply enclose the file in the appropriate headers. See the PKWare appnote.txt for a description of the format. You would need to calculate the CRC-32 of the data to include in the headers.

Related

Determine a CSS file is valid CSS file

I'm attempting to validate that a file that is uploaded to my application is actually that file to increase security.
For image formats, we can check the magic bits and the file extension to determine the format.
I'm looking to do the same for a CSS file. From my understanding, there's no magic bits for a CSS file. The extent of what I could check would be the magic bits from a UTF-8 formatted file, which wouldn't protect against scripts in the event of an injection flaw.
Currently we validate the file extension is correct, but if you were to change the file extension, any file could be uploaded.
Is there a best practice way of validating a CSS file?
Can you read file content and use regex to check for valid css styles, such as:
([#.#]?[\w.:> ]+)[\s]{[\r\n]?([A-Za-z\- \r\n\t]+[:][\s]*[\w .\/()\-!]+;[\r\n]*(?:[A-Za-z\- \r\n\t]+[:][\s]*[\w .\/()\-!]+;[\r\n]*(?2)*)*)}
See:
https://regex101.com/r/fK9mY3/1

how to add a header to a large file in camel?

I have finally managed to split a large file and reaggregate into smaller (but still very large files)
At the end of the writing I have the count of records in each file. This needs to be added to each of the smaller files as a header.
What is the best way to accomplish this in a performant way ?
Possibilities I considered:
Write a file with each header as the split data files are being generated.
At the end match up the header and the data file and write concatenate it.
I am running into issues how to read the file in a non polling way and how to trigger the concat phase. This would require re-writing the entire big files
keep file headers in a message header or exchange and when all files are written, read all the files from the directory, find a matching header file and add it to output.
This would require re-writing the entire big files
Add a dummy header with placeholders for the count data and somehow modify the data files in place....
this seems most performant but not sure how to do this
Header: 3 records
a
b
c

File extension detection mechanism

How Application will detect file extension?
I knew that every file has header that contains all the information related to that file.
My question is how application will use that header to detect that file?
Every file in file system associated some metadata with it for example, if i changed audio file's extension from .mp3 to .txt and then I opened that file with VLC but still VLC is able to play that file.
I found out that every file has header section which contains all the information related to that file.
I want to know how can I access that header?
Just to give you some more details:
A file extension is basically a way to indicate the format of the data (for example, TIFF image files have a format specification).
This way an application can check if the file it handles is of the right format.
Some applications don't check (or accept wrong) file formats and just tries to use them as the format it needs. So for your .mp3 file, the data in this file is not changed when you simply change the extension to .txt.
When VLC reads the .txt byte by byte and interprets it as a .mp3 it can just extract the correct music data from that file.
Now some files include a header for extra validation of what kind of format the data inside the file is. For example a unicode text file (should) include a BOM to indicate how the data in the file needs to be handled. This way an application can check whether the header tag matches the expected header and so it knows for sure that your '.txt` file actually contains data in the 'mp3' format.
Now there are quite some applications to read those header tags, but they are often specific for each format. This TIFF Tag Viewer for example (I used it in the past to check the header tags from my TIFF files).
So or you could just open your file with some kind of hex viewer and then look at the format specifications what every bytes means, or you search Google for a header viewer for the format you want to see them.

combine a binary file and a .txt file to a single file in python

I have a binary file (.bin) and a (.txt) file.
Using Python3, is there any way to combine these two files into one file (WITHOUT using any compressor tool if possible)?
And if I have to use a compressor, I want to do this with python.
As an example, I have 'file.txt' and 'file.bin', I want a library that gets these two and gives me one file, and also be able to un-merge the file.
Thank you
Just create a tar archive, a module that let's you accomplish this task is already bundled with Cpython, and it's called tarfile.
more examples here.
there are a lot of solutions for compressing!
gzip or zlib would allows compression and decompression and could be a solution for your problem.
Example of how to GZIP compress an existing file from [http://docs.python.org]:
import gzip
f_in = open('file.txt', 'rb')
f_out = gzip.open('file.txt.gz', 'wb')
f_out.writelines(f_in)
f_out.close()
f_in.close()
but also tarfile is a good solution!
Tar's the best solution to get binary file.
If you want the output to be a text, you can use base64 to transform binary file into a text data, then concatenate them into one file (using some unique string (or other technique) to mark the point they were merged).

How to Generate the HTTP Content-Type Header in C?

So I'm working on a networking assignment to produce a basic HTTP/1.0 web server in C. I have most of it figured out, but one of the requirements is that it properly populate the Content-Type field in the header, and I can't seem to find any way to do this automatically.
I'm already using fstat() to get the file size and when it was last modified, and I noticed that also includes a "st_objtype" field, but after some research it looks like this is just the AS/400 object type (which is obviously not what I need) and stat() & lstat() appear to do essentially the same thing as fstat().
Is there any way in C to automatically generate a string with the HTTP-style file type for a given file, or do I just need to make a big list of types and plug the correct value into the header based on the ending of the requested file (.txt, .html, .png, etc)?
Some examples of the Content-Type field for various files I checked:
Content-Type: text/html; charset=ISO-8859-1
Content-Type: image/png
Content-Type: application/x-gzip
Content-Type: application/pdf
Some systems contain a file called /etc/mime.types which contains a bunch of extensions and MIME type pairs.
See the documentation for such a file.
Probably the best approach is a lookup table based on extensions. The idea that a given file has a single "type" associated with it is just wrong. At least using extensions gives you a bit of power to control how the file contents are interpreted. For example if you wanted to show an example of how html source works, you could rename example.html to example.html.txt and have the client treat it as text/plain. If you just used a heuristic to determine that the file contents "are html", you'd be stuck.
A file is only a bag of bytes, so there is no way to know for sure that a .html file doesn't contain C code, for example. You could delegate to the file command, which contains a lot of heuristics for determining these kinds of things based on the first few bytes of a file (scripts start with #!, for instance), but file still isn't foolproof. I would recommend the lookup table based on filenames.

Resources