jMimeMagic returning mime type for docx, pptx, jar files as application/zip - mime-types

I read the mimetype for .docx file is application/vnd.openxmlformats-officedocument.wordprocessingml.document. But when I upload a .docx file(one that I just created, not from a zip file) and check for its mimetype in my application using
String mimeType = Magic.getMagicMatch(file1, false).getMimeType();
I get Mimetype as application/zip.
I get the same result when I try to upload a .jar file.
I mean this way, how can I check if the user is uploading a msword or a jar file to my application?

All of the .*x Office variants (.docx, .pptx, and so on) are XML-based content which is wrapped in a ZIP "container" to keep them compact, and your library is detecting the ZIP header correctly but then either not checking for, or failing to find, the additional information that would allow it to distinguish those from a ZIP file containing whatever random data someone put into it.
Similarly, the JAR file format is an extension of the ZIP file format, so if the library does not know to check for the "special type of ZIP" case, it would simply report it as a ZIP file.

Related

how to get the type of the file before its compression

For example, if we have the following file: file.txt that after the compression is now file.new (new is the new extension) , how to obtain that .txt extension, that is forgotten?
I need that to decompress the file.
In general, if you lose the file name extension you can't get it back. It's as simple as this.
However, there might be chances depending on the compression format. Some formats do store the original file name (along with other informations) in the compressed file. And the "decompressor" will be able to recreate those properties.
Anyway, it's good practise to name a compressed file with an additional extension, in your case file.txt.new.
Oh, and you don't need to know the file name extension to uncompress the compressed file. Just uncompress it and give it a temporary name. As #MarcoBonelli said, file contents and file name extensions have no fixed relation. They are just a convention to handle them conveniently.
For example: You can rename a EXE to DOCX. Windows will show the Word icon but it is still an executable. Windows will not attempt to run it, though.
To know what a file contains can be difficult. The magic number Marco linked to might give you some hint.

How to load a table with multiple zip files in Snowflake?

I'm trying to upload data to a Snowflake table using a zip file containg multiple CSV files but I keep getting the following message:
Unable to copy files into table. Found character '\u0098' instead of
field delimiter ',' File 'tes.zip', line 118, character 42 Row 110,
column "TEST"["CLIENT_USERNAME":1] If you would like to continue
loading when an error is encountered, use other values such as
'SKIP_FILE' or 'CONTINUE' for the ON_ERROR option. For more
information on loading options, please run 'info loading_data' in a
SQL client.
If I skip the errors some data load but it is like snowflake is not properly opening the zip file and I just get some random characters like if the zip file was only opened with notepad.
I tried changing the File Format Compression Method to all the available ones: Auto, Gzip, Deflate, Raw Deflate, Bz2m Brotli, Zstd and None. Getting different error messages.
I know my Zip file is compressed using the standard Deflate compression method but when I select this type I'm getting the following error:
Invalid data encountered during decompression for file: 'test.zip',compression type used: 'DEFLATE', cause: 'data error'
The "Auto" method sends the same error message as None
I also tried with zip files containing only one file and I get the same errors. The files that worked correctly were an uncompressed one (CSV) and one compressed using GZ but I need this to work using a zip file containing multiple CSVs
A zip file is not a DEFLATE file, even though zip uses deflate. All the compression methods supported are single file compression methods. Where-as zip is a file archive, thus why it has many files, which would be similar to are tar.gz which is also not supported.
Thus you will ether need to uncompress your files yourself, in your S3 bucket, or alter your data export tool to conform.
CREATE FILE FORMAT help

What mime type should I use for CSV ZIP files?

I am trying to send an email using Simple Java Mail API. The email will contain a CSV ZIP file attachment. Should I use text/csv or application/zip for the mime type of the attachment?
A zip file is a zip file, no matter what it contains.
It should be application/zip.
It is not a CSV file, if you tried to parse it as CSV it would fail. It is not text/csv.
In linux, you can find way much of the in the following file: /etc/mime.types.
See below to find out what mime is suitable for zip files.
grep zip /etc/mime.types
application/bacnet-xdd+zip xdd
application/epub+zip epub
application/gzip gz tgz
application/lpf+zip lpf
application/prs.hpub+zip hpub
application/tlsrpt+gzip
application/vnd.airzip.filesecure.azf azf
application/vnd.airzip.filesecure.azs azs
application/vnd.comicbook+zip cbz
application/vnd.d2l.coursepackage1p0+zip
application/vnd.dece.zip uvz uvvz
application/vnd.espass-espass+zip espass
application/vnd.etsi.asic-e+zip asice sce
application/vnd.etsi.asic-s+zip asics
application/vnd.exstream-empower+zip mpw
application/vnd.ficlab.flb+zip flb
application/vnd.gov.sk.e-form+zip
application/vnd.imagemeter.folder+zip imf
application/vnd.imagemeter.image+zip imi
application/vnd.iso11783-10+zip
application/vnd.laszip
application/vnd.logipipe.circuit+zip lcs lca
application/vnd.software602.filler.form-xml-zip zfo
application/vnd.stepmania.package smzip
application/zip zip
image/vnd.airzip.accelerator.azv azv
model/vnd.usdz+zip usdz
application/x-bzip2 bz2

Huffman compressed files with my own extension

I am working on a project that uses Huffman algorithm to compress files, and I am doing my project using Java, what I want is to create my own file extension say (.huff) for the compressed file, and when I right click a file if it has the (.huff) extension, I want to add a new option which decompresses it, I searched the web but I did not find anything useful.
Any help would be appreciated.
To set the file extension just use one of the String methods like append(".yourExtension") (append it to the filename) and set as filename. Simple as that.
String filename = filename.append(extension);
To decompress the compressed file, I suggest you write a metod to which you provide a path to file as argument, check if the file extension is correct and then in another method you decompress this file.
There is nothing special about a file extension, it's just a part of the file name. To create a .huff file extension, just add .huff to the end of the file name.
To add the windows context menu, that's explained in the question linked in the comments How can I add a context menu to the Windows Explorer for a Java application?
I would recommend creating a batch script that will launch your program taking in the file to decompress as an argument.
Something similar to:
#echo off
java -cp <path-to-jar> <decompression main class> %1
Adding in any other setup or program arguments you need. Then a registry entry might look like.
HKEY_CLASSES_ROOT\.huff\shell\Decompress huffman encoded file\command
"<path to batch file>" "%1"

What is file name extension .done?

What is the file name extension ".done"?
How to handle the files with this extension? For example filename.log.gz.done is file name, can we just remove the done extension and use it?
What is the use of this file extension?
".done" is just a marker that signifies that the file is ready for consumption.
So yes, get rid of the done extension and use it.
More details can be found here: http://www.davsclaus.com/2010/12/camel-26-using-done-files-with-fileftp.html
The ".done" file extension may be appended onto any type of file, such as a .TXT or .LOG file, and may be found on an FTP server where multiple have access to the file. The DONE file helps prevents a user from accessing a file that is not meant to be accessed. The ".done" extension should be removed in order to open the actual file.
This takes you to the source from which I'm quoting my answer from. You will see a clear explanation of .done file extension and it's also a great reference for any other file extensions that you might need information about

Resources