Gzip Files: Extracting Does Not Work as Expected

Gzip Files: Extracting Does Not Work as Expected - file

I'm facing this very strange problem when working with gzip files. I'm trying to download this file https://www.sec.gov/Archives/edgar/daily-index/2014/QTR2/master.20140402.idx.gz
When I view the contents of the file inside the archive, it is perfect.
However when I unzip the contents and try to see them, it is all gibberish.
Is something wrong with the file or am I missing to see anything here?

If I remember correctly, an idx file is a Java file. It can also a plain text archive format, which it is in this case.
On Linux, try running
gunzip master.20140402.idx.gz
This will extract it into an idx file, which you should be able to open with any text reader, such as vi, since vi can open pretty much anything.
On Windows, you can, from the command line, use WinZip, with:
wzunzip -d master.20140402.idx.gz
You can then use something like IE, Edge, or Wordpad to try to examine the file, that should automagically load a readable environment.
EDIT:
So, I downloaded the file, and was able to extract, and view it in vi, IE, and Wordpad, using my above commands, so if you are seeing gibberish, try redownloading it. It should be 104kb in .gz format, and 533 kb extracted.

Related

how to get the type of the file before its compression

For example, if we have the following file: file.txt that after the compression is now file.new (new is the new extension) , how to obtain that .txt extension, that is forgotten?
I need that to decompress the file.

In general, if you lose the file name extension you can't get it back. It's as simple as this.
However, there might be chances depending on the compression format. Some formats do store the original file name (along with other informations) in the compressed file. And the "decompressor" will be able to recreate those properties.
Anyway, it's good practise to name a compressed file with an additional extension, in your case file.txt.new.
Oh, and you don't need to know the file name extension to uncompress the compressed file. Just uncompress it and give it a temporary name. As #MarcoBonelli said, file contents and file name extensions have no fixed relation. They are just a convention to handle them conveniently.
For example: You can rename a EXE to DOCX. Windows will show the Word icon but it is still an executable. Windows will not attempt to run it, though.
To know what a file contains can be difficult. The magic number Marco linked to might give you some hint.

Reconstruct odt file with missing content.xml file

I have an .odt file that's corrupt. I looked online and apparently if you can get to the content.xml file, there's a chance the file can be repaired. However, in my case, when I convert the file to a .zip and extract it, I don't have that file. However, the .odt file is 2.9MB and has content in it when you convert it to a .txt file.
How can I recreate the content.xml file from the .txt file?

You might not want to hear this, but depending on where the corruption happened, there is nothing you can do.
The idea behind the method you are describing is that if the corruption only concerns, for example, the styles.xml, you can still recover the contents by looking at content.xml. For more details on this, see https://en.wikipedia.org/wiki/OpenDocument_technical_specification#Format_internals
However, from your zip extract, it looks like the only uncorrupted file is styles.xml, which doesn't help you much.
What you can try to do is the following: Rename your .odt-File so that it ends in .zip, and then try to recover that file using one of the multitude of tools available on the internet, for example here, until you get a valid content.xml file.

Unzip single zipped file into parts?

I have a single file that is zipped. I want to unzip this, but I don't have enough space on the computer. Is there a way to unzip it in parts? For example, first I'd want to unzip the first quarter (or x GB), then stop, look at the resulting file, delete it, and then unzip the next part. The parts do not have to fit together perfectly to form a new file.
I'm using Windows.
EDIT
The original pre-zipped file is only 1 file. This single file was zipped, and now I need to unzip it, but in parts.

Assuming it's a text file:
I'm not sure how to do this natively in Windows, but this is very easy to do under unix. You can download Cygwin which will give you access to the unix tools that can do this.
Then you can do:
/cygdrive/c/yourDisk/
$ zcat yourFile.zip | sed -n 1,1000p > file1.txt
This will give you the first thousand lines in a file in c:\yourDisk\file1.txt

Use your Windows Explorer to explore the zip file... it allows you to open subdirectories (folders) and for you to navigate anywhere within the zip as if it was a normal folder.
When you find something you want to view, either double click it, or drag it to another location in your drive. If you drag it, you will end up copying whatever you are dragging to a new location (say, your temporary work area). Note, copying is not the same as moving as the original compressed version will continue to exist within the zipped folder.
When you have finished with whatever you dragged out, you can delete it (the copy) and return to your original and pull out more data/files for inspection.
Look at my attached image... notice the directory path where I have the red arrow. It says I opened a file called myzipfile.zip (I did a right mouse button over the file and clicked Open With... and selected Windows Explorer).

Code Injection in .png file

I am using the libpng- a c library to check the valid .png file. If a file is valid it passes the test. I want to inject shell code in it. How can I craft a .png file, so that it is still a valid image file and also contains some shell code in it. Please tell me how is it possible. Thanks.

Well, AFAIK there is no way to inject code into a png file and execute it. But you can inject your png file into a shell script, and after view it. But you must convince the one you hack to make the png file executable and to open so-called png file through terminal.
The procedure is:
Create a text file, call it executeme.png
Paste the following code into it, note that there shouldn't be any new line at the end of the file.
#!/bin/bash
PNG_FILE=$(mktemp /tmp/hack.XXXXXXX.png)
ARCHIVE=$(awk '/^__ARCHIVE_BELOW__/ {print NR + 1; exit 0; }' $0)
tail -n+$ARCHIVE $0 > "$PNG_FILE"
# whatever you want to do is here!
xdg-open $PNG_FILE
exit 0
__ARCHIVE_BELOW__
Append your original png file using cat injectme.png >> executeme.png.
Make executeme.png executable.
If you run the executeme.png from terminal, the original png file will be shown using the default image viewer, and your injected code will be run.
Note: I don't believe there is someone so stupid to execute that file.
Note2: On Ubuntu, executeme.png cannot be executed from file managers because it's tried to be opened using the file manager due to the png extension. You may rename file executeme.png to execute.\rpng (append a carriage return before png after dot) so at first it looks like a png file, since its extension is not png it will be executed with double click if it's executable. To make that renaming, you may need to use terminal.
Have a good time hacking! :D
Further reading: Linux journal, making installers

Is it possible to detect file format and encoding of file using batch files?

Is it possible to detect file format and encoding of file using batch files? And if a particular file is not of intended format, throw an error?

As a *nix guy, I'd want to jump for something more powerful than a batch file, such as Python. (or a shell script, but I'm assuming you're using Windows --- you might look into PowerShell, but I've never tried it.)
Unix has a great utility for this sort of thing, it's named file. There appears to be a Windows version here: http://gnuwin32.sourceforge.net/packages/file.htm
Basically, you run file [your filename here] and file spits out a blurb about the file. For example:
$ file zdoom-2.4.1-src.7z
zdoom-2.4.1-src.7z: 7-zip archive data, version 0.3
It's not always right, and it doesn't mean that if file says "this is a JPEG" that the file is actually a JPEG: it could be corrupt, etc.
Also, if I rename the above 7z archive to "foo":
$ file foo
foo: 7-zip archive data, version 0.3
... file will still get it.