Decoding an .iwa file - file-format

I am looking to view the contents of an .iwa file in my Apple Numbers project. It seems like the format is a protobuf wrapped in snappy. I was wondering if there was a relatively simple/crude way to view the text contents of the data. I tried with a basic hexdump but it just gives me some gibberish, probably because of the snappy encoding.
What might be a way to view this data? As an example, here is a sample file: https://drive.google.com/file/d/1-AZdpuoshfmpjxMfi7ARKhneeUNmW9Xx/view?usp=sharing

Related

What is the difference between .bin and .dat file?

When writing to a binary file, when should I use .bin vs .dat? If I'm just trying to store information not meant to be read by humans, like item description/serial number pair, does it matter which one I pick if I'm just trying to make it unreadable from a text editor?
Let me give you some brief details about these files :
.BIN File : The BIN file type is primarily associated with 'Binary File'. Binary files are used for a wide variety of content and can be associated with a great many different programs. In general, a .BIN file will look like garbage when viewed in a file editor.
.DAT File : The DAT file type is primarily associated with 'Data'. Can be just about anything: text, graphic, or general binary data. Data file in special format or ASCII.
Reference:
Abhijit Banerjee answered that question on quora
.dat is a more frequently used suffix for binary data. It doesn't matter what extension you pick, as long as you are on Unix or Linux based systems.
Sufixes can mean whatever you want them to mean... Those rules are more like guidelines than actual rules...
However, BIN seems like a short to binary, so a BIN file will likely hold data in binary form. DAT seems like a short to data, so a DAT file will contain information in whatever format the developer of the program that reads that file seems fit (ASCII, Binary, a mix of them, something else entirely)
On a UNIX system, there is no difference. The extensions are interchangeable.
If you do not put any extension, it makes it kinda hard for someone not knowing what the file's extension should be, to open the file. Additionally, with Unix or Linux, if you place a dot (period) before the file name, the file hides itself.

How to read content of unknown file

I have a file that holds manufacturing orders for a machine.
I would like to read the content of this file and edit it, but when I open it in a text editor i.e. Notepad++, I get a bunch of wierd charecters:
xÚ¥—_HSQÀo«a)’êaAXŽâê×pD8R‰¬©s“i+ƒ´#¡$
-þl-ó/ÓíºIúPôàƒHˆP–%a&RÎÈn÷ü¹·;Ú;ç<ìòÝÃý}¿ó}‡{϶«rWg>˜›ãR‡)Çn0³Ûf³yÎW[5–šw½ÇRW{ñ’rO6¹ŽŸp¦ÙœcÏ.9yÀnýg
)Ë—e90ejÕø£rC. f¦}3ËŒ˜hü”å1g[…ø±ú ÜJøz®‹˜YfÈ,4`ŽKÉ—ù“ÔË¿d„þlG3#=˜Ž´+hF¬¦£€«šm¿áØ
ïÖµv‡ËpíÍ~™‡Aù
šëÈÚ]ÿç™DŒÉFØ ïƒæsij  ¦y=-74Æ/t=ÕŠr\˜š»Âä‰Ý­¨žã΢
dz·à‡'fœ½­yâ½4qåPjácòÄŒeÊhñ“ý™ÙÎÕ÷5ôlñ=˜Õ{ú;ø=Û;4OêYä>Ìpxbæâ­'è"oëB×1gQ9“'¹]Ô³’Ô³ø!ÌózÞyŸõžÓIŽù*&OÌXPÕ"ŽWžpíOÌè‚Þ3Òr0{Ž†R=_?…/¼žÞ0,ê=/?£ûÓËîy“2Z<ij³[ËÁì™÷–ôžÎ’Ããa÷<Maêéí…¼ž}©žYýZ-˜=­”á¤}π>3°¢÷œ$ïè‰3ìž«ƒÄs¿—xnŒÀ*¯gi$ÕómDËÁìùIeоû‡À¬?3°x¾"~ª§c˜öÝÇî颌°›x¾Fßb>Ï}QXÓ{öFi-êÙßóR”œe^Ñ÷ü‘¿g[Lë ŽwJZϘë¹3”³L©gH‚,^Ïe 2ôžWGøëÙ2‚Î
øœL¾ÅqÈäõ,ýç\œË3¾þeྗ&`Ϻ<KÒf“’»ðù]í‰ãžU^wèþåÔÖy”H}ò•6ø6
It looks like the file is encoded.
Any idea how to find the encoding and make the file readable and editable?
It's binary and probably encoded so without knowledge of data structure you can't do much - just reverse engineering based on trying and checking what changed, operating with hex editor.
It isn't impossible, tho. If you can change the data the way you know (eg. change number of orders from 1 to 2) and export to file, you can compare binary values and find which byte holds that number. Of course if it is encrypted and you don't know the key... It's easier to find another way.
For further read, check this out - https://en.wikibooks.org/wiki/Reverse_Engineering/File_Formats
If you've got access to a Linux box why not use
hexdump -C <filename>
You will be able to get a much better insight into how the file is structured, than by using a text editor.
There are also many "hexdump" equivalent commands on Windows

identifying data file type

I have a huge 1.9 GB data file without extension I need to open and get some data from, the problem is this data file is extension-less and I need to know what extension it should be and what software I can open it with to view the data in a table.
here is the picture :
Its only 2 lines file, I already tried csv on excel but it did not work, any help ?
I have never use it but you could try this:
http://mark0.net/soft-tridnet-e.html
explained here:
http://www.labnol.org/software/unknown-file-extensions/20568/
The third "column" of that line looks 99% chance to be from php's print_r function (with newlines imploded to be able to stored on a single line).
There may not be a "format" or program to open it with if its just some app's custom debug/output log.
A quick google found a few programs to split large files into smaller units. THat may make it easier to load into something (may or may not be n++) for reading.
It shouldnt be too hard to mash out a script to read the lines and reconstitute the session "array" into a more readable format (read: vertical, not inline), but it would to be a one-off custom job, since noone other than the holder of your file would have a use for it.

combine a binary file and a .txt file to a single file in python

I have a binary file (.bin) and a (.txt) file.
Using Python3, is there any way to combine these two files into one file (WITHOUT using any compressor tool if possible)?
And if I have to use a compressor, I want to do this with python.
As an example, I have 'file.txt' and 'file.bin', I want a library that gets these two and gives me one file, and also be able to un-merge the file.
Thank you
Just create a tar archive, a module that let's you accomplish this task is already bundled with Cpython, and it's called tarfile.
more examples here.
there are a lot of solutions for compressing!
gzip or zlib would allows compression and decompression and could be a solution for your problem.
Example of how to GZIP compress an existing file from [http://docs.python.org]:
import gzip
f_in = open('file.txt', 'rb')
f_out = gzip.open('file.txt.gz', 'wb')
f_out.writelines(f_in)
f_out.close()
f_in.close()
but also tarfile is a good solution!
Tar's the best solution to get binary file.
If you want the output to be a text, you can use base64 to transform binary file into a text data, then concatenate them into one file (using some unique string (or other technique) to mark the point they were merged).

How to read lines from a pdf file into a c program using ghostscript?

I am currently taking a curse in C programming, and for our final project we need to read some text from a pdf into a string, so we can manipulate the string.
In essence what i am looking for is something similar to this, only with a .pdf instead of a .txt file.
char *line;
fscanf(myfile.txt," %[^\n]", line);
I have no experience with ghostscript, so I have no idea if this is even possible, although we where told that we should use ghostscript.
The current version of Ghostscript includes the 'txtwrite' device, which will extract text from any supported input (PostScript, PDF, XPS, PCL) and will emit it in a variety of forms.
The UTF-8 output would probably be most useful to you.
Caveat! Many things which appear to be text in PDF files are not text, and no attempt is made to deal with these.
ps2ascii is deprecated with the release of the txtwrite device, but in any case its perfectly capable (despite the name) of dealing with PDF as an input.
I can't think why anyone assigned you this project, PDF files are not text files, and cannot be treated as such. In addition to the fact that PDF files are generally compressed, identifying the contents stream and all the other streams it relies on (which may themselves include text) is non-trivial. Plus, the text is often encoded in a way which can be difficult to understand (this is particularly true of CIDFonts and TrueType fonts).
Perhaps your tutor expected you to first become expert in the PDF format, but that seems excessive for a C course.
You can convert your PDF to Postscript using pdf2ps, and then to ASCII using ps2ascii. You already know how to read ASCII.
Both utilities mentioned are in the ghostscript package.

Resources