Is file format different from file extension? - file

I don't understand how files are stored. I opened my text editor, wrote down some gibberish and saved it as .pdf, and then tried to open it with a pdf reader. The pdf reader could not open it. Someone please explain what happened here?

Let's say you have a folder system in your office, that says "English documents go into brown folders; Spanish documents go into pink folders; French documents in red folders; Japanese documents in white folders;..."
If you put "これは英語じゃね〜よばか" into a brown folder, it doesn't make the text English. It just means you put it into a wrong folder. If you put "egu egu egu egu egu" into a Japanese folder, it doesn't stop being gibberish.
File extension is a part of a file name that we use by convention to mark what kind of content a file has (kind of like a folder colour). File format is the structure of the content a file has (kind of like knowing what language to use to read the paper within).
Extensions are primarily for users, so we know what each file has, and also a shortcut so OS can open a file with an application it thinks is suited for it (just like one could see a brown folder and decide it should go to the English-speaking employee).
Just like a Japanese speaker will be able to read a Japanese text in a brown folder (if explicitly handed one), a PDF reader will be able to read a PDF-format document, whatever the extension (if you make the PDF-reader open it, rather than relying on the OS to figure out the correct application). Just like no-one can read "egu egu egu egu egu" despite its folder's claims of it being Japanese, the PDF reader is confused to find a non-PDF-formatted content inside a .pdf extension file.

File extensions and file formats are often spoken about
interchangeably. In reality, however, a file extension is just the
characters that appear after the period, while the file format speaks
to the way in which the data in the file is organized.
So, in your example, you have created file in txt format and manually updated extension to .pdf. PDF reader thinks that it can open that file (since it has .pdf extension), but it wasn't able since it's formatted as txt.
To sum, you can change extension, but no format. In order to change format, you have to use some kind of converter.

File extension is part of file name so it's just a label. You won't change file format by changing its label just as changing the name of the shortcut on your desktop from "MS Word" to "Battlefield 2" won't magically allow you to play this game for free:)

Related

how to get the type of the file before its compression

For example, if we have the following file: file.txt that after the compression is now file.new (new is the new extension) , how to obtain that .txt extension, that is forgotten?
I need that to decompress the file.
In general, if you lose the file name extension you can't get it back. It's as simple as this.
However, there might be chances depending on the compression format. Some formats do store the original file name (along with other informations) in the compressed file. And the "decompressor" will be able to recreate those properties.
Anyway, it's good practise to name a compressed file with an additional extension, in your case file.txt.new.
Oh, and you don't need to know the file name extension to uncompress the compressed file. Just uncompress it and give it a temporary name. As #MarcoBonelli said, file contents and file name extensions have no fixed relation. They are just a convention to handle them conveniently.
For example: You can rename a EXE to DOCX. Windows will show the Word icon but it is still an executable. Windows will not attempt to run it, though.
To know what a file contains can be difficult. The magic number Marco linked to might give you some hint.

How would I store different types of data in one file

I need to store data in a file in this format
word, audio, jpeg
How would I store that all in one file? Is it even possible do would I need to store links to other data files in place of the audio and jpeg. Would I need a custom file format?
1. Your own filetype
As mentioned by #Ken White you would need to be creating your own custom file format for this sort of thing, which would then mean creating your own parser type. This could be achieved in almost any language you wanted but since you are planning on using word format, then maybe C# would be best for you. However, this technique could be quite complicated and take a relatively large amount of time to thoroughly test your file compresser / decompressor, but may be best depending on your needs.
2. Command line utilities
Another way to go about this would be to use a bash script to combine all of the files into one file, and then decompress it at the other end. For example the steps could involve:
Combine files using windows copy / linux cat command on command line
Create a metdata file of your own that says how many files are in this custom file, and how much memory each one takes up (could be a short XML or JSON file for example...)
Use the linux split command or install a Windows command line file splitter program (here's just one example) to split the file back into whatever components have made it up.
This way you only have to create a really small file type, and let the OS utilities handle the combining of them for you.
Example on Windows:
Copy all of the files in your current directory into one output file called 'file.custom'
copy /b * file.custom
Generate custom file format describing metadata (i.e. get the file size on disk in C# example here). This is just maybe what I would do in JSON. SO formatting was being annoying so here's a link (Copy paste it into an editor or online JSON viewer).
Use a decompress windows / linux command line tool to decompress each files to the exact length (and export it back to the exact name) specified in the JSON (metadata) file. (More info on splitting files on this post).
3. ZIP files
You could always store all of the files in a compressed zip file, and then just use a zip compressor, expander as and when you like to retreive any number of file formats stored within.
I found a couple of examples of :
Combining multiple files into one ZIP file in only C# .net,
Unzipping ZIP files in C#
Zipping & Unzipping with only windows built-in utilities
Zipping & Unzipping in Linux command line
Good Zipping/Unzipping library in Java
Zipping/Unzipping in Python

How to remove specific characters from a file name?

I have bunch of files that need to have a (.) dot removed from the file-name. Eg. "Mr.-John-Smith.jpg" to "Mr-John-Smith.jpg". I don't have real experience with programming and know only html/css and a little javascript. I found another identical question here on stackoverflow, but what I gathered it was fixed on linux system and BASH was used.
Anyways, if anyone could provide me a tutorial on which program to use for this and what code to execute in that program to make it happen I'd be grateful.
if you are using a windows environment (which i guess you do)
you can download this free utility to mass change file names !
main page :
http://www.bulkrenameutility.co.uk/Main_Intro.php
download page :
http://www.bulkrenameutility.co.uk/Download.php
its easy to use
enjoy
If your file names in a file...
1- Open Microsoft Word or any text editor. Press ctrl+h and then search "." without quotes then replace it with blank character.
2- It will remove all dots, again bring "." to your file extention such as .jpg , .png searh your file extention for example "jpg" and replace it with ".jpg"
It will works %100, i am using this method everytime.
if they are not in a file and if you want do somethings in your operation systems' file system
Try this program. It is very useful for this operation;
Download
To remove all except the extension dot from all files in a directory, you can use PowerShell that comes with newer versions of Windows, and can be downloaded for older versions;
Line breaks inserted for readability, this should go on one line;
PS> dir | rename-item -newname {
[System.IO.Path]::GetFileNameWithoutExtension($_.name).Replace(".","") +
[System.IO.Path]::GetExtension($_.name); }
What it does is to take the file name without an extension and remove all dots in it, and then add back the extension. It then renames the file to the resulting name.
This will change for example do.it.now to doit.now, or in your case, Mr.-John-Smith.jpg to Mr-John-Smith.jpg.

sublime text suffix mapping

is it possible to map the suffix of files opened by sublime text 2 to be similar to other languages to get the text coloring? for example, when i open a *.cu file, this isn't recognize by sublime text as a valid file extension, so it opens the file and there are no text coloring in there at all. however if you open a php or c++ files, the the file will be recognized as valid file extension, and there will be text coloring, making it much easier to read.
I know text wrangler has this feature, and you cam map *.cu file extension to be like c++ so when you open a .cu file, the text coloring will be similar to a c++ file.
Yes it is.
Please see:
http://opensourcehacker.com/2012/05/11/sublime-text-2-tips-for-python-and-web-developers/
->
Map file formats to syntax highlighting

Is there a possibility of storing a .txt file in a .exe file or atleast hiding it from user view?

Is there a possibility to include a text file in a .exe file? or atleast hide it from the user view? I mean let us take an example,I have a target.exe file in which it opens and reads the contents of a data.txt file and yes its working perfect with my computer But when i transfer these target.exe file without a data.txt file to some other computer where you dont have the required data.txt file.It results an error and the thing I want you to know is data.txt file has some information like example a contact info which is a confidential text.When someone runs these .exe file he should enter the name and the data is displayed about the contact info but it works only if you have data.txt file. But i want the data.txt file to be hidden it cannot be accessed normally.The data in data.txt file can be accessed only through .exe file.How could I solve it? and remember i should give my friends only the .exe file and using that .exe file they can save thier data and display contact info. Does any one have any idea to do it?
yes, you can include any user data into recource and link it with your .exe
Resources in Windows
Resource compiler reference
LoadResource
If the exe opens the file, then it is nearly impossible to prevent users from accessing the contents of that file. If you store it as a raw resource, then one can use a resource editor to view it. If you do some sort of basic encryption, then using ProcExplorer, one could view string in the process to see the information when the program executes. You could use DRM style protections, but that seems like overkill.
The answer to your general question is yes, you can store resources in an EXE file and then the EXE can open and load those resources at runtime.

Resources