Use HEX to find file type - file

I got a file and i don't know its type. I tried to run tools to get file type but that was of no use. When I open the file in hex editor it shows 00 hex value from starting to certain address(50 +linse). I know we can find type of file by seeing hex code of the file. But in this case it is showing 00. Can any one help how to find file type using hex value? Is there any way to obscure hex information so that file type can be hide.

If you are using Linux or Unix. You can type
$ file filename
Or you can use HEX signatures of the file. refere this. http://www.garykessler.net/library/file_sigs.html
or use third party library "magic.h" library known as "libmagic" and use if in c++ like this.
#include <stdio.h>
#include "magic.h"
int main() {
magic_t myt = magic_open(MAGIC_CONTINUE|MAGIC_ERROR/*|MAGIC_DEBUG*/|MAGIC_MIME);
magic_load(myt,NULL); printf("magic output: '%s'\n",magic_file(myt,YOURFILENAME));
magic_close(myt);
return 0;
}

No, there is not. The hex editor always shows real content (if it has permission to read the file at all).
Most binary file formats start with magic number, but not all of them. However bunch of nul bytes at the beginning looks more like the file is simply corrupt.

Related

Stray 377 and 376

I am new to Linux OS and i am trying to compile a simpe c program
, I wrote it in using text editor
#include<stdio.h>
void main(){
printf("Hello!");
}
I typed gcc -o main main.c
and the following issue shows up
main.c:1:1: error: stray '\377' in program
# i n c l u d e < s t d i o . h >
main.c:1:2: error: stray '\376' in program
This happens whenever i run c or c++ program
\377 and \376 are an octal representation of the bytes that constitute the value 0xFEFF, the UTF-16 byte order marker. Your compiler doesn't expect those characters in your source code.
You need to change the encoding of your source file to either be UTF-8 or ASCII. Given the number of text editors that exist and the lack of that information in your question I cannot list every possibility for how to do that.
You could just do this in a bash shell:
cat > program.c
// File content here
^D
This will create a file called "program.c" with "// File content here" as its content, in UTF-8.
Your text editor is saving the program in the wrong character encoding. Save it as ASCII plain text and try again.
There is no text but encoded text.
With your editor, you have chosen to save your text file with the UTF-16LE character encoding (presumably).
Any program that reads a text file must know the character encoding of the text file. It could accept one documented character encoding (only or default) and/or allow you to tell it which you used.
This could work
gcc -finput-charset=UTF16-LE main.c
but since you have include files, the include files must use the same character encoding. On my system, they use UTF-8 (and include ©, which is good because gcc chokes on the bytes for that, letting me know that I've messed up).
Note: It's not very common to save a C source file (or most any text file) with UTF-16. UTF-8 is very common for all types of text files. (ASCII is also not very common, either. You might not find it as an option in many text editors. Historically, MS-DOS did not support it and Windows only got it very late and only for the sake of completeness.)

How to check the given file is binary file or not in C programming?

I'm trying to check the given file is binary or not.
I refer the link given below to find the solution,
How can I check if file is text (ASCII) or binary in C
But the given solutions is not working properly, If I pass the .c file as argument, Its not working, It gives wrong output.
The possible files I may pass as argument:
a.out
filename.c
filename.txt
filename.pl
filename.php
So I need to know whether there is any function or way to solve the problem?
Thanks...
Note : [ Incase of any query, Please ask me before down vote ]
You need to clearly define what a binary file is for you, and then base your checking on that.
If you just want to filter based on file extensions, then create a list of the ones you consider binary files, and a list of the ones you don't consider binary files and then check based on that.
If you have a list of known formats rather then file extensions, attempt to parse the file in the formats, and if it doesn't parse / the parse result makes no sense, it's a binary file (for your purposes).
Depending on your OS, binary files begin with a header specifying that they are an executable and contain several informations about them (architecture, size, byte order etc.). So by trying to parse this header, you should know if a file is binary or not. If you are on mac, check the Mach-O file format, if you are on Linux, it should be ELF format. But beware, it's a lot of documentation.

Write File byte[] array received from C# to a file in C dll

I just created a simple PDF Document containing a word "Test" in it and created a byte stream out of it in C# Console Application:
buff = File.ReadAllBytes(<Path of File>);
The size of the file is around 9,651 bytes. I also created a Win32 C dll that exports a function which takes the file byte array and the length of the byte array as an argument, declared in C# using this:
[DllImport("<path to dll>", CallingConvention = CallingConvention.Cdecl)]
public static extern int file_data(byte[] byteArray, int length);
The method in C dll is exported as below:
#define FILEDATA_API __declspec(dllexport)
FILEDATA_API int file_data(char *byteArray, int size);
I then invoked ret = file_data(buff, buff.length); and in the C code, wrote the character pointer received directly to a temp file character by character as below:
while (length> 0)
{
fprintf(outFile, "%c", *fileData); //fileData is the actual byte array received from C# Code
fileData++;
length--;
}
But the problem arises here, the C code that dumps out the byte array to a file character by character generates a file of size 9,755 bytes. Most of the content inside it seems to look correct, except some new lines that gets introduced(as far as i know and may be some additional data), which causes the PDF file to get corrupted and this dumped out version does not open in Adobe. Can someone please provide some pointers on where I might be going wrong? I cannot use %s in fprint since some combination of the byte array in the PDF results in null terminated string in C which then dumps out even lesser data than I expect.
Thanks.
UPDATE:
The desired behavior is that the file byte array as received from C#
and when written using C code to a file should make the file open
successfully in Adobe.
The code present in the problem should be
sufficient for someone to generate a win32 dll, that simply writes
out the char pointer to a file, but I have added few more details.
You're probably calling fopen without the b mode flag. Append b to your mode specifier:
FILE *outFile = fopen("file.txt", "wb")
From this site (emphasis mine):
Text files are files containing sequences of lines of text. Depending on the environment where the application runs, some special character conversion may occur in input/output operations in text mode to adapt them to a system-specific text file format. Although on some environments no conversions occur and both text files and binary files are treated the same way, using the appropriate mode improves portability.
In my experience, this "conversion" on Windows changes \n to \r\n at least.

How to use bin2h?

I'm trying to use bin2h to convert a font file (font.ttf) into a C file but it won't work.
Can someone please tell me the syntax to save the output to a text file?
I've been trying to figure this out but nothing is working, and it's driving me insane. I'm really frustrated because I know the tool is working (I got it to work like a year ago) but I can't remember how I used it.
The example syntax on that site doesn't really help...
Please
Thanks to Lightness Races in Orbit's comment below I finally got the syntax right!
bin2h -cz font < font.ttf > output.h
That's working, thanks
Perhaps you are looking at the usage example on the website and not realising that it is a program that you execute from shell? It is not a line of C code.
So if you want to use this from a C program, you will need to execute it through a function like system or exec. However, since its output is a line of C code, you'd be better off running it from within your build script to create a C script, that you'd then link in to the rest of your program.
Example (in C++ as my C is rusty — port to C as required):
Source code for main.cpp
#include <iostream>
#include "eula.h"
int main()
{
std::cout << std::string(eula, eula_size) << std::endl;
}
Build commands
$ bin2h -cz eula < eula.txt > eula.h
$ g++ main.cpp -o myProgram
Execution command
$ ./myProgram
I would just write my own.
Here's the algorithm:
Open the source code file as text output.
Open the font file as binary input.
Write the array declaration to the output file, something like:
static const unsigned char font[] =
{
While the font file is not empty do:
Read unsigned char from font file, using binary read methods.
Output the unsigned char, in text format, to the source file.
end-while
Write the ending brace and semicolon to the source file.

Get MIME type from filename in C

I want to get the MIME type from a filename using C.
Is there a way to do this without using a textfile containing MIME types and file extensions (i.e. Apache's file mime.types)?
Maybe there is a function to get the MIME type using the filename? I rather not use the file extension if I don't have to.
I just implemented this for a project on which I'm working. libmagic is what you're looking for. On RHEL/CentOS its provided by file-libs and file-devel. Debian/Ubuntu appears to be libmagic-dev.
http://darwinsys.com/file/
Here's some example code:
#include <stdio.h>
#include <magic.h>
int main(int argc, char **argv){
const char *mime;
magic_t magic;
printf("Getting magic from %s\n", argv[1]);
magic = magic_open(MAGIC_MIME_TYPE);
magic_load(magic, NULL);
magic_compile(magic, NULL);
mime = magic_file(magic, argv[1]);
printf("%s\n", mime);
magic_close(magic);
return 0;
}
The code below uses the default magic database /usr/share/misc/magic. Once you get the dev packages installed, the libmagic man page is pretty helpful. I know this is an old question, but I found it on my hunt for the same answer. This was my preferred solution.
If there was a way to do it, Apache wouldn't need its mime.types file!
The table has to be somewhere. It's either in a separate file which is parsed by your code, or it's hard coded into your software. The former is clearer the better solution...
It's also possible to guess at the MIME type of a file by examining the content of the file, i.e. header fields, data structures, etc. This is the approach used by the file(1) program and also by Apache's mod_mime_magic. In both cases they still use a separate text file to store the lookup rules rather than have any details hard-coded in the program itself.
as far as I know, the unix command file outputs the mime string with the option -i:
> file -i main.c
main.c: text/x-c charset=us-ascii

Resources