I'm writing my own version of the stat command and I am having difficulty getting the correct output for the Device field.
When I run the Linux stat command on an empty file in the working directory I get:
Device: 801h/2049d
To replicate this I tried to extract from the stat structure, the st_dev field.
But printing st_dev gives me
Device: 801
I am missing the h at the end and I am not sure where the 2049d comes from.
Is the first part just a formatting problem? I am printing in hex format. And how can I extract 2049d?
Since (hexadecimal) 0x801 == 2049 (decimal), you can get the output you're after from:
printf("Device: %xh/%dd\n", st.st_dev, st.st_dev);
The h in the format is the h that appears at the end of 801h; the %x means 'print number in hex'. Similarly, the %d means print in decimal, and the trailing d is the d in 2049d.
Incidentally, on Linux and other POSIX platforms, you can also avoid repeating the st.st_dev argument. For example:
#include <stdio.h>
int main(void)
{
printf("Device: %1$xh/%1$dd\n", 0x801);
return 0;
}
This also produces:
Device: 801h/2049d
To see why, read the printf()
specification very carefully. Note that if you use one of the 1$ modifiers, you must (should) use it with every conversion specification.
Related
I am trying to figure out the file type of a file, without using external libs or the "file" command.
I have viewed a number of posts and threads, and they point to using the stat() function (unix man stat) and playing with the "st_mode" from the stat struct.
But I have no idea how to do this, nor am I able to find a good example of doing it.
For example the program takes in a file F, I want to be able to read F similar to the program below and give similar output. And the filetype of F is a PDF, but it does not have the extension on it.
FURTHER EXAMPLE: If I have foo.pdf, but I changed the extension to *.png (foo.png) I can pass my program "foo.png" and say it is infact a .pdf file.
When a file is created, it makes a "magic number", example with a PDF, the magic number of PDF files start with "%PDF" (hex 25 50 44 46)."
How can I use the magic number to figure out the filetype.
I understand some type of table will need to be made at my end, to support files. And I am only doing a small handful <10.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void errorInput()
{
fprintf(stderr, "\nYou have received this message due to an error. \n");
fprintf(stderr, "Please type 'filetype <file>' to properly execute the program.\n");
fprintf(stderr, "Thank you and have a fine day! \n\n");
exit(0);
}
int main(int argc, char *argv[])
{
char command[128];
if (argc == 2)
{
strcpy(command, "file ");
strcat(command, argv[1]);
system(command);
}
else
{
errorInput();
}
return 0;
}
Thank You in advance!
Like Jonathon Reinhart Pointed, don't try to reinvent the wheel use libmagic:
#include <stdio.h>
#include <magic.h>
int main(void) {
struct magic_set *magic = magic_open(MAGIC_MIME|MAGIC_CHECK);
magic_load(magic,NULL);
printf("Output1: '%s'\n",magic_file(magic,"ValgrindOut.xml"));
printf("Output2: '%s'\n",magic_file(magic,"program"));
printf("Output3: '%s'\n",magic_file(magic,"Chapter9.pdf"));
printf("Output4: '%s'\n",magic_file(magic,"test.txt"));
printf("Output5: '%s'\n",magic_file(magic,"linux-3.17.tar.xz"));
printf("Output6: '%s'\n",magic_file(magic,"gcc-5.2.0.tar.gz"));
printf("Output7: '%s'\n",magic_file(magic,"/home/michi"));
return 0;
}
Compile:
gcc -o program program.c -lmagic
Output:
Output1: 'application/xml; charset=us-ascii'
Output2: 'application/x-executable; charset=binary'
Output3: 'application/pdf; charset=binary'
Output4: 'text/plain; charset=utf-8'
Output5: 'application/x-xz; charset=binary'
Output6: 'application/gzip; charset=binary'
Output7: 'inode/directory; charset=binary'
First, you need to include sys/stat.h
Next, you need to declare a struct stat in your code:
struct stat s
Next, you pass a pointer to your stat structure along with the file/object name:
returnval = stat("filename", &s);
Check the return value, you'll get < 0 on error. If no error the object/file exists, we can use a macro function to determine the file type:
if (S_ISREG(s.st_mode))
/* Regular text file... */
else if (S_ISDIR(s.st_mode))
/* Is a directory.... */
I suggest you have a look at the man page (man 3 stat) and it will give you all of the types that st_type may potentially be (it can be used to identify files, directories, block devices, sym links, etc)
Another very useful member of the stat struct is st_size which gives you a files size in bytes.
ETA - the stat() system call won't tell you if a file is a PDF or anything like that - normally we'd use the extension, if there is no extension and you're trying to identify specific file formats then stat() won't be of much use to you.
Most files usually will have a portion called as Header/MetaData. It is in this portion/segments of the file which will contain details about the file it self.Also, these Headers/MetaData Segments will also contain the Signature to identify the file type. But be aware most of these Signatures will be in an Hex Signature format
Example
PDF Signature - 25 50 44 46(In Hex) or %PDF
JPEG Signature - Start FF D8 and end of file FF D9
So, Basically you need to open the file in a binary format and parse the file structure and compare it to see if it matches with any one of the file types you define in your program.Like suppose you wanna check if it's pdf file then you need to first open the file in binary mode then scan the file till you get the bytcode/hex code which matches the bytcode/hex code of a pdf file. Use the C fopen() function in binary mode i.e "rb".
Or you can open the file normally without binary mode like this,
unsigned int data;
data=fgetc(pfile);
You might want to look into this for further details,
Magic Number
File Signatures
I'm trying to write a small program to show me the internal representation of a directory in linux (debian, specifically). The idea was a small C program using open(".", O_RDONLY), but this seems to give no output. The program is the following:
#include <stdio.h>
#include <fcntl.h>
int main(int argc, char** argv)
{
int fd = open(argv[1],O_RDONLY,0 );
char buf;
printf("%i\n",fd);
while(read(fd, &buf, 1) > 0)
printf("%x ", buf);
putchar('\n');
}
When I run it on regular files it works as expected, but on a directory such as ".", it gives no output. The value of fd is 3 (as expected) but the call to read returns -1.
Why isn't this working, and how could I achieve to read the internal representation?
Thanks!
For handling directories, you need to use opendir/readdir/closedir. Read the corresponding man pages for more infos.
To check whether a filename corresponds to a directory, you first need to call stat for the filename and check whether it's a directory (S_ISDIR(myStatStruc.st_mode)).
Directories are a filesystem specific representation and are part of the file system. On extfs, they are a table of string/inode pairs, unlike files which have blocks of data(that you read using your code above).
To read directory-specific information in C, you need to use dirent.h .
Look at this page for more information
http://pubs.opengroup.org/onlinepubs/7908799/xsh/dirent.h.html
On POSIX systems, the system call "stat" would give you all the information about an inode on the filesystem(file/directory/etc.)
I am using ar.h for the defining the struct. I was wondering on how I would go about getting information about a file and putting it into those specified variables in the struct.
struct ar_hdr {
char ar_name[16]; /* name of this member */
char ar_date[12]; /* file mtime */
char ar_uid[6]; /* owner uid; printed as decimal */
char ar_gid[6]; /* owner gid; printed as decimal */
char ar_mode[8]; /* file mode, printed as octal */
char ar_size[10]; /* file size, printed as decimal */
char ar_fmag[2]; /* should contain ARFMAG */
};
Using the struct defined above, how would I put get the information from the file from ls -la
-rw-rw----. 1 clean-unix upg40883 368 Oct 29 15:17 testar
?
You're looking for stat(2,3p).
In order to emulate the behavior of ls -la you need a combination of readdir and stat. Do a man 3 readdir and a man 2 stat to get information on how to use them.
Capturing the output of ls -la is possible, but not such a good idea. People might expect that of a shell script, but not a C or C++ program. It's even sort of the wrong thing to do in Python or perl if you can help it.
You will have to construct your structure yourself from the data available to you. strftime can be used for formatting the time in a manner you like.
For collecting data about a single file into an archive header entry, the primary answer is stat(); in other contexts (such as ls -la), you might also need to use lstat() and readlink(). (Beware: readlink() does not null terminate its return string!)
With ls -la, you would probably use the opendir() family of functions (readdir() and closedir() too) to read the contents of a directory.
If you needed to handle a recursive search, then you'd be looking at nftw(). (There's also a less capable ftw(), but you'd probably be better off using nftw().)
The default limit for the max open files on Mac OS X is 256 (ulimit -n) and my application needs about 400 file handlers.
I tried to change the limit with setrlimit() but even if the function executes correctly, i'm still limited to 256.
Here is the test program I use:
#include <stdio.h>
#include <sys/resource.h>
main()
{
struct rlimit rlp;
FILE *fp[10000];
int i;
getrlimit(RLIMIT_NOFILE, &rlp);
printf("before %d %d\n", rlp.rlim_cur, rlp.rlim_max);
rlp.rlim_cur = 10000;
setrlimit(RLIMIT_NOFILE, &rlp);
getrlimit(RLIMIT_NOFILE, &rlp);
printf("after %d %d\n", rlp.rlim_cur, rlp.rlim_max);
for(i=0;i<10000;i++) {
fp[i] = fopen("a.out", "r");
if(fp[i]==0) { printf("failed after %d\n", i); break; }
}
}
and the output is:
before 256 -1
after 10000 -1
failed after 253
I cannot ask the people who use my application to poke inside a /etc file or something. I need the application to do it by itself.
rlp.rlim_cur = 10000;
Two things.
1st. LOL. Apparently you have found a bug in the Mac OS X' stdio. If I fix your program up/add error handling/etc and also replace fopen() with open() syscall, I can easily reach the limit of 10000 (which is 240 fds below my 10.6.3' OPEN_MAX limit 10240)
2nd. RTFM: man setrlimit. Case of max open files has to be treated specifically regarding OPEN_MAX.
etresoft found the answer on the apple discussion board:
The whole problem here is your
printf() function. When you call
printf(), you are initializing
internal data structures to a certain
size. Then, you call setrlimit() to
try to adjust those sizes. That
function fails because you have
already been using those internal
structures with your printf(). If you
use two rlimit structures (one for
before and one for after), and don't
print them until after calling
setrlimit, you will find that you can
change the limits of the current
process even in a command line
program. The maximum value is 10240.
For some reason (perhaps binary compatibility), you have to define _DARWIN_UNLIMITED_STREAMS before including <stdio.h>:
#define _DARWIN_UNLIMITED_STREAMS
#include <stdio.h>
#include <sys/resource.h>
main()
{
struct rlimit rlp;
FILE *fp[10000];
int i;
getrlimit(RLIMIT_NOFILE, &rlp);
printf("before %d %d\n", rlp.rlim_cur, rlp.rlim_max);
rlp.rlim_cur = 10000;
setrlimit(RLIMIT_NOFILE, &rlp);
getrlimit(RLIMIT_NOFILE, &rlp);
printf("after %d %d\n", rlp.rlim_cur, rlp.rlim_max);
for(i=0;i<10000;i++) {
fp[i] = fopen("a.out", "r");
if(fp[i]==0) { printf("failed after %d\n", i); break; }
}
}
prints
before 256 -1
after 10000 -1
failed after 9997
This feature appears to have been introduced in Mac OS X 10.6.
This may be a hard limitation of your libc. Some versions of solaris have a similar limitation because they store the fd as an unsigned char in the FILE struct. If this is the case for your libc as well, you may not be able to do what you want.
As far as I know, things like setrlimit only effect how many file you can open with open (fopen is almost certainly implemented in terms on open). So if this limitation is on the libc level, you will need an alternate solution.
Of course you could always not use fopen and instead use the open system call available on just about every variant of unix.
The downside is that you have to use write and read instead of fwrite and fread, which don't do things like buffering (that's all done in your libc, not by the OS itself). So it could end up be a performance bottleneck.
Can you describe the scenario that requires 400 files open ** simultaneously**? I am not saying that there is no case where that is needed. But, if you describe your use case more clearly, then perhaps we can recommend a better solution.
I know that's sound a silly question, but you really need 400 files opened at the same time?
By the way, are you running this code as root are you?
Mac OS doesn't allow us to easily change the limit as in many of the unix based operating system. We have to create two files
/Library/LaunchDaemons/limit.maxfiles.plist
/Library/LaunchDaemons/limit.maxproc.plist
describing the max proc and max file limit. The ownership of the file need to be changed to 'root:wheel'
This alone doesn't solve the problem, by default latest version of mac OSX uses 'csrutil', we need to disable it. To disable it we need to reboot our mac in recovery mode and from there disable csrutil using terminal.
Now we can easily change the max open file handle limit easily from terminal itself (even in normal boot mode).
This method is explained in detail in the following link. http://blog.dekstroza.io/ulimit-shenanigans-on-osx-el-capitan/
works for OSX-elcapitan and OSX-Seirra.
I am working on TTCN-3 (Testing and Test Control Notation) scripting language. I wanted to prepare on guideline checker for this code files.
For that I want to read lines of TTCN-3 script file( some thing like file.ttcn ) one by one into a buffer. But for me fopen / sopen / open / fgetc / fscanf are not able to work properly and are not reading the file correctly. It is giving NULL. Is there any way I can read characters of it into a buffer. I think C cannot read files with more than three extension characters (like .ttcn). Forgive me if my assumption is wrong.
My Environment is Turbo C on windows.
Edit:
Yes I checked those errors also but they are giving unknown error for read()
and no such file or directory exists.
My code is as follows
#include <errno.h>
#include <io.h>
#include <fcntl.h>
#include <sys\stat.h>
#include <process.h>
#include <share.h>
#include <stdio.h>
int main(void)
{
int handle;
int status;
int i=0;
char ch;
FILE *fp;
char *buffer;
char *buf;
clrscr();
handle = sopen("c:\\tc\\bin\\hi.ttcn", O_BINARY, SH_DENYNONE, S_IREAD);
/here even I used O_TEXT and others/
if (!handle)
{
printf("sopen failed\n");
// exit(1);
}
printf("\nObtained string %s #",buf);
close(handle);
fp=fopen("c:\\tc\\bin\\hi.ttcn","r"); \\sorry for the old version of one slash
if(fp==NULL) \\I was doing it with argv[1] for opening
{ \\user given file name
printf("\nCannot open file");
}
ch=fgetc(fp);
i=0;
while(i<10)
{
printf("\ncharacter is %c %d",ch,ch);
i++; //Here I wanted to take characters into
ch=fgetc(fp); //buffer
}
getch();
return 0;
}
The most likely culprit is your Turbo C, an ancient compiler. It's techincally a DOS compiler, not Windows. That would limit it's RunTme Library to 8.3 filenames. Upgrade to something newer - Turbo C++ seems like a logical successor, but Microsoft's VC++ Express would work as well.
Your assumption is wrong about extensions. If fopen is returning NULL, you should output the result of strerror(errno) or use the perror() function to see why it failed.
Edit: The problem is probably because you have "c:\tc\bin\hi.ttcn". in C, "\t" is interpreted as tab, for example.
You could do
"c:\\tc\\bin\\hi.ttcn"
But this is extremely ugly, and your system should accept:
"c:/tc/bin/hi.ttcn"
MS-DOS does not know about long file names, thos including files with extensions longer than 3 characters. Therefore, the CRT provided by Turbo C most probably does not look for the name you are providing, but a truncated one - or something else.
Windows conveniently provides a short (i.e. matching the 8.3 format, most of the time ending in ~1 unless you play with files having the same 8-character prefix) file name for those; one way to discover it is to open a console window and to run "dir /x" in the folder your file is stored.
Find the short name associated to your file and patch it into your C source file.
Edit: Darn, I'll read the comments next time. All credits to j_random_hacker.
Now that you've posted the code, another problem comes to light.
The following line:
fp=fopen("c:\tc\bin\hi.ttcn","r");
Should instead read:
fp=fopen("c:\\tc\\bin\\hi.ttcn","r");
In C strings, the backslash (\) is an escape character that is used to encode special characters (e.g. \n represents a newline character, \t a tab character). To actually use a literal backslash, you need to double it. As it stands, the compiler is actually trying to open a file named "C:<tab>c<backspace>in\hi.ttcn" -- needless to say, no such file exists!