How to implement unix ls -s command in C?

How to implement unix ls -s command in C? - c

I have to write a program in C which returns file size in blocks just like ls -s command.
Please help.
I tried using stat() function (st_blksize)...And I am unable to implement it.
My code looks like this
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <dirent.h>
void main(int argc, char **argv)
{
DIR *dp;
struct dirent *dirp;
struct stat buf;
if(argc < 2)
{
dp = opendir(".");
}
if(dp == NULL)
{
perror("Cannot open directory ");
exit(2);
}
while ((dirp = readdir(dp)) != NULL)
{
printf("%s\n", dirp->d_name);
if (stat(".", &buf))
printf("%d ", buf.st_blksize);
}
closedir(dp);
exit(0);
}
It is giving error buf size is not declared. Don't know what is the problem.
Addition
Thanks for the correction. I included the <sys/stat.h> header file. Now it is giving a warning:
warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘__blksize_t’
I am new to C so can't make out what should be the possible solution.

You need to include the correct header:
#incude <sys/stat.h>
That declares the structure and associated functions.
Note that stat() returns zero on success, so your test needs changing (and, as #jsmchmier pointed out in a comment, the call to stat should probably use dirp->d_name rather than the string literal "."). Also, st_blksize is the size of the disk blocks, not the size of the file - that is st_size (measured in bytes).
POSIX says:
off_t st_size For regular files, the file size in bytes.
For symbolic links, the length in bytes of the
pathname contained in the symbolic link.
blksize_t st_blksize A file system-specific preferred I/O block size
for this object. In some file system types, this
may vary from file to file.
blkcnt_t st_blocks Number of blocks allocated for this object.
Note that old (very old) versions of Unix did not support st_blksize or st_blocks. I expect most current versions do.
Now it is giving a warning..warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘__blksize_t’
The chances are that __blksize_t is an unisgned integer type similar to size_t. I'd probably use a simple cast:
printf("Block size = %d\n", (int)buf.st_blksize);
Alternatively, if you have C99 available, you could use the facilities from <inttypes.h> to use a bigger size:
printf("Block size = %" PRIu64 "\n", (uint64_t)buf.st_blksize);
In practice, this is overkill; the block size is unlikely to exceed 2 GB this decade, so int is likely to be sufficient for the foreseeable future.

From man 2 stat on my Mac OS X box:
NAME
fstat, fstat64, lstat, lstat64, stat, stat64 -- get file status
SYNOPSIS
#include <sys/stat.h>
int
fstat(int fildes, struct stat *buf);
Note the #include <sys/stat.h> which you have not done. No doubt the actual layout of struct stat is defined in there, which is what your compiler is complaining about.
This is one aspect of the man pages which is not always discussed with beginners but is very useful indeed: the whole unix API is documented in them. Oh, it is not always the easiest place to find a function when you know what it should do but don't know what it is called, but all the answers are there.

Open the file, and stat/fstat it. The struct field st_blocks should contain the information you want. If you're dealing with a directory, use opendir, readdir, closedir (posix)... Just pointers to start your work.
EDIT
Add unistd.h and sys/stat.h. Then remember that stat return 0 on success, so
if (stat(dirp->d_name, &buf) == 0)
and I've changed "." to the name of the "element", which is what you wanted, I suppose. Another change is to use st_blocks and not st_blksize, which says how big is each block (e.g. 1024 or 4096 or...), and -s returns the size in number of blocks, not the size of a block.
The fragment of code is of course incomplete: if you pass an argument, dp is not initialized and even dp == NULL can fail, you shoud have nullified it before:
DIR *dp = NULL;
struct dirent *dirp = NULL;

Careful, one bug in your code is that dp points to garbage and is only initialised if argc is less than 2, but you still try to use it in your while loop and you also try to closedir it. If you invoke your application with any arguments at all, it will probably crash.

To avoid the warning, change the %d to %ld in the line: printf("%d ", buf.st_blksize);

Related

How to refer to string content when using ls

I'm working with IRAF, based on SPP, kind of a mix between Fortran and C. I'm looking for a way of referring to a string content when using ls. For example, I can type ls *hola* if I want to list every file containing the word hola in my directory. Supose I have an string called id whose content is the world hola. How could I refer to the content in id? I'm looking for some sort of ls id (I know that construction won't work) which returns the same result as in ls *hola*.
Thank you in advance.
EDIT: SPP is somehow hidden on the Internet but here you have a reference manual https://www.mn.uio.no/astro/english/services/it/help/visualization/iraf/SPPManual.pdf although I haven't found any information there related to this topic.

Using the C side of your SPP thing, if you have access to the C headers, the simple is use something like
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char* command = "ls";
char* args = "*.c";
char line[80];
sprintf( line, " %s %s\n", command, args );
system(line);
};
But in C you have dirent.h where you can find the functions that does this things. Try man opendir on your machine
OPENDIR(3) Linux Programmer's Manual OPENDIR(3)
NAME
opendir, fdopendir - open a directory
SYNOPSIS
#include <sys/types.h>
#include <dirent.h>
DIR *opendir(const char *name);
DIR *fdopendir(int fd);
Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
fdopendir():
Since glibc 2.10:
_POSIX_C_SOURCE >= 200809L
Before glibc 2.10:
_GNU_SOURCE
DESCRIPTION
The opendir() function opens a directory stream corresponding to the directory name, and returns a pointer to
the directory stream. The stream is positioned at the first entry in the directory.
The fdopendir() function is like opendir(), but returns a directory stream for the directory referred to by
the open file descriptor fd. After a successful call to fdopendir(), fd is used internally by the implementa‐
tion, and should not otherwise be used by the application.
RETURN VALUE
...

First, I would recommend not to use SPP (and IRAF) at all. IRAF is out of official maintainance now, and writing new code for a deprecated software is probably a dead end.
Concerning your question: SPP comes with a function fntopnb() for IRAF style access to filename templates. They are documented at pages 101ff. of the SPP manual. A usage example can be found in the sources of pkg/system/files.x of IRAF:
call salloc (fname, SZ_FNAME, TY_CHAR)
list = fntopnb ("*id*", NO)
while (fntgfnb (list, Memc[fname], SZ_FNAME) != EOF) {
call printf ("%s\n")
call pargstr (Memc[fname])
}
call fntclsb (list)

readdir() 32/64 compatibility issues

I'm trying to get some old legacy code working on new 64-bit systems, and I'm currently stuck. Below is a small C file I'm using to test functionality that exists in the actual program that is currently breaking.
#define _POSIX_SOURCE
#include <dirent.h>
#include <sys/types.h>
#undef _POSIX_SOURCE
#include <stdio.h>
main(){
DIR *dirp;
struct dirent *dp;
char *const_dir;
const_dir = "/any/path/goes/here";
if(!(dirp = opendir(const_dir)))
perror("opendir() error");
else{
puts("contents of path:");
while(dp = readdir(dirp))
printf(" %s\n", dp->d_name);
closedir(dirp);
}
}
The Problem:
The OS is Red Hat 7.0 Maipo x86_64.
The legacy code is 32-bit, and must be kept that way.
I've gotten the compile for the program working fine using the -m32 flag with g++. The problem that arises is during runtime, readdir() gets a 64-bit inode and then throws an EOVERFLOW errno and of course nothing gets printed out.
I've tried using readdir64() in place of readdir() to some success. I no longer get the errno EOVERFLOW, and the lines come out on the terminal, but the files themselves don't get printed. I'm assuming this is due to the buffer not being what dirent expects.
I've attempted to use dirent64 to try to alleviate this problem but whenever I attempt this I get:
test.c:19:22 error: dereferencing pointer to incomplete type
printf(" %s\n", dp->d_name);
I'm wondering if there's a way to manually shift the dp->d_name buffer for dirent to be used with readdir(). I've noticed in Gdb that using readdir() and dirent results in dp->d_name having directories listed at dp->d_name[1], whereas readdir64() and dirent gives the first directory at dp->d_name[8].
That or somehow get dirent64 to work, or maybe I'm just on the wrong path completely.
Lastly, it's worth noting that the program functions perfectly without the -m32 flag included, so I'm assuming it has to be a 32/64 compatibility error somewhere. Any help is appreciated.

Thanks to #Martin in the comments above I was led to try defining the dirent64 struct in my code. This works. There's probably a #define that can be used to circumvent pasting libc .h code into my own code, but this works for now.
The code I needed was found in <bits/dirent.h>
I guess I should also note that this makes it work using both readdir64() and dirent64

In order to get a 64-bit ino_t with GCC and Glibc, you need to define the features _XOPEN_SOURCE and _FILE_OFFSET_BITS=64.
$ echo '#include <dirent.h>' | gcc -m32 -E -D_XOPEN_SOURCE -D_FILE_OFFSET_BITS=64 - | grep ino
__extension__ typedef unsigned long int __ino_t;
__extension__ typedef __u_quad_t __ino64_t;
typedef __ino64_t ino_t;
__ino64_t d_ino;
I say this from documentation reading and checking the preprocessor, not from deep experience or testing with a filesystem with inode numbers above 2^32, so I don't guarantee that you won't run into other problems down the line.

unix command result to a variable - char*

How can I assign "pwd" (or any other command in that case) result (present working dir) to a variable which is char*?
command can be anything. Not bounded to just "pwd".
Thanks.

Start with popen. That will let you run a command with its standard output directed to a FILE * that your parent can read. From there it's just a matter of reading its output like you would any normal file (e.g., with fgets, getchar, etc.)
Generally, however, you'd prefer to avoid running an external program for that -- you should have getcwd available, which will give the same result much more directly.

Why not just call getcwd()? It's not part of C's standard library, but it is POSIX, and it's very widely supported.
Anyway, if pwd was just an example, have a look at popen(). That will run an external command and give you a FILE* with which to read its output.

There is a POSIX function, getcwd() for this - I'd use that.

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int main(int argc, char* argv[]) {
char *dir;
dir = getcwd(NULL, 0);
printf("Current directory is: %s\n", dir);
free(dir);
return 0;
}
I'm lazy, and like the NULL, 0 parameters, which is a GNU extension to allocate as large a buffer as necessary to hold the full pathname. (It can probably still fail, if you're buried a few hundred thousand characters deep.)
Because it is allocated for you, you need to free(3) it when you're done. I'm done with it quickly, so I free(3) it quickly, but that might not be how you need to use it.

You can fork and use one of the execv* functions to call pwd from your C program, but getting the result of that would be messy at best.
The proper way to get the current working directory in a C program is to call char* getcwd(char* name, size_t size);

Executing machine code in memory

I'm trying to figure out how to execute machine code stored in memory.
I have the following code:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char* argv[])
{
FILE* f = fopen(argv[1], "rb");
fseek(f, 0, SEEK_END);
unsigned int len = ftell(f);
fseek(f, 0, SEEK_SET);
char* bin = (char*)malloc(len);
fread(bin, 1, len, f);
fclose(f);
return ((int (*)(int, char *)) bin)(argc-1, argv[1]);
}
The code above compiles fine in GCC, but when I try and execute the program from the command line like this:
./my_prog /bin/echo hello
The program segfaults. I've figured out the problem is on the last line, as commenting it out stops the segfault.
I don't think I'm doing it quite right, as I'm still getting my head around function pointers.
Is the problem a faulty cast, or something else?

You need a page with write execute permissions. See mmap(2) and mprotect(2) if you are under unix. You shouldn't do it using malloc.
Also, read what the others said, you can only run raw machine code using your loader. If you try to run an ELF header it will probably segfault all the same.
Regarding the content of replies and downmods:
1- OP said he was trying to run machine code, so I replied on that rather than executing an executable file.
2- See why you don't mix malloc and mman functions:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/mman.h>
int main()
{
char *a=malloc(10);
char *b=malloc(10);
char *c=malloc(10);
memset (a,'a',4095);
memset (b,'b',4095);
memset (c,'c',4095);
puts (a);
memset (c,0xc3,10); /* return */
/* c is not alligned to page boundary so this is NOOP.
Many implementations include a header to malloc'ed data so it's always NOOP. */
mprotect(c,10,PROT_READ|PROT_EXEC);
b[0]='H'; /* oops it is still writeable. If you provided an alligned
address it would segfault */
char *d=mmap(0,4096,PROT_READ|PROT_WRITE|PROT_EXEC,MAP_PRIVATE|MAP_ANON,-1,0);
memset (d,0xc3,4096);
((void(*)(void))d)();
((void(*)(void))c)(); /* oops it isn't executable */
return 0;
}
It displays exactly this behavior on Linux x86_64 other ugly behavior sure to arise on other implementations.

Using malloc works fine.
OK this is my final answer, please note I used the orignal poster's code.
I'm loading from disk, the compiled version of this code to a heap allocated area "bin", just as the orignal code did (the name is fixed not using argv, and the value 0x674 is from;
objdump -F -D foo|grep -i hoho
08048674 <hohoho> (File Offset: 0x674):
This can be looked up at run time with the BFD (Binary File Descriptor library) or something else, you can call other binaries (not just yourself) so long as they are statically linked to the same set of lib's.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
unsigned char *charp;
unsigned char *bin;
void hohoho()
{
printf("merry mas\n");
fflush(stdout);
}
int main(int argc, char **argv)
{
int what;
charp = malloc(10101);
memset(charp, 0xc3, 10101);
mprotect(charp, 10101, PROT_EXEC | PROT_READ | PROT_WRITE);
__asm__("leal charp, %eax");
__asm__("call (%eax)" );
printf("am I alive?\n");
char *more = strdup("more heap operations");
printf("%s\n", more);
FILE* f = fopen("foo", "rb");
fseek(f, 0, SEEK_END);
unsigned int len = ftell(f);
fseek(f, 0, SEEK_SET);
bin = (char*)malloc(len);
printf("read in %d\n", fread(bin, 1, len, f));
printf("%p\n", bin);
fclose(f);
mprotect(&bin, 10101, PROT_EXEC | PROT_READ | PROT_WRITE);
asm volatile ("movl %0, %%eax"::"g"(bin));
__asm__("addl $0x674, %eax");
__asm__("call %eax" );
fflush(stdout);
return 0;
}
running...
co tmp # ./foo
am I alive?
more heap operations
read in 30180
0x804d910
merry mas
You can use UPX to manage the load/modify/exec of a file.
P.S. sorry for the previous broken link :|

It seems to me you're loading an ELF image and then trying to jump straight into the ELF header? http://en.wikipedia.org/wiki/Executable_and_Linkable_Format
If you're trying to execute another binary, why don't you use the process creation functions for whichever platform you're using?

An typical executable file has:
a header
entry code that is called before main(int, char **)
The first means that you can't generally expect byte 0 of the file to be executable; intead, the information in the header describes how to load the rest of the file in memory and where to start executing it.
The second means that when you have found the entry point, you can't expect to treat it like a C function taking arguments (int, char **). It may, perhaps, be usable as a function taking no paramters (and hence requiring nothing to be pushed prior to calling it). But you do need to populate the environment that will in turn be used by the entry code to construct the command line strings passed to main.
Doing this by hand under a given OS would go into some depth which is beyond me; but I'm sure there is a much nicer way of doing what you're trying to do. Are you trying to execute an external file as a on-off operation, or load an external binary and treat its functions as part of your program? Both are catered for by the C libraries in Unix.

It is more likely that that it is the code that is jumped to by the call through function-pointer that is causing the segfault rather than the call itself. There is no way from the code you have posted to determine that that code loaded into bin is valid. Your best bet is to use a debugger, switch to assembler view, break on the return statement and step into the function call to determine that the code you expect to run is indeed running, and that it is valid.
Note also that in order to run at all the code will need to be position independent and fully resolved.
Moreover if your processor/OS enables data execution prevention, then the attempt is probably doomed. It is at best ill-advised in any case, loading code is what the OS is for.

What you are trying to do is something akin to what interpreters do. Except that an interpreter reads a program written in an interpreted language like Python, compiles that code on the fly, puts executable code in memory and then executes it.
You may want to read more about just-in-time compilation too:
Just in time compilation
Java HotSpot JIT runtime
There are libraries available for JIT code generation such as the GNU lightning and libJIT, if you are interested. You'd have to do a lot more than just reading from file and trying to execute code, though. An example usage scenario will be:
Read a program written in a scripting-language (maybe
your own).
Parse and compile the source into an
intermediate language understood by
the JIT library.
Use the JIT library to generate code
for this intermediate
representation, for your target platform's CPU.
Execute the JIT generated code.
And for executing the code you'd have to use techniques such as using mmap() to map the executable code into the process's address space, marking that page executable and jumping to that piece of memory. It's more complicated than this, but its a good start in order to understand what's going on beneath all those interpreters of scripting languages such as Python, Ruby etc.
The online version of the book "Linkers and Loaders" will give you more information about object file formats, what goes on behind the scenes when you execute a program, the roles of the linkers and loaders and so on. It's a very good read.

You can dlopen() a file, look up the symbol "main" and call it with 0, 1, 2 or 3 arguments (all of type char*) via a cast to pointer-to-function-returning-int-taking-0,1,2,or3-char*

Use the operating system for loading and executing programs.
On unix, the exec calls can do this.
Your snippet in the question could be rewritten:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char* argv[])
{
return execv(argv[1],argv+2);
}

Executable files contain much more than just code. Header, code, data, more data, this stuff is separated and loaded into different areas of memory by the OS and its libraries. You can't load a program file into a single chunk of memory and expect to jump to it's first byte.
If you are trying to execute your own arbitrary code, you need to look into dynamic libraries because that is exactly what they're for.

uses undefined struct compile error - C

The compiler doesn't know where stat.h is?
Error:
c:\Projects\ADC_HCI\mongoose.c(745) : error C2079: 'st' uses undefined struct '_stat64'
#include <sys/types.h>
#include <sys/stat.h>
static int
mg_stat(const char *path, struct mgstat *stp)
{
struct _stat64 st; //<-- ERROR
int ok;
wchar_t wbuf[FILENAME_MAX];
to_unicode(path, wbuf, ARRAY_SIZE(wbuf));
if (_wstat64(wbuf, &st) == 0) {
ok = 0;
stp->size = st.st_size;
stp->mtime = st.st_mtime;
stp->is_directory = S_ISDIR(st.st_mode);
} else {
ok = -1;
}
return (ok);
}
...downloaded the files straight from the source.

See MSDN: _wstat64 takes a parameter of struct __stat64 (with two underscores). Redeclare your variable st to be of type struct __stat64.

Note that neither _stat64 nor __stat64 is 'standard' in the sense of documented by any standard, such as POSIX. You would normally use struct stat; if you are worried about whether that will work with big files (over 2 GiB), then check what compilation options are required on your platform to obtain 'large file support'. For 64-bit machines and 64-bit compilations (not necessarily Windows 64), you usually don't need to worry. You can often obtain large file support using:
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
These are at least semi-standardized. Systems such as autoconf detect these things automatically (if you ask them to do so).

Change the _stat64 to stat64. At least in my Linux machines that's the name of the structure. I don't know if it is different in Windows.

I suggest you to sync to SVN trunk.
If you don't have SVN client, simply download two files:
http://mongoose.googlecode.com/svn/trunk/mongoose.h (and .c file too)
The reason is that recently the code was refactored, and CRT _stat function was substituted
with WinAPI one, GetFileAttributesExW().

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight