I have recently(yesterday) started trying to learn linux and to program in this os. Now, one interesting and probably easy problem I came across while surfing the net was something like this:
Consider a C program that takes a directory as an argument in the command line and calculates the sum of all the files' dimensions that are in the directory's tree.
Now, due to the fact that I've been doing a lot of reading and researching in a short matter of time, all my knowledge is piled up in my brian creating a cloud of confusion. If anyone could help me with the code, I'd be really thankful.
what you are asking is a basic task. It can be done in linux but can also be done in microsoft windows with minor code tweaks if you are writing a program in C or C++. you would be writing code, which is sort of at a lower level compared to other ways of doing it, to accomplish what you want.
However you don't need to write a program C, which then requires you to compile it into an executable. Because what you are asking is a basic task, you might be able to do it with a bash shell script which would be linux specific. And if you wanted to do this in Windows then you would write a .bat file which is either the DOS scripting language, or Windows Powershell. I am not that familiar with Windows, i only mention it to help give you a general understanding for "all the knowledge piled up in your brain creating a cloud of confusion".
There is the windirstat program which runs under Microsoft Windows, can get it free from sourceforge and I think it does mostly what you are asking. I am not sure if you can get source code for it.
For linux there is kdirstat and that you can get the source code for from
http://kdirstat.cvs.sourceforge.net/viewvc/kdirstat/
you can download it as GNU tarball.
Look at how that program is written, which is C++ as you'll see a bunch of .cpp files. That would be a good template to work off of, and you can see what libraries they are using to accomplish file system functions. There are 21 .cpp files, look at the file kdirstatmain.cpp first.
For C/C++ code the start of execution is with the function int main(int argc, char *argv[]).
Regarding accomplishing this task with a bash shell script in linux, the best i can tell you is web search on bash shell scripting for linux.
And in linux to calculate the sum of all the files' dimensions that are in the directory's tree we can quickly do that at the linux prompt with the du -sh . command. In linux at the prompt do man du so read about the disk usage command. And then consider looking for the source code for du to use it as a template, and work off how they implemented du to learn and then modify their ways to meet your needs.
linux du command source code
Use opendir(3) to "open" the directory. Since you are interested in learning how to program in GNU/Linux, start by typing man opendir in the terminal to read how this function works. The (3) in opendir(3) means that the help for this function can be found in the section 3 of the manpages. Notice, at the top of the page, that the manpage tells you which #includes you'll need.
If everything goes right, opendir(3) will return a DIR* object. To know which files or subdirectories it contains, you use this object with readdir(3). This should return a pointer of type struct dirent*. You can heck the manual pages for details on the fields of this structure, but the most important for you will probably be d_type and d_name. A second call to this function will return the next entry. When it returns NULL, that either means you have read all files or an error occurred. To know which happened, you should check errno.
Here's a short example that list all entries in /tmp:
#include <stdio.h>
#include <dirent.h>
#include <sys/types.h>
int main(void)
{
DIR *dir;
struct dirent *entry;
dir = opendir("/tmp");
/* should check if dir != NULL */
while ((entry = readdir(dir)) != NULL) {
printf("Found %s\n", entry->d_name);
}
/* You may want to check errno here to see if readdir returned
* NULL because all files were read or because of some error;
* but this is beyond the purposes of my example.
*/
closedir(dir);
return 0;
}
Now you have to process each entry. If it is a directory, you have to descend into it an read its contents. A recursive function will probably help you here. If it is a file, then you have at least two options:
Open it with fopen(3), then use fseek(3) to seek the end of file. Use the return value of fseek(3) to calculate the size of the file in bytes;
Use stat(2) to get a structure with information on the file. Do not confuse it with stat(1). If you simply type man stat, you'll get information about the latter. To force man to read from section 2, type man 2 stat in the command line.
The first approach is certainly simpler. The second will require you to do a bit of reading on how stat(2) works. My advice: you should do it. Not only because it's more in the lines of Linux, but also because it gives you information that fseek(3) doesn't give. For instance, you can use stat(2) to see not only how many bytes the file contains, but how many bytes it occupies in the disk (like du does).
While reading the directory, you may stumble on other types of entries other than files and directories. stat(2) will probably help you figure the sizes of them as well. But you may want to simply ignore them for now.
Related
I am writing a c program that takes system commands such as "ls" or "cd" as inputs.However user can give any type of commands out of which some are not commands.How can i find which command is valid and which is not?I am writing the code in Ubuntu.
Off the top of my head there are two ways to check if the input is a valid system command, without actually attempting to run it:
A long list of if()s and else if()s which strcmp() the input string with a hard-coded, predetermined list of valid commands - this may be relatively slow, both to write and to run, but with conditional-compilation with #ifdef, can be nearly perfectly portable (i.e, can be made to work with Windows, Linux, BSD et all from one codebase with enough hard work).
If you don't mind being restricted to a UNIX-like platform only, parse the $PATH variable, and search for executables with the same filename as the input string in the directories found from $PATH, and handle errors if no match is met.
You may wish to implement a hybrid of 2. and 1. by hard-coding some exceptions which may not be found in $PATH.
IMHO, however, I fail to see why you would want to do this; it seems puzzling to me.
Xcode's generic Kernel Extension requires file parsing.
For example, I want to read the contents of the A.txt file and save it as a variable. Just like you used FILE, fopen, EOF in c
As you can see, generic Kernel Extension can not include stdio.h, resulting in an error of use of undeclared identifier.
I am wondering if there is a way to parse a file in generic Kernel Extension like c.
(The following code can be used in Kernel Extension)
FILE *f;
char c;
int index = 0;
f = fopen(filepath, "rt");
while((c = fgetc(f)) != EOF){
fileContent[index] = c;
index++;
}
fileContent[index] = '\0';
It is certainly possible. You'll need to do the following:
Open the file with vnode_open(). This will turn your path into a vnode_t reference. You'll need a VFS authorisation context; you can obtain the current thread's context (i.e. open the file as the user in whose process's context the kernel is currently running) with vfs_context_create() if you don't already have one.
Perform I/O with vn_rdwr(). (Reads & writes use the same function, just pass UIO_READ or UIO_WRITE as the second argument.)
Close the file and drop references to the vnode with vnode_close(). Possibly dispose of a created VFS context using vfs_context_rele().
You'll want to look at the headerdocs for all of those functions, they're defined in <sys/vnode.h> in the Kernel.framework, and explaining every parameter exceeds the scope of a SO question/answer.
Note: As a commenter has already pointed out however, you'll want to make sure that opening files is really what needs to be done to solve whatever your problem is, particularly if you're newish to kernel programming. If at all unsure, I suggest you post a question along the lines of "I'm trying to do X, is reading the file in a kext really the best way forward?" where X is sufficiently high level, not "I need the contents of a file in the kernel" but why, and why a file specifically?
In various kernel execution contexts, file I/O may not be safe (i.e. may sometimes hang the system). If your kext loads early during boot, there might not be a file system yet. File I/O causes a lot to happen in the system, and can take a very long time in kernel terms - especially if you consider network file systems (including netboot environments!). If you're not careful, you might cause a bad user experience if the user is trying to eject a volume with a file your kext has open: the user has no way of resolving this, the OS can only suggest specific apps to close, it can't reach deep into your kext. Plus, there's the usual warnings about kernel programming in general: just because it can be done in the kernel, doesn't mean it should be. It's more the opposite: only if it can't be done any other way should it be done in a kext.
I am writing a C shared library in Linux in which a function would like to discover the path to the currently running executable. It does NOT have access to argv[0] in main(), and I don't want to require the program accessing the library to pass that in.
How can a function like this, outside main() and in the wild, get to the path of the running executable? So far I've thought of 2 rather unportable, unreliable ways: 1) try to read /proc/getpid()/exe and 2) try to climb the stack to __libc_start_main() and read the stack params. I worry about all machines having /proc mounted.
Can you think of another way? Is there something buried anywhere in dlopen(NULL, 0) ? Can I get a reliable proc image of self from the kernel??
Thanks for any thoughts.
/proc is your best chance, as "path of the executable" is not that well defined concept in Linux (you can even delete it while the program is running).
To get the breakdown of loaded modules (with the main executable usually being the first entry) you should look at /proc/<pid>/maps. It's a text formatted file which will allow you to associate executable and library paths with load addresses (if the former are known and still valid).
Unless you are writing software that may be used very early in system startup, you can safely assume that /proc will always be mounted on a Linux system. It contains quite a bit of data that is not accessible any other way, and thus must be mounted for a system to function properly. As such, you can pretty easily obtain a path to your executable using:
readlink("/proc/self/exe", buf, sizeof(buf));
If for some reason you want to avoid this, it's also possible to read it from the process's auxiliary vector:
#include <sys/auxv.h>
#include <elf.h>
const char *execpath = (const char *) getauxval(AT_EXECFN);
Note that this will require a recent version of glibc (2.16 or later). It'll also return the path that was used to execute your application (e.g, possibly something like ./binary), rather than its absolute path.
Platform: Debian Wheezy 3.2.0-4-686-pae
Complier: GCC (Debian 4.7.2-5) 4.7.2 (Code::Blocks)
I want to move a file from one location to another. Nothing complex like moving to different drives or to different file systems. I know the "standard" way to do this would be simply copying the file and then removing the original. But I want some way of preserving the file's ownership, mode, last access/modification, etc. . I am assuming that I will have to copy the file and then edit the new file's ownership, mode, etc. afterwards but I have no idea how to do this.
The usual way to move a file in C is to use rename(2), which sometimes fail.
If you cannot use the rename(2) syscall (e.g. because source and target are on different filesystems), you have to query the size, permission and other metadata of the source file with stat(2); copy the data looping on read(2), write(2) (using a buffer of several kilobytes), open(2), close(2) and the metadata using chmod(2), chown(2), utime(2). You might also care about copying attributes using getxattr(2), setxattr(2), listxattr(2). You could also in some cases use sendfile(2), as commented by David C. Rankin.
And if the source and target are on different filesystems, there is no way to make the move atomic and avoid race conditions (So using rename(2) is preferable when possible, because it is atomic according to its man page). The source file can always be modified (by another process) during the move operations...
So a practical way to move files is to first try doing a rename(2), and if that fails with EXDEV (when oldpath and newpath are not on the same mounted filesystem), then you need to copy bytes and metadata. Several libraries provide functions doing that, e.g. Qt QFile::rename.
Read Advanced Linux Programming - and see syscalls(2) - for more (and also try to strace some mv command to understand what it is doing). That book is freely and legally downloadable (so you could find several copies on the Web).
The /bin/mv command (see mv(1)) is part of GNU coreutils which is free software. You could either study its source code, or use strace(1) to understand what that command does (in terms of syscalls(2)). In some open source Unix shells like sash or busybox, mv might be a shell builtin. See also path_resolution(7) and glob(7).
There are subtle corner cases (imagine another process or pthread doing some file operations on the same filesystem, directory, or files). Read some operating system textbook for more.
Using a mix of snprintf(3), system(3), mv(1) could be tricky if the file name contains weird characters such as tab or or newlines, or starts with an initial -. See errno(3).
If the original and new location for the file are on the same filesystem then a "move" is conceptually identical to a "rename."
#include <stdio.h>
int rename (const char *oldname, const char *newname)
I need to run a program that crawls websites and I already have an algorithm and some parts of the code. Problem is, I do not know how to insert wget into my source code. Our student assistant hinted that some kind of keyword or function shall be used before the wget( system, I think or something but I'm not so sure).
when to not use system:
1.) when you want to distribute the program to different environment, where the program you call via system is not available
2.) in a security relevant environment, where you have to make sure that the program you call is really the program you want it to be
3.) when the thing you want to do can easily be accomplished in 10-20 lines of C code
4.) in performance-critical applications
so, you should use system virtually never.
instead, to accomplish the same thing, you could use libcurl, as David suggested (his answer seems to be gone...), or do some socket programming (it's C, after all).
In a real-world scenario, I'd probably just default to writing the crawler in a different language. web requests and complex string processing are not necessarily the strong sides of C, and most definitely not very convenient to use :)
You can use the system() command.
In your case (possibly):
system("/bin/wget");
But if you want really call wget with parameters, so you should use execl().
execl("/bin/wget", "http://anyadress.com/file");
Whenever , you want to run shell commands from your C program , you use system("shell command").In your case
system("wget");
Note - wget is an executable , whose location is added to the path variable, so there is no need to specify the path explicitly.
-- Example --
#include <stdio.h>
#define BUFFLEN 2500
int main()
{
char web_address[BUFFLEN] = "www.google.com";
system("wget 'web_address' ");
return 0;
}
The system command is used to execute a shell command. man system