print out contents of archived files in C - c

I have the same question posted here:
How to print the name of the files inside an archive file?
but those answers don't necessarily address the problem. I have an archived file week.a and I'd like to print out the names of the files inside that archive, called mon.txt, and fri.txt.
It should work just like the ar -t command, except I'm not allowed to use that.
What I've tried:
My first attempt was to create a for loop and print out the arguments, but then I realized the file is already archived by that point, and so that wouldn't work.
My second attempt was to look at the print_contents function of ar, which I've listed below:
static void
print_contents (bfd *abfd)
{
size_t ncopied = 0;
char *cbuf = (char *) xmalloc (BUFSIZE);
struct stat buf;
size_t size;
if (bfd_stat_arch_elt (abfd, &buf) != 0)
/* xgettext:c-format */
fatal (_("internal stat error on %s"), bfd_get_filename (abfd));
if (verbose)
printf ("\n<%s>\n\n", bfd_get_filename (abfd));
bfd_seek (abfd, (file_ptr) 0, SEEK_SET);
size = buf.st_size;
while (ncopied < size)
{
size_t nread;
size_t tocopy = size - ncopied;
if (tocopy > BUFSIZE)
tocopy = BUFSIZE;
nread = bfd_bread (cbuf, (bfd_size_type) tocopy, abfd);
if (nread != tocopy)
/* xgettext:c-format */
fatal (_("%s is not a valid archive"),
bfd_get_filename (bfd_my_archive (abfd)));
/* fwrite in mingw32 may return int instead of size_t. Cast the
return value to size_t to avoid comparison between signed and
unsigned values. */
if ((size_t) fwrite (cbuf, 1, nread, stdout) != nread)
fatal ("stdout: %s", strerror (errno));
ncopied += tocopy;
}
free (cbuf);
}
But with this route, I don't really know what a lot of that code means or does (I'm very new to C). Could someone help make sense of this code, or point me in the right direction for writing my program? Thank you.

Based on format at wikipedia.org/wiki/Ar_(Unix), the basic shape of your program will be:
fopen(filename)
fscanf 8 characters/* global header */
check header is "!<arch>" followed by LF
while not at end of file /* check return value of fcanf below */
fscanf each item in file header
print filename /* first 16 characters of file header */
check magic number is 0x60 0x0A
skip file size characters /* file contents - can use fseek with origin = SEEK_CUR */
fclose(file)
Refer to the C stdio library documentation for details of functions needed. Or see Wikipedia C file input/output

int counting(FILE *f)
{
int count=0;
rewind(f);
struct ar_hdr myheader;
fseek(f,8,SEEK_CUR);
while(fread(&myheader,sizeof(struct ar_hdr),1,f)>0)
{
long test;
test = atol(myheader.ar_size);
fseek(f,test,SEEK_CUR);
count++;
}
printf("count is : %d\n",count);
return count;
}
this code i had written to count the number of files in archive.. u can use the same to print the file names inside it as well

Related

Listing directory's files using open() and read() syscalls in POSIX systems

I was wondering how to do this. I have tried several things but nothing seems to work for me. I don't want to use opendir() syscall nor do i want to use readdir() system call. Could you please tell me how to do this because i get garbage values. I want to list files that are inside a folder. I get garbage value stored in the buffer from this code.
char buffer[16];
size_t offset = 0;
size_t bytes_read;
int i;
/* Open the file for reading. */
int fd = open ("testfolder", O_RDONLY);
/* Read from the file, one chunk at a time. Continue until read
“comes up short”, that is, reads less than we asked for.
This indicates that we’ve hit the end of the file. */
do {
/* Read the next line’s worth of bytes. */
bytes_read = read (fd, buffer, 16);
/* Print the offset in the file, followed by the bytes themselves.*/
// printf ("0x%06lx : ", offset);
// for (i = 0; i < bytes_read; ++i)
// printf ("%02x ", buffer[i]);
printf("%s", buffer);
printf ("\n");
/* Keep count of our position in the file. */
// offset += bytes_read;
}
while (bytes_read!=-1);
You can't do this. The kernel only allows open on a directory with special options, and it doesn't allow read on a directory at all. You have to use opendir and readdir.
(Under the hood, opendir calls open with those special options I mentioned, and readdir calls the private system call getdents. See https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/opendir.c and https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/readdir.c . Doing this yourself is not recommended.)

fgets statement reads first line and not sure how to modify because I have to return a pointer [duplicate]

I need to copy the contents of a text file to a dynamically-allocated character array.
My problem is getting the size of the contents of the file; Google reveals that I need to use fseek and ftell, but for that the file apparently needs to be opened in binary mode, and that gives only garbage.
EDIT: I tried opening in text mode, but I get weird numbers. Here's the code (I've omitted simple error checking for clarity):
long f_size;
char* code;
size_t code_s, result;
FILE* fp = fopen(argv[0], "r");
fseek(fp, 0, SEEK_END);
f_size = ftell(fp); /* This returns 29696, but file is 85 bytes */
fseek(fp, 0, SEEK_SET);
code_s = sizeof(char) * f_size;
code = malloc(code_s);
result = fread(code, 1, f_size, fp); /* This returns 1045, it should be the same as f_size */
The root of the problem is here:
FILE* fp = fopen(argv[0], "r");
argv[0] is your executable program, NOT the parameter. It certainly won't be a text file. Try argv[1], and see what happens then.
You cannot determine the size of a file in characters without reading the data, unless you're using a fixed-width encoding.
For example, a file in UTF-8 which is 8 bytes long could be anything from 2 to 8 characters in length.
That's not a limitation of the file APIs, it's a natural limitation of there not being a direct mapping from "size of binary data" to "number of characters."
If you have a fixed-width encoding then you can just divide the size of the file in bytes by the number of bytes per character. ASCII is the most obvious example of this, but if your file is encoded in UTF-16 and you happen to be on a system which treats UTF-16 code points as the "native" internal character type (which includes Java, .NET and Windows) then you can predict the number of "characters" to allocate as if UTF-16 were fixed width. (UTF-16 is variable width due to Unicode characters above U+FFFF being encoded in multiple code points, but a lot of the time developers ignore this.)
I'm pretty sure argv[0] won't be an text file.
Give this a try (haven't compiled this, but I've done this a bazillion times, so I'm pretty sure it's at least close):
char* readFile(char* filename)
{
FILE* file = fopen(filename,"r");
if(file == NULL)
{
return NULL;
}
fseek(file, 0, SEEK_END);
long int size = ftell(file);
rewind(file);
char* content = calloc(size + 1, 1);
fread(content,1,size,file);
return content;
}
If you're developing for Linux (or other Unix-like operating systems), you can retrieve the file-size with stat before opening the file:
#include <stdio.h>
#include <sys/stat.h>
int main() {
struct stat file_stat;
if(stat("main.c", &file_stat) != 0) {
perror("could not stat");
return (1);
}
printf("%d\n", (int) file_stat.st_size);
return (0);
}
EDIT: As I see the code, I have to get into the line with the other posters:
The array that takes the arguments from the program-call is constructed this way:
[0] name of the program itself
[1] first argument given
[2] second argument given
[n] n-th argument given
You should also check argc before trying to use a field other than '0' of the argv-array:
if (argc < 2) {
printf ("Usage: %s arg1", argv[0]);
return (1);
}
argv[0] is the path to the executable and thus argv[1] will be the first user submitted input. Try to alter and add some simple error-checking, such as checking if fp == 0 and we might be ble to help you further.
You can open the file, put the cursor at the end of the file, store the offset, and go back to the top of the file, and make the difference.
You can use fseek for text files as well.
fseek to end of file
ftell the offset
fseek back to the begining
and you have size of the file
Kind of hard with no sample code, but fstat (or stat) will tell you how big the file is. You allocate the memory required, and slurp the file in.
Another approach is to read the file a piece at a time and extend your dynamic buffer as needed:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define PAGESIZE 128
int main(int argc, char **argv)
{
char *buf = NULL, *tmp = NULL;
size_t bufSiz = 0;
char inputBuf[PAGESIZE];
FILE *in;
if (argc < 2)
{
printf("Usage: %s filename\n", argv[0]);
return 0;
}
in = fopen(argv[1], "r");
if (in)
{
/**
* Read a page at a time until reaching the end of the file
*/
while (fgets(inputBuf, sizeof inputBuf, in) != NULL)
{
/**
* Extend the dynamic buffer by the length of the string
* in the input buffer
*/
tmp = realloc(buf, bufSiz + strlen(inputBuf) + 1);
if (tmp)
{
/**
* Add to the contents of the dynamic buffer
*/
buf = tmp;
buf[bufSiz] = 0;
strcat(buf, inputBuf);
bufSiz += strlen(inputBuf) + 1;
}
else
{
printf("Unable to extend dynamic buffer: releasing allocated memory\n");
free(buf);
buf = NULL;
break;
}
}
if (feof(in))
printf("Reached the end of input file %s\n", argv[1]);
else if (ferror(in))
printf("Error while reading input file %s\n", argv[1]);
if (buf)
{
printf("File contents:\n%s\n", buf);
printf("Read %lu characters from %s\n",
(unsigned long) strlen(buf), argv[1]);
}
free(buf);
fclose(in);
}
else
{
printf("Unable to open input file %s\n", argv[1]);
}
return 0;
}
There are drawbacks with this approach; for one thing, if there isn't enough memory to hold the file's contents, you won't know it immediately. Also, realloc() is relatively expensive to call, so you don't want to make your page sizes too small.
However, this avoids having to use fstat() or fseek()/ftell() to figure out how big the file is beforehand.

reading data from large file into struct in C

I am a beginner to C programming. I need to efficiently read millions of from a file using struct in a file. Below is the example of input file.
2,33.1609992980957,26.59000015258789,8.003999710083008
5,15.85200023651123,13.036999702453613,31.801000595092773
8,10.907999992370605,32.000999450683594,1.8459999561309814
11,28.3700008392334,31.650999069213867,13.107999801635742
I have a current code shown in below, it is giving an error "Error in file"
suggesting the file is NULL but file has data.
#include<stdio.h>
#include<stdlib.h>
struct O_DATA
{
int index;
float x;
float y;
float z;
};
int main ()
{
FILE *infile ;
struct O_DATA input;
infile = fopen("input.dat", "r");
if (infile == NULL);
{
fprintf(stderr,"\nError file\n");
exit(1);
}
while(fread(&input, sizeof(struct O_DATA), 1, infile))
printf("Index = %d X= %f Y=%f Z=%f", input.index , input.x , input.y , input.z);
fclose(infile);
return 0;
}
I need to efficiently read and store data from an input file to process it further. Any help would be really appreciated. Thanks in advnace.
~
~
~
First figure out how to convert one line of text to data
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct my_data
{
unsigned int index;
float x;
float y;
float z;
};
struct my_data *
deserialize_data(struct my_data *data, const char *input, const char *separators)
{
char *p;
struct my_data tmp;
if(sscanf(input, "%d,%f,%f,%f", &data->index, &data->x, &data->y, &data->z) != 7)
return NULL;
return data;
}
deserialize_data(struct my_data *data, const char *input, const char *separators)
{
char *p;
struct my_data tmp;
char *str = strdup(input); /* make a copy of the input line because we modify it */
if (!str) { /* I couldn't make a copy so I'll die */
return NULL;
}
p = strtok (str, separators); /* use line for first call to strtok */
if (!p) goto err;
tmp.index = strtoul (p, NULL, 0); /* convert text to integer */
p = strtok (NULL, separators); /* strtok remembers line */
if (!p) goto err;
tmp.x = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.y = atof(p);
p = strtok (NULL, separators);
if (!p) goto err;
tmp.z = atof(p);
memcpy(data, &tmp, sizeof(tmp)); /* copy values out */
goto out;
err:
data = NULL;
out:
free (str);
return data;
}
int main() {
struct my_data somedata;
deserialize_data(&somedata, "1,2.5,3.12,7.955", ",");
printf("index: %d, x: %2f, y: %2f, z: %2f\n", somedata.index, somedata.x, somedata.y, somedata.z);
}
Combine it with reading lines from a file:
just the main function here (insert the rest from the previous example)
int
main(int argc, char *argv[])
{
FILE *stream;
char *line = NULL;
size_t len = 0;
ssize_t nread;
struct my_data somedata;
if (argc != 2) {
fprintf(stderr, "Usage: %s <file>\n", argv[0]);
exit(EXIT_FAILURE);
}
stream = fopen(argv[1], "r");
if (stream == NULL) {
perror("fopen");
exit(EXIT_FAILURE);
}
while ((nread = getline(&line, &len, stream)) != -1) {
deserialize_data(&somedata, line, ",");
printf("index: %d, x: %2f, y: %2f, z: %2f\n", somedata.index, somedata.x, somedata.y, somedata.z);
}
free(line);
fclose(stream);
exit(EXIT_SUCCESS);
}
You've got an incorrect ; after your if (infile == NULL) test - try removing that...
[Edit: 2nd by 9 secs! :-)]
if (infile == NULL);
{ /* floating block */ }
The above if is a complete statement that does nothing regardless of the value of infile. The "floating" block is executed no matter what infile contains.
Remove the semicolon to 'attach' the "floating" block to the if
if (infile == NULL)
{ /* if block */ }
You already have solid responses in regard to syntax/structs/etc, but I will offer another method for reading the data in the file itself: I like Martin York's CSVIterator solution. This is my go-to approach for CSV processing because it requires less code to implement and has the added benefit of being easily modifiable (i.e., you can edit the CSVRow and CSVIterator defs depending on your needs).
Here's a mostly complete example using Martin's unedited code without structs or classes. In my opinion, and especially so as a beginner, it is easier to start developing your code with simpler techniques. As your code begins to take shape, it is much clearer why and where you need to implement more abstract/advanced devices.
Note this would technically need to be compiled with C++11 or greater because of my use of std::stod (and maybe some other stuff too I am forgetting), so take that into consideration:
//your includes
//...
#include"wherever_CSVIterator_is.h"
int main (int argc, char* argv[])
{
int index;
double tmp[3]; //since we know the shape of your input data
std::vector<double*> saved = std::vector<double*>();
std::vector<int> indices;
std::ifstream file(argv[1]);
for (CSVIterator loop(file); loop != CSVIterator(); ++loop) { //loop over rows
index = (*loop)[0];
indices.push_back(index); //store int index first, always col 0
for (int k=1; k < (*loop).size(); k++) { //loop across columns
tmp[k-1] = std::stod((*loop)[k]); //save double values now
}
saved.push_back(tmp);
}
/*now we have two vectors of the same 'size'
(let's pretend I wrote a check here to confirm this is true),
so we loop through them together and access with something like:*/
for (int j=0; j < (int)indices.size(); j++) {
double* saved_ptr = saved.at(j); //get pointer to first elem of each triplet
printf("\nindex: %g |", indices.at(j));
for (int k=0; k < 3; k++) {
printf(" %4.3f ", saved_ptr[k]);
}
printf("\n");
}
}
Less fuss to write, but more dangerous (if saved[] goes out of scope, we are in trouble). Also some unnecessary copying is present, but we benefit from using std::vector containers in lieu of knowing exactly how much memory we need to allocate.
Don't give an example of input file. Specify your input file format -at least on paper or in comments- e.g. in EBNF notation (since your example is textual... it is not a binary file). Decide if the numbers have to be in different lines (or if you might accept a file with a single huge line made of million bytes; read about the Comma Separated Values format). Then, code some parser for that format. In your case, it is likely that some very simple recursive descent parsing is enough (and your particular parser won't even use recursion).
Read more about <stdio.h> and its routines. Take time to carefully read that documentation. Since your input is textual, not binary, you don't need fread. Notice that input routines can fail, and you should handle the failure case.
Of course, fopen can fail (e.g. because your working directory is not what you believe it is). You'll better use perror or errno to find more about the failure cause. So at least code:
infile = fopen("input.dat", "r");
if (infile == NULL) {
perror("fopen input.dat");
exit(EXIT_FAILURE);
}
Notice that semi-colons (or their absence) are very important in C (no semi-colon after condition of if). Read again the basic syntax of C language. Read about How to debug small programs. Enable all warnings and debug info when compiling (with GCC, compile with gcc -Wall -g at least). The compiler warnings are very useful!
Remember that fscanf don't handle the end of line (newline) differently from a space character. So if the input has to have different lines you need to read every line separately.
You'll probably read every line using fgets (or getline) and parse every line individually. You could do that parsing with the help of sscanf (perhaps the %n could be useful) - and you want to use the return count of sscanf. You could also perhaps use strtok and/or strtod to do such a parsing.
Make sure that your parsing and your entire program is correct. With current computers (they are very fast, and most of the time your input file sits in the page cache) it is very likely that it would be fast enough. A million lines can be read pretty quickly (if on Linux, you could compare your parsing time with the time used by wc to count the lines of your file). On my computer (a powerful Linux desktop with AMD2970WX processor -it has lots of cores, but your program uses only one-, 64Gbytes of RAM, and SSD disk) a million lines can be read (by wc) in less than 30 milliseconds, so I am guessing your entire program should run in less than half a second, if given a million lines of input, and if the further processing is simple (in linear time).
You are likely to fill a large array of struct O_DATA and that array should probably be dynamically allocated, and reallocated when needed. Read more about C dynamic memory allocation. Read carefully about C memory management routines. They could fail, and you need to handle that failure (even if it is very unlikely to happen). You certainly don't want to re-allocate that array at every loop. You probably could allocate it in some geometrical progression (e.g. if the size of that array is size, you'll call realloc or a new malloc for some int newsize = 4*size/3 + 10; only when the old size is too small). Of course, your array will generally be a bit larger than what is really needed, but memory is quite cheap and you are allowed to "lose" some of it.
But StackOverflow is not a "do my homework" site. I gave some advice above, but you should do your homework.

C: How could I send a list of file names as a string on a socket?

I'm trying to take file names from a dirent struct, and send a list of all names as a concatenated string to the client.
After a few hours of trying to figure it out, I can't seem to allocate memory properly, or read it properly, I'm getting nonsense string back, so It must be reading memory wrong, even though I think I've appended the string with "\0"
Here is what I've done so far,
Send string to client:
void send_file_list(int socketNumber)
{
DIR *mydir;
if ((mydir = opendir("upload")) == NULL) {
perror("error");
exit(EXIT_FAILURE);
}
struct dirent *entry = NULL;
size_t len;
//loop through entry to get size of all filenames as string.
while ((entry = readdir(mydir)) != NULL)
{
len = len + strlen(entry->d_name);
}
char filelist[len];
//returns NULL when dir contents all processed
while ((entry = readdir(mydir)) != NULL)
{
strcat(strcat(filelist, entry->d_name),"\n");
}
closedir(mydir);
size_t n = len;
writen(socketNumber, (unsigned char *) &n, sizeof(size_t));
writen(socketNumber, (unsigned char *) filelist, n);
printf("Sent file list of size %zu bytes\n",n);
}//end send_file_list()
Get string from server:
void get_file_list(int socket)
{
size_t k;
readn(socket, (unsigned char *) &k, sizeof(size_t));
char filelist[k];
readn(socket, (unsigned char *) filelist, k);
printf("Received: %zu bytes\n\n", k);
printf("\n---Files On Server -------------------\n");
printf("%s", filelist);
printf("\n--------------------------------------\n");
} // end get_file_list()
witen() and readn() are from an imported file, rdwrn.c and .h, this was part of a coursework project, i'm not fully sure how they work but basically they read and write from the given socket. I didn't make them.
I've been at it for hours, and I feel It's getting messy. Is there a better way of doing this?
this loop:
while ((entry = readdir(mydir)) != NULL)
{
len = len + strlen(entry->d_name);
}
has a couple of problems.
1) Generic to the whole posted code, the indenting is not consistent. For consistency, indent after every opening brace '{'. unindent before every closing brace '}'. Suggest each indent level be 4 spaces.
2) There are lots of kinds of entries in the directory entries. (links, sub directories, etc etc)
Before including any specific 'file name' in the resulting string. the code needs to check the type of entry to assure it is a normal file AND the name of the file is NOT . nor ..
====
the variable: len is not initialized, suggest using:
size_t len = 0;
====
Note that the function strlen() returns the offset to the NUL byte at the end of the string (and offsets start at 0, not 1) so the actual length is 1 byte longer
====
The question doesn't state: is this code to collect only the current directory file names or is it to include the file names in the sub directories?
====
The question doesn't state: are the file names to be jammed together or separated by a space or comma. If separated by a space, what about file names that contain a space?
====
BTW: where do the function names: writen() and readn() come from?

retrofitting a .h & .c file to my already working .c program

I have a program that I'm doing for class where I need to take the content of one file, reverse it, and write that reversed content to another file. I have written a program that successfully does this (after much googling as I am new to the C programming language). The problem however is that my professor wants us to submit the program in a certain way with a couple supporting .h and .c files (which I understand is good practice). So I was hoping someone could help me understand exactly how I can take my already existing program and make it into one that is to his specifications, which are as follows:
he would like a file named "file_utils.h" that has function signatures and guards for the following two functions
int read_file( char* filename, char **buffer );
int write_file( char* filename, char *buffer, int size);
thus far I have created this file to try and accomplish this.
#ifndef UTILS_H
#define UTILS_H
int read_file(char* filename, char **buffer);
int write_file(char* filename, char *buffer, int size);
#endif
he would like a file named "file_utils.c" that has the implemented code for the previous two functions
he would like a file named "reverse.c" that accepts command arguments, includes a main function, and calls the functions from the previous two files.
now. I understand how this is supposed to work, but as I'm looking at the program I wrote my way I'm unsure how to actually accomplish the same result by adhering to the previously mentioned specifications.
Below is the program that successfully accomplishes the desired functionality
#include<stdlib.h>
#include<stdio.h>
#include<fcntl.h>
#include<string.h>
#include<sys/stat.h>
#include<unistd.h>
int main(int argc, char *argv[]) {
int file1, file2, char_count, x, k;
char buffer;
// if the number of parameters passed are not correct, exit
//
if (argc != 3) {
fprintf(stderr, "usage %s <file1> <file2>", argv[0]);
exit(EXIT_FAILURE);
}
// if the origin file cannot be opened for whatever reason, exit
// S_IRUSR specifies that this file is to be read by only the file owner
//
if ((file1 = open(argv[1], S_IRUSR)) < 0) {
fprintf(stderr, "The origin-file is inaccessible");
exit(EXIT_FAILURE);
}
// if the destination-file cannot be opened for whatever reason, exit
// S_IWUSR specifies that this file is to be written to by only the file owner
//
if ((file2 = creat(argv[2], S_IWUSR)) < 0) {
fprintf(stderr, "The destination-file is inaccessible");
exit(EXIT_FAILURE);
}
// SEEK_END is used to place the read/write pointer at the end of the file
//
char_count = lseek(file1, (off_t) 0, SEEK_END);
printf("origin-file size is %d\n", char_count - 1);
for (k = char_count - 1; k >= 0; k--) {
lseek(file1, (off_t) k, SEEK_SET);
x = read(file1, &buffer, 1);
if (x != 1) {
fprintf(stderr, "can't read 1 byte");
exit(-1);
}
x = write(file2, &buffer, 1);
if (x != 1) {
fprintf(stderr, "can't write 1 byte");
exit(-1);
}
}
write(STDOUT_FILENO, "Reversal & Transfer Complete\n", 5);
close(file1);
close(file2);
return 0;
}
any insight as to how I can accomplish this "re-factoring" of sorts would be much appreciated, thanks!
The assignment demands a different architecture than your program. Unfortunately, this will not be a refactoring but a rewrite.
You have most of the pieces of read_file and write_file already: opening the file, determining its length, error handling. Those can be copy-pasted into the new functions.
But read_file should call malloc and read the file into memory, which is different.
You should create a new function in reverse.c, called by main, to reverse the bytes in a memory buffer.
After that function runs, write_file should attempt to open the file, and only do its error checking at that point.
Your simple program is superior because it validates the output file before any I/O, and it requires less memory. Its behavior satisfies the assignment, but its form does not.

Resources