Word Extraction C [closed] - c

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I want to extract the words from a file (and later, from console input), count their appearances and store them in my Word structure:
typedef struct cell{
char *info; /* word itself */
int nr; /* number of appearances of the word *
}*Word;
This structure will be allocated dynamically for as many words are contained in the file. Consider this function:
void Word_Allocation (Word* a) /* The function that allocates space for one structure */
My questions are:
How do I correctly open a file and read it line by line?
How do I correctly store words and number of appearances in my structure?

As for file io, this is the basics.
As for the algorithm, since you are not using C++, so map is not available which is trivial for this problem. A straightforward solution in C might be:
Allocated an array of cell and read in words
sort the array on char *info.
count

Your allocator function should return a Word* and receive a size to allocate for the word itself. Something like this, perhaps:
Word * Word_Allocation (size_t size) {
Word *w = malloc(sizeof(*w));
if (w) w->info = malloc(size);
if (!w->info)
{
free(w);
w = NULL;
}
return w;
}
You can read a word at a time with:
#define STR(x) #x
enum {MAX_BUF = 100};
char buf[MAX_BUF];
fscanf(infile, "%" STR(MAX_BUF) "s", buf);
And then strlen(buf)+1 is the size to pass to Word_Allocation. Or you can pass buf and have Word_Allocation call strlen and copy the data over.

Related

Seg Fault when Parsing a CSV Line in C [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I am working on a project in which I need to read CSV lines from a text file into my program. I was given a code skeleton, and asked to fill in functionality. I have a struct containing a variable for each type of value I am going to receive, but my char array is causing segmentation faults.
Here is an excerpt of my code.
None of the exerpt is part of the given code, this is all mine:
My error is a Segmentation Fault(Core Dumped), due to the code within the get timestamp space.
my test file contained only one line,
5, 10:00:10, 1, 997
/*
* YOUR CODE GOES HERE:
* (1) Read an input csv line from stdin
* (2) Parse csv line into appropriate fields
* (3) Take action based on input type:
* - Check-in or check-out a patient with a given ID
* - Add a new health data type for a given patient
* - Store health data in patient record or print if requested
* (4) Continue (1)-(3) until EOF
*/
/* A new struct to hold all of the values from the csv file */
typedef struct {
int iD;
char *time[MAXTIME + 1];
int value;
int type;
}csv_input;
/* Declare an instance of the struct, and assign pointers for its values */
csv_input aLine;
int *idptr;
char timeval[MAXTIME + 1];
int *valueptr;
int *typeptr;
/*Note: because the time char is already a pointer, I did not make another one for it but instead dereferenced the pointer I was given */
idptr = &aLine.iD;
int j; /* iterator variable */
for(j; j < MAXTIME; j++){
*aLine.time[j] = timeval[j];
}
valueptr = &aLine.value;
typeptr = &aLine.type;
/* Get the Patient ID */
*idptr = getchar();
printf("%c", aLine.iD); /* a test to see if my pointers worked and the correct value was read */
/*Skip the first comma */
int next;
next = getchar();
/* get the timestamp */
int i;
for(i = 0; i < MAXTIME; i++)
{
while ((next = getchar()) != ',')
{
timeval[i] = next;
//printf("%s", aLine.time[i]);
}
}
First:
int j; /* iterator variable */
for(j; j < MAXTIME; j++){
You need to set j to some value, j=0 makes sense. Without this you're accessing an array with an uninitialized value and you're going to get UB with that.
Second:
/*Note: because the time char is already a pointer,
No, time is an array of pointers to characters, there is a difference there.
This line:
*aLine.time[j] = timeval[j];
won't work because, for one thing, of your statement but instead dereference the pointer I was given is making an incorrect assumption. Yes, you were given an array of pointers, but they don't point to anything, they are uninitialized and as such you can't deference them until you initialize them to a valid non-NULL value.
I think you were trying to do something like this:
aLine.time[j] = &timeval; //set the pointer to the local static array
but that's only going to work in the local function scope. It'd be better if you malloc to your array of pointers.
char *time[MAXTIME + 1];
this is an array of pointers (pointer to char array) and not an array of chars
The crash come from this line
*aLine.time[j] = timeval[j];
Because as I said aLine.time[j] is a pointer and you have not allocated memory for this pointer before filling its content

Nested structs and pointers in C [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I have spent ages trying to figure out what is the problem with the following program:
typedef struct user {
char host[40];
int order;
} user;
typedef struct data {
user userdat;
} data;
int read_user (char *datname, user *userdat) {
...
fscanf (datin, "%s", &userdat->host);
fscanf (datin, "%d", &userdat->order);
//1
printf ("%d\n", userdat->order);
...
}
void init_data (data *dat) {
init_userdat (&dat->userdat);
}
void init_userdat (user *userdat) {
*userdat->host = "127.0.0.1";
userdat->order = 0;
}
int user_print (int i, data *dat) {
//2
printf ("%d\n", dat->userdat.order);
}
int main(int argc, char *argv[]) {
...
data dat;
init_data (&dat);
read_user (datname, &dat->userdat);
user_print (&dat);
}
The program is very simplified to highlight the relevant sections. What happens is that the first print statement (//1) outputs the value correctly, while the second (//2) does not - it outputs something that looks like a possible memory location.
I have tried numerous combinations of accessing the stored variable, but I just can't crack it. Any help would be appreciated.
Edit1: Fixed up a couple of non essential errors in code (not relating to pointers or structs)
Final Edit: Thank you all for your help. The issue that Arun Saha pointed out was indeed in the original code and is now fixed. However, the problem of printing two different strings persisted. Your assurance that the code should otherwise compile and work led me to discover the true culprit - I was not properly initializing another part of the otherwise complex struct and this resulted in overwriting of the user.order variable.
The following line does not do what it appears to do :-)
*userdat->host = "127.0.0.1";
userdata->host is a pointer to the first character in the host[40] array. The above statement would copy only one character from the source string ("127.0.0.1"). To copy the entire string, use the standard library function stncpy()
strncpy( & userdata->host[ 0 ], "127.0.0.1", 40 );
In your main function, when you invoke read_user (datname, &dat->userdat);, I feel there should be a compilation issue. This should be actually read_user (datname, &dat.userdat); as dat is not a pointer, but an object itself.
With this change and Arun's previous recommendation, I have tried your program and it works.

Array of files and value assigning [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I have 2 questions.
I want to create an array of files in C. But I'm not sure whether I have to malloc the size before or not.Can I just use FILE** files as an array or do I have to malloc them before. And if I have to make space, do I need to reserve 4 bytes (x86)?
I have the variable "char extra[8] = { 0xAE00AF00B000B100 };" and I want to assign it to the end of another char array[24]. Is there a faster way of doing that without having to type in every value by hand or using a for loop.
char extra[8] = { 0xAE00AF00B000B100 };
// index is a random place in the string
name[index] = '\0';
i = 0;
if (index > 16) {
for (i = 24-index; i < 8; i++) {
index++;
name[index] = extra[i];
}
}
else {
name[17] = 0xAE;
name[18] = 0x00;
name[19] = 0xAF;
name[20] = 0x00;
name[21] = 0xB0;
name[22] = 0x00;
name[23] = 0xB1;
name[24] = 0x00;
}
I need to add those extra bytes btw.
I want to create an array of files in C. But I'm not sure whether I
have to malloc the size before or not.Can I just use FILE** files as
an array or do I have to malloc them before. And if I have to make
space, do I need to reserve 4 bytes (x86)?
If you need to have an array of files, it is possible to use an array of pointers as follow:
#include <stdio.h>
FILE *array[NB_FILES];
Or you can do it dynamically if NB_FILES is only known at runtime.
#include <stdio.h>
#include <stdlib.h>
FILE **array = malloc(nb_files * sizeof *array);
I have the variable "char extra[8] = { 0xAE00AF00B000B100 };" and I want to assign it to the end of another char array[24]. Is there a faster way of doing that without having to type in every value by hand or using a for loop.
The standard C library provides the function memcpy, which is a builtin on many compiler (so it will be faster than a for loop).
#include <string.h>
char array[24];
char extra[8];
memcpy(array + sizeof array - sizeof extra - 1, extra, sizeof extra);

(C) realloc array modifies data _pointed_ by items [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
(C) realloc array modifies data pointed by items
Hello,
A nice weird bug I feel like sharing ;-) Requires some preliminary explanations:
First, I have a type of strings PString which hold their size (and a hash value), followed by a flexible array member with the bytes. Here is the type and kind of constructor (the printfl statement at the end is debug):
typedef struct {
size_t size;
uint hash;
char bytes[];
} PString;
// offset from start of pstring struct to start of data bytes:
static const size_t PSTRING_OFFSET = sizeof(size_t) + sizeof(uint);
PString * pstring_struct (string str, size_t size, uint hash) {
// memory zone
char *mem = malloc(PSTRING_OFFSET + size * sizeof(char));
check_mem(mem);
// string data bytes:
memcpy(mem + PSTRING_OFFSET, str, size);
mem[PSTRING_OFFSET + size] = NUL;
// pstring struct:
PString * pstr = (PString *) mem;
pstr->size = size;
pstr->hash = hash;
printfl("*** str:'%s' (%u) --> pstr:'%s' (%u) 0x%X",
str, size, pstr->bytes, pstr->size, pstr); ///////////////////////
return pstr;
}
[Any comment on this construction welcome: I'm not sure at all to do things right, here. It's the first time I use flexible array members, and I could not find exemples of using them in allocated structs.]
Second, those pstrings are stored in a string pool, meaning a set implemented as hash table. As usual, "buckets" for collisions (after hash & modulo) are plain linked lists of cells, each holding a pstring pointer and a pointer to next cell. The only special detail is that the cells themselves are stored in an array, instead of beeing allocated anywhere on the heap [1]. Hope the picture is clear. Here is the definition of Cell:
typedef struct SCell {
PString * pstr;
struct SCell * next;
} Cell;
All seemed to work fine, including a battery of tests of the pool itself. Now, when testing a pstring routine (search), I noticed a string changed. After some research, I finally guessed the problem is related to pool growing, and endly could reduce the issue exactly around the growing of the array of cells (so, well before redistributing cells into lists). Here is the lines of debug prints around this growing, with copy of the show_pool routine producing the output (just shows the strings), and the output itself:
static void pool_grow (StringPool * pool, uint n_new) {
...
// Grow arrays:
show_pool(pool); /////////////////////
pool->cells = realloc(pool->cells, pool->n_cells * sizeof(Cell));
check_mem(pool->cells);
show_pool(pool); ////////////////////
...
static void show_pool (StringPool * pool) {
if (pool->n == 0) {
printfl("{}");
return;
}
printf("pool : {\"%s\"", pool->cells[0].pstr->bytes);
PString * pstr;
uint i;
for (i = 1; i < pool->n; i++) {
pstr = pool->cells[i].pstr;
printf(", \"%s\"", pstr->bytes);
}
printl("}");
}
// output:
pool : {"", "abc", "b", "abcXXXabcXXX"}
pool : {"", "abc", "b", "abcXXXabcXXXI"}
As you can see, the last string stored has an additional byte 'I'. Since in the meanwhile I'm just calling realloc, I find myself a bit blocked for further debugging; and thinking hard does not help in throwing light on this mystery. (Note that cells just hold pstring pointers, so how can growing the array of cells alter the string bytes?) Also, I'm bluffed by the fact there seems to be a quite convenient NUL just after the mysterious 'I', since printf halts there.
Thank you.
Can you help?
[1] There is no special reason for doing that here, with a string pool. I usually do that to get for free an ordered set or map, and in addition locality of reference. (The only overhead is that the array of cells must grow in addition to the array of buckets, but one can reduce the number of growings by predimensioning.)
Since size doesn't include the null terminator,
mem[PSTRING_OFFSET + size] = NUL;
is invalid. Every other issue stems from this.

Printing an int array from a struct in C [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I am trying to read in a struct from a file and then display (and sort) and array. I am having trouble though which I think is to do with me not accessing the correct memory. When I print the array it comes up as loads of random numbers.
struct details
{
int numberOfPresents;
int numberOfBuildings;
int buildings[];
};
void print_int_array(const int *array)
{
for(int i=0; i<200; i++)
printf("%d | ", array[i]);
putchar('\n');
}
void sort(int buildings[], int count)
{
int i, j, temp;
do {
j = 0;
for (i = 0;i<count-1;i++)
{
if (buildings[i] < buildings[i+1])
{
j = 1;
temp = buildings[i];
buildings[i] = buildings[i+1];
buildings[i+1] = temp;
}
}
} while (j == 1);
}
int main()
{
FILE *fp;
fp = fopen("buildings.out", "r");
struct details data1;
size_t structSize = sizeof(struct details);
//size_t arraySize = sizeof(int)*sizeof(buildings);
fread(&data1, structSize, 1, fp);
for(int i=0; i<200; i++)
printf("%d | ", data1.buildings[i]);
//sort(data1.buildings );
//print_int_array(data1.buildings, arraySize);
//printf("Number of Houses: %d\n",numberOfHouses(data1.numberOfPresents, data1.buildings));
fclose(fp);
return 0;
}
The sizeof your struct only includes a minimal allocation for the array (one entry, I think). It doesn't actually allocate enough for the 200 entries you want. There are a few possible fixes.
If it will always be 200 entries, then just declare buildings as having size 200. This is the easiest.
If you know the number of entries prior to reading it, then you can do something unpleasant like (s is the number of entries):
struct details *data1 = (struct details *) malloc(sizeof(struct details)+s*sizeof(int));
and free data1 when you are done. This type of code is generally frowned upon but used to be quite common. The read command gets complicated as well.
The final option would be to change buildings to an int* and then malloc that array before reading. Again, the read would have to be done in a loop.
You've got two issues working against you printing the data.
Enough room is not allocated for the records.
Enough data is not read for the records.
The line struct details data1 only allocates enough room on the stack for one copy of the struct. You need enough for 200 of them. I'd immediately suggest an array.
struct details data1[200];
When you perform the read, fread(&data1, structSize, 1, fp), you're only reading in one record of size structSize. Now that you have enough memory allocated to read in 200 records, you can bump up the number of records you're reading to 200 as well.
fread(data1, structSize, 200, fp);
(Notice we dropped the & because we're dealing with an array now. Arrays automatically return their base address if you just reference them by name.)
Now, what if your file doesn't have 200 records in it? You probably need to capture the return value of fread() to determine how many records you actually read.
int intNumberOfRecords = fread(&data1, structSize, 200, fp);
for(int i=0; i<intNumberOfRecords ; i++)
[...]
Now that we have that working, we can look a little closer at the srtuct itself. We've got a challenge with the definition that we can't easily overcome.
struct details{
int numberOfPresents;
int numberOfBuildings;
int buildings[];
};
The last member, buildings[], is not going to read correctly from a file. This is because it's only a 32-bit integer at best in a 32-bit memory model. In other words, what you'll be reading from the disk is just a 32-bit number that points to somewhere in memory. What you won't end up with is an array that contains the buildings. If you try to access it (i.e. in your sort routine) you'll more than likely seg-fault and your program will never work. Trying to post a general solution for this is a little out of the scope of my answer. Suffice it to say, you'll either have to go with a fixed size array or dynamically write variable size arrays to disk (variable length records). The fixed size array would be a lot easier. If we change your definition to the following, we'd load up some data from the disk.
struct details{
int numberOfPresents;
int numberOfBuildings;
int buildings[16];
};
We'd also avoid seg-faulting, which is a nice plus. However, I don't know what your input file looks like so I don't know if this will work given your data.

Resources