Error in ./thrash: free(): invalid pointer - c

I realize this has been asked several times but none of the solutions offer any help for me. I am writing a lab program that allocates a large amount of memory in C specifically an array of char pointers, each of which have allocated the size of a page in memory which is 4096 bytes.
char** pgs =(char**) malloc(sizeof(char *) * pages);
if(pgs == NULL){
printf("Failed to allocate memory");
exit(0);
}
int i;
for(i = 0; i < pages; i++){
pgs[i] = malloc(4096);
/*if(pgs[i] == NULL){
printf("Failed to allocate memory");
exit(0);
}*/
*pgs[i] = "\0";
/*if(pgs[i] == NULL){
printf("Failed to allocate memory");
exit(0);
}*/
}
In the middle of the program elements of this array are accessed and modified at random so as to induce thrashing (as part of the lab):
while(time(NULL) - startTime < seconds){
long rando = rand() % pages;
if(modify > 0){
*pgs[rando]++;
}
else{
long temp = *pgs[rando];
}
At the end of the program I attempt to free this memory:
for(i = 0; i < pages; i++){
free(pgs[i]);
}
free(pgs);
I am however getting the dread "invalid pointer" error. If anyone has any advice or knowledge on how this might be fixable, please share.

The program fragments you show exhibit numerous problems, some of which were identified in comments:
Programs should report errors on standard error, not standard output.
Programs should exit with a non-zero status if they fail.
Programs should compile without warnings.
Messages in general and error messages in particular should end with a newline.
The program only attempts to modify one byte of each page.
However, the primary problem is that the code in the question uses *pgs[rando]++ which is intended to modify the memory that's allocated. This is equivalent to *(pgs[rando]++) which increments the pointer and then reads the value and discards it — rather than being equivalent to (*pgs[rando])++ which would modify the byte pgs[rando][0]. The code in the question should generate a warning about value computed is not used (or an error if you make sure you compile with all warnings are treated as errors). Because your code is incrementing the pointers, the values returned to to the memory allocation system with free() are not, in general, the same as the ones that the memory allocation system returned to you, so you are indeed passing invalid pointers to free().
This code avoids the problems described above. It does a fix number of iterations and doesn't use time(). It prints the sum so that the optimizer cannot optimize away the read accesses to the memory.
/* SO 4971-2352 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
enum { PAGESIZE = 4096 };
int main(void)
{
int pages = PAGESIZE;
char **pgs = (char **)malloc(sizeof(char *) * pages);
if (pgs == NULL)
{
fprintf(stderr, "Failed to allocate memory\n");
exit(EXIT_FAILURE);
}
for (int i = 0; i < pages; i++)
{
pgs[i] = malloc(PAGESIZE);
if (pgs[i] == NULL)
{
fprintf(stderr, "Failed to allocate memory\n");
exit(EXIT_FAILURE);
}
memset(pgs[i], '\0', PAGESIZE); // Or use calloc()!
}
size_t sum = 0;
for (int i = 0; i < PAGESIZE * PAGESIZE; i++)
{
int pagenum = rand() % pages;
int offset = rand() % PAGESIZE;
int modify = i & 2;
if (modify != 0)
{
pgs[pagenum][offset]++;
}
else
{
sum += pgs[pagenum][offset];
}
}
printf("Sum: 0x%.8zX\n", sum);
for (int i = 0; i < pages; i++)
free(pgs[i]);
free(pgs);
return 0;
}
I called that code thrash31.c and compiled it into thrash31 using:
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror thrash31.c -o thrash31
$
When run with a timing program, I got the output:
$ timecmd -u -- thrash31
2018-04-07 15:48:58.546809 [PID 9178] thrash31
Sum: 0x001FE976
2018-04-07 15:48:59.355508 [PID 9178; status 0x0000] - 0.808699s
$
So, it took about 0.8 seconds to run. The sum it generates is the same each time because the code doesn't seed the random number generator.

Related

Calloc/Malloc and freeing to often or too large a space?

Disclaimer, this is help with a school assignment. That being said, my issue only occurs about 50% of the time. Meaning if I compile and run my code without edits sometimes it will make it through to the end and other times it will not. Through the use of multiple print statements I know exactly where the issue is occurring when it does. The issue occurs in my second call to hugeDestroyer(right after the print 354913546879519843519843548943513179 portion) and more exactly at the free(p->digits) portion.
I have tried the advice found here (free a pointer to dynamic array in c) and setting the pointers to NULL after freeing them with no luck.
Through some digging and soul searching I have learned a little more about how free works from (How do malloc() and free() work?) and I wonder if my issue stems from what user Juergen mentions in his answer and that I am "overwriting" admin data in the free list.
To be clear, my question is two-fold.
Is free(p->digits) syntactically correct and if so why might I have trouble half the time when running the code?
Secondly, how can I guard against this kind of behavior in my functions?
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
typedef struct HugeInteger
{
// a dynamically allocated array to hold the digits of a huge integer
int *digits;
// the number of digits in the huge integer (approx. equal to array length)
int length;
} HugeInteger;
// Functional Prototypes
int str2int(char str) //converts single digit numbers contained in strings to their int value
{
return str - 48;
}
HugeInteger *parseInt(unsigned int n)
{
int i = 0, j = 0;
int *a = (int *)calloc(10, sizeof(int));
HugeInteger *p = (HugeInteger *)calloc(1, sizeof(HugeInteger));
if(n == 0)
{
p->digits = (int *)calloc(1, sizeof(int));
p->length = 1;
return p;
}
while(n != 0)
{
a[i] = n % 10;
n = n / 10;
i++;
}
p->length = i;
p->digits = (int *)calloc(p->length, sizeof(int));
for(i = 0; i <= p->length; i++, j++)
p->digits[j] = a[i];
return p;
}
HugeInteger *parseString(char *str) //notice datatype is char (as in char array), so a simple for loop should convert to huge int array
{
int i = 0, j = 0;
HugeInteger *p = (HugeInteger *)calloc(1, sizeof(HugeInteger));
if(str == NULL)
{
free(p);
p = NULL;
return p;
}
else
{
for(i=0; str[i] != '\0'; i++)
;
p->length = i;
p->digits = (int *)calloc(p->length, sizeof(int));
for(; i >= 0; i--)
p->digits[j++] = str2int(str[i - 1]);
}
return p;
} //end of HugeInteger *parseString(char *str)
HugeInteger *hugeDestroyer(HugeInteger *p)
{
//printf("No problem as we enter the function\n");
if(p == NULL)
return p;
//printf("No problem after checking for p = NULL\n");
if(p->digits == NULL)
{
free(p);
p = NULL;
return p;
}
//printf("No Problem after checking if p->digits = NULL\n");
//else
//{
free(p->digits);
printf("We made it through free(p->digits)\n");
p->digits = NULL;
printf("We made it through p->digits = NULL\n");
free(p);
printf("We made it through free(p)\n");
p = NULL;
printf("We made it through p = NULL\n");
return p;
//}
//return NULL;
}//end of HugeInteger *hugeDestroyer(HugeInteger *p)
// print a HugeInteger (followed by a newline character)
void hugePrint(HugeInteger *p)
{
int i;
if (p == NULL || p->digits == NULL)
{
printf("(null pointer)\n");
return;
}
for (i = p->length - 1; i >= 0; i--)
printf("%d", p->digits[i]);
printf("\n");
}
int main(void)
{
HugeInteger *p;
hugePrint(p = parseString("12345"));
hugeDestroyer(p);
hugePrint(p = parseString("354913546879519843519843548943513179"));
hugeDestroyer(p);
hugePrint(p = parseString(NULL));
hugeDestroyer(p);
hugePrint(p = parseInt(246810));
hugeDestroyer(p);
hugePrint(p = parseInt(0));
hugeDestroyer(p);
hugePrint(p = parseInt(INT_MAX));
hugeDestroyer(p);
//hugePrint(p = parseInt(UINT_MAX));
//hugeDestroyer(p);
return 0;
}
First of all, really outstanding question. You did a lot of research on topic and generally speaking, solved this issue by yourself, I'm here mainly to confirm your findings.
Is free(p->digits) syntactically correct and if so why might I have trouble half the time when running the code?
Syntax is correct. #Shihab suggested in comments not to release p->digits and release p only, but such suggestion is wrong, it leads to memory leakages. There is a simple rule: for each calloc you must eventually call free, so your current approach in freeing p->digits and then p is totally fine.
However, program fails on a valid line. How is it possible? Quick answer: free can't do its work due to corruption of meta information responsible for tracking allocated/free blocks lists. At some point program corrupted meta information, but this was revealed only on attempt to use it.
As you already discovered, in most implementations memory routines such as calloc results into allocation of buffer with prepended meta-info. You receives pointer to buffer itself, but small piece of information right before this pointer is crucial for further buffer managing (e.g. freeing). Writing 11 integers into buffer intended for 10, you're likely to corrupt meta-info of block following the buffer. Whether corruption actually happens and what would be its consequences, is heavily dependent on both implementation specifics and current memory alignment (what block follows the buffer, what exactly meta-data is corrupted). It doesn't surprise me, that you see one crash per two executions, neither surprises me observing 100% crash reproduction on my system.
Secondly, how can I guard against this kind of behavior in my functions?
Let's start with fixing overflows. There are couple of them:
parseString: loop for(; i >= 0; i--) is executed length+1 times, so p->digits is overflown
parseInt: loop for (i = 0; i <= p->length; i++, j++) is executed length+1 times, so p->digits is overflown
Direct access to memory managing in C++ is error prone and troublesome to debug. Memory leakages and buffers overflows are the worst nightmare in programmers life, it's usually better to simplify/reduce direct usage of dynamic memory, unless you are studying to cope with it, of course. If you need to stick with a lot of direct memory managing, take a look at valgrind, it's intended to detect all such things.
By the way, there is also a memory leakage in your program: each call to parseInt allocates buffer for a, but never frees it.

(C) Using Arrays with Dynamic memory allocation

I want to create a simple array of integers having 10 elements.
I'm using dynamic memory to allocate a space in memory and whenever I exceed that amount It will call realloc to double its size.
Whenever I type 'q' it will exit the loop and print the array.
I know my program is full of bugs, so please guide me into where I'm going wrong.
/* Simple example of dynamic memory allocation */
/* When array reaches 10 elements, it calls
realloc to double the memory space available */
#include <stdio.h>
#include <stdlib.h>
#define SIZE_ARRAY 10
int main()
{
int *a;
int length=0,i=0,ch;
a= calloc(SIZE_ARRAY, sizeof(int));
if(a == NULL)
{
printf("Not enough space. Allocation failed.\n");
exit(0);
}
printf("Fill up your array: ");
while(1)
{
scanf("%d",&ch);
if(ch == 'q') //possible mistake here
break;
a[length]=ch;
length++;
if(length % 10 == 0) //when length is 10, 20, 30 ..
{
printf("Calling realloc to double size.\n");
a=realloc(a, 2*SIZE_ARRAY*sizeof(int));
}
}
printf("You array is: \n");
for(;i<length;i++)
printf("%d ",a[i]);
return 0;
}
Whenever I type 'q' the program crashes. I'm a beginner so I know I'm doing a dumb mistake. Any help would be greatly appreciated.
You should not double the memory every realloc() that can get very large, very quick. You normally expand the memory by small chunks only. realloc() also has the nasty habit to use another part of the memory if the old one cannot be elongated enough. If that fails you loose all of your data in the old memory. This can be avoided by using a temporary pointer to point to the new memory and swap them after a successful allocation. That comes with the cost of just on extra pointer (mostly 4 or 8 bytes) and a swap (needing only a handful of CPU cycles at most. Be careful with x86's xchg it uses a lock in case of multiple processors which is quite expensive!)
#include <stdio.h>
#include <stdlib.h>
// would normally be some small power of two, like
// e.g.: 64 or 256
#define REALLOC_GROW 10
int main()
{
// a needs to be NULL to avoid a first malloc()
int *a = NULL, *cp;
// to avoid complications allocated is int instead of size_t
int allocated = 0, length = 0, i, ch, r;
printf("Fill up your array: ");
while (1) {
// Will also do the first allocation when allocated == length
if (allocated <= length) {
// realloc() might choose another chunk of memory, so
// it is safer to work on copy here, such that nothing is lost
cp = realloc(a, (allocated + REALLOC_GROW) * sizeof(int));
if (cp == NULL) {
fprintf(stderr, "Malloc failed\n");
// but we can still use the old data
for (i = 0; i < length; i++) {
printf("%d ", a[i]);
}
// that we still have the old data means that we need to
// free that memory, too
free(a);
exit(EXIT_FAILURE);
}
a = cp;
// don't forget to keep the amount of memory we've just allocated
allocated += REALLOC_GROW;
}
// out, if user typed in anything but an integer
if ((r = scanf("%d", &ch)) != 1) {
break;
}
a[length] = ch;
length++;
}
printf("Your array is: \n");
// keep informations together, set i=0 in the loop
for (i = 0; i < length; i++) {
printf("%d ", a[i]);
}
fputc('\n', stdout);
// clean up
free(a);
exit(EXIT_SUCCESS);
}
If you play around a bit with the start-value of allocated, the value of REALLOC_GROW and use multiplication instead of addition in the realloc() and replace the if(allocated <= length) with if(1) you can trigger the no-memory error and see if it still prints what you typed in before. Now change the realloc-on-copy by using a directly and see if it prints the data. That still might be the case but it is not guaranteed anymore.

Inconsistent malloc memory corruption

I'm currently writing a method that reads from an allocated block of memory and prints out its contents from a certain offset and up to a specified size, both of which are passed as parameters. I'm using char pointers to accomplish this, but keep getting a malloc error around line
char *content = (char *)malloc(size+1);
Code for the method:
int file_read(char *name, int offset, int size)
{
//First find file and its inode, if existing
int nodeNum = search_cur_dir(name);
if(nodeNum < 0) {
printf("File read error: file does not exist\n");
return -1;
}
//Size check, to avoid overflows/overreads
if(offset > inode[nodeNum].size || size > inode[nodeNum].size || (offset+size) > inode[nodeNum].size) {
printf("File read error: offset and/or size is too large\n");
return -1;
}
int i, read_size, track_size = size, content_offset = 0;
int target_block = offset / BLOCK_SIZE; //Defined as constant 512
int target_index = offset % BLOCK_SIZE;
char *raw_content = (char *)malloc(inode[nodeNum].size+1);
printf("check1\n"); //Debug statment
for(i = target_block; i < (inode[nodeNum].blockCount-(size/BLOCK_SIZE)); i++) {
disk_read(inode[nodeNum].directBlock[i], raw_content+content_offset);
content_offset += BLOCK_SIZE;
}
printf("check2\n"); //Debug statment
char *content = (char *)malloc(size+1);
memcpy(content, raw_content+target_index, size);
printf("%s\n", content);
free(raw_content);
free(content);
return 0;
}
and code for disk_read:
char disk[MAX_BLOCK][BLOCK_SIZE]; //Defined as 4096 and 512, respectively
int disk_read(int block, char *buf)
{
if(block < 0 || block >= MAX_BLOCK) {
printf("disk_read error\n");
return -1;
}
memcpy(buf, disk[block], BLOCK_SIZE);
return 0;
}
structure for node
typedef struct {
TYPE type;
int owner;
int group;
struct timeval lastAccess;
struct timeval created;
int size;
int blockCount;
int directBlock[10];
int indirectBlock;
char padding[24];
} Inode; // 128 byte
The error I get when using this method is one of memory corruption
*** glibc detected *** ./fs_sim: malloc(): memory corruption (fast): 0x00000000009f1030 ***
Now the strange part is, firstly this only occurs after I have used the method a few times - for the first two or three attempts it will work and then the error occurs. For instance, here is an example test run:
% read new 0 5
z12qY
% read new 0 4
z12q
% read new 0 3
*** glibc detected *** ./fs_sim: malloc(): memory corruption (fast): 0x00000000009f1030 ***
Even stranger still, this error disappears completely when I comment out
free(raw_content);
free(content);
Even through this would tie up the memory. I've read through previous posts regarding malloc memory corruption and understand this usually results from overwriting memory bounds or under allocating space, but I can't see where I could be doing this. I've attempted other sizes for malloc as well and these produced the best results when I commented out the lines freeing both pointers. Does anyone see what I could be missing? And why does this occur so inconsistently?
Code allocates space for characters and a null character, but does not insure the array is terminated with a null character before printing as a string.
char *content = (char *)malloc(size+1);
memcpy(content, raw_content+target_index, size);
// add
content[size] = '\0';
printf("%s\n", content);
Likely other issues too.
[Edit]
OP code is prone to mis-coding and dependent on inode[] to have coherent values (.blockCount . size). Clarify and simplify by determining the loop count and allocating per that count.
int loop_count = (inode[nodeNum].blockCount-(size/BLOCK_SIZE)) - target_block;
char *raw_content = malloc(sizeof *raw_content * loop_count * BLOCK_SIZE);
assert(raw_count);
for (loop = 0; loop < loop_count; loop++) {
i = target_block + loop;
disk_read(inode[nodeNum].directBlock[i], raw_content + content_offset);
content_offset += BLOCK_SIZE;
}
Also recommend checking the success of disk_read()

making your own malloc function in C

I need your help in this. I have an average knowledge of C and here is the problem. I am about to use some benchmarks to test some computer architecture stuff (branch misses, cache misses) on a new processor. The thing about it is that benchmarks are in C but I must not include any library calls. For example, I cannot use malloc because I am getting the error
"undefined reference to malloc"
even if I have included the library. So I have to write my own malloc. I do not want it to be super efficient - just do the basics. As I am thinking it I must have an address in memory and everytime a malloc happens, I return a pointer to that address and increment the counter by that size. Malloc happens twice in my program so I do not even need large memory.
Can you help me on that? I have designed a Verilog and do not have so much experience in C.
I have seen previous answers but all seem too complicated for me. Besides, I do not have access to K-R book.
Cheers!
EDIT: maybe this can help you more:
I am not using gcc but the sde-gcc compiler. Does it make any difference? Maybe that's why I am getting an undefined reference to malloc?
EDIT2:
I am testing a MIPS architecture:
I have included:
#include <stdlib.h>
and the errors are:
undefined reference to malloc
relocation truncated to fit: R_MIPS_26 against malloc
and the compiler command id:
test.o: test.c cap.h
sde-gcc -c -o test.s test.c -EB -march=mips64 -mabi=64 -G -O -ggdb -O2 -S
sde-as -o test.o test.s EB -march=mips64 -mabi=64 -G -O -ggdb
as_objects:=test.o init.o
EDIT 3:
ok, I used implementation above and it runs without any problems. The problem is that when doing embedded programming, you just have to define everything you are using so I defined my own malloc. sde-gcc didn't recognize the malloc function.
This is a very simple approach, which may get you past your 2 mallocs:
static unsigned char our_memory[1024 * 1024]; //reserve 1 MB for malloc
static size_t next_index = 0;
void *malloc(size_t sz)
{
void *mem;
if(sizeof our_memory - next_index < sz)
return NULL;
mem = &our_memory[next_index];
next_index += sz;
return mem;
}
void free(void *mem)
{
//we cheat, and don't free anything.
}
If required, you might need to align the memory piece you hand back, so e.g. you always
give back memory addresses that's on an address that's a multiple of 4, 8, 16 or whatever you require.
Trying a thread safe nos answer given above, I am referring his code with some changes as below:
static unsigned char our_memory[1024 * 1024]; //reserve 1 MB for malloc
static size_t next_index = 0;
static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
void *malloc(size_t sz)
{
void *mem;
pthread_mutex_lock(&lock);
if(sizeof our_memory - next_index < sz){
pthread_mutex_unlock(&lock);
return NULL;
}
mem = &our_memory[next_index];
next_index += sz;
pthread_mutex_unlock(&lock);
return mem;
}
void free(void *mem)
{
//we cheat, and don't free anything.
}
You need to link against libc.a or the equivilent for your system. If you don't use the standard C lib you won't get any of the startup code that runs before the main function either. Your program will never run....
You could either allocate a block of static data and use that in the place of malloc, like:
// char* fred = malloc(10000);
// equals
static char [100000] fred;
or call the standard malloc for a large block of continuous memory on startup and write yr own malloc type function to divide that down. In the 2nd case you would start benchmarking after the calling the system's malloc as to not effect the benchmarks.
I am sharing the complete approach for Malloc and free it works on every scenario. This is complimented using array. We can also implement using link list for metadata.
There are three Scenarios We have to Cover
Continuous Memory allocation: Allocate memory in continuous manner
Allocated memory between two allocated memory: When Memory is free to allocate in between two allocated memory block. we have to use that memory chunk for allocation.
Allocated from Initial block When Initial block is free.
for detailed You can see in diagram. Diagram for allocating algo of memory
Source code for malloc
#define TRUE 1
#define FALSE 0
#define MAX_ALOCATION_ALLOWED 20
static unsigned char our_memory[1024 * 1024];
static int g_allocted_number = 0;
static int g_heap_base_address = 0;
typedef struct malloc_info
{
int address;
int size;
}malloc_info_t;
malloc_info_t metadata_info[MAX_ALOCATION_ALLOWED] ={0};
void* my_malloc(int size)
{
int j =0;
int index = 0 ;
int initial_gap =0;
int gap =0;
int flag = FALSE;
int initial_flag = FALSE;
void *address = NULL;
int heap_index = 0;
malloc_info_t temp_info = {0};
if(g_allocted_number >= MAX_ALOCATION_ALLOWED)
{
return NULL;
}
for(index = 0; index < g_allocted_number; index++)
{
if(metadata_info[index+1].address != 0 )
{
initial_gap = metadata_info[0].address - g_heap_base_address; /*Checked Initial Block (Case 3)*/
if(initial_gap >= size)
{
initial_flag = TRUE;
break;
}
else
{
gap = metadata_info[index+1].address - (metadata_info[index].address + metadata_info[index].size); /*Check Gap Between two allocated memory (Case 2)*/
if(gap >= size)
{
flag = TRUE;
break;
}
}
}
}
if(flag == TRUE) /*Get Index for allocating memory for case 2*/
{
heap_index = ((metadata_info[index].address + metadata_info[index].size) - g_heap_base_address);
for(j = MAX_ALOCATION_ALLOWED -1; j > index+1; j--)
{
memcpy(&metadata_info[j], &metadata_info[j-1], sizeof(malloc_info_t));
}
}
else if (initial_flag == TRUE) /*Get Index for allocating memory for case 3*/
{
heap_index = 0;
for(j = MAX_ALOCATION_ALLOWED -1; j > index+1; j--)
{
memcpy(&metadata_info[j], &metadata_info[j-1], sizeof(malloc_info_t));
}
}
else /*Get Index for allocating memory for case 1*/
{
if(g_allocted_number != 0)
{
heap_index = ((metadata_info[index -1].address + metadata_info[index-1].size) - g_heap_base_address);
}
else /* 0 th Location of Metadata for First time allocation*/
heap_index = 0;
}
address = &our_memory[heap_index];
metadata_info[index].address = g_heap_base_address + heap_index;
metadata_info[index].size = size;
g_allocted_number += 1;
return address;
}
Now Code for Free
void my_free(int address)
{
int i =0;
int copy_meta_data = FALSE;
for(i = 0; i < g_allocted_number; i++)
{
if(address == metadata_info[i].address)
{
// memset(&our_memory[metadata_info[i].address], 0, metadata_info[i].size);
g_allocted_number -= 1;
copy_meta_data = TRUE;
printf("g_allocted_number in free = %d %d\n", g_allocted_number, address);
break;
}
}
if(copy_meta_data == TRUE)
{
if(i == MAX_ALOCATION_ALLOWED -1)
{
metadata_info[i].address = 0;
metadata_info[i].size = 0;
}
else
memcpy(&metadata_info[i], &metadata_info[i+1], sizeof(malloc_info_t));
}
}
For testing Now Test code is
int main()
{
int *ptr =NULL;
int *ptr1 =NULL;
int *ptr2 =NULL;
int *ptr3 =NULL;
int *ptr4 =NULL;
int *ptr5 =NULL;
int *ptr6 =NULL;
g_heap_base_address = &our_memory[0];
ptr = my_malloc(20);
ptr1 = my_malloc(20);
ptr2 = my_malloc(20);
my_free(ptr);
ptr3 = my_malloc(10);
ptr4 = my_malloc(20);
ptr5 = my_malloc(20);
ptr6 = my_malloc(10);
printf("Addresses are: %d, %d, %d, %d, %d, %d, %d\n", ptr, ptr1, ptr2, ptr3, ptr4, ptr5, ptr6);
return 0;
}

Reallocating memory for a struct array in C

I am having trouble with a struct array. I need to read in a text file line by line, and compare the values side by side. For example "Mama" would return 2 ma , 1 am because you have ma- am- ma. I have a struct:
typedef struct{
char first, second;
int count;
} pair;
I need to create an array of structs for the entire string, and then compare those structs. We also were introduced to memory allocation so we have to do it for any size file. That is where my trouble is really coming in. How do I reallocate the memory properly for an array of structs? This is my main as of now (doesn't compile, has errors obviously having trouble with this).
int main(int argc, char *argv[]){
//allocate memory for struct
pair *p = (pair*) malloc(sizeof(pair));
//if memory allocated
if(p != NULL){
//Attempt to open io files
for(int i = 1; i<= argc; i++){
FILE * fileIn = fopen(argv[i],"r");
if(fileIn != NULL){
//Read in file to string
char lineString[137];
while(fgets(lineString,137,fileIn) != NULL){
//Need to reallocate here, sizeof returning error on following line
//having trouble seeing how much memory I need
pair *realloc(pair *p, sizeof(pair)+strlen(linestring));
int structPos = 0;
for(i = 0; i<strlen(lineString)-1; i++){
for(int j = 1; j<strlen(lineSTring);j++){
p[structPos]->first = lineString[i];
p[structPos]->last = lineString[j];
structPos++;
}
}
}
}
}
}
else{
printf("pair pointer length is null\n");
}
}
I am happy to change things around obviously if there is a better method for this. I HAVE to use the above struct, have to have an array of structs, and have to work with memory allocation. Those are the only restrictions.
Allocating memory for an array of struct is as simple as allocating for one struct:
pair *array = malloc(sizeof(pair) * count);
Then you can access each item by subscribing "array":
array[0] => first item
array[1] => second item
etc
Regarding the realloc part, instead of:
pair *realloc(pair *p, sizeof(pair)+strlen(linestring));
(which is not syntactically valid, looks like a mix of realloc function prototype and its invocation at the same time), you should use:
p=realloc(p,[new size]);
In fact, you should use a different variable to store the result of realloc, since in case of memory allocation failure, it would return NULL while still leaving the already allocated memory (and then you would have lost its position in memory). But on most Unix systems, when doing casual processing (not some heavy duty task), reaching the point where malloc/realloc returns NULL is somehow a rare case (you must have exhausted all virtual free memory). Still it's better to write:
pair*newp=realloc(p,[new size]);
if(newp != NULL) p=newp;
else { ... last resort error handling, screaming for help ... }
So if I get this right you're counting how many times pairs of characters occur? Why all the mucking about with nested loops and using that pair struct when you can just keep a frequency table in a 64KB array, which is much simpler and orders of magnitude faster.
Here's roughly what I would do (SPOILER ALERT: especially if this is homework, please don't just copy/paste):
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
void count_frequencies(size_t* freq_tbl, FILE* pFile)
{
int first, second;
first = fgetc(pFile);
while( (second = fgetc(pFile)) != EOF)
{
/* Only consider printable characters */
if(isprint(first) && isprint(second))
++freq_tbl[(first << 8) | second];
/* Proceed to next character */
first = second;
}
}
int main(int argc, char*argv[])
{
size_t* freq_tbl = calloc(1 << 16, sizeof(size_t));;
FILE* pFile;
size_t i;
/* Handle some I/O errors */
if(argc < 2)
{
perror ("No file given");
return EXIT_FAILURE;
}
if(! (pFile = fopen(argv[1],"r")))
{
perror ("Error opening file");
return EXIT_FAILURE;
}
if(feof(pFile))
{
perror ("Empty file");
return EXIT_FAILURE;
}
count_frequencies(freq_tbl, pFile);
/* Print frequencies */
for(i = 0; i <= 0xffff; ++i)
if(freq_tbl[i] > 0)
printf("%c%c : %d\n", (char) (i >> 8), (char) (i & 0xff), freq_tbl[i]);
free(freq_tbl);
return EXIT_SUCCESS;
}
Sorry for the bit operations and hex notation. I just happen to like them in such a context of char tables, but they can be replaced with multiplications and additions, etc for clarity.

Resources