Hoard performance degrades severely when allocating large size chunks - c

I have written below sample program in 'C' which is dynamic memory intensive and tried to benchmark the same (in terms of time taken) for the 'glibc' default allocator versus Hoard allocator.
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 #define NUM_OF_BLOCKS (1 * 4096)
5
6 void *allocated_mem_ptr_arr[NUM_OF_BLOCKS];
7
8 int
9 main (int argc, char *argv[])
10 {
11 void *myblock = NULL;
12
13 int count, iter;
14
15 int blk_sz;
16
17 if (argc != 2)
18 {
19 fprintf (stderr, "Usage:./memory_intensive <Block size (KB)>\n\n");
20 exit (-1);
21 }
22
23 blk_sz = atoi (argv[1]);
24
25 for (iter = 0; iter < 1024; iter++)
26 {
27 /*
28 * The allocated memory is not accessed (read/write) hence the residual memory
29 * size remains low since no corresponding physical pages are being allocated
30 */
31 printf ("\nCurrently at iteration %d\n", iter);
32 fflush (NULL);
33
34 for (count = 0; count < NUM_OF_BLOCKS; count++)
35 {
36 myblock = (void *) malloc (blk_sz * 1024);
37 if (!myblock)
38 {
39 printf ("malloc() fails\n");
40 sleep (30);
41 return;
42 }
43
44 allocated_mem_ptr_arr[count] = myblock;
45 }
46
47 for (count = 0; count < NUM_OF_BLOCKS; count++)
48 {
49 free (allocated_mem_ptr_arr[count]);
50 }
51 }
52 }
As a result of this benchmark activity, I got below results (Block Size, Time elapsed for default allocator, Time elapsed for Hoard):
'1K' '4.380s' '0.927s'
'2k' '8.390s' '0.960s'
'4k' '16.757s' '1.078s'
'8k' '16.619s' '1.154s'
'16k' '17.028s' '13m 6.463s'
'32k' '17.755s' '5m 45.039s'
As can be seen, Hoard performance severely degrades with block size >= 16K. What is the reason? Can we say that Hoard is not meant for applications allocating large size chunks?

There are some nice benchmarks and explanations at http://locklessinc.com/benchmarks_allocator.shtml:
For small allocaitons, it still performs similarly to tcmalloc. However, beyond about 64KiB, it drops drastically in performance. It uses a central "hoard" to redistribute memory between threads. This presents a bottle-neck as only one thread at a time can be using it. As the number of threads increases, the problem gets worse and worse.

Related

How can I know why a program causes a coredump sigfault?

I have those functions and I would like to know if anyone can help me. I have to investigate why they cause a "segfault", and why it happens faster or slower depending on its conditions.
I supposed that in Rec1, it's caused by an infinite loop that collapses the memory of the memory stack. In rec2 I suppose it's caused faster because of the same condition that in Rec1 but adding that it is allocating memory everytime for the pointer too.
In Rec3 () that crashes instantly because it's allocating the same memory spot in the second iteration and causes a problem because the program is trying to access the same allocated memory.
In Rec4 () I think it's caused because it creates an array with infinite positions, ask is the limiting of the array max space.
Can you give me some advices on those suppositions?
#include <stdio.h>
#include <stdlib.h>
#define MOD 10000
int k = 0;
char global[100];
void
panic (char *m)
{
fprintf (stderr, "%s\n", m);
exit (0);
}
void
rec1 ()
{
k++;
if (k % MOD == 0)
fprintf (stderr, "%d ", k / MOD);
rec1 ();
}
void
rec2 ()
{
int *tmp;
tmp = malloc (100 * sizeof (int));
if (tmp == NULL)
exit (0);
k++;
if (k % MOD == 0)
fprintf (stderr, "%d ", k / MOD);
rec2 ();
}
void
rec3 ()
{
int tmp[100];
k++;
if (k % MOD == 0)
fprintf (stderr, "%d ", k / MOD);
rec3 ();
}
void
rec4 ()
{
global[k] = k;
if (k % 200 == 0)
fprintf (stderr, "%d ", k);
k++;
rec4 ();
}
int
main (int argc, char *argv[])
{
int mode = 1;
if (argc > 1)
mode = atoi (argv[1]);
printf ("Testing rec%d...\n", mode);
switch (mode)
{
case 1:
rec1 ();
break;
case 2:
rec2 ();
break;
case 3:
rec3 ();
break;
case 4:
rec4 ();
break;
default:
panic ("Wrong mode");
}
return 0;
}
This is the output when I run the compiled C program in terminal.
Testing rec1...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
Program received signal SIGSEGV, Segmentation fault.
0x0000555555554904 in rec1 () at stack.c:24
24 rec1 ();
Testing rec2...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7a7b96a in __GI___libc_free (mem=0x555555757670) at malloc.c:3086
3086 malloc.c: No existe el archivo o el directorio.
Testing rec3...
1
Program received signal SIGSEGV, Segmentation fault.
0x0000555555554a43 in rec3 () at stack.c:53
53 rec3 ();
Testing rec4...
0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 Violación de segmento (`core' generado)
Program received signal SIGSEGV, Segmentation fault.
0x0000555555554a1f in rec4 ()
The Code that you have is very likely to trigger an error in my experience. Without any compiler or program feedback, it's a little difficult to discern exactly what went wrong, but I believe you may be looking (Generally) for information regarding stacks, heaps, and recursion.
Firstly, please note that
void rec1 () {
k++;
if (k % MOD == 0)
fprintf (stderr, "%d ", k / MOD);
rec1 ();
}
is NOT "Iteration". Iteration refers to the repetition of a sequential portion of code (usually a for or while loop). What you have here is recursion. Recursion creates a new instance of the method to operate from, along with a stack pointer to navigate to the last execution point (As well as store any immediately relevant variables). This occurs every time you call the rec1 () function from your rec1 () function Eventually, you'll run out of space on the stack to store those pointers. The number of pointers you can store on a stack is usually quite large on modern computers, but considering that you have no return statement, you will run into the maximum capacity eventually.
EDIT
This post has been edited to reflect the new material presented by the question.
Okay...From the material you presented, it looks like you're essentially being asked about WHERE each rec stores and processes information...
In the case of Rec1, it does indeed appear to be a simple case of stack overflow. The pointer to the last execution point of the previous Rec1 is stored on the stack, ultimately resulting in the program's crash after about 520,000 instances. Given that each pointer is 4 bytes, that's around 2 MB of just recursive pointer information alone stored on your stack before it collapses and triggers a Seg Fault due to stack overflow.
The second case is a little trickier. Note that your program indicates that it makes it to roughly 260,000 recursions before it triggers a Seg Fault. This is exactly half of Rec1. HOWEVER, this is not necessarily a stack overflow per se. Rec2 allocates 400 bytes of data on the heap per recursion. The pointer to the heap is stored on the stack, meaning that 8 bytes are stored on the stack per recursion (which may be related to why its exactly half, but could also be explained by the ratio of your stack / heap size). Now, the error for Rec2 states that malloc could not find the file or directory, which seems to me as though malloc could not complete correctly. This may actually indicate that the max heap size has been hit.
Rec3 is pretty straightforward. The entire integer array tmp is stored on the stack for each recursion. that's 4 bytes per integer times 100 ints, which is 400 bytes on the stack PER recursion. This is no surprise that it crashes between 10,000 to 20,000 recursions. There just wasn't enough space to store the data on the stack. NOTE: In relation to something you mentioned in your question, this tmp array does not attempt to allocate the same region of memory. Due to the fact that this is recursively built, it creates a new space on the stack for that function instance.
Rec4 is a simple case of buffer overflow. After overwriting the first 100 bytes of memory allocated in global[100], it was only a matter of time before k++ would cause global[k] to point to an address space restricted to the process. This triggered the seg fault after about 4000 recursions (k was not mod 10,000 in rec4).
I hope this clarification helps.

How to convert a linking list to an dynamic array

I created an function in c as a linked list but i can't understand how convert it to a linked list.
My Question is how to convert an linked list to an dynamic array. This function adds elements to an linked list i want to be able to add elements to an dynamic array. i don't know if the struct blockhead_node is right like i said i'm still learning c. I do not understand dynamic arrays so i'm trying to create an program that uses them so i may understand them better. i want to create an add function that adds elements to the front or end of the list. This is what i have:
//this struct i'm trying to use for dynamic array
struct blockhead_node
{
float x, y;
float dx, dy;
long color;
int size; // slots used so far
int capacity; // total available slots
int *data; // array of integers we're storing
};
//this struct is for the linked list
struct blockhead_node
{
float x,y;
float dx, dy;
long color;
int size;
struct blockhead_node * next;
};
void add(struct blockhead_node ** blockhead_list) // double pointer because we can't modify the list it self
{
while((*blockhead_list)!=NULL)
{
blockhead_list=&(*blockhead_list)->next;
}
(*blockhead_list) = (struct blockhead_node*) malloc(sizeof(struct blockhead_node));
(*blockhead_list)->x = rand()%screen_width + (*blockhead_list)->size;
//(*blockhead_list)->x = 400;
//
//look up how to create an random floating point number
//
(*blockhead_list)->dx = ((float)rand()/(float)(10000));
(*blockhead_list)->y = rand()%screen_height + (*blockhead_list)->size;
(*blockhead_list)->dy = ((float)rand()/(float)(10000));
(*blockhead_list)->size = rand()%100;
(*blockhead_list)->next = NULL;
if((*blockhead_list)->x + (*blockhead_list)->size > screen_width)
{
(*blockhead_list)->x = screen_width - (*blockhead_list)->size;
}
if((*blockhead_list)->y + (*blockhead_list)->size > screen_height)
{
(*blockhead_list)->y = screen_height - (*blockhead_list)->size;
}
}
Using a dynamic array is not much different than using a linked-list. The primary difference is you are responsible for keeping track of size and realloc your data array when size == capacity (or when capacity == 0 for a new stuct).
Other than that you are either allocating per-node with a linked-list, whereas with a dynamic array you can allocate in a more efficient manner by allocating by some multiple of the current capacity when size == capacity. For arrays of completely unknown size you can either start with capacity = 2 or 8 and then either increase by some multiple (e.g. 3/2 or 2, etc..) or some fixed amount (8, 16, etc..) each time a reallocation is needed. I generally just double the current capacity, e.g. starting a 2, reallocating storage in data for 4, 8, 16, 32, ... integers as more numbers are added to data. You can do it any way you like -- as long as you keep track of your size and capacity.
A short example may help. Let's start with just a simple struct where data is the only dynamic member we are concerned about. For example:
#define CAP 2 /* initial capacity for new struct */
typedef struct {
size_t size,
capacity;
int *data;
} blkhd_node;
Where size is your "slots used so far" and capacity is your "total available slots" and data your "array of integers we're storing".
The add function is fairly trivial. All you need to do is check whether you need to realloc (e.g. this is a struct not used yet where capacity == 0 or all the slots are used and size == capacity). Since realloc can be used for both new and resizing the allocation, it is all you need. The scheme is simple, if this is a new struct, we will allocate for CAP number of int, otherwise if size == capacity we will allocated 2 * capacity number of int.
Whenever we realloc we do so with a temporary pointer! Why? When realloc fails it returns NULL. If you realloc with the original pointer (e.g. data = realloc (data, newsize); and NULL is returned, you overwrite your original pointer address for data with NULL and your ability to reach (or free) the original block of memory is lost -- creating a memory leak. By using a temporary pointer, we can validate whether realloc succeeds before we assign the new address to data. Importantly, if realloc does fail -- then our existing integers pointed to by data are still fine, so we are free to use our original allocation in the event of realloc failure. Important to remember.
We also need a way to indicate success/failure of or add function. Here, you are not adding a new node, so returning a pointer to a new node isn't an option. In this case a simple int return of 0 for failure of 1 for success is just as good as anything else.
With that, an add function could be as simple as:
int add (blkhd_node *b, int v)
{
/* realloc if (1) new struct or (2) size == capacity */
if (!b->capacity || b->size == b->capacity) {
size_t newsize;
if (!b->capacity) /* new stuct, set size = CAP */
newsize = CAP;
else if (b->size == b->capacity) /* otherwise double size */
newsize = b->capacity * 2;
/* alway realloc with a temporary pointer */
void *tmp = realloc (b->data, newsize * sizeof *b->data);
if (!tmp) { /* validate reallocation */
perror ("realloc_b->data");
return 0; /* return failure */
}
b->data = tmp; /* assign new block of mem to data */
b->capacity = newsize; /* set capacity to newsize */
}
b->data[b->size++] = v; /* set data value to v */
return 1; /* return success */
}
Note: since you are initializing your struct and both size and capacity will be 0, you can dispense with separate checks for capacity == 0 and size == capacity and use a ternary to set newsize. This is arguably the less-readable way to go and a good compiler will optimize both identically, but for sake of completeness, you could replace the code setting newsize with:
/* realloc if (1) new struct or (2) size == capacity */
if (b->size == b->capacity) {
/* set newsize */
size_t newsize = b->capacity ? b->capacity * 2 : CAP;
Given the option, your choice should always be for the more readable and maintainable code. You do your job, and let the compiler worry about the rest.
(you can do a dynamic array of struct blockhead_node in exactly the same way -- just account for the number of structs you have in the array in addition to the size, capacity for data in each -- that is left to you)
So, let's see if our scheme of originally allocating for 2-int will allow us to add 100-int to our data array with a short example putting it altogether:
#include <stdio.h>
#include <stdlib.h>
#define CAP 2 /* initial capacity for new struct */
typedef struct {
size_t size,
capacity;
int *data;
} blkhd_node;
int add (blkhd_node *b, int v)
{
/* realloc if (1) new struct or (2) size == capacity */
if (!b->capacity || b->size == b->capacity) {
size_t newsize;
if (!b->capacity) /* new stuct, set size = CAP */
newsize = CAP;
else if (b->size == b->capacity) /* otherwise double size */
newsize = b->capacity * 2;
/* alway realloc with a temporary pointer */
void *tmp = realloc (b->data, newsize * sizeof *b->data);
if (!tmp) { /* validate reallocation */
perror ("realloc_b->data");
return 0; /* return failure */
}
b->data = tmp; /* assign new block of mem to data */
b->capacity = newsize; /* set capacity to newsize */
}
b->data[b->size++] = v; /* set data value to v */
return 1; /* return success */
}
void prndata (blkhd_node *b)
{
for (size_t i = 0; i < b->size; i++) {
if (i && i % 10 == 0)
putchar ('\n');
printf (" %3d", b->data[i]);
}
putchar ('\n');
}
int main (void) {
blkhd_node b = { .size = 0 }; /* declare/initialize struct */
for (int i = 0; i < 100; i++) /* add 100 data values */
if (!add (&b, i + 1))
break; /* don't exit, all prior data still good */
/* output results */
printf ("\nsize : %zu\ncapacity : %zu\n\n", b.size, b.capacity);
prndata (&b);
free (b.data); /* don't forget to free what you allocate */
return 0;
}
Example Use/Output
$ ./bin/dynarrayint
size : 100
capacity : 128
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/dynarrayint
==12589== Memcheck, a memory error detector
==12589== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==12589== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==12589== Command: ./bin/dynarrayint
==12589==
size : 100
capacity : 128
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
==12589==
==12589== HEAP SUMMARY:
==12589== in use at exit: 0 bytes in 0 blocks
==12589== total heap usage: 7 allocs, 7 frees, 1,016 bytes allocated
==12589==
==12589== All heap blocks were freed -- no leaks are possible
==12589==
==12589== For counts of detected and suppressed errors, rerun with: -v
==12589== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

Valgrind Error for Eratosthenes Prime (C)

I've written this program to calculate the prime numbers up to len using Eratosthenes method. The program works fine and I can even calculate up to very large numbers like 999,999 and so on and I get a fine output. But the issue is that valgrind always shows me errors, no matter how small or how big len is.
Program:
#include <stdio.h>
#include <stdlib.h>
int main(){
size_t len=100;
int *array=malloc(len * sizeof(*array));
// initialize all elements to 1
for(int i=0;i<len;i++)
array[i]=1;
//set multiples of array[a] to 0
for(int a=2;a<len;a++){
for(int b=2;b<len;b++){
if(a*b>len)
break;
array[a*b]=0;
}
}
//print the index of "1"s in the array
for(int a=2;a<=len;a++){
if(array[a]==1)
printf("%d ", a);
}
printf("\n");
free(array);
return 0;
}
Errors:
I compile using: gcc -std=c99 -Wall -g test.c -o test
Output: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
gdb shows no errors or malfunction
valgrind ./test shows:
==10134== Invalid write of size 4
==10134== at 0x400695: main (...)
==10134== Address 0x52041d0 is 0 bytes after a block of size 400 alloc'd
==10134== at 0x4C2DB8F: malloc (...)
==10134== by 0x400625: main (...)
==10134==
==10134== Invalid read of size 4
==10134== at 0x4006D9: main (...)
==10134== Address 0x52041d0 is 0 bytes after a block of size 400 alloc'd
==10134== at 0x4C2DB8F: malloc (...)
==10134== by 0x400625: main (...)
==10134==
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
==10134==
==10134== HEAP SUMMARY:
==10134== in use at exit: 0 bytes in 0 blocks
==10134== total heap usage: 2 allocs, 2 frees, 1,424 bytes allocated
==10134==
==10134== All heap blocks were freed -- no leaks are possible
==10134==
==10134== For counts of detected and suppressed errors, rerun with: -v
==10134== ERROR SUMMARY: 8 errors from 2 contexts (suppressed: 0 from 0)
As you can see in the code, I am declaring my array as int *array=malloc(len * sizeof(*array)); which seems to me to be the problem. If I declare it like this: int array[len];, valgrind doesn't show me error for small numbers of len, which is also something that fascinates me. Higher numbers of len with VLA declaration causes some unexpected behavior. So what is happening here? Is there anything wrong with the code or can I simply ignore the valgrind errors, since the output is ok? Also, as I said earlier, the program works fine for very large numbers like 999,999 but running valgrind for that len gives exactly 999,999 errors in valgrind. Any explanation is highly appreciated :)
Your malloc is correct, sizeof(*array) is the same as sizeof(int), because *array is an int.
for(int a=2;a<=len;a++) overflows array because array[len-1] is the last element in the array and eventually you do
if(array[a]==1)
printf("%d ", a);
with a==100 which is out of bounds.
It should be
for(int a=2;a<len;a++)
if(array[a] == 1)
printf("%d ", a);
Edit
As Jonathan Leffler has pointed out in the comments,
if(a*b>len)
break;
if also wrong. For a==2 and b==50 you have a*b==100 which is not greater
than len and you are accessing out of bounds again. The condition should be
if(a*b >= len)
break;
Just two comments about your code:
The algorithm implemented here is not exactly the algorithm known as The sieve of Erathostenes, because if you analyse your code you'll see that you have two loops, running all possible values of a and b and marking them as compound. Yes, you finish with a sieve... but that was not the efficient one from Erathostenes. The sieve of Erathostenes consists in getting the next unmarked element, and mark all elements that are multiples of it (until you reach the maximum index, at len) this is something like:
#include <stdio.h>
#include <stdlib.h>
#define N 10000
#define sqrt_N 100
int array[N];
int main(){
int a, b;
/* better to leave them at 0 and marking with ones */
// initialize all elements to 1
//for(int i=0;i<len;i++)
//array[i]=1;
//set multiples of a to 1
/* We only need to do this upto sqrt_N because if a unmarked number
* is discovered above sqrt_N it will be marked (as compund) because
* it was the product of a number less than sqrt_N or be prime (because
* it cannot be the product of two numbers greater than the sqrt(N)
* or it would be greater than N. */
for(a = 2; a < sqrt_N; a++){
if (array[a] == 0) { /* do the inner loop only if a is a prime */
for(b = 2; a * b < N; b++)
array[a * b] = 1;
}
}
//print the index of "0"s in the array
for(a = 2; a <= N; a++){
if(!array[a])
printf("%s%d", a == 2 ? "" : ", ", a);
}
printf("\n");
//free(array); /* array is no more dynamic */
return 0;
}
just check the computation time and you'll see it runs faster than yours.
The problem you are having with a declaration like this:
int sieve[999999];
is that, if you put that as a local variable on main() (as you do) it is going to be stored in the stack, so that means you need at least one million integers (of size 4, most probably) and this is four megabytes in the stack. I don't actually know the operating system you are using, but it is common that the stack size is limited (in the order of 4-10 Mb) so you can overflow the stack if you are not careful with the size of your automatic variables.
The invalid write message from valgrind(1) comes from the fact that you have written the following code:
if (a*b > len) break;
and that means that, in the possible case that a*b results exactly len, you are writing in the array cell array[len] which is outside of the bounds of array (by one, they go from 0 to len - 1) You have to change that line to read:
if (a*b >= len) break;
or
if (a*b < len) {
array[a*b] = 0;
}

Having issues iterating through machine code

I'm attempting to recreate the wc command in c and having issues getting the proper number of words in any file containing machine code (core files or compiled c). The number of logged words always comes up around 90% short of the amount returned by wc.
For reference here is the project info
Compile statement
gcc -ggdb wordCount.c -o wordCount -std=c99
wordCount.c
/*
* Author(s) - Colin McGrath
* Description - Lab 3 - WC LINUX
* Date - January 28, 2015
*/
#include<stdio.h>
#include<string.h>
#include<dirent.h>
#include<sys/stat.h>
#include<ctype.h>
struct counterStruct {
int newlines;
int words;
int bt;
};
typedef struct counterStruct ct;
ct totals = {0};
struct stat st;
void wc(ct counter, char *arg)
{
printf("%6lu %6lu %6lu %s\n", counter.newlines, counter.words, counter.bt, arg);
}
void process(char *arg)
{
lstat(arg, &st);
if (S_ISDIR(st.st_mode))
{
char message[4056] = "wc: ";
strcat(message, arg);
strcat(message, ": Is a directory\n");
printf(message);
ct counter = {0};
wc(counter, arg);
}
else if (S_ISREG(st.st_mode))
{
FILE *file;
file = fopen(arg, "r");
ct currentCount = {0};
if (file != NULL)
{
char holder[65536];
while (fgets(holder, 65536, file) != NULL)
{
totals.newlines++;
currentCount.newlines++;
int c = 0;
for (int i=0; i<strlen(holder); i++)
{
if (isspace(holder[i]))
{
if (c != 0)
{
totals.words++;
currentCount.words++;
c = 0;
}
}
else
c = 1;
}
}
}
currentCount.bt = st.st_size;
totals.bt = totals.bt + st.st_size;
wc(currentCount, arg);
}
}
int main(int argc, char *argv[])
{
if (argc > 1)
{
for (int i=1; i<argc; i++)
{
//printf("%s\n", argv[i]);
process(argv[i]);
}
}
wc(totals, "total");
return 0;
}
Sample wc output:
135 742 360448 /home/cpmcgrat/53/labs/lab-2/core.22321
231 1189 192512 /home/cpmcgrat/53/labs/lab-2/core.26554
5372 40960 365441 /home/cpmcgrat/53/labs/lab-2/file
24 224 12494 /home/cpmcgrat/53/labs/lab-2/frequency
45 116 869 /home/cpmcgrat/53/labs/lab-2/frequency.c
5372 40960 365441 /home/cpmcgrat/53/labs/lab-2/lineIn
12 50 1013 /home/cpmcgrat/53/labs/lab-2/lineIn2
0 0 0 /home/cpmcgrat/53/labs/lab-2/lineOut
39 247 11225 /home/cpmcgrat/53/labs/lab-2/parseURL
138 318 2151 /home/cpmcgrat/53/labs/lab-2/parseURL.c
41 230 10942 /home/cpmcgrat/53/labs/lab-2/roman
66 162 1164 /home/cpmcgrat/53/labs/lab-2/roman.c
13 13 83 /home/cpmcgrat/53/labs/lab-2/romanIn
13 39 169 /home/cpmcgrat/53/labs/lab-2/romanOut
7 6 287 /home/cpmcgrat/53/labs/lab-2/URLs
11508 85256 1324239 total
Sample rebuild output (./wordCount):
139 76 360448 /home/cpmcgrat/53/labs/lab-2/core.22321
233 493 192512 /home/cpmcgrat/53/labs/lab-2/core.26554
5372 40960 365441 /home/cpmcgrat/53/labs/lab-2/file
25 3 12494 /home/cpmcgrat/53/labs/lab-2/frequency
45 116 869 /home/cpmcgrat/53/labs/lab-2/frequency.c
5372 40960 365441 /home/cpmcgrat/53/labs/lab-2/lineIn
12 50 1013 /home/cpmcgrat/53/labs/lab-2/lineIn2
0 0 0 /home/cpmcgrat/53/labs/lab-2/lineOut
40 6 11225 /home/cpmcgrat/53/labs/lab-2/parseURL
138 318 2151 /home/cpmcgrat/53/labs/lab-2/parseURL.c
42 3 10942 /home/cpmcgrat/53/labs/lab-2/roman
66 162 1164 /home/cpmcgrat/53/labs/lab-2/roman.c
13 13 83 /home/cpmcgrat/53/labs/lab-2/romanIn
13 39 169 /home/cpmcgrat/53/labs/lab-2/romanOut
7 6 287 /home/cpmcgrat/53/labs/lab-2/URLs
11517 83205 1324239 total
Notice the difference in the word count (second int) from the first two files (core files) as well as the roman file and parseURL files (machine code, no extension).
C strings do not store their length. They are terminated by a single NUL (0) byte.
Consequently, strlen needs to scan the entire string, character by character, until it reaches the NUL. That makes this:
for (int i=0; i<strlen(holder); i++)
desperately inefficient: for every character in holder, it needs to count all the characters in holder in order to test whether i is still in range. That transforms a simple linear Θ(N) algorithm into an Θ(N2) cycle-burner.
But in this case, it also produces the wrong result, since binary files typically include lots of NUL characters. Since strlen will actually tell you where the first NUL is, rather than how long the "line" is, you'll end up skipping a lot of bytes in the file. (On the bright side, that makes the scan quadratically faster, but computing the wrong result more rapidly is not really a win.)
You cannot use fgets to read binary files because the fgets interface doesn't tell you how much it read. You can use the Posix 2008 getline interface instead, or you can do binary input with fread, which is more efficient but will force you to count newlines yourself. (Not the worst thing in the world; you seem to be getting that count wrong, too.)
Or, of course, you could read the file one character at a time with fgetc. For a school exercise, that's not a bad solution; the resulting code is easy to write and understand, and typical implementations of fgetc are more efficient than the FUD would indicate.

How fast is mprotect

My question is how fast is mprotect. What will be the difference between mprotecting say 1 MB of contiguous memory as compared to 1 GB of contiguous memory? Of course I can measure the time, but I want to know what goes under the hood.
A quick check on the source seems to indicate that it iterates over the process mappings in the selected region and change their flags. If you protect less than a whole mapping it'll split it into two or three.
So in short it's O(n) where n is the number of times you've called mmap.
You can see all the current maps in /proc/pid/maps
It is O(n) on page count in the region too, because it should change a access bits on all PTEs (Page translation Entries, which describes virtual->physical page mappings in PageTable). Calltree:
mprotect
->
mprotect_fixup
->
change_pte_range
http://lxr.free-electrons.com/source/mm/mprotect.c#L32
47 do {
48 oldpte = *pte;
49 if (pte_present(oldpte)) {
50 pte_t ptent;
51
52 ptent = ptep_modify_prot_start(mm, addr, pte);
53 ptent = pte_modify(ptent, newprot);
54
55 /*
56 * Avoid taking write faults for pages we know to be
57 * dirty.
58 */
59 if (dirty_accountable && pte_dirty(ptent))
60 ptent = pte_mkwrite(ptent);
61
62 ptep_modify_prot_commit(mm, addr, pte, ptent);
63 } else if (PAGE_MIGRATION && !pte_file(oldpte)) {
64 swp_entry_t entry = pte_to_swp_entry(oldpte);
65
66 if (is_write_migration_entry(entry)) {
67 /*
68 * A protection check is difficult so
69 * just be safe and disable write
70 */
71 make_migration_entry_read(&entry);
72 set_pte_at(mm, addr, pte,
73 swp_entry_to_pte(entry));
74 }
75 }
76 } while (pte++, addr += PAGE_SIZE, addr != end);
Note the increment: addr += PAGE_SIZE, addr != end);

Resources