How fast is mprotect

How fast is mprotect - c

My question is how fast is mprotect. What will be the difference between mprotecting say 1 MB of contiguous memory as compared to 1 GB of contiguous memory? Of course I can measure the time, but I want to know what goes under the hood.

A quick check on the source seems to indicate that it iterates over the process mappings in the selected region and change their flags. If you protect less than a whole mapping it'll split it into two or three.
So in short it's O(n) where n is the number of times you've called mmap.
You can see all the current maps in /proc/pid/maps

It is O(n) on page count in the region too, because it should change a access bits on all PTEs (Page translation Entries, which describes virtual->physical page mappings in PageTable). Calltree:
mprotect
->
mprotect_fixup
->
change_pte_range
http://lxr.free-electrons.com/source/mm/mprotect.c#L32
47 do {
48 oldpte = *pte;
49 if (pte_present(oldpte)) {
50 pte_t ptent;
51
52 ptent = ptep_modify_prot_start(mm, addr, pte);
53 ptent = pte_modify(ptent, newprot);
54
55 /*
56 * Avoid taking write faults for pages we know to be
57 * dirty.
58 */
59 if (dirty_accountable && pte_dirty(ptent))
60 ptent = pte_mkwrite(ptent);
61
62 ptep_modify_prot_commit(mm, addr, pte, ptent);
63 } else if (PAGE_MIGRATION && !pte_file(oldpte)) {
64 swp_entry_t entry = pte_to_swp_entry(oldpte);
65
66 if (is_write_migration_entry(entry)) {
67 /*
68 * A protection check is difficult so
69 * just be safe and disable write
70 */
71 make_migration_entry_read(&entry);
72 set_pte_at(mm, addr, pte,
73 swp_entry_to_pte(entry));
74 }
75 }
76 } while (pte++, addr += PAGE_SIZE, addr != end);
Note the increment: addr += PAGE_SIZE, addr != end);

Related

How to convert a linking list to an dynamic array

I created an function in c as a linked list but i can't understand how convert it to a linked list.
My Question is how to convert an linked list to an dynamic array. This function adds elements to an linked list i want to be able to add elements to an dynamic array. i don't know if the struct blockhead_node is right like i said i'm still learning c. I do not understand dynamic arrays so i'm trying to create an program that uses them so i may understand them better. i want to create an add function that adds elements to the front or end of the list. This is what i have:
//this struct i'm trying to use for dynamic array
struct blockhead_node
{
float x, y;
float dx, dy;
long color;
int size; // slots used so far
int capacity; // total available slots
int *data; // array of integers we're storing
};
//this struct is for the linked list
struct blockhead_node
{
float x,y;
float dx, dy;
long color;
int size;
struct blockhead_node * next;
};
void add(struct blockhead_node ** blockhead_list) // double pointer because we can't modify the list it self
{
while((*blockhead_list)!=NULL)
{
blockhead_list=&(*blockhead_list)->next;
}
(*blockhead_list) = (struct blockhead_node*) malloc(sizeof(struct blockhead_node));
(*blockhead_list)->x = rand()%screen_width + (*blockhead_list)->size;
//(*blockhead_list)->x = 400;
//
//look up how to create an random floating point number
//
(*blockhead_list)->dx = ((float)rand()/(float)(10000));
(*blockhead_list)->y = rand()%screen_height + (*blockhead_list)->size;
(*blockhead_list)->dy = ((float)rand()/(float)(10000));
(*blockhead_list)->size = rand()%100;
(*blockhead_list)->next = NULL;
if((*blockhead_list)->x + (*blockhead_list)->size > screen_width)
{
(*blockhead_list)->x = screen_width - (*blockhead_list)->size;
}
if((*blockhead_list)->y + (*blockhead_list)->size > screen_height)
{
(*blockhead_list)->y = screen_height - (*blockhead_list)->size;
}
}

Using a dynamic array is not much different than using a linked-list. The primary difference is you are responsible for keeping track of size and realloc your data array when size == capacity (or when capacity == 0 for a new stuct).
Other than that you are either allocating per-node with a linked-list, whereas with a dynamic array you can allocate in a more efficient manner by allocating by some multiple of the current capacity when size == capacity. For arrays of completely unknown size you can either start with capacity = 2 or 8 and then either increase by some multiple (e.g. 3/2 or 2, etc..) or some fixed amount (8, 16, etc..) each time a reallocation is needed. I generally just double the current capacity, e.g. starting a 2, reallocating storage in data for 4, 8, 16, 32, ... integers as more numbers are added to data. You can do it any way you like -- as long as you keep track of your size and capacity.
A short example may help. Let's start with just a simple struct where data is the only dynamic member we are concerned about. For example:
#define CAP 2 /* initial capacity for new struct */
typedef struct {
size_t size,
capacity;
int *data;
} blkhd_node;
Where size is your "slots used so far" and capacity is your "total available slots" and data your "array of integers we're storing".
The add function is fairly trivial. All you need to do is check whether you need to realloc (e.g. this is a struct not used yet where capacity == 0 or all the slots are used and size == capacity). Since realloc can be used for both new and resizing the allocation, it is all you need. The scheme is simple, if this is a new struct, we will allocate for CAP number of int, otherwise if size == capacity we will allocated 2 * capacity number of int.
Whenever we realloc we do so with a temporary pointer! Why? When realloc fails it returns NULL. If you realloc with the original pointer (e.g. data = realloc (data, newsize); and NULL is returned, you overwrite your original pointer address for data with NULL and your ability to reach (or free) the original block of memory is lost -- creating a memory leak. By using a temporary pointer, we can validate whether realloc succeeds before we assign the new address to data. Importantly, if realloc does fail -- then our existing integers pointed to by data are still fine, so we are free to use our original allocation in the event of realloc failure. Important to remember.
We also need a way to indicate success/failure of or add function. Here, you are not adding a new node, so returning a pointer to a new node isn't an option. In this case a simple int return of 0 for failure of 1 for success is just as good as anything else.
With that, an add function could be as simple as:
int add (blkhd_node *b, int v)
{
/* realloc if (1) new struct or (2) size == capacity */
if (!b->capacity || b->size == b->capacity) {
size_t newsize;
if (!b->capacity) /* new stuct, set size = CAP */
newsize = CAP;
else if (b->size == b->capacity) /* otherwise double size */
newsize = b->capacity * 2;
/* alway realloc with a temporary pointer */
void *tmp = realloc (b->data, newsize * sizeof *b->data);
if (!tmp) { /* validate reallocation */
perror ("realloc_b->data");
return 0; /* return failure */
}
b->data = tmp; /* assign new block of mem to data */
b->capacity = newsize; /* set capacity to newsize */
}
b->data[b->size++] = v; /* set data value to v */
return 1; /* return success */
}
Note: since you are initializing your struct and both size and capacity will be 0, you can dispense with separate checks for capacity == 0 and size == capacity and use a ternary to set newsize. This is arguably the less-readable way to go and a good compiler will optimize both identically, but for sake of completeness, you could replace the code setting newsize with:
/* realloc if (1) new struct or (2) size == capacity */
if (b->size == b->capacity) {
/* set newsize */
size_t newsize = b->capacity ? b->capacity * 2 : CAP;
Given the option, your choice should always be for the more readable and maintainable code. You do your job, and let the compiler worry about the rest.
(you can do a dynamic array of struct blockhead_node in exactly the same way -- just account for the number of structs you have in the array in addition to the size, capacity for data in each -- that is left to you)
So, let's see if our scheme of originally allocating for 2-int will allow us to add 100-int to our data array with a short example putting it altogether:
#include <stdio.h>
#include <stdlib.h>
#define CAP 2 /* initial capacity for new struct */
typedef struct {
size_t size,
capacity;
int *data;
} blkhd_node;
int add (blkhd_node *b, int v)
{
/* realloc if (1) new struct or (2) size == capacity */
if (!b->capacity || b->size == b->capacity) {
size_t newsize;
if (!b->capacity) /* new stuct, set size = CAP */
newsize = CAP;
else if (b->size == b->capacity) /* otherwise double size */
newsize = b->capacity * 2;
/* alway realloc with a temporary pointer */
void *tmp = realloc (b->data, newsize * sizeof *b->data);
if (!tmp) { /* validate reallocation */
perror ("realloc_b->data");
return 0; /* return failure */
}
b->data = tmp; /* assign new block of mem to data */
b->capacity = newsize; /* set capacity to newsize */
}
b->data[b->size++] = v; /* set data value to v */
return 1; /* return success */
}
void prndata (blkhd_node *b)
{
for (size_t i = 0; i < b->size; i++) {
if (i && i % 10 == 0)
putchar ('\n');
printf (" %3d", b->data[i]);
}
putchar ('\n');
}
int main (void) {
blkhd_node b = { .size = 0 }; /* declare/initialize struct */
for (int i = 0; i < 100; i++) /* add 100 data values */
if (!add (&b, i + 1))
break; /* don't exit, all prior data still good */
/* output results */
printf ("\nsize : %zu\ncapacity : %zu\n\n", b.size, b.capacity);
prndata (&b);
free (b.data); /* don't forget to free what you allocate */
return 0;
}
Example Use/Output
$ ./bin/dynarrayint
size : 100
capacity : 128
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/dynarrayint
==12589== Memcheck, a memory error detector
==12589== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==12589== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==12589== Command: ./bin/dynarrayint
==12589==
size : 100
capacity : 128
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
==12589==
==12589== HEAP SUMMARY:
==12589== in use at exit: 0 bytes in 0 blocks
==12589== total heap usage: 7 allocs, 7 frees, 1,016 bytes allocated
==12589==
==12589== All heap blocks were freed -- no leaks are possible
==12589==
==12589== For counts of detected and suppressed errors, rerun with: -v
==12589== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

how to get a clock from a device tree node

I have the following issue: I want to define the clock a CPU should use during frequency transitions in the device tree rather than in the clock driver code (in this way it will be more generic). I want to define the "transition-clock" property in the device tree, something like:
232 cpu: cpu#01c20050 {
233 #clock-cells = <0>;
234 compatible = "allwinner,sun4i-a10-cpu-clk";
235 reg = <0x01c20050 0x4>;
-----
243 clocks = <&osc32k>, <&osc24M>, <&pll1>, <&pll1>;
244 transition-clock = <&osc24M>;
245 clock-output-names = "cpu";
246 };
I changed the file "drivers/clk/clkdev.c" to search this property so I can get a clk pointer and store it in a new property in the cpu clock structure. This is what I have managed so far (changes begin on line 78):
59 static struct clk *__of_clk_get(struct device_node *np, int index,
60 const char *dev_id, const char *con_id)
61 {
62 struct of_phandle_args clkspec;
63 struct clk *clk, *transition_clk;
64 struct device_node *clock_node, *transition_clock_node;
65 int rc;
66
67 if (index < 0)
68 return ERR_PTR(-EINVAL);
69
70 rc = of_parse_phandle_with_args(np, "clocks", "#clock-cells", index,
71 &clkspec);
72 if (rc)
73 return ERR_PTR(rc);
74
75 clk = __of_clk_get_by_clkspec(&clkspec, dev_id, con_id);
76 of_node_put(clkspec.np);
77
78 clock_node = of_parse_phandle(np, "clocks", 0);
79 pr_err("-------------------- parsing node %s\n", np->name);
80 if (clock_node!=NULL) {
81 pr_err("============ Clock node found %p\n", clock_node);
82 transition_clock_node =
83 of_parse_phandle(clock_node, "transition-clock", 0);
84 if (transition_clock_node!=NULL) {
85 pr_err("============ Transition clock node found %p\n",
86 transition_clock_node);
87 transition_clk = clk_get(clock_node, 0);
88 pr_err("============ Transition clock %p\n", transition_clk);
89 }
90 }
91
92 return clk;
93 }
I get pointers to the clock node and the transition clock node, but when I try to get a clock pointer from that, I get 0xfffffffe, which looks like an error:
[ 2.540542] -------------------- parsing node cpu
[ 2.540546] ============ Clock node found eeef9520
[ 2.540550] ============ Transition clock node found eeef8934
[ 2.540555] ============ Transition clock fffffffe
I want a clock object which I can later use in
clk_set_parent(cpu_clk, cpu_clk->transition_clk)
Any ideas are welcome :)

OK, I got it figured out in the meantime and since nobody answered I'm going to provide my solution (maybe somebody else bumps into this issue):
81 if (clock_node != NULL) {
82 rc = of_parse_phandle_with_args(clock_node, "transition-clock", NULL,
83 0, &transition_clkspec);
84 of_node_put(transition_clkspec.np);
85
86 if (!rc) {
87 transition_clock = __of_clk_get_by_clkspec(&transition_clkspec,
88 dev_id, con_id);
89 clk_set_transition_parent(clk, transition_clock);
90 }
91 }
So, the solution is to get an object of type "of_phandle_args" and get the clock from there using __of_clk_get_by_clkspec.
(the clk_set_transition_parent function is defined somewhere else and it does exactly what it's name suggests)

bufbomb stack overflow failed

I'm using bufbomb.c to do some buffer overflow attack experimenting.
I successfully used gdb to debug the code. Howeverer; when I run the program directly, I get a "Segmentation fault (core dumped)" when I enter the characters to try the attack.
I used gcc (Ubuntu/Linaro 4.8.1-10ubuntu9) 4.8.1. to build the following.
//bufbomb.c
/* Bomb program that is solved using a buffer overflow attack */
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
/* Like gets, except that characters are typed as pairs of hex digits.
Nondigit characters are ignored. Stops when encounters newline */
char *getxs(char *dest)
{
int c;
int even =1; /* Have read even number of digits */
int otherd =0; /* Other hex digit of pair */
char*sp = dest;
while ((c = getchar()) != EOF && c !='\n') {
if (isxdigit(c)) {
int val;
if ('0'<= c && c <='9')
val = c -'0';
else if ('A'<= c && c <='F')
val = c -'A'+10;
else
val = c -'a'+10;
if (even) {
otherd = val;
even =0;
}
else {
*sp++= otherd *16+ val;
even =1;
}
}
}
*sp++='\0';
return dest;
}
/* $begin getbuf-c */
int getbuf()
{
char buf[12];
getxs(buf);
return 1;
}
void test()
{
int val;
printf("Type Hex string:");
val = getbuf();
printf("getbuf returned 0x%x\n", val);
}
/* $end getbuf-c */
int main()
{
int buf[16];
/* This little hack is an attempt to get the stack to be in a
stable position
*/
int offset = (((int) buf) &0xFFF);
int*space = (int*) alloca(offset);
*space =0; /* So that don't get complaint of unused variable */
test();
return 0;
}
Then I executed it under gdb:
...> gdb ./bugbomb
...
..run
Type Hex string:30 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37 38 39 d8 bf ff ff 9f 85 04 08 b0 86 04 08 30 31 32 33 30 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37 38 39 ef be ad de
getbuf returned 0xdeadbeef
[Inferior 1 (process 13530) exited normally]
And then without gdb::
./bufbomb
Type Hex string:30 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37 38 39 d8 bf ff ff 9f 85 04 08 b0 86 04 08 30 31 32 33 30 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37 38 39 ef be ad de
Segmentation fault (core dumped)
I am looking for some help to resolve the seg-fault.

Run it under gdb with a bigger buffer to see which address it's trying to access to guess the stack offset of the return address used by getbuf().
To bear with small differences in memory offsets that arise from the use of gdb, use a NOP-sled. Your attack buffer should look like this:
|RET ADDRESS x 30 | NOPS (0x90) x 1000 | SHELLCODE|.
The return address should point to the middle of the NOP-sled.
If the execution jumps anywhere in the sled, it will slide to the shellcode.

You are accessing memory that your process doesn't "own".
When you run gdb, the compiler adds stuff (like extra debug info).
You can bypass the segmentation fault by expanding the stack before you attempt the buffer overflow:
int expand_stack(int n_bytes)
{
char buf[n_bytes];
return buf[n_bytes-1]; // access the memory to make sure the optimiser doesn't remove buf.
}
And in main, add a call to expand_stack before you call test:
int main()
{
int buf[16];
/* This little hack is an attempt to get the stack to be in a
stable position
*/
int offset = (((int) buf) &0xFFF);
int*space = (int*) alloca(offset);
*space = expand_stack(200);
test();
return 0;
}
Note that your code still invokes undefined behaviour.
Note2: If your compiler doesn't support variable length arrays, just use a fixed array, buf[200].

Hoard performance degrades severely when allocating large size chunks

I have written below sample program in 'C' which is dynamic memory intensive and tried to benchmark the same (in terms of time taken) for the 'glibc' default allocator versus Hoard allocator.
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 #define NUM_OF_BLOCKS (1 * 4096)
5
6 void *allocated_mem_ptr_arr[NUM_OF_BLOCKS];
7
8 int
9 main (int argc, char *argv[])
10 {
11 void *myblock = NULL;
12
13 int count, iter;
14
15 int blk_sz;
16
17 if (argc != 2)
18 {
19 fprintf (stderr, "Usage:./memory_intensive <Block size (KB)>\n\n");
20 exit (-1);
21 }
22
23 blk_sz = atoi (argv[1]);
24
25 for (iter = 0; iter < 1024; iter++)
26 {
27 /*
28 * The allocated memory is not accessed (read/write) hence the residual memory
29 * size remains low since no corresponding physical pages are being allocated
30 */
31 printf ("\nCurrently at iteration %d\n", iter);
32 fflush (NULL);
33
34 for (count = 0; count < NUM_OF_BLOCKS; count++)
35 {
36 myblock = (void *) malloc (blk_sz * 1024);
37 if (!myblock)
38 {
39 printf ("malloc() fails\n");
40 sleep (30);
41 return;
42 }
43
44 allocated_mem_ptr_arr[count] = myblock;
45 }
46
47 for (count = 0; count < NUM_OF_BLOCKS; count++)
48 {
49 free (allocated_mem_ptr_arr[count]);
50 }
51 }
52 }
As a result of this benchmark activity, I got below results (Block Size, Time elapsed for default allocator, Time elapsed for Hoard):
'1K' '4.380s' '0.927s'
'2k' '8.390s' '0.960s'
'4k' '16.757s' '1.078s'
'8k' '16.619s' '1.154s'
'16k' '17.028s' '13m 6.463s'
'32k' '17.755s' '5m 45.039s'
As can be seen, Hoard performance severely degrades with block size >= 16K. What is the reason? Can we say that Hoard is not meant for applications allocating large size chunks?

There are some nice benchmarks and explanations at http://locklessinc.com/benchmarks_allocator.shtml:
For small allocaitons, it still performs similarly to tcmalloc. However, beyond about 64KiB, it drops drastically in performance. It uses a central "hoard" to redistribute memory between threads. This presents a bottle-neck as only one thread at a time can be using it. As the number of threads increases, the problem gets worse and worse.

extract coap query

I need your help in extracting the query value in a coap message.The coap message looks like
coap://[ff08:90:5001:0:0:0:0:1]:12345/c?a=4
decoded packet is 52 02 00 00 91 63 63 61 3d 34 . Here 63 61 3d 34 is the query part ?a=4 . There is a data after query. I have pointed my buffer pointer to 63(?), now I'm struck in getting the value 34(4). How do I go to the value and extract it?
coap_h *hdr = (coap_h *)(buf);
buf = (uint8_t *)(hdr + 1);
len = buf[0] & 0xf;
buf += len + 1;
buf points to 52 initially and then I move the buf to the options field 91 and check for length then increment the buf which points to 63 (?). hope i'm clear this time.

I don't have the time at the moment to parse your packet by hand, but you should know that the way options work has changed dramatically in CoAP-12. I have implemented some functions to encode and parse options, which you might find useful:
https://github.com/darconeous/smcp/blob/master/src/smcp/coap.c
https://github.com/darconeous/smcp/blob/master/src/smcp/coap.h