In learning C, I pay more attention to memory allocation and we were basically given this code in my Uni building a binary tree.
Each time in the for loop a pointer will be created and only then we will assign the addresses of each subtree and its children to the nodes. But with that we have created over 10 pointers right?
Could we juste have a pointer and keep updating it? Like the implementation with recursion where the main pointer get only updated and then returned.
typedef struct tr {
int data;
struct tr *left, *right;
} btree, *btreeptr;
btree *createtree(int data) {
btree *newtree;
newtree = (btree *)malloc(sizeof(btree));
newtree->left = NULL;
newtree->right = NULL;
newtree->data = data;// Is the same as : (*p).data,
return newtree;
}
int main() {
btree *test[10];
btreeptr baum;
int i;
for (i = 0; i < 10; i++) {
test[i] = createtree(i);
}
baum = test[0];
test[0]->left = test[1];
test[0]->right = test[2];
test[1]->left = test[3];
test[1]->right = test[4];
test[2]->right = test[5];
test[4]->left = test[6];
test[5]->left = test[7];
test[5]->right = test[8];
test[7]->left = test[9];
printf("Size: %d\n", size(baum));
printf("Leaves: %d\n", numberOfLeaves(baum));
printf("Height: %d\n", height(baum));
return 0;
}
Please if my understanding of memory is wrong I would appreciate being corrected.
It probably would have been better to ask your teacher why they chose to do that; I can only guess. My guess is that they want you to write the functions called by the three printf statements at the end, none of which depend on the values stored in the tree nodes. I'm pretty sure that later one, you will be writing functions to insert and delete nodes in the tree, at which point the temporary vector of allocations will disappear. For now, however, there must be some mechanism for creating the tree edges for at least one test tree. And how are you going to verify that your implementations of size, height and numberOfLeaves, work if you don't have some tree to try them with?
Your instructor is probably trying to teach you something quite important: when you are writing a system, you should test everything, as soon as possible (before you write hundreds of lines of code using untested functions), and as far as possible without dependencies on other untested functions. "Mocking" a test tree is one technique for testing without depending on the tree insert function (which hasn't yet been written).
Eventually, the data fields will be important. For now, storing the node number as the value of the data field might help you during debugging, since it will let you see which node is being processed at any point in time. Learning to debug is also a very important skill. So you might eventually decide to thank whoever designed that course material.
Title: Is that usage of tree a waste of memory?
It's good to question what comes your way.
If the purpose of the example was to introduce the data structure known as a binary tree (with arbitrary connections), then the example serves its purpose. Is such a binary tree an economic solution given the values and connections presented? Perhaps not...
For your interest:
#include <stdio.h>
typedef struct tr {
int data;
struct tr *left, *right;
};
int main( void ) {
unsigned char t[ 10 ] = { 0x12, 0x34, 0x05, 0x00, 0x60, 0x78, 0x00, 0x90 };
for( size_t i = 0; i < sizeof t; i++ )
printf( "node %d: left %d right %d\n", i, t[i]>>4, t[i]&0xF );
printf( "size of t[] = %ld\n", sizeof t );
printf( "size of 10x struct tr = %ld\n", 10 * sizeof( struct tr ) );
return 0;
}
node 0: left 1 right 2
node 1: left 3 right 4
node 2: left 0 right 5
node 3: left 0 right 0
node 4: left 6 right 0
node 5: left 7 right 8
node 6: left 0 right 0
node 7: left 9 right 0
node 8: left 0 right 0
node 9: left 0 right 0
size of t[] = 10
size of 10x struct tr = 120
(And, my ancient compiler uses 4 byte pointers, not 8 byte pointers.)
The "payload" in the example is limited to binary values 0 through 9, so one 'nybble' (4 bits) fulfills the shown requirements.
Some may argue that the initialisation shown above requires too much effort.
Below is initialisation that corresponds to the OP version.
unsigned char t[ 10 ] = { 0 };
t[0] = 1 << 4 | 2;
t[1] = 3 << 4 | 4;
t[2] = 0 << 4 | 5;
t[4] = 6 << 4 | 0;
t[5] = 7 << 4 | 8;
t[7] = 9 << 4 | 0;
Perhaps someone will re-write these statements using a simple macro.
Is it a waste of memory? Well, in a sense. The code lays out a bunch of stuff into a binary tree of pointers but doesn't do anything with them. This code could eventually be utilized for something, which is probably why it's being shown to you.
Another interesting thing is that this code unintentionally forms a Heap, a subset of binary tree that can be represented with a contiguous array, in which the children of any node with index i are simply the entries i*2+1 and i*2+2. That style of binary tree requires no pointers simply because all the structural relationships are implied by the nodes' positions. Though because Heaps also have other requirements and benefits, this code would not represent a heap if, say, the values in the initial array were random instead of sequential.
Likewise most binary trees cannot be represented as a contiguous array. For those, this may indeed be the most efficient representation. It also has the benefit of being really easy to handle recursively.
Related
I'm trying to create a type set, that holds like so
typedef struct set
{
char *blocks;
char *numbers;
char blockNum;
};
then in order to create it I do
void createSet(set *newSet)
{
newSet->blockNum = 1;
newSet->blocks = calloc(newSet->blockNum, BYTE_SIZE);
}
I can initially get numbers from [0,127], with no duplicates
so in order to do that, I want to change the nth bit into 1, and then when I want to read the set, I look at the memory that set.blocks take, and check the position of each bit inside it, if a bit is equal to 1, then the number that is equal to the position of that bit is in the set.
here's the code for adding an item
void addItem(set *set, unsigned int item)
{
int blocksToAdd = 0;
int i = 0;
int r = (item / BYTE_SIZE);
int rem=item % BYTE_SIZE;
if (r > set->blockNum)
{
if(r!=0) {
blocksToAdd = r - (set->blockNum);
}
set->blocks = realloc(set->blocks, (set->blockNum + blocksToAdd)*BYTE_SIZE);
}
*(set->blocks+r) = (1 << rem);
}
also, I increase the memory of blocks, by adding the amount of bytes needed to get the nth amount of bits,
so far it kinda works, but when the selected number to add is 7, I get that *blocks: -128 when it supposed to be 128, I guess it happen cause the then *blocks = 1000 0000.
So, I tried working around it but I couldn't get it to work, and I'm pretty sure I'm doing something wrong.
the goal was to store numbers from [0,127] with no duplicates, in a way that each element(number) takes 1 bit of memory.
int num =*(set->blocks+r);
if(num<0) {
set->blocks = (char *)realloc(set->blocks, (set->blockNum + 1)*BYTE_SIZE);
num*=-1;
*(set->blocks+r) = num;
}
Tried doing this to get around it, but it wouldn't work
edit
I managed to fix the issue, by changing blocks from char* to unsigned char*, also allocated 16 bytes from the start since calling relloc to increase the size every time isn't efficient, so I would instead check if I can decrease the memory allocated for set after adding all the items
I'm trying to write a function that when given an array and a value, it checks if the value is in that array. If it is there then keep finding a new unique random value before adding it to the array. This is what I have done so far but I think the problem is my lack of understanding of pointers. Here is what I have so far:
#include <stdio.h>
#include <stdlib.h>
int getNewIndex(int index, int *visitedPixels, int *visitedPixelsIndex);
int main() {
int *visitedPixels = malloc(2 * sizeof(int));
int *visitedPixelsIndex = 0;
srand(1);
int randIndex = rand() % 16, i;
printf("Initial randIndex = %d\n", randIndex);
for(i = 0; i < 16; i++) {
randIndex = getNewIndex(randIndex, visitedPixels, visitedPixelsIndex);
printf("randIndex[%d] = %d\n", i, visitedPixels[i]);
}
return 0;
}
int getNewIndex(int index, int *visitedPixels, int *visitedPixelsIndex) {
int i = 0;
while (i < *visitedPixelsIndex) {
(index == visitedPixels[i]) ? index = rand() % 16, i = 0 : i++;
}
visitedPixels[*visitedPixelsIndex] = index;
(*visitedPixelsIndex)++;
//(*visitedPixels) = realloc(visitedPixels, (*visitedPixelsIndex+1) * sizeof(int));
return index;
}
Any help would be appreciated.
Okay, so. I'm going to try to explain with a metaphor. Hopefully it helps rather than confusing more.
Imagine memory is a long board you can write numbers on. It takes an inch of board to write a small number. Bigger numbers can be represented by writing across more slots.
An array, in our metaphor, is just a contiguous length of board you can write stuff into. If you want an array of 5 integers, and each integer takes 4 inches, you'll need 20 inches of board for it. If you wanted to pass all these integers to a function, instead of copying them all across, you would instead write down how many inches from the end of the board your array is. That's what a pointer is. It's a number telling where something is.
When you called malloc( 2 * sizeof( int ) ), you requested for a segment of the board big enough for two integers, and you received how many inches from the end of the board that new segment is. So we've got 8 inches of board X inches from the end, with X being our pointer.
Incrementing a pointer says "increase this value to point at the next element of the underlying array". A int* will increase by 4, a pointer to a structure by the size of the structure plus any alignment offset the compiler has decided for it.
It does not increase the amount of storage.
If I have a pointer to two 8 inches of board, write a 4 inch number, increment the pointer to point 4 inches more in, write another 4 inch number and increment again, my pointer is now right after the last element of the array. If I write here, all bets are off. What was on the board after the array? Who knows. It could be anything. Maybe it was a different array. Maybe it was information for keeping track of what parts of the board have been handed out to the program. Maybe it was the end of my board and I'll write off the end. Writing to memory you haven't received permission to from the operating system is where signals for "segment violations", SIGSEGV, program failures come from.
You need to request more space up front, or bigger arrays as you need them. There's also a realloc that will do this too. And for all of them, you have to check if the call failed and terminate or otherwise recover appropriately.
Hopefully this is more helpful than confusing. Good luck :)
I'm trying to code a calculator in C, and want to make one that can calculate a multiple inputs, eg, (5*9 + 1 -2). These inputs can be completely random and I'm stuck on how to do this.
I know how to initialize a variable, and ask the user input a number and all that, but if the user wanted to add up 50 random numbers, that calculator wouldn't be able to do that.
Hope you can help, or share some tips
Thanks !
You will need to implement an expression parser that will take operator precedence into account. To me the two simplest ways to do this would be to either implement a recursive decent parser or to implement the Shunting yard algorithm.
See this for an example.
In order to do this, you need to read the entire line (it shouldn't be too hard),
then you need to parse it and store it into some data structures.
Here are two ways I know to store and use it :
The first one : it's easy to do, easy to use, but not beautiful nor fast :
A double linked-list with each link containing an operator or a number and a priority if it's an operator (you can use an enum + union if you want something cleaner):
struct list {
struct list *prev;
struct list *next;
char operator;
int number;
unsigned int priority;
}
You loop trough your string and apply an easy algorithm for priority (pseudocode):
var priority = 0
var array = cut string // ["5", "+", "3", "*", "(", "6", "-", "2", ")"]
check string // verify the string is correct
for each element of array :
if (element is number)
store in list
else if (element is operator)
store in list
if (element is '*' or '/' or '%')
set his priority to priority + 1
else
set his priority to priority
else if (element is open parenthesis)
priority += 2
else if (element is close parenthesis)
priority -= 2
For example :
string:
5 + 3 * (6 - 2) - 1
priorities:
0 1 2 0
Then, to do your calculations :
while list isn't empty:
find operator with the highest priority // if there is more than one, take the first
calculate this operator // with the element prev and next
replace by result in list // operator, but also prev and next
An example, again, with 5 + 3 * (6 - 2) - 1 :
first iteration:
5 + 3 * 4 - 1
then
5 + 12 - 1
then
17 - 1
then
16
The other (and better, even though it is a little bit harder if you're not familiar with recursion) one : A binary tree using the reverse polish notation
(see here and here)
This one is more common, so I won't explain it.
You can use a string when you read and divide that string when you find {+,-,*,/}. And what you find between them are your numbers.
Try to add your code!
Good luck!
It's not a complete solution but hey it's something...
try this code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_SIZE (1000)
static char tempNumStr[50] = {0};
static double parsedDoubleArr[5] = {0};
static char parsedOperators[5] = {0};
void flushTempArr(char tmpArr[50]){
for(int i=0; i< 50; i++)
tmpArr[i] = 0 ;
}
int tempNumStrIterator=0;
int parsedDoubleArrIterator=0;
int parsedOperatorsIterator=0;
int main(void)
{
char operator;
char sourceStr[] = "(17.5 + 8)";
for(int i = 0; sourceStr[i] != '\0'; ++i) /*iterate over string till \0 */
{
if (IsDigit(sourceStr[i]))
{
while(sourceStr[i] != '\0' && IsDigit(sourceStr[i]))
{
tempNumStr[tempNumStrIterator++] = sourceStr[i];
++i;
}
sscanf(tempNumStr, "%lf", &parsedDoubleArr[parsedDoubleArrIterator++]);
flushTempArr(tempNumStr);
tempNumStrIterator = 0;
}
if (IsCalcOperator(sourceStr[i]))
{
parsedOperators[parsedOperatorsIterator++] = sourceStr[i];
}
else if (IsBracket(sourceStr[i]))
{
//do something
continue;
}
}
//do what you want with parsedDoubleArr and parsedOperators
return EXIT_SUCCESS;
}
There are 2 very big series of elements, the second 100 times bigger than the first. For each element of the first series, there are 0 or more elements on the second series. This can be traversed and processed with 2 nested loops. But the unpredictability of the amount of matching elements for each member of the first array makes things very, very slow.
The actual processing of the 2nd series of elements involves logical and (&) and a population count.
I couldn't find good optimizations using C but I am considering doing inline asm, doing rep* mov* or similar for each element of the first series and then doing the batch processing of the matching bytes of the second series, perhaps in buffers of 1MB or something. But the code would be get quite messy.
Does anybody know of a better way? C preferred but x86 ASM OK too. Many thanks!
Sample/demo code with simplified problem, first series are "people" and second series are "events", for clarity's sake. (the original problem is actually 100m and 10,000m entries!)
#include <stdio.h>
#include <stdint.h>
#define PEOPLE 1000000 // 1m
struct Person {
uint8_t age; // Filtering condition
uint8_t cnt; // Number of events for this person in E
} P[PEOPLE]; // Each has 0 or more bytes with bit flags
#define EVENTS 100000000 // 100m
uint8_t P1[EVENTS]; // Property 1 flags
uint8_t P2[EVENTS]; // Property 2 flags
void init_arrays() {
for (int i = 0; i < PEOPLE; i++) { // just some stuff
P[i].age = i & 0x07;
P[i].cnt = i % 220; // assert( sum < EVENTS );
}
for (int i = 0; i < EVENTS; i++) {
P1[i] = i % 7; // just some stuff
P2[i] = i % 9; // just some other stuff
}
}
int main(int argc, char *argv[])
{
uint64_t sum = 0, fcur = 0;
int age_filter = 7; // just some
init_arrays(); // Init P, P1, P2
for (int64_t p = 0; p < PEOPLE ; p++)
if (P[p].age < age_filter)
for (int64_t e = 0; e < P[p].cnt ; e++, fcur++)
sum += __builtin_popcount( P1[fcur] & P2[fcur] );
else
fcur += P[p].cnt; // skip this person's events
printf("(dummy %ld %ld)\n", sum, fcur );
return 0;
}
gcc -O5 -march=native -std=c99 test.c -o test
Since on average you get 100 items per person, you can speed things up by processing multiple bytes at a time. I re-arranged the code slightly in order to use pointers instead of indexes, and replaced one loop by two loops:
uint8_t *p1 = P1, *p2 = P2;
for (int64_t p = 0; p < PEOPLE ; p++) {
if (P[p].age < age_filter) {
int64_t e = P[p].cnt;
for ( ; e >= 8 ; e -= 8) {
sum += __builtin_popcountll( *((long long*)p1) & *((long long*)p2) );
p1 += 8;
p2 += 8;
}
for ( ; e ; e--) {
sum += __builtin_popcount( *p1++ & *p2++ );
}
} else {
p1 += P[p].cnt;
p2 += P[p].cnt;
}
}
In my testing this speeds up your code from 1.515s to 0.855s.
The answer by Neil doesn't require sorting by age, which btw could be a good idea --
If the second loop has holes (please correct original source code to support that idea), a common solution is to do cumsum[n+1]=cumsum[n]+__popcount(P[n]&P2[n]);
Then for each people
sum+=cumsum[fcur + P[p].cnt] - cumsum[fcur];
Anyway it seems that the computational burden is merely of order EVENTS, not EVENTS*PEOPLE. Some optimization can anyway take place by calling the inner loop for all the consecutive people meeting the condition.
If there are really max 8 predicates, it could makes sense to precalculate all the
sums (_popcounts(predicate[0..255])) for each people into separate arrays C[256][PEOPLE]. That just about doubles the memory requirements (on disk?), but localizes the search from 10GB+10GB+...+10GB (8 predicates) to one stream of 200MB (assuming 16 bit entries).
Depending on the probability of p(P[i].age < condition && P[i].height < cond2), it may not anymore make sense to calculate cumulative sums. Maybe, maybe not. More likely just some SSE parallelism 8 or 16 people at a time will do.
A completely new approach could be to use ROBDDs to encode the truth tables of each person / each event. First, if the event tables are not very random or if they do not consists of pathological functions, such as truth tables of bignum multiplication, then first one may achieve compression of the functions and secondly arithmetic operations for truth tables can be calculated in compressed form. Each subtree can be shared between users and each arithmetic operation for two identical subtrees has to be calculated only once.
I don't know if your sample code accurately reflects your problem but it can be rewritten like this:
for (int64_t p = 0; p < PEOPLE ; p++)
if (P[p].age < age_filter)
fcur += P[p].cnt;
for (int64_t e = 0; e < fcur ; e++)
sum += __builtin_popcount( P1[e] & P2[e] );
I don't know about gcc -O5 (it seems not documented here) and seems to produce the exact same code as gcc -O3 here with my gcc 4.5.4 (though, only tested on a relatively small code sample)
depending on what you want to achieve, -O3 can be slower than -O2
as with your problem, I'd suggest thinking more about your data structure than the actual algorithm.
you should not focus on solving the problem with an adequate algorithm/code optimisation as long as your data aren't repsented in a convenient manner.
if you want to quickly cut a large set of your data based on a single criteria (here, age in your example) I'd recommand using a variant of a sorted tree.
If your actual data(age,count etc.) is indeed 8-bit there is probably a lot of redundancy in calculations. In this case you can replace the processing by lookup tables - for each 8-bit value you'll have 256 possible outputs and instead of computation it might be possible to read the computed data from the table.
To tackle the branch mispredictions (missing in other answers) the code could do something like:
#ifdef MISPREDICTIONS
if (cond)
sum += value
#else
mask = - (cond == 0); // cond: 0 then -0, binary 00..; cond: 1 then -1, binary 11..
sum += (value & mask); // if mask is 0 sum value, else sums 0
#endif
It's not completely free since there are data dependencies (think superscalar cpu). But it usually gets a 10x boost for mostly unpredictable conditions.
I am trying to understand virtual memory paging. I have the following code snippet that represents the first step in the process. Here search_tbl is called from the main program for each logical address in order to check if the page table already has an entry that maps the provided logical address to a location in physical memory. vfn is the virtual frame number.
EDITED:
Does this implementation make any sense? Or am I going down the wrong road?
Any help/suggestion would be greatly appreciated. Thank you.
uint vfn_bits;//virtual frame number
static tbl_entry **tbl;
uint page_bits = log_2(pagesize);
vfn_bits = addr_space_bits - page_bits;
tbl = calloc(pow_2(vfn_bits), sizeof (tbl_entry*));
tbl_entry *search_tbl(uint vfn) {
uint index = vfn;
if (tbl[index] == NULL) {
/* Initial miss */
tbl[index] = create_tbl_entry(vfn);
}
return tbl[index];
}
tbl_entry *create_tbl_entry(uint vfn) {
tbl_entry *te;
te = (tbl_entry*) (malloc(sizeof (tbl_entry)));
te->vfn = vfn;
te->pfn = -1;
te->valid = FALSE;
te->modified = FALSE;
te->reference = 0;
return te;
}
The only real issue I can see with it is that search_tbl()'s return type is tbl_entry* but it is actually returning a tbl_entry. That could be a major issue though, thinking about, it if the page table is really an array of pointers to page table entries. Also, if sizeof(tbl_entry) > sizeof(tbl_entry*) you are not allocating enough space for the table.
Another issue might be getbits(). It's normal practice to number the bits of an n-bit integer type with 0 as the least significant bit and n - 1 as the most significant bit. If that is the case for the getbits() API, you are calculating the index based on the wrong part of the address.
Edit
The above was true for the original version of the code in the question which has been edited out.
As for the getbits question in the comment, if the following is used (assuming 32 bit addresses)
uint32_t getbits(uint32_t x, unsigned int p, unsigned int n)
{
return (x >> (p + 1-n)) & ~(~0 << n);
}
That assumes that the most significant bit is the one with the highest number i.e. bit 31 is the highest bit. So, if you assume a page size of 4096 bytes, the frame number of an address can be obtained like this:
vfn = getbits(x, 31, 20); // 31 is the top bit, number of bits is 32 - log2(4096)