Checking if two pointers are on the same page - c

I saw this interview question and wanted to know if my function is doing what it's supposed to or if there's a better way to do this.
Here's the exact quote of the question:
The operating system typically allocates memory in pages such that the base address of the page are 0, 4K, 8K etc. Given two addresses (pointers), write a function to find if two pointers are on the same page. Here's the function prototype: int AreOnSamePage (void * a, void * b);
Here's my implementation. I made it return 4 if it's between 4k and 8k. It returns 1 if it's between 0 and 4k and it returns -1 if it's over 8k away. Am I getting the right addresses? The interview question is worded vaguely. Is it correct to use long's since the addresses could be pretty big?
int AreOnSamePage(void* a, void* b){
long difference = abs(&a - &b);
printf("%ld %ld\n",(long)&a,(long)&b);
if(difference > 8000)
return -1;
if(difference >= 4000)
return 4;
return 1;
}

a and b are pointers, so the distance between them is:
ptrdiff_t difference = (ptrdiff_t) abs((char *)a - (char *) b)
But you don't need it.
Two pointers are on the same page, if
(uintptr_t)a / 4096 == ( uintptr_t ) b / 4096
Else they are on different pages.
So:
int AreOnSamePage(void* a, void* b) {
const size_t page_size = 4096;
if ( (uintptr_t) a / page_size == (uintptr_t) b / page_size)
return 1;
else
return 0;
}

There are many problems with your code.
You are comparing addresses of function parameters (they are side by side, on stack), not pointers
You for no reason compare the difference with 8000
4K != 4000
Imagine one address is 3K, other is 5K, according to your code, they are on the same page.
Bad choice of return values

The name AreOnSamePage() implies that the function returns either 0 or 1; I'd find it odd to have it return -1, 4 or other values.
If a page is 4KB, then it means you need 12 bits to index each byte inside a page (because 2^12 = 4096), so as long as the N-12 most significant bits of both pointer values compare equal, then you know they are on the same page (where N is the size of a pointer).
So you can do this:
#include <stdint.h>
static const uintptr_t PAGE_SIZE = 4096;
static const uintptr_t PAGE_MASK = ~(PAGE_SIZE-1);
int AreOnSamePage(void *a, void *b) {
return (((uintptr_t) a) & PAGE_MASK) == (((uintptr_t) b) & PAGE_MASK);
}
PAGE_MASK is a bit mask that has all N-12 most significant bits set to 1 and the 12 least significant bits set to 0. By doing the bitwise AND with an address, we effectively clear the least significant 12 bits (the offset into the page), so we can compare only the other bits that matter.
Note that uintptr_t is guaranteed to be wide enough to store pointer values, unlike long.

As already stated, you should use uintptr_t to proces the pointers. Your code is, however, wrong, as you test the distance, not the page. Also, you foget that computers use powers of two. 8000 is none; that would be 8192. similar for 4000.
The fastest approach for the test would be:
#include <stdbool.h>
#include <stdint.h>
// this should better be found in a system header:
#define PAGESIZE 4096U
bool samePage(void *a, void *b)
{
return ((uintptr_t)a ^ (uintptr_t)b) < PAGESIZE;
}
or:
return !(((uintptr_t)a ^ (uintptr_t)b) / PAGESIZE);
Note the result of the division will be converted to bool. If this is used as an inline, it will just tested for zero/not zero.
The XOR will zero all bits which are equal. So if any higher order bits differ, they will be set after XOR, and make the result >= PAGESIZE. This saves you one division or masking.
This requires PAGESIZE to be a power of two, of course.

Your aptempt to solve the interview's question is wrong.
You should be comparing a and b. Not &a and &b.
But even then it would still be wrong.
Consider pointer a points to last position of page 0 and pointer b points to first position of page 1. And page 1 is the one after page 0.
Their difference is 1. But they are in different pages.
In order to correctly implement it you should consider that a page is 4Kib long. 4Kib = 2^12 = 4096. So all the bits of a pair of pointers save for the last 12 will be equal if they are in the same page.
#include<stdint.h>
int AreOnSamePage(void* a, void* b){
return ((intptr_t)a & ~(intptr_t)0xFFF) ==
((intptr_t)b & ~(intptr_t)0xFFF);
}
A more concise but equivalent implementation :
int AreOnSamePage(void* a, void* b){
return ((intptr_t)a)>>12 == ((intptr_t)b)>>12;
}

Related

Adding 32 bit signed in C

I have been given this problem and would like to solve it in C:
Assume you have a 32-bit processor and that the C compiler does not support long long (or long int). Write a function add(a,b) which returns c = a+b where a and b are 32-bit integers.
I wrote this code which is able to detect overflow and underflow
#define INT_MIN (-2147483647 - 1) /* minimum (signed) int value */
#define INT_MAX 2147483647 /* maximum (signed) int value */
int add(int a, int b)
{
if (a > 0 && b > INT_MAX - a)
{
/* handle overflow */
printf("Handle over flow\n");
}
else if (a < 0 && b < INT_MIN - a)
{
/* handle underflow */
printf("Handle under flow\n");
}
return a + b;
}
I am not sure how to implement the long using 32 bit registers so that I can print the value properly. Can someone help me with how to use the underflow and overflow information so that I can store the result properly in the c variable with I think should be 2 32 bit locations. I think that is what the problem is saying when it hints that that long is not supported. Would the variable c be 2 32 bit registers put together somehow to hold the correct result so that it can be printed? What action should I preform when the result over or under flows?
Since this is a homework question I'll try not to spoil it completely.
One annoying aspect here is that the result is bigger than anything you're allowed to use (I interpret the ban on long long to also include int64_t, otherwise there's really no point to it). It may be temping to go for "two ints" for the result value, but that's weird to interpret the value of. So I'd go for two uint32_t's and interpret them as two halves of a 64 bit two's complement integer.
Unsigned multiword addition is easy and has been covered many times (just search). The signed variant is really the same if the inputs are sign-extended: (not tested)
uint32_t a_l = a;
uint32_t a_h = -(a_l >> 31); // sign-extend a
uint32_t b_l = b;
uint32_t b_h = -(b_l >> 31); // sign-extend b
// todo: implement the addition
return some struct containing c_l and c_h
It can't overflow the 64 bit result when interpreted signed, obviously. It can (and should, sometimes) wrap.
To print that thing, if that's part of the assignment, first reason about which values c_h can have. There aren't many possibilities. It should be easy to print using existing integer printing functions (that is, you don't have to write a whole multiword-itoa, just handle a couple of cases).
As a hint for the addition: what happens when you add two decimal digits and the result is larger than 9? Why is the low digit of 7+6=13 a 3? Given only 7, 6 and 3, how can you determine the second digit of the result? You should be able to apply all this to base 232 as well.
First, the simplest solution that satisfies the problem as stated:
double add(int a, int b)
{
// this will not lose precision, as a double-precision float
// will have more than 33 bits in the mantissa
return (double) a + b;
}
More seriously, the professor probably expected the number to be decomposed into a combination of ints. Holding the sum of two 32-bit integers requires 33 bits, which can be represented with an int and a bit for the carry flag. Assuming unsigned integers for simplicity, adding would be implemented like this:
struct add_result {
unsigned int sum;
unsigned int carry:1;
};
struct add_result add(unsigned int a, unsigned int b)
{
struct add_result ret;
ret.sum = a + b;
ret.carry = b > UINT_MAX - a;
return ret;
}
The harder part is doing something useful with the result, such as printing it. As proposed by harold, a printing function doesn't need to do full division, it can simply cover the possible large 33-bit values and hard-code the first digits for those ranges. Here is an implementation, again limited to unsigned integers:
void print_result(struct add_result n)
{
if (!n.carry) {
// no carry flag - just print the number
printf("%d\n", n.sum);
return;
}
if (n.sum < 705032704u)
printf("4%09u\n", n.sum + 294967296u);
else if (n.sum < 1705032704u)
printf("5%09u\n", n.sum - 705032704u);
else if (n.sum < 2705032704u)
printf("6%09u\n", n.sum - 1705032704u);
else if (n.sum < 3705032704u)
printf("7%09u\n", n.sum - 2705032704u);
else
printf("8%09u\n", n.sum - 3705032704u);
}
Converting this to signed quantities is left as an exercise.

Determine the ranges of char by direct computation in C89 (do not use limits.h)

I am trying to solve the Ex 2-1 of K&R's C book. The exercise asks to, among others, determine the ranges of char by direct computation (rather than printing the values directly from the limits.h). Any idea on how this should be done nicely?
Ok, I throw my version in the ring:
unsigned char uchar_max = (unsigned char)~0;
// min is 0, of course
signed char schar_min = (signed char)(uchar_max & ~(uchar_max >> 1));
signed char schar_max = (signed char)(0 - (schar_min + 1));
It does assume 2's complement for signed and the same size for signed and unsigned char. While the former I just define, the latter I'm sure can be deduced from the standard as both are char and have to hold all encodings of the "execution charset" (What would that imply for RL-encoded charsets like UTF-8).
It is straigt-forward to get a 1's complement and sing/magnitude-version from this. Note that the unsigned version is always the same.
One advantage is that is completely runs with char types and no loops, etc. So it will be still performant on 8-bit architectures.
Hmm ... I really thought this would need a loop for signed. What did I miss?
Assuming that the type will wrap intelligently1, you can simply start by setting the char variable to be zero.
Then increment it until the new value is less than the previous value.
The new value is the minimum, the previous value was the maximum.
The following code should be a good start:
#include<stdio.h>
int main (void) {
char prev = 0, c = 0;
while (c >= prev) {
prev = c;
c++;
}
printf ("Minimum is %d\n", c);
printf ("Maximum is %d\n", prev);
return 0;
}
1 Technically, overflowing a variable is undefined behaviour and anything can happen, but the vast majority of implementations will work. Just keep in mind it's not guaranteed to work.
In fact, the difficulty in working this out in a portable way (some implementations had various different bit-widths for char and some even used different encoding schemes for negative numbers) is probably precisely why those useful macros were put into limits.h in the first place.
You could always try the ol' standby, printf...
let's just strip things down for simplicity's sake.
This isn't a complete answer to your question, but it will check to see if a char is 8-bit--with a little help (yes, there's a bug in the code). I'll leave it up to you to figure out how.
#include <stdio.h>
#DEFINE MMAX_8_BIT_SIGNED_CHAR 127
main ()
{
char c;
c = MAX_8_BIT_SIGNED_CHAR;
printf("%d\n", c);
c++;
printf("%d\n", c);
}
Look at the output. I'm not going to give you the rest of the answer because I think you will get more out of it if you figure it out yourself, but I will say that you might want to take a look at the bit shift operator.
There are 3 relatively simple functions that can cover both the signed and unsigned types on both x86 & x86_64:
/* signed data type low storage limit */
long long limit_s_low (unsigned char bytes)
{ return -(1ULL << (bytes * CHAR_BIT - 1)); }
/* signed data type high storage limit */
long long limit_s_high (unsigned char bytes)
{ return (1ULL << (bytes * CHAR_BIT - 1)) - 1; }
/* unsigned data type high storage limit */
unsigned long long limit_u_high (unsigned char bytes)
{
if (bytes < sizeof (long long))
return (1ULL << (bytes * CHAR_BIT)) - 1;
else
return ~1ULL - 1;
}
With CHAR_BIT generally being 8.
the smart way, simply calculate sizeof() of your variable and you know it's that many times larger than whatever has sizeof()=1, usually char. Given that you can use math to calculate the range. Doesn't work if you have odd sized types, like 3 bit chars or something.
the try hard way, put 0 in the type, and increment until it doesn't increment anymore (wrap around or stays the same depending on machine). Whatever the number before that was, that's the max. Do the same for min.

Data stored with pointers

void *memory;
unsigned int b=65535; //1111 1111 1111 1111 in binary
int i=0;
memory= &b;
for(i=0;i<100;i++){
printf("%d, %d, d\n", (char*)memory+i, *((unsigned int * )((char *) memory + i)));
}
I am trying to understand one thing.
(char*)memory+i - print out adress in range 2686636 - 2686735.
and when i store 65535 with memory= &b this should store this number at adress 2686636 and 2686637
because every adress is just one byte so 8 binary characters so when i print it out
*((unsigned int * )((char *) memory + i)) this should print 2686636, 255 and 2686637, 255
instead of it it prints 2686636, 65535 and 2686637, random number
I am trying to implement memory allocation. It is school project. This should represent memory. One adress should be one byte so header will be 2686636-2586639 (4 bytes for size of block) and 2586640 (1 byte char for free or used memory flag). Can someone explain it to me thanks.
Thanks for answers.
void *memory;
void *abc;
abc=memory;
for(i=0;i<100;i++){
*(int*)abc=0;
abc++;
}
*(int*)memory=16777215;
for(i=0;i<100;i++){
printf("%p, %c, %d\n", (char*)memory+i, *((char *)memory +i), *((char *)memory +i));
}
output is
0028FF94,  , -1
0028FF95,  , -1
0028FF96,  , -1
0028FF97, , 0
0028FF98, , 0
0028FF99, , 0
0028FF9A, , 0
0028FF9B, , 0
i think it works. 255 only one -1, 65535 2 times -1 and 16777215 3 times -1.
In your program it seems that address of b is 2686636 and when you will write (char*)memory+i or (char*)&b+i it means this pointer is pointing to char so when you add one to it will jump to only one memory address i.e2686637 and so on till 2686735(i.e.(char*)2686636+99).
now when you are dereferencing i.e.*((unsigned int * )((char *) memory + i))) you are going to get the value at that memory address but you have given value to b only (whose address is 2686636).all other memory address have garbage values which you are printing.
so first you have to store some data at the rest of the addresses(2686637 to 2686735)
good luck..
i hope this will help
I did not mention this in my comments yesterday but it is obvious that your for loop from 0 to 100 overruns the size of an unsigned integer.
I simply ignored some of the obvious issues in the code and tried to give hints on the actual question you asked (difficult to do more than that on a handy :-)). Unfortunately I did not have time to complete this yesterday. So, with one day delay my hints for you.
Try to avoid making assumptions about how big a certain type is (like 2 bytes or 4 bytes). Even if your assumption holds true now, it might change if you switch the compiler or switch to another platform. So use sizeof(type) consequently throughout the code. For a longer discussion on this you might want to take a look at: size of int, long a.s.o. on Stack Overflow. The standard mandates only the ranges a certain type should be able to hold (0-65535 for unsigned int) so a minimal size for types only. This means that the size of int might (and tipically is) bigger than 2 bytes. Beyond primitive types sizeof helps you also with computing the size of structures where due to memory alignment && packing the size of a structure might be different from what you would "expect" by simply looking at its attributes. So the sizeof operator is your friend.
Make sure you use the correct formatting in printf.
Be carefull with pointer arithmetic and casting since the result depends on the type of the pointer (and obviously on the value of the integer you add with).
I.e.
(unsigned int*)memory + 1 != (unsigned char*)memory + 1
(unsigned int*)memory + 1 == (unsigned char*)memory + 1 * sizeof(unsigned int)
Below is how I would write the code:
//check how big is int on our platform for illustrative purposes
printf("Sizeof int: %d bytes\n", sizeof(unsigned int));
//we initialize b with maximum representable value for unsigned int
//include <limits.h> for UINT_MAX
unsigned int b = UINT_MAX; //0xffffffff (if sizeof(unsigned int) is 4)
//we print out the value and its hexadecimal representation
printf("B=%u 0x%X\n", b, b);
//we take the address of b and store it in a void pointer
void* memory= &b;
int i = 0;
//we loop the unsigned chars starting at the address of b up to the sizeof(b)
//(in our case b is unsigned int) using sizeof(b) is better since if we change the type of b
//we do not have to remember to change the sizeof in the for loop. The loop works just the same
for(i=0; i<sizeof(b); ++i)
{
//here we kept %d for formating the individual bytes to represent their value as numbers
//we cast to unsigned char since char might be signed (so from -128 to 127) on a particular
//platform and we want to illustrate that the expected (all bytes 1 -> printed value 255) occurs.
printf("%p, %d\n", (unsigned char *)memory + i, *((unsigned char *) memory + i));
}
I hope you will find this helpfull. And good luck with your school assignment, I hope you learned something you can use now and in the future :-).

Concat 4 integers into one integer

Hi i am trying to concatinate 4 integers one integer. I used the concatinate function found here :
https://stackoverflow.com/a/12700533/2016977
My code:
unsigned concatenate(unsigned x, unsigned y) {
unsigned pow = 10;
while(y >= pow)
pow *= 10;
return x * pow + y;
}
void stringtoint(){
struct router *ptr;
ptr=start;
while(ptr!=NULL){
int a;
int b;
int c;
int d;
sscanf(ptr->ip, "%d.%d.%d.%d", &a, &b, &c, &d);
int num1 = concatenate(a,b);
int num2 = concatenate(c,d);
int num3 = concatenate(num1,num2);
printf("%d\n",num3);
ptr=ptr->next;
};
}
The problem:
I am dealing with IP address numbers e.g. 198.32.141.140 i am breaking them down to 4 integers and concatenate them to form 19832141140, however my concatenate function is doing maths on the larger number like 198.32.141.140 (becomes) - >-1642695340
but it is concatenating the IP which are small numbers e.g. 164.78.104.1 becomes 164781041 (which is correct)
How should i solve the problem, basically i am trying to make a string of IP e.g. 198.32.141.140 into an integer number 19832141140
Your proposed approach is likely a very big mistake. How do you distinguish 127.0.1.1 from 127.0.0.11?
It's much better to treat IP addresses as exactly what they are. Namely, a.b.c.d represents
a * 256^3 + b * 256^2 + c * 256^1 + d * 256^0
and done in this way you can not possibly run into the issue I just described. Moreover, the implementation is trivial:
unsigned int number;
number = (a << 24) + (b << 16) + (c << 8) + d
You may read a line, and then use inet_aton(). Otherwise, you can do as Jason says, but you'd need to check each integers value to be within 0 ... 255 (those 4 x 8 bits represent the 32bit integer containing an IPv4 address). inet_aton() would support hex, dec and octal notation of IPv4 addresses.
/**
** You DO NOT want to do this usually...
**/
#include <stdint.h>
uint_fast64_t
concatIPv4Addr(uint_fast16_t parts[])
{
uint_fast64_t n = 0;
for (int i = 0; i < 3; ++i) {
n += parts[i];
n *= 1000;
}
return (n += parts[3]);
}
I used the "fast" integer types for speed purposes, but if you have a storage requirement, use the corresponding "least" types instead. Of course this assumes you have a C99 compiler or a C89 compiler with extensions. Otherwise you're stuck with the primitive types where a char could even be 32-bit according to the C standard. Since I don't know your target environment, I made no assumptions. Feel free to change to the appropriate primitive types as you see fit.
I used a 16-bit value (minimum) because an 8-bit number can only represent 0-255, meaning if 358 was entered accidentally, it would be interpreted as 102, which is still valid. If you have a type able to store more than 8 bits and less than 16 bits, you can obviously use that, but the type must be able to store more than 8 bits.
That aside, you will need at least a 38-bit type:
4294967295 (32-bit unsigned max)
255255255255 (255.255.255.255 converted to the integer you want)
274877906944 (38-bit unsigned max)
The function above will convert 127.0.1.1 and 127.0.0.11 to 127000001001 and 127000000011 respectively:
127.0.1.1 ->
127.000.001.001 ->
127000001001
127.0.0.11 ->
127.000.000.011 ->
127000000011
Why so many zeros? Because otherwise you can't tell the difference between them! As others have said, you could confuse 127.0.1.1 and 127.0.0.11. Using the function above or something more appropriate that actually converts an IPv4 address to its real decimal representation, you won't have such a problem.
Lastly, I did no validation on the IPv4 address passed to the function. I assume you already ensure the address is valid before calling any functions that save or use the IPv4 address. BTW, if you wanted to do this same thing for IPv6, you can't so easily because that would require a string or conversion to decimal of each of the 8 parts, each of which is at most 16-bit, yielding 5 decimal digits per part, or 40 digits. To store that, you'd need a minimum of 133 bits, rather than the 128 bits required for the IPv6 address, just as you'd need 38 bits to store an IPv4 address instead of the 32 bits required.
Still not too bad, right? How about a theoretical IPv8 where there are 16 parts, each of which are 32-bit in size? The equivalent function to the one above would require 580 bits, instead of the proper mathematical requirement: 512 bits. While not a problem today, I'm simply pointing out the error in doing anything with an IPv4 address represented by concatenating the decimal values of each part. It scales absolutely terribly.

How to make sure that two addresses have the least significant 4 bits the same?

So I have two pointers:
unsigned char * a;
unsigned char * b;
Let's assume that I used malloc and they are allocated of a certain size.
I want to make the least significant 4 bits of the address of the pointers to be the same... but I really don't know how.
First of all I want to take the least significant 4 bits from a. I tried something like
int least = (&a) & 0x0f;
but I get an error that & is an invalid operand. I was thinking to allocate more for b and search for an address that has the least significant 4 bits the same as a but I really have no idea how I can do that.
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
int main()
{
unsigned char *a;
unsigned char *b;
a = malloc(8);
b = malloc(8);
if (((uintptr_t)a & 0x0F) == ((uintptr_t)b & 0x0F)) {
printf("Yeah, the least 4 bits are the same.\n");
} else {
printf("Nope, the least 4 bits are not the same.\n");
}
free(a);
free(b);
return EXIT_SUCCESS;
}
Try this:
int main()
{
unsigned char *a, *b;
a = malloc(32);
b = a + 16;
printf("%p %p\n", a, b); // You should see that their least significative
// 4-bits are equal
}
Since a and b are 16 byte apart and part of a contiguous memory block, their addresses should have the property you want.
One possible way to solve this problem is to use an allocation function that will only return allocations that are aligned on 16 byte boundaries (therefore the least significant 4 bits will be always be zero).
Some platforms have such alignment-guaranteed allocation functions such as _aligned_malloc() in MSVC or posix_memalign() on Unix variants. If you don't have such an allocator available, returning an aligned block of memory using plain vanilla malloc() is a common interview question - an internet search will net you many possible solutions.
What about this:
int least;
least = (int)(&a) ^ (int)(&b); //this is a bitwise XOR, returning 0s when the bits are the same
if (least % 16) = 0 then
{
//first four bits are zeroes, meaning they all match
}

Resources