I have some trouble coding a Function that takes 3 rand()- values and generates a 32-bit value. Specifically it has to be 15-bit from the first rand() operation; then attaching another 15-bit from the second rand() operation; and lastly attaching 2-bits from the last rand() operation;
My idea was something like this
unsigned int x = (rand()<<17 | rand()<<2 ) | rand()>>13;
However i donĀ“t think this function gives me values of the entire unsigned int range, my guess it has something to do with the fact that rand() gives you at least a value of 32767 (as far as i understand).
I hope someone can help me out.
Cheers
There are two problems with your code. First of all rand returns int and not unsigned int, and if your unsigned int is 32 bits, then bit shift left to position 17 is invalid if top bit is set. The second problem is that if rand indeed does return more than 15 bits then you're getting too many. One simple fix for both of these would be to & with 0x7fffu - that would ensure only 15 bits are set and that the resulting value is unsigned:
unsigned int x = ((rand() & 0x7fffu)<<17 | (rand() & 0x7fffu)<<2 ) | (rand() & 0x7fffu)>>13;
Another one, if you know the values are truly independent (Edit: and as Eric noticed, also a power of two), would be to use xor ^ instead of or - this would ensure that all values are still possible.
Note though that any random number generator that returns values only from 0 to 32767 is a suspect for being really bad and it is very likely that the lower-order bits are not very independent and it can be that you will end up not being able to generate all values nevertheless...
I'm trying to find or figure out the algorithm for converting a signed 64-bit int (twos-complement, natch) to closest value IEEE double (64-bit), staying within bitwise operations.What I'm looking for is for the generic "C-like" pseudocode; I'm implementing a toy JVM on a platform that is not C and doesn't have a native int64 types, so I'm operating on 8 byte arrays (details of that are mercifully outside this scope) and that's the domain the data needs to stay in.
So: input is a big-endian string of 64 bits, signed twos-complement. Output is a big-endian string of 64 bits in IEEE double format that represents as near the original int64 value as possible. In between is some set of masks, shifts, etc! Algorithm absolutely does not need to be especially clever or optimized. I just want to be able to get to the result and ideally understand what the process is.
Having trouble tracking this down because I suspect it's an unusual need. This answer addresses a parallel question (I think) in x86 SSE, but I don't speak SSE and my attempts and translation leave me more confused than enlightened.
Would love someone to either point in the right direction for a recipe or ideally explain the bitwise math behind so I actually understand it. Thanks!
Here's a simple (and wrong in several ways) implementation, including a test harness.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
double do_convert(int64_t input)
{
uint64_t sign = (input < 0);
uint64_t magnitude;
// breaks on INT64_MIN
if (sign)
magnitude = -input;
else
magnitude = input;
// use your favourite algorithm here instead of the builtin
int leading_zeros = __builtin_clzl(magnitude);
uint64_t exponent = (63 - leading_zeros) + 1023;
uint64_t significand = (magnitude << (leading_zeros + 1)) >> 12;
uint64_t fake_double = sign << 63
| exponent << 52
| significand;
double d;
memcpy(&d, &fake_double, sizeof d);
return d;
}
int main(int argc, char** argv)
{
for (int i = 1; i < argc; i++)
{
long l = strtol(argv[i], NULL, 0);
double d = do_convert(l);
printf("%ld %f\n", l, d);
}
return 0;
}
The breakages here are many - the basic idea is to first extract the sign bit, then treat the number as positive the rest of the way, which won't work if the input is INT64_MIN. It also doesn't handle input 0 correctly because it doesn't correctly deal with the exponent in that case. These extensions are left as an exercise for the reader. ;-)
Anyway - the algorithm just figures out the exponent by calculating log2 of the input number and offsetting by 1023 (because floating point) and then getting the significand by shifting the number up far enough to drop off the most significant bit, then shifting back down into the right field position.
After all that, the assembly of the final double is pretty straightforward.
Edit:
Speaking of exercises for the reader - I also implemented this program using _builtin_clzl(). You can expand that part as necessary.
I have been given this problem and would like to solve it in C:
Assume you have a 32-bit processor and that the C compiler does not support long long (or long int). Write a function add(a,b) which returns c = a+b where a and b are 32-bit integers.
I wrote this code which is able to detect overflow and underflow
#define INT_MIN (-2147483647 - 1) /* minimum (signed) int value */
#define INT_MAX 2147483647 /* maximum (signed) int value */
int add(int a, int b)
{
if (a > 0 && b > INT_MAX - a)
{
/* handle overflow */
printf("Handle over flow\n");
}
else if (a < 0 && b < INT_MIN - a)
{
/* handle underflow */
printf("Handle under flow\n");
}
return a + b;
}
I am not sure how to implement the long using 32 bit registers so that I can print the value properly. Can someone help me with how to use the underflow and overflow information so that I can store the result properly in the c variable with I think should be 2 32 bit locations. I think that is what the problem is saying when it hints that that long is not supported. Would the variable c be 2 32 bit registers put together somehow to hold the correct result so that it can be printed? What action should I preform when the result over or under flows?
Since this is a homework question I'll try not to spoil it completely.
One annoying aspect here is that the result is bigger than anything you're allowed to use (I interpret the ban on long long to also include int64_t, otherwise there's really no point to it). It may be temping to go for "two ints" for the result value, but that's weird to interpret the value of. So I'd go for two uint32_t's and interpret them as two halves of a 64 bit two's complement integer.
Unsigned multiword addition is easy and has been covered many times (just search). The signed variant is really the same if the inputs are sign-extended: (not tested)
uint32_t a_l = a;
uint32_t a_h = -(a_l >> 31); // sign-extend a
uint32_t b_l = b;
uint32_t b_h = -(b_l >> 31); // sign-extend b
// todo: implement the addition
return some struct containing c_l and c_h
It can't overflow the 64 bit result when interpreted signed, obviously. It can (and should, sometimes) wrap.
To print that thing, if that's part of the assignment, first reason about which values c_h can have. There aren't many possibilities. It should be easy to print using existing integer printing functions (that is, you don't have to write a whole multiword-itoa, just handle a couple of cases).
As a hint for the addition: what happens when you add two decimal digits and the result is larger than 9? Why is the low digit of 7+6=13 a 3? Given only 7, 6 and 3, how can you determine the second digit of the result? You should be able to apply all this to base 232 as well.
First, the simplest solution that satisfies the problem as stated:
double add(int a, int b)
{
// this will not lose precision, as a double-precision float
// will have more than 33 bits in the mantissa
return (double) a + b;
}
More seriously, the professor probably expected the number to be decomposed into a combination of ints. Holding the sum of two 32-bit integers requires 33 bits, which can be represented with an int and a bit for the carry flag. Assuming unsigned integers for simplicity, adding would be implemented like this:
struct add_result {
unsigned int sum;
unsigned int carry:1;
};
struct add_result add(unsigned int a, unsigned int b)
{
struct add_result ret;
ret.sum = a + b;
ret.carry = b > UINT_MAX - a;
return ret;
}
The harder part is doing something useful with the result, such as printing it. As proposed by harold, a printing function doesn't need to do full division, it can simply cover the possible large 33-bit values and hard-code the first digits for those ranges. Here is an implementation, again limited to unsigned integers:
void print_result(struct add_result n)
{
if (!n.carry) {
// no carry flag - just print the number
printf("%d\n", n.sum);
return;
}
if (n.sum < 705032704u)
printf("4%09u\n", n.sum + 294967296u);
else if (n.sum < 1705032704u)
printf("5%09u\n", n.sum - 705032704u);
else if (n.sum < 2705032704u)
printf("6%09u\n", n.sum - 1705032704u);
else if (n.sum < 3705032704u)
printf("7%09u\n", n.sum - 2705032704u);
else
printf("8%09u\n", n.sum - 3705032704u);
}
Converting this to signed quantities is left as an exercise.
I am attempting exercise 2.1 of K&R. The exercise reads:
Write a program to determine the ranges of char, short, int, and long variables, both signed and unsigned, by printing appropriate values from standard headers and by direct computation. Harder if you compute them: determine the ranges of the various floating-point types.
Printing the values of constants in the standards headers is easy, just like this (only integer shown for example):
printf("Integral Ranges (from constants)\n");
printf("int max: %d\n", INT_MAX);
printf("int min: %d\n", INT_MIN);
printf("unsigned int max: %u\n", UINT_MAX);
However, I want to determine the limits programmatically.
I tried this code which seems like it should work but it actually goes into an infinite loop and gets stuck there:
printf("Integral Ranges (determined programmatically)\n");
int i_max = 0;
while ((i_max + 1) > i_max) {
++i_max;
}
printf("int max: %d\n", i_max);
Why is this getting stuck in a loop? It would seem that when an integer overflows it jumps from 2147483647 to -2147483648. The incremented value is obviously smaller than the previous value so the loop should end, but it doesn't.
Ok, I was about to write a comment but it got too long...
Are you allowed to use sizeof?
If true, then there is an easy way to find the max value for any type:
For example, I'll find the maximum value for an integer:
Definition: INT_MAX = (1 << 31) - 1 for 32-bit integer (2^31 - 1)
The previous definition overflows if we use integers to compute int max, so, it has to be adapted properly:
INT_MAX = (1 << 31) - 1
= ((1 << 30) * 2) - 1
= ((1 << 30) - 1) * 2 + 2) - 1
= ((1 << 30) - 1) * 2) + 1
And using sizeof:
INT_MAX = ((1 << (sizeof(int)*8 - 2) - 1) * 2) + 1
You can do the same for any signed/unsigned type by just reading the rules for each type.
So it actually wasn't getting stuck in an infinite loop. C code is usually so fast that I assume it's broken if it doesn't complete immediately.
It did eventually return the correct answer after I let it run for about 10 seconds. Turns out that 2,147,483,647 increments takes quite a few cycles to complete.
I should also note that I compiled with cc -O0 to disable optimizations, so this wasn't the problem.
A faster solution might look something like this:
int i_max = 0;
int step_size = 256;
while ((i_max + step_size) > i_max) {
i_max += step_size;
}
while ((i_max + 1) > i_max) {
++i_max;
}
printf("int max: %d\n", i_max);
However, as signed overflow is undefined behavior, probably it is a terrible idea to ever try to programmatically guess this in practice. Better to use INT_MAX.
The simplest I could come up with is:
signed int max_signed_int = ~(1 << ((sizeof(int) * 8) -1));
signed int min_signed_int = (1 << ((sizeof(int) * 8) -1));
unsigned int max_unsigned_int = ~0U;
unsigned int min_unsigned_int = 0U;
In my system:
// max_signed_int = 2147483647
// min_signed_int = -2147483648
// max_unsigned_int = 4294967295
// min_unsigned_int = 0
Assuming a two's complement processor, use unsigned math:
unsigned ... smax, smin;
smax = ((unsigned ...)0 - (unsigned ...)1) / (unsigned ...) 2;
smin = ~smax;
As it has been pointed here in other solutions, trying to overflow an integer in C is undefined behaviour, but, at least in this case, I think you can get an valid answer, even from the U.B. thing:
The case is tha if you increment a value and compare the new value with the last, you always get a greater value, except on an overflow (in this case you'll get a value lesser or equal ---you don't have more values greater, that's the case in an overflow) So you can try at least:
int i_old = 0, i = 0;
while (++i > i_old)
i_old = i;
printf("MAX_INT guess: %d\n", i_old);
After this loop, you will have got the expected overflow, and old_i will store the last valid number. Of course, in case you go down, you'll have to use this snippet of code:
int i_old = 0, i = 0;
while (--i < i_old)
i_old = i;
printf("MIN_INT guess: %d\n", i_old);
Of course, U.B. can even mean program stopping run (in this case, you'll have to put traces, to get at least the last value printed)
By the way, in the ancient times of K&R, integers used to be 16bit wide, a value easily accessible by counting up (easier than now, try 64bit integers overflow from 0 up)
I would use the properties of two's complement to compute the values.
unsigned int uint_max = ~0U;
signed int int_max = uint_max >> 1;
signed int int_min1 = (-int_max - 1);
signed int int_min2 = ~int_max;
2^3 is 1000. 2^3 - 1 is 0111. 2^4 - 1 is 1111.
w is the length in bits of your data type.
uint_max is 2^w - 1, or 111...111. This effect is achieved by using ~0U.
int_max is 2^(w-1) - 1, or 0111...111. This effect can be achieved by bitshifting uint_max 1 bit to the right. Since uint_max is an unsigned value, the logical shift is applied by the >> operator, means it adds in leading zeroes instead of extending the sign bit.
int_min is -2^(w-1), or 100...000. In two's complement, the most significant bit has a negative weight!
This is how to visualize the first expression for computing int_min1:
...
011...111 int_max +2^(w-1) - 1
100...000 (-int_max - 1) -2^(w-1) == -2^(w-1) + 1 - 1
100...001 -int_max -2^(w-1) + 1 == -(+2^(w-1) - 1)
...
Adding 1 would be moving down, and subtracting 1 would be moving up. First we negate int_max in order to generate a valid int value, then we subtract 1 to get int_min. We can't just negate (int_max + 1) because that would exceed int_max itself, the biggest int value.
Depending on which version of C or C++ you are using, the expression -(int_max + 1) would either become a signed 64-bit integer, keeping the signedness but sacrificing the original bit width, or it would become an unsigned 32-bit integer, keeping the original bit width but sacrificing the signedness. We need to declare int_min programatically in this roundabout way to keep it a valid int value.
If that's a bit (or byte) too complicated for you, you can just do ~int_max, observing that int_max is 011...111 and int_min is 100...000.
Keep in mind that these techniques I've mentioned here can be used for any bit width w of an integer data type. They can be used for char, short, int, long, and also long long. Keep in mind that integer literals are almost always 32-bits by default, so you may have to cast the 0U to the data type with the appropriate bit width before bitwise NOTing it. But other than that, these techniques are based on the fundamental mathematical principles of two's complement integer representation. That said, they won't work if your computer uses a different way of representing integers, for example ones' complement or most-significant sign-bit.
The assignment says that "printing appropriate values from standard headers" is allowed, and in the real world, that is what you would do. As your prof wrote, direct computation is harder, and why make things harder for its own sake when you're working on another interesting problem and you just want the result? Look up the constants in <limits.h>, for example, INT_MIN and INT_MAX.
Since this is homework and you want to solve it yourself, here are some hints.
The language standard technically allows any of three different representations for signed numbers: two's-complement, one's-complement and sign-and-magnitude. Sure, every computer made in the last fifty years has used two's-complement (with the partial exception of legacy code for certain Unisys mainframes), but if you really want to language-lawyer, you could compute the smallest number for each of the three possible representations and find the minimum by comparing them.
Attempting to find the answer by overflowing or underflowing a signed value does not work! This is undefined behavior! You may in theory, but not in practice, increment an unsigned value of the same width, convert to the corresponding signed type, and compare to the result of casting the previous or next unsigned value. For 32-bit long, this might just be tolerable; it will not scale to a machine where long is 64 bits wide.
You want to use the bitwise operators, particularly ~ and <<, to calculate the largest and smallest value for every type. Note: CHAR_BITS * sizeof(x) gives you the number of bits in x, and left-shifting 0x01UL by one fewer than that, then casting to the desired type, sets the highest bit.
For floating-point values, the only portable way is to use the constants in <math.h>; floating-point values might or might not be able to represent positive and negative infinity, are not constrained to use any particular format. That said, if your compiler supports the optional Annex G of the C11 standard, which specifies IEC 60559 complex arithmetic, then dividing a nonzero floating-point number by zero will be defined as producing infinity, which does allow you to "compute" infinity and negative infinity. If so, the implementation will #define __STDC_IEC_559_COMPLEX__ as 1.
If you detect that infinity is not supported on your implementation, for instance by checking whether INFINITY and -INFINITY are infinities, you would want to use HUGE_VAL and -HUGE_VAL instead.
#include <stdio.h>
int main() {
int n = 1;
while(n>0) {
n=n<<1;
}
int int_min = n;
int int_max = -(n+1);
printf("int_min is: %d\n",int_min);
printf("int_max is: %d\n", int_max);
return 0;
}
unsigned long LMAX=(unsigned long)-1L;
long SLMAX=LMAX/2;
long SLMIN=-SLMAX-1;
If you don't have yhe L suffix just use a variable or cast to signed before castong to unsigned.
For long long:
unsigned long long LLMAX=(unsigned long long)-1LL;
I am writing a program, I have to store the distances between pairs of numbers in a hash table.
I will be given a Range R. Let's say the range is 5.
Now I have to find distances between the following pairs:
1 2
1 3
1 4
1 5
2 3
2 4
2 5
3 4
3 5
4 5
that is, the total number of pairs is (R^2/2 -R). I want to store it in a hash table. All these are unsigned integers. So there are 32 bits. My idea was that, I take an unsigned long (64 bits).
Let's say I need to hash the distance between 1 and 5. Now
long k = 1;
k = k<<31;
k+=5;
Since I have 64 bits, I am storing the first number in the first 31 bits and the second number in the second 31 bits. This guarantees unique keys which can then be used for hashing.
But when I do this:
long k = 2;
k << 31;
k+= 2;
The value of k becomes zero.
I am not able to wrap my head around this shifting concept.
Essentially what I am trying to achieve is that,
An unsigned long has | 32bits | 32 bits |
Store |1st integer|2nd integer|
How can I achieve this to get unique keys for each pair?
I am running the code on a 64 bit AMD Opteron processor. sizeof(ulong) returns 8. So it is 64 bits. Do I need a long long in such a case?
Also I need to know if this will create unique keys? From my understanding , it does seem to create unique keys. But I would like a confirmation.
Assuming you're using C or something that follows vaguely similar rules, your problem is primarily with types.
long k = 2; // This defines `k` a a long
k << 31; // This (sort of) shifts k left, but still as a 32-bit long.
What you almost certainly want/need to do is convert k to a long long before you shift it left, so you're shifting in a 64-bit word.
unsigned long first_number = 2;
unsigned long long both_numbers = (unsigned long long)first_number << 32;
unsigned long second_number = 5;
both_numbers |= second_number;
In this case, if (for example) you print out both_numbers, in hexadecimal, you should get 0x0000000200000005.
The concept makes sense. As Oli has added, you want to shift by 32, not 31 - shifting by 31 will put it in the 31st bit, so if you shifted back to the right to try and get the first number you would end up with a bit missing, and the second number would be potentially huge because you could have put a 1 in the uppermost bit.
But if you want to do bit manipulation, I would do instead:
k = 1 << 32;
k = k|5;
It really should produce the same result though. Are you sure that long is 64 bits on your machine? This is not always the case (although it usually is, I think). If long is actually 32 bits, 2<<31 will result in 0.
How large is R? You can get away with a 32 bit sized variable if R doesn't go past 65535...