How to find out number of bits enabled : bits handling

How to find out number of bits enabled : bits handling - c

This was asked in one of the interview I gave. I couldn't answer this properly.
I want to find out how many bits are enabled based on a number.
Suppose , if the number is 2 , I should return 3.
if the number is 3 , I should return 7
8 4 2 1
1 1
8 4 2 1
1 1 1
Is there any easy way of doing it?

Yes, there is: subtract 1 from the corresponding power of 2, like this:
int allBitsSet = (1U << n) - 1;
The expression (1U << n) - 1 computes the value of 2 to the power of n, which always has this form in binary:
1000...00
i.e. one followed by n zeros. When you subtract 1 from a number of that form, you "borrow" from the bit that is set to 1 making it zero, and flip the remaining bits to 1.
You can visualize this by solving an analogous problem in decimal system: "make a number that has n nines". The solution is the same, except now you need to use 10 instead of 2.

Related

Fastest way to compute sum of first set bit over consecutive integers?

Edit: I wish SO let me accept 2 answers because neither is complete without the other. I suggest reading both!
I am trying to come up with a fast implementation of a function that given an unsigned 32-bit integer x returns the sum of 2^trailing_zeros(i) for i=1..x-1, where trailing_zeros is the count trailing zeros operation which is defined as returning the 0 bits after the least significant 1 bit. This seems like the kind of problem that should lend itself to a clever bit manipulation implementation that takes the same number of instructions regardless of the input, but I haven't been able to derive it.
Mathematically, 2^trailing_zeros(i) is equivalent to the largest factor of 2 that exactly divides i. So we are summing those largest factors for 1..x-1.
i | 1 2 3 4 5 6 7 8 9 10
-----------------------------------------------------------------------
2^trailing_zeroes(i) | 1 2 1 4 1 2 1 8 1 2
-----------------------------------------------------------------------
Sum (desired value) | 0 1 3 4 8 9 11 12 20 21
It is a little easier to see the structure of 2^trailing_zeroes(i) if we 'plot' the values -- horizontal position increasing from left to right corresponding to i and vertical position increasing from top to bottom corresponding to trailing_zeroes(i).
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
16 16 16 16 16 16 16 16
32 32 32 32
64 64
Here it is easier to see the pattern that 2's are always 4 apart, 8's are always 16 apart, etc. However, each pattern starts at a different time -- 8's don't begin until i=8, 16 doesn't begin until i=16, etc. If you don't take into account that the patterns don't start right away you can come up with formulas that don't work -- for example you might think to determine the number of 8's going into the total you should just compute floor(x/16) but i=25 is far enough to the right to include both of the first two 8s.
The best solution I have come up with so far is:
Set n = floor(log2(x)). This can be computed quickly using the count leading zeros operation. This tells us the highest power of two that is going to be involved in the sum.
Set sum = 0
for i = 1..n
sum += floor((x - 2^i) / 2^(i+1))*2^i + 2^i
The way this works as for each power, it calculates the horizontal distance on the plot between x and the first appearance of that power, e.g. the distance between x and the first 8 is (x-8), and then it divides by the distance between repeating instances of that power, e.g. floor((x-8)/16), which gives us how many times that power appeared, we the sum for that power, e.g. floor((x-8)/16)*8. Then we add one instance of the given power because that calculation excludes the very first time that power appears.
In practice this implementation should be pretty fast because the division/floor can be done by right bit shift and powers of two can be done with 1 bit-shifted to the left. However it seems like it should still be possible to do better. This implementation will loop more for larger inputs, up to 32 times (it's O(log2(n)), ideally we want O(1) without a gigantic lookup table using up all the CPU cache). I've been eyeing the BMI/BMI2 intrinsics but I don't see an obvious way to apply them.
Although my goal is to implement this in a compiled language like C++ or Rust with real bit shifting and intrinsics, I've been prototyping in Python. Included below is my script that includes the implementation I described, z(x), and the code for generating the plot, tower(x).
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from math import pow, floor, log, ceil
def leading_zeros(x):
return len(bin(x).split('b')[-1].split('1')[-1])
def f(x):
s = 0
for c, i in enumerate(range(1,x)):
a = pow(2, len(bin(i).split('b')[-1].split('1')[-1]))
s += a
return s
def g(x): return sum([pow(2,i)*floor((x+pow(2,i)-1)/pow(2,i+1)) for i in range(0,32)])
def h(x):
s = 0
extra = 0
extra_s = 0
for i in range(0,32):
num = (x+pow(2,i)-1)
den = pow(2,i+1)
fraction = num/den
floored = floor(num/den)
power = pow(2,i)
product = power*floored
if product == 0:
break
s += product
extra += (fraction - floored)
extra_s += power*fraction
#print(f"i={i} s={s} num={num} den={den} fraction={fraction} floored={floored} power={power} product={product} extra={extra} extra_s={extra_s}")
return s
def z(x):
upper_bound = floor(log(x,2)) if x > 0 else 0
s = 0
for i in range(upper_bound+1):
num = (x - pow(2,i))
den = pow(2,i+1)
fraction = num/den
floored = floor(fraction)
added = pow(2,i)
s += floored * added
s += added
print(f"i={i} s={s} upper_bound={upper_bound} num={num} den={den} floored={floored} added={added}")
return s
# return sum([floor((x - pow(2,i))/pow(2,i+1) + pow(2,i)) for i in range(floor(log(x, 2)))])
def tower(x):
table = [[" " for i in range(x)] for j in range(ceil(log(x,2)))]
for i in range(1,x):
p = leading_zeros(i)
table[p][i] = 2**p
for row in table:
for col in row:
print(col,end='')
print()
# h(9000)
for i in range(1,16):
tower(i)
print((i, f(i), g(i), h(i), z(i-1)))

Based on the method of Eric Postpischil, here is a way to do it without a loop.
Note that every bit is being multiplied by its position, and the results are summed (sort of, except there is also a factor of 0.5 in it, let's put that aside for now). Let's call those values that are being added up "the partial products" just to call them something, it's not really accurate to call them that, I can't come up with anything better. If we transpose that a little bit, then it's built up like this: the lowest bit of every partial product is the lowest bit of the position of every bit multiplied by that bit. Single-bit-products are bitwise-AND, and the values of the lowest bits of the positions are 0,1,0,1 etc, so it works out to x & 0xAAAAAAAA, the second bit of every partial product is x & 0xCCCCCCCC (and has a "weight" of 2, so this must be multiplied by 2) etc.
Then the whole thing needs to be shifted right by 1, to account for the factor of 0.5
So in total:
unsigned CountCumulativeTrailingZeros(unsigned x)
{
--x;
unsigned sum = x;
sum += (x >> 1) & 0x55555555;
sum += x & 0xCCCCCCCC;
sum += (x & 0xF0F0F0F0) << 1;
sum += (x & 0xFF00FF00) << 2;
sum += (x & 0xFFFF0000) << 3;
return sum;
}
For an additional explanation, here is a more visual example. Let's temporarily drop the factor of 0.5 again, it doesn't fundamentally change the algorithm but adds some complication.
First I write above every bit of v (some example value), the position of that bit in binary (p0 is the least significant bit of the position, p1 the second bit etc). Read the ps vertically, every column is a number:
p0: 10101010101010101010101010101010
p1: 11001100110011001100110011001100
p2: 11110000111100001111000011110000
p3: 11111111000000001111111100000000
p4: 11111111111111110000000000000000
v : 00000000100001000000001000000000
So for example bit 9 is set, and it has (reading from bottom to top) 01001 above it (9 in binary).
What we want to do (why this works has been explained by Eric's answer), is take the indexes of the bits that are set, shift them to their corresponding positions, and add them. In this case, they are already at their own positions (by construction, the numbers were written at their own positions), so there is no shift, but they still need to be filtered so only the numbers that correspond to set bits survive. This is what I meant by the "single bit products": take a bit of v and multiply it by the corresponding bits of p0, p1, etc.
You can look at that as multiplying the bit value by its index as well so 2^bit * bit as mentioned in the comments. That is not how it is done here, but that is effectively what is done.
Back to the example, applying bitwise-AND results in these partial products:
pp0: 00000000100000000000001000000000
pp1: 00000000100001000000000000000000
pp2: 00000000100000000000000000000000
pp3: 00000000000000000000001000000000
pp4: 00000000100001000000000000000000
v : 00000000100001000000001000000000
The only values that are left are 01001, 10010, 10111, and they are at their corresponding positions (so, already shifted to where they need to go).
Those values must be added, while keeping them at their positions. They don't need to be extracted from the strange form which they are in, addition is freely reorderable (associative and commutative) so it's OK to add all the least significant bits of the partial products to the sum first, then all the seconds bits, and so on. But they have to added with the right "weight", after all a set bit in pp0 corresponds to a 1 at that position but a set bit in pp1 really corresponds to a 2 at that position (since it's the second bit of the number that it is part of). So pp0 is used directly, but pp1 is shifted left by 1, pp2 is shifted left by 2, etc.
The the factor of 0.5 must still be accounted for, which I did mostly by shifting over the bits of the partial products by one less than what their weight would imply. pp0 was shifted left by zero, so it must be shifted right by 1 now. This could be done with less complication by just putting return sum >> 1; at the end, but that would reduce the range of values that the function can handle before running into integer wrapping modulo 232 (also it would cost an extra operation, and doing it the weird way does not).

Observe that if we count from 1 to x instead of to x−1, we have a pattern:
x
sum
sum/x
1
1
1
2
3
1.5
4
8
2
8
20
2.5
16
48
3
So we can easily calculate the sum for any power of two p as p • (1 + ½b), where b is the power (equivalently, the number of the bit that is set or the log2 of the power). We can see this by induction: If the sum from 1 to 2b is 2b•(1+½b) (which it is for b=0), then the sum from 1 to 2b+1 reprises the individual term contributions twice except that the last term adds 2b+1 instead of 2b, so the sum is 2•2b•(1+½b) − 2b + 2b+1 = 2b+1•(1+½b) + ½•2b+1 = 2b+1•(1+½(b+1)).
Further, between any two powers of two, the lower bits reprise the previous partial sums. Thus, for any x, we can compute the cumulative number of trailing zeros by summing the sums for the set bits in it. Recalling this provides the sum for numbers from 1 to x, we adjust by to get the desired sum from 1 to x−1 subtracting one from x before computation:
unsigned CountCumulative(unsigned x)
{
--x;
unsigned sum = 0;
for (unsigned bit = 0; bit < sizeof x * CHAR_BIT; ++bit)
sum += (x & 1u << bit) * (1 + bit * .5);
return sum;
}
We can terminate the loop when x is exhausted:
unsigned CountCumulative(unsigned x)
{
--x;
unsigned sum = 0;
for (unsigned bit = 0; x; ++bit, x >>= 1)
sum += ((x & 1) << bit) * (1 + bit * .5);
return sum;
}
As harold points out, we can factor out the 1, as summing the value of each bit of x equals x:
unsigned CountCumulative(unsigned x)
{
--x;
unsigned sum = x;
for (unsigned bit = 0; x; ++bit, x >>= 1)
sum += ((x & 1) << bit) * bit * .5;
return sum;
}
Then eliminate the floating-point:
unsigned CountCumulative(unsigned x)
{
unsigned sum = --x;
for (unsigned bit = 0; x; ++bit, x >>= 1)
sum += ((x & 1) << bit) / 2 * bit;
return sum;
}
Note that when bit is zero, ((x & 1) << bit) / 2 will lose the fraction, but this irrelevant as * bit makes the contribution zero anyway. For all other values of bit, (x & 1) << bit is even, so the division does not lose anything.
This will overflow unsigned at some point, so one might want to use a wider type for the calculations.
More Code Golf
Another way to add half the values of the bits of x repeatedly depending on their bit position is to shift x (to halve its bit values) and then add that repeatedly while removing successive bits from low to high:
unsigned CountCumulative(unsigned x)
{
unsigned sum = --x;
for (unsigned bit = 0; x >>= 1; ++bit)
sum += x << bit;
return sum;
}

GROUPING_ID functionality

I don't quite understand how this function works, I've been looking over the below documentation, and I have some issues.
http://msdn.microsoft.com/en-us/library/bb510624.aspx
So, I understand how GROUPING() works perfectly, but the output for GROUPING_ID is quite impossible for me to comprehend how it's done because it's not the same with the explanation.
For example I have the following string of ones and zeroes: 010. In the documentation it says it's equal to 2. Also I read in a SQL book that each byte (the rightmost) is EQUAL with 2 at the power of the byte position minus one.
So, (2^2 - 1) + (2^1 - 1 ) + (2^0 - 1), but isn't that the same for each binary number? (100/101/110/etc), and the result isn't 2 either....
EDIT 1 :
This is how the explanation from the book is:
Another function that you can use to identify the grouping sets is GROUPING_ID. This
function accepts the list of grouped columns as inputs and returns an integer representing a
bitmap. The rightmost bit represents the rightmost input. The bit is 0 when the respective element
is part of the grouping set and 1 when it isn’t. Each bit represents 2 raised to the power
of the bit position minus 1; so the rightmost bit represents 1, the one to the left of it 2, then 4,
then 8, and so on. The result integer is the sum of the values representing elements that are
not part of the grouping set because their bits are turned on. Here’s a query demonstrating
the use of this function.
There has to be an error because there is no way a number is calculated by 2^(position) - 1, is it a mistake ? I've been calculating with 2^(bitposition) *1 and the outputs are correct.
For example the I've done this
GROUPING_ID(a,b,c),
GROUPING(a),
GROUPING(b),
GROUPING(c)
And let's say we have the following output
3, 0, 1, 1
So our binary string is 011 and 3 is the output of the GROUPING_ID function, if we calculate the string
2^0 * 1 + 2^1 * 1 + 2^0 *2 = 1 + 2 + 0 = 3
There is no other logic that I see here, I cannot calculate with minus as the upper quote says, the thing is that on MSDN the weirder definition seems to be somewhat similar with this one:
Each GROUPING_ID argument must be an element of the GROUP BY list. GROUPING_ID () returns an integer bitmap whose lowest N bits may be lit.
A lit bit indicates the corresponding argument is not a grouping column for the given output row. The lowest-order bit corresponds to argument N, and the N-1th lowest-order bit corresponds to argument 1.

First of all, when they say
Each bit represents 2 raised to the power of the bit position minus 1
they do not mean 2position - 1 but rather 2position - 1. Apparently, for the purpose of their description they chose to count bits from 1 (for the rightmost bit) rather than from 0.
Secondly, each bit represents the said value when it is set, i.e. when it is 1. So, naturally, you do not do just
21 - 1 + 22 - 1 + ... + 2N - 1
but rather
bit1 × 21 - 1 + bit2 × 22 - 1 + ... + bitN × 2N - 1
which is the normal way of converting the binary representation to the decimal one and is also the method you have shown near the end of your question.

Lets say we have binary number 0101
we went from right to left
1->(2^0*1)=1
0->(2^1*0)=0
1->(2^2*1)=4
0->(2^3*0)=0
if we summerize all results we have 5
so 0101(binary)=5(decimal)

C - Fast conversion between binary and hex representations

Reading or writing a C code, I often have difficulties translating the numbers from the binary to the hex representations and back. Usually, different masks like 0xAAAA5555 are used very often in low-level programming, but it's difficult to recognize a special pattern of bits they represent. Is there any easy-to-remember rule how to do it fast in the mind?

Each hex digit map exactly on 4 bit, I usually keep in mind the 8421 weights of each of these bits, so it is very easy to do even an in mind conversion ie
A = 10 = 8+2 = 1010 ...
5 = 4+1 = 0101
just keep the 8-4-2-1 weights in mind.
A 5
8+4+2+1 8+4+2+1
1 0 1 0 0 1 0 1

I always find easy to map HEX to BINARY numbers. Since each hex digit can be directly mapped to a four digit binary number, you can think of:
> 0xA4
As
> b 1010 0100
> ---- ---- (4 binary digits for each part)
> A 4

The conversion is calculated by dividing the base 10 representation by 2 and stringing the remainders in reverse order. I do this in my head, seems to work.
So you say what does 0xAAAA5555 look like
I just work out what A looks like and 5 looks like by doing
A = 10
10 / 2 = 5 r 0
5 / 2 = 2 r 1
2 / 2 = 1 r 0
1 / 2 = 0 r 1
so I know the A's look like 1010 (Note that 4 fingers are a good way to remember the remainders!)
You can string blocks of 4 bits together, so A A is 1010 1010. To convert binary back to hex, I always go through base 10 again by summing up the powers of 2. You can do this by forming blocks of 4 bits (padding with 0s) and string the results.
so 111011101 is 0001 1101 1101 which is (1) (1 + 4 + 8) (1 + 4 + 8) = 1 13 13 which is 1DD

Bit Hack - Round off to multiple of 8

can anyone please explain how this works (asz + 7) & ~7; It rounds off asz to the next higher multiple of 8.
It is easy to see that ~7 produces 11111000 (8bit representation) and hence switches off the last 3 bits ,thus any number which is produced is a multiple of 8.
My question is how does adding asz to 7 before masking [edit] produce the next higher[end edit] multiple of 8 ? I tried writing it down on paper
like :
1 + 7 = 8 = 1|000 (& ~7) -> 1000
2 + 7 = 9 = 1|001 (& ~7) -> 1000
3 + 7 = 10 = 1|010 (& ~7) -> 1000
4 + 7 = 11 = 1|011 (& ~7) -> 1000
5 + 7 = 12 = 1|100 (& ~7) -> 1000
6 + 7 = 13 = 1|101 (& ~7) -> 1000
7 + 7 = 14 = 1|110 (& ~7) -> 1000
8 + 7 = 15 = 1|111 (& ~7) -> 1000
A pattern clearly seems to emerge which has been exploited .Can anyone please help me it out ?
Thank You all for the answers.It helped confirm what I was thinking. I continued the writing the pattern above and when I crossed 10 , i could clearly see that the nos are promoted to the next "block of 8" if I can say so.
Thanks again.

Well, if you were trying to round down, you wouldn't need the addition. Just doing the masking step would clear out the bottom bits and you'd get rounded to the next lower multiple.
If you want to round up, first you have to add enough to "get past" the next multiple of 8. Then the same masking step takes you back down to the multiple of 8. The reason you choose 7 is that it's the only number guaranteed to be "big enough" to get you from any number up past the next multiple of 8 without going up an extra multiple if your original number were already a multiple of 8.
In general, to round up to a power of two:
unsigned int roundTo(unsigned int value, unsigned int roundTo)
{
return (value + (roundTo - 1)) & ~(roundTo - 1);
}

It's actually adding 7 to the number and rounding down.
This has the desired effect of rounding up to the next multiple of 8. (Adding +8 instead of +7 would bump a value of 8 to 16.)

The +7 isn't to produce an exact multiple of 8, it's to make sure you get the next highest multiple of eight.
edit: Beaten by 16 seconds and several orders of quality. Oh well, back to lurking.

Well, the mask would produce an exact multiple of 8 by itself. Adding 7 to asz ensures that you get the next higher multiple.

Without the +7 it will be the biggest multiple of 8 less or equal to your orig number

Adding 7 does not produce a multiple of 8. The multiple of 8 is produced by anding with ~7. ~7 is the complement of 7, which is 0xffff fff8 (except using however many bits are in an int). This truncates, or rounds down.
Adding 7 before doing that insures that no value lower than asz is returned. You've already worked out how that works.

Uhh, you just answered your own question??? by adding 7, you are guaranteeing the result will be at or above the next multiple of 8. truncating then gives you that multiple.

k&r exercise confusion with bit-operations

The exercise is:
Write a function setbits(x,p,n,y) that returns x with the n bits that begin at position p set to the rightmost n bits of y, leaving the other bits unchanged.
My attempt at a solution is:
#include <stdio.h>
unsigned setbits(unsigned, int, int, unsigned);
int main(void)
{
printf("%u\n", setbits(256, 4, 2, 255));
return 0;
}
unsigned setbits(unsigned x, int p, int n, unsigned y)
{
return (x >> (p + 1 - n)) | (1 << (n & y));
}
It's probably incorrect, but am I on the right path here? If not, what am I doing wrong? I'm unsure as to why I don't perfectly understand this, but I spent about an hour trying to come up with this.
Thanks.

Here's your algorithm:
If n is 0, return x.
Take 1, and left shift it n times and then subtract 1. Call this mask.
Left shift mask p times call this mask2.
And x with the inverse of mask2. And y with mask, and left shift p times.
Or the results of those two operations, and return that value.

I think the answer is a slightly modified application of the getbits example from section 2.9.
Lets break it down as follows:
Let bitstring x be 1 0 1 1 0 0
Let bitstring y be 1 0 1 1 1 1
positions -------->5 4 3 2 1 0
Setting p = 4 and n =3 gives us the bitstring from x which is 0 1 1. It starts at 4 and ends at 2 and spans 3 elements.
What we want to do is to replace 0 1 1 with 1 1 1(the last three elements of bitstring y).
Lets forget about left-shift/right-shift for the moment and visualize the problem as follows:
We need to grab the last three digits from bitstring y which is 1 1 1
Place 1 1 1 directly under positions 4 3 and 2 of bitstring x.
Replace 0 1 1 with 1 1 1 while keeping the rest of the bits intact...
Now lets go into a little more detail...
My first statement was:
We need to grab the last three digits from bitstring y which is 1 1 1
The way to isolate bits from a bitstring is to first start with bitstring that has all 0s.
We end up with 0 0 0 0 0 0.
0s have this incredible property where bitwise '&'ing it with another number gives us all 0s and bitwise '|'ing it with another number gives us back that other number.
0 by itself is of no use here...but it tells us that if we '|' the last three digits of y with a '0', we will end up with 1 1 1. The other bits in y don't really concern us here, so we need to figure out a way to zero out those numbers while keeping the last three digits intact. In essence we need the number 0 0 0 1 1 1.
So lets look at the series of transformations required:
Start with -> 0 0 0 0 0 0
apply ~0 -> 1 1 1 1 1 1
lshift by 3 -> 1 1 1 0 0 0
apply ~ -> 0 0 0 1 1 1
& with y -> 0 0 0 1 1 1 & 1 0 1 1 1 1 -> 0 0 0 1 1 1
And this way we have the last three digits to be used for setting purposes...
My second statement was:
Place 1 1 1 directly under positions 4 3 and 2 of bitstring x.
A hint for doing this can be found from the getbits example in section 2.9. What we know about positions 4,3 and 2, can be found from the values p = 4 and n =3. p is the position and n is the length of the bitset. Turns out p+1-n gives us the offset of the bitset from the rightmost bit. In this particular example p+1-n = 4 +1-3 = 2.
So..if we do a left shift by 2 on the string 0 0 0 1 1 1, we end up with 0 1 1 1 0 0. If you put this string under x, you will notice that 1 1 1 aligns with positions 4 3 and 2 of x.
I think I am finally getting somewhere...the last statement I made was..
Replace 0 1 1 with 1 1 1 while keeping the rest of the bits intact...
Lets review our strings now:
x -> 1 0 1 1 0 0
isolated y -> 0 1 1 1 0 0
Doing a bitwise or on these two values gives us what we need for this case:
1 1 1 1 0 0
But this would fail if instead of 1 1 1, we had 1 0 1...so if we need to dig a little more to get to our "silver-bullet"...
Lets look at the above two strings one more time...
x -> bit by bit...1(stays) 0(changes) 1(changes) 1(changes) 0(stays) 0(stays)
So ideally..we need the bitstring 1 x x x 0 0, where the x's will be swapped with 1's.
Here's a leap of intuition that will help us..
Bitwise complement of isolated y -> 1 0 0 0 1 1
& this with x gives us -> 1 0 0 0 0 0
| this with isolated y -> 1 1 1 1 0 0 (TADA!)
Hope this long post helps people with rationalizing and solving such bitmasking problems...
Thanks

Note that ~0 << i gives you a number with the least significant i bits set to 0, and the rest of the bits set to 1. Similarly, ~(~0 << i) gives you a number with the least significant i bits set to 1, and the rest to 0.
Now, to solve your problem:
First, you want a number that has all the bits except the n bits that begin at position p set to the bits of x. For this, you need a mask that comprises of 1 in all the places except the n bits beginning at position p:
this mask has the topmost (most significant) bits set, starting with the bit at position p+1.
this mask also has the least significant p+1-n bits set.
Once you have the above mask, & of this mask with x will give you the number you wanted in step 1.
Now, you want a number that has the least significant n bits of y set, shifted left p+1-n bits.
You can easily make a mask that has only the least significant n bits set, and & it with y to extract y's least significant n bits.
Then, you can shift this number by p+1-n bits.
Finally, you can bitwise-or (|) the results of step 2 and 3.2 to get your number.
Clear as mud? :-)
(The above method should be independent of the size of the numbers, which I think is important.)
Edit: looking at your effort: n & y doesn't do anything with n bits. For example, if n is 8, you want the last 8 bits of y, but n & y will just pick the 4th bit of y (8 in binary is 1000). So you know that can't be right. Similarly, right-shifting x p+1-n times gives you a number that has the most significant p+1-n bits set to zero and the rest of the bits are made of the most significant bits of x. This isn't what you want either.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight