Assuming that x is a positive integer, the following function returns 1 if x is a certain type
of values or it returns 0 otherwise.
int mystery(int x) {
return !((x-1) & x);
}
What does mystery(20) return?
May I ask how do we approach this type of qn ?
My idea is to express x in binary and do bitwise operation with it.
Do correct me if I am wrong thanks !
Let's work from the outside in.
!(expression)
you will get a 0 if expression is true, that is, not zero,
and you will get a 1 if expression is false, that is, zero.
So when will expression be non-zero, giving a zero as a result? Whenever (x-1) has some bits in common with x.
What are examples?
0 - 1 = 0xfff... & 0, no bits in common, returns 1
1 - 1 = 0 & 1, no bits in common, returns 1
2 - 1 = 1 & 2, no bits in common, returns 1
3 - 1 = 2 & 3, bits in common, returns 0
4 - 1 = 3 & 4, no bits in common, returns 1
5 - 1 = 4 & 5, bits in common, returns 0
6 - 1 = 5 & 6, bits in common, returns 0
7 - 1 = 6 & 7, bits in common, returns 0
8 - 1 = 7 & 8, no bits in common, returns 1
It looks to me like we can say it returns 1 when the binary representation has exactly zero or one bits turned on in it.
0 or 1 or 10 or 100
Related
Edit: I wish SO let me accept 2 answers because neither is complete without the other. I suggest reading both!
I am trying to come up with a fast implementation of a function that given an unsigned 32-bit integer x returns the sum of 2^trailing_zeros(i) for i=1..x-1, where trailing_zeros is the count trailing zeros operation which is defined as returning the 0 bits after the least significant 1 bit. This seems like the kind of problem that should lend itself to a clever bit manipulation implementation that takes the same number of instructions regardless of the input, but I haven't been able to derive it.
Mathematically, 2^trailing_zeros(i) is equivalent to the largest factor of 2 that exactly divides i. So we are summing those largest factors for 1..x-1.
i | 1 2 3 4 5 6 7 8 9 10
-----------------------------------------------------------------------
2^trailing_zeroes(i) | 1 2 1 4 1 2 1 8 1 2
-----------------------------------------------------------------------
Sum (desired value) | 0 1 3 4 8 9 11 12 20 21
It is a little easier to see the structure of 2^trailing_zeroes(i) if we 'plot' the values -- horizontal position increasing from left to right corresponding to i and vertical position increasing from top to bottom corresponding to trailing_zeroes(i).
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
16 16 16 16 16 16 16 16
32 32 32 32
64 64
Here it is easier to see the pattern that 2's are always 4 apart, 8's are always 16 apart, etc. However, each pattern starts at a different time -- 8's don't begin until i=8, 16 doesn't begin until i=16, etc. If you don't take into account that the patterns don't start right away you can come up with formulas that don't work -- for example you might think to determine the number of 8's going into the total you should just compute floor(x/16) but i=25 is far enough to the right to include both of the first two 8s.
The best solution I have come up with so far is:
Set n = floor(log2(x)). This can be computed quickly using the count leading zeros operation. This tells us the highest power of two that is going to be involved in the sum.
Set sum = 0
for i = 1..n
sum += floor((x - 2^i) / 2^(i+1))*2^i + 2^i
The way this works as for each power, it calculates the horizontal distance on the plot between x and the first appearance of that power, e.g. the distance between x and the first 8 is (x-8), and then it divides by the distance between repeating instances of that power, e.g. floor((x-8)/16), which gives us how many times that power appeared, we the sum for that power, e.g. floor((x-8)/16)*8. Then we add one instance of the given power because that calculation excludes the very first time that power appears.
In practice this implementation should be pretty fast because the division/floor can be done by right bit shift and powers of two can be done with 1 bit-shifted to the left. However it seems like it should still be possible to do better. This implementation will loop more for larger inputs, up to 32 times (it's O(log2(n)), ideally we want O(1) without a gigantic lookup table using up all the CPU cache). I've been eyeing the BMI/BMI2 intrinsics but I don't see an obvious way to apply them.
Although my goal is to implement this in a compiled language like C++ or Rust with real bit shifting and intrinsics, I've been prototyping in Python. Included below is my script that includes the implementation I described, z(x), and the code for generating the plot, tower(x).
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from math import pow, floor, log, ceil
def leading_zeros(x):
return len(bin(x).split('b')[-1].split('1')[-1])
def f(x):
s = 0
for c, i in enumerate(range(1,x)):
a = pow(2, len(bin(i).split('b')[-1].split('1')[-1]))
s += a
return s
def g(x): return sum([pow(2,i)*floor((x+pow(2,i)-1)/pow(2,i+1)) for i in range(0,32)])
def h(x):
s = 0
extra = 0
extra_s = 0
for i in range(0,32):
num = (x+pow(2,i)-1)
den = pow(2,i+1)
fraction = num/den
floored = floor(num/den)
power = pow(2,i)
product = power*floored
if product == 0:
break
s += product
extra += (fraction - floored)
extra_s += power*fraction
#print(f"i={i} s={s} num={num} den={den} fraction={fraction} floored={floored} power={power} product={product} extra={extra} extra_s={extra_s}")
return s
def z(x):
upper_bound = floor(log(x,2)) if x > 0 else 0
s = 0
for i in range(upper_bound+1):
num = (x - pow(2,i))
den = pow(2,i+1)
fraction = num/den
floored = floor(fraction)
added = pow(2,i)
s += floored * added
s += added
print(f"i={i} s={s} upper_bound={upper_bound} num={num} den={den} floored={floored} added={added}")
return s
# return sum([floor((x - pow(2,i))/pow(2,i+1) + pow(2,i)) for i in range(floor(log(x, 2)))])
def tower(x):
table = [[" " for i in range(x)] for j in range(ceil(log(x,2)))]
for i in range(1,x):
p = leading_zeros(i)
table[p][i] = 2**p
for row in table:
for col in row:
print(col,end='')
print()
# h(9000)
for i in range(1,16):
tower(i)
print((i, f(i), g(i), h(i), z(i-1)))
Based on the method of Eric Postpischil, here is a way to do it without a loop.
Note that every bit is being multiplied by its position, and the results are summed (sort of, except there is also a factor of 0.5 in it, let's put that aside for now). Let's call those values that are being added up "the partial products" just to call them something, it's not really accurate to call them that, I can't come up with anything better. If we transpose that a little bit, then it's built up like this: the lowest bit of every partial product is the lowest bit of the position of every bit multiplied by that bit. Single-bit-products are bitwise-AND, and the values of the lowest bits of the positions are 0,1,0,1 etc, so it works out to x & 0xAAAAAAAA, the second bit of every partial product is x & 0xCCCCCCCC (and has a "weight" of 2, so this must be multiplied by 2) etc.
Then the whole thing needs to be shifted right by 1, to account for the factor of 0.5
So in total:
unsigned CountCumulativeTrailingZeros(unsigned x)
{
--x;
unsigned sum = x;
sum += (x >> 1) & 0x55555555;
sum += x & 0xCCCCCCCC;
sum += (x & 0xF0F0F0F0) << 1;
sum += (x & 0xFF00FF00) << 2;
sum += (x & 0xFFFF0000) << 3;
return sum;
}
For an additional explanation, here is a more visual example. Let's temporarily drop the factor of 0.5 again, it doesn't fundamentally change the algorithm but adds some complication.
First I write above every bit of v (some example value), the position of that bit in binary (p0 is the least significant bit of the position, p1 the second bit etc). Read the ps vertically, every column is a number:
p0: 10101010101010101010101010101010
p1: 11001100110011001100110011001100
p2: 11110000111100001111000011110000
p3: 11111111000000001111111100000000
p4: 11111111111111110000000000000000
v : 00000000100001000000001000000000
So for example bit 9 is set, and it has (reading from bottom to top) 01001 above it (9 in binary).
What we want to do (why this works has been explained by Eric's answer), is take the indexes of the bits that are set, shift them to their corresponding positions, and add them. In this case, they are already at their own positions (by construction, the numbers were written at their own positions), so there is no shift, but they still need to be filtered so only the numbers that correspond to set bits survive. This is what I meant by the "single bit products": take a bit of v and multiply it by the corresponding bits of p0, p1, etc.
You can look at that as multiplying the bit value by its index as well so 2^bit * bit as mentioned in the comments. That is not how it is done here, but that is effectively what is done.
Back to the example, applying bitwise-AND results in these partial products:
pp0: 00000000100000000000001000000000
pp1: 00000000100001000000000000000000
pp2: 00000000100000000000000000000000
pp3: 00000000000000000000001000000000
pp4: 00000000100001000000000000000000
v : 00000000100001000000001000000000
The only values that are left are 01001, 10010, 10111, and they are at their corresponding positions (so, already shifted to where they need to go).
Those values must be added, while keeping them at their positions. They don't need to be extracted from the strange form which they are in, addition is freely reorderable (associative and commutative) so it's OK to add all the least significant bits of the partial products to the sum first, then all the seconds bits, and so on. But they have to added with the right "weight", after all a set bit in pp0 corresponds to a 1 at that position but a set bit in pp1 really corresponds to a 2 at that position (since it's the second bit of the number that it is part of). So pp0 is used directly, but pp1 is shifted left by 1, pp2 is shifted left by 2, etc.
The the factor of 0.5 must still be accounted for, which I did mostly by shifting over the bits of the partial products by one less than what their weight would imply. pp0 was shifted left by zero, so it must be shifted right by 1 now. This could be done with less complication by just putting return sum >> 1; at the end, but that would reduce the range of values that the function can handle before running into integer wrapping modulo 232 (also it would cost an extra operation, and doing it the weird way does not).
Observe that if we count from 1 to x instead of to x−1, we have a pattern:
x
sum
sum/x
1
1
1
2
3
1.5
4
8
2
8
20
2.5
16
48
3
So we can easily calculate the sum for any power of two p as p • (1 + ½b), where b is the power (equivalently, the number of the bit that is set or the log2 of the power). We can see this by induction: If the sum from 1 to 2b is 2b•(1+½b) (which it is for b=0), then the sum from 1 to 2b+1 reprises the individual term contributions twice except that the last term adds 2b+1 instead of 2b, so the sum is 2•2b•(1+½b) − 2b + 2b+1 = 2b+1•(1+½b) + ½•2b+1 = 2b+1•(1+½(b+1)).
Further, between any two powers of two, the lower bits reprise the previous partial sums. Thus, for any x, we can compute the cumulative number of trailing zeros by summing the sums for the set bits in it. Recalling this provides the sum for numbers from 1 to x, we adjust by to get the desired sum from 1 to x−1 subtracting one from x before computation:
unsigned CountCumulative(unsigned x)
{
--x;
unsigned sum = 0;
for (unsigned bit = 0; bit < sizeof x * CHAR_BIT; ++bit)
sum += (x & 1u << bit) * (1 + bit * .5);
return sum;
}
We can terminate the loop when x is exhausted:
unsigned CountCumulative(unsigned x)
{
--x;
unsigned sum = 0;
for (unsigned bit = 0; x; ++bit, x >>= 1)
sum += ((x & 1) << bit) * (1 + bit * .5);
return sum;
}
As harold points out, we can factor out the 1, as summing the value of each bit of x equals x:
unsigned CountCumulative(unsigned x)
{
--x;
unsigned sum = x;
for (unsigned bit = 0; x; ++bit, x >>= 1)
sum += ((x & 1) << bit) * bit * .5;
return sum;
}
Then eliminate the floating-point:
unsigned CountCumulative(unsigned x)
{
unsigned sum = --x;
for (unsigned bit = 0; x; ++bit, x >>= 1)
sum += ((x & 1) << bit) / 2 * bit;
return sum;
}
Note that when bit is zero, ((x & 1) << bit) / 2 will lose the fraction, but this irrelevant as * bit makes the contribution zero anyway. For all other values of bit, (x & 1) << bit is even, so the division does not lose anything.
This will overflow unsigned at some point, so one might want to use a wider type for the calculations.
More Code Golf
Another way to add half the values of the bits of x repeatedly depending on their bit position is to shift x (to halve its bit values) and then add that repeatedly while removing successive bits from low to high:
unsigned CountCumulative(unsigned x)
{
unsigned sum = --x;
for (unsigned bit = 0; x >>= 1; ++bit)
sum += x << bit;
return sum;
}
I was studying bitwise operators and they make sense until the unary ~one's complement is used with them. Can anyone explain to me how this works?
For example, these make sense however the rest of the computations aside from these do not:
1&~0 = 1 (~0 is 1 -> 1&1 = 1)
~0^~0 = 0 (~0 is 1 -> 1^1 = 0)
~1^0 = 1 (~1 is 0 -> 0^1 = 1)
~0&1 = 1 (~0 is 1 -> 1&1 = 1)
~0^~1 = 1 (~0 is 1, ~1 is 0 -> 1^0 = 1)
~1^~1 = 0 (~1 is 0 -> 0^0)
The rest of the results produced are negative(or a very large number if unsigned) or contradict the logic I am aware of. For example :
0&~1 = 0 (~1 = 0 therefor 0&0 should equal 0 but they equal 1)
~0&~1 = -2
~1|~0 = -1
etc. Anywhere you can point me to learn about this?
They actually do make sense when you expand them out a little more. A few things to be aware of though:
Bitwise AND yields a 1 only when both bits involved are 1. Otherwise, it yields 0. 1 & 1 = 1, 0 & anything = 0.
Bitwise OR yields a 1 when any of the bits in that position are a 1, and 0 only if all bits in that position are 0. 1 | 0 = 1, 1 | 1 = 1, 0 | 0 = 0.
Signed numbers are generally done as two's complement (though a processor does not have to do it that way!). Remember with two's complement, you invert and add 1 to get the magnitude when the highest bit position is a 1.
Assuming a 32-bit integer, you get these results:
0 & ~1 = 0 & 0xFFFFFFFE = 0
~0 & ~1 = 0xFFFFFFFF & 0xFFFFFFFE = 0xFFFFFFFE (0x00000001 + 1) = -2
~1 | ~0 = 0xFFFFFFFE & 0xFFFFFFFF = 0xFFFFFFFF (0x00000000 + 1) = -1
0&~1 = 0 (~1 = 0 therefor 0&0 should equal 0 but they equal 1)
~1 equals -2. If you flip all the bits of a Two's Complement number, you multiply it by -1 and subtract 1 from the result. Regardless of that 0 has 0 for all the bits, so the result of & is going to be 0 anyway.
~0&~1 = -2
~0 has all bits set so ~0&~1 is just ~1. Which is -2.
~1|~0 = -1
~0 has all bits set, so the result of the | is ~0 (= -1) no matter what it is OR'd with.
~1 = 0 - No it's not. It's equal to -2. Let's take a eight bit two complement as example. The decimal number 1 has the representation 0000 0001. So ~1 will have 1111 1110 which is the two complement representation of -2.
I have a question: Is bitwise anding transitive, particularly in C and C++?
Say res=(1 & 2 & 3 & 4), is this same as res1=(1&2) and res2=(3&4) and
res= (res1 & res2). Will this be same?
Yes, bitwise AND is transitive as you've used the term.
It's perhaps easier to think of things as a stack of bits. So if we have four 4-bit numbers, we can do something like this:
A = 0xB;
B = 0x3;
C = 0x1;
D = 0xf;
If we simply stack them up:
A 1 0 1 1
B 0 0 1 1
C 0 0 0 1
D 1 1 1 1
Then the result of a bitwise AND looks at one column at a time, and produces a 1 for that column if and only if there's a 1 for every line in that column, so in the case above, we get: 0 0 0 1, because the last column is the only one that's all ones.
If we split that in half to get:
A 1 0 1 1
B 0 0 1 1
A&B 0 0 1 1
And:
C 0 0 0 1
D 1 1 1 1
C&D 0 0 0 1
Then and those intermediate results:
A&B 0 0 1 1
C&D 0 0 0 1
End 0 0 0 1
Our result is still going to be the same--anywhere there's a zero in a column, that'll produce a zero in the intermediate result, which will produce a zero in the final result.
The term you're looking for is associative. We generally wouldn't call a binary operator "transitive". And yes, & and | are both associative, by default. Obviously, you could overload the operators to be something nonsensical, but the default implementations will be associative. To see this, consider one-bit values a, b, and c and note that
(a & b) & c == a & (b & c)
because both will be 1 if and only if all three inputs are 1. And this is the operation that is being applied pointwise to each bit in your integer values. The same is true of |, simply replacing 1 with 0.
There are also some issues to consider if your integers are signed, as the behavior is dependent on the underlying bit representation.
So I saw this code which printed out individual bits of any number.I do not understand why the individual bits are accessed and not the entire number itself
#include <stdio.h>
int main()
{
int x=10, b;
for(b=0; x!=0; x>>=1) {
printf("%d:%d\n", b, (x&1));
b++;
}
}
OUTPUT:
0:0
1:1
2:0
3:1
Please help me understand this piece of code.
In your code you are printing the value of X variable in binary. For this, your code, use logical operation as AND operator and right-shift.
In the loop condition, you displace the X variable one bit to the right.
for b = 0 you get x = 1010
for b = 1 you get x = 101
for b = 2 you get x = 10
for b = 3 you get x = 1
Then, in your print, show your loop iterator (b) and your X variable AND 1.
The AND operator get this values:
0 AND 0 = 0
0 AND 1 = 0
1 AND 0 = 0
1 AND 1 = 1
In your case, you have:
1010 AND (000)1 = 0
101 AND (00)1 = 1
10 AND (0)1 = 0
1 AND 1 = 1
There are two questions in your post: how to access an individual bit ? and how to select that bit ?
Concerning the first question, suppose you want to access the less significant bit (or, to make it simpler, the rightmmost bit), you can use a mask: suppose your data is b0011 for instance, you can mask with b0001 (i.e. 1 in decimal).
0 0 1 1
& 0 0 0 1
---------
0 0 0 1
The & operator is the bitwise and. If you look in your code sample, you have printf("%d:%d\n", b, (x&1)); in which you can see x & 1, i.e. print the rightmost bit of x.
Now comes the second question: how to put each bit in the rightmost position one after each other (said otherwise, how to select the bit to print) ? An easy solution is to shift your data of 1 position to the right each time you want to select the next bit (i.e. the bit to the left of the current one).
In C, you can shift using >>. For instance, if x is b0011, then x >> 1 is b0001 (in this case, you fill the leftmost position with zeros, but in some cases it might be trickier). If you look in you code sample, you have x>>=1 in the for-loop, which assigns x >> 1 in x.
Hence, suppose you take the previous example:
0 0 1 1 = x 0 0 0 1 = x
& 0 0 0 1 & 0 0 0 1
--------- x >> 1 = b0001 -> x ---------
0 0 0 1 0 0 0 1
and so one...
A last bonus point, the loop stopping condition is x != 0, this implies that you don't prints all bits of your data, but only the bits up to the leftmost 1 (included). For instance, in the above example, after printing the two rightmost bits, x becomes 0 and the loop exits.
The exercise is:
Write a function setbits(x,p,n,y) that returns x with the n bits that begin at position p set to the rightmost n bits of y, leaving the other bits unchanged.
My attempt at a solution is:
#include <stdio.h>
unsigned setbits(unsigned, int, int, unsigned);
int main(void)
{
printf("%u\n", setbits(256, 4, 2, 255));
return 0;
}
unsigned setbits(unsigned x, int p, int n, unsigned y)
{
return (x >> (p + 1 - n)) | (1 << (n & y));
}
It's probably incorrect, but am I on the right path here? If not, what am I doing wrong? I'm unsure as to why I don't perfectly understand this, but I spent about an hour trying to come up with this.
Thanks.
Here's your algorithm:
If n is 0, return x.
Take 1, and left shift it n times and then subtract 1. Call this mask.
Left shift mask p times call this mask2.
And x with the inverse of mask2. And y with mask, and left shift p times.
Or the results of those two operations, and return that value.
I think the answer is a slightly modified application of the getbits example from section 2.9.
Lets break it down as follows:
Let bitstring x be 1 0 1 1 0 0
Let bitstring y be 1 0 1 1 1 1
positions -------->5 4 3 2 1 0
Setting p = 4 and n =3 gives us the bitstring from x which is 0 1 1. It starts at 4 and ends at 2 and spans 3 elements.
What we want to do is to replace 0 1 1 with 1 1 1(the last three elements of bitstring y).
Lets forget about left-shift/right-shift for the moment and visualize the problem as follows:
We need to grab the last three digits from bitstring y which is 1 1 1
Place 1 1 1 directly under positions 4 3 and 2 of bitstring x.
Replace 0 1 1 with 1 1 1 while keeping the rest of the bits intact...
Now lets go into a little more detail...
My first statement was:
We need to grab the last three digits from bitstring y which is 1 1 1
The way to isolate bits from a bitstring is to first start with bitstring that has all 0s.
We end up with 0 0 0 0 0 0.
0s have this incredible property where bitwise '&'ing it with another number gives us all 0s and bitwise '|'ing it with another number gives us back that other number.
0 by itself is of no use here...but it tells us that if we '|' the last three digits of y with a '0', we will end up with 1 1 1. The other bits in y don't really concern us here, so we need to figure out a way to zero out those numbers while keeping the last three digits intact. In essence we need the number 0 0 0 1 1 1.
So lets look at the series of transformations required:
Start with -> 0 0 0 0 0 0
apply ~0 -> 1 1 1 1 1 1
lshift by 3 -> 1 1 1 0 0 0
apply ~ -> 0 0 0 1 1 1
& with y -> 0 0 0 1 1 1 & 1 0 1 1 1 1 -> 0 0 0 1 1 1
And this way we have the last three digits to be used for setting purposes...
My second statement was:
Place 1 1 1 directly under positions 4 3 and 2 of bitstring x.
A hint for doing this can be found from the getbits example in section 2.9. What we know about positions 4,3 and 2, can be found from the values p = 4 and n =3. p is the position and n is the length of the bitset. Turns out p+1-n gives us the offset of the bitset from the rightmost bit. In this particular example p+1-n = 4 +1-3 = 2.
So..if we do a left shift by 2 on the string 0 0 0 1 1 1, we end up with 0 1 1 1 0 0. If you put this string under x, you will notice that 1 1 1 aligns with positions 4 3 and 2 of x.
I think I am finally getting somewhere...the last statement I made was..
Replace 0 1 1 with 1 1 1 while keeping the rest of the bits intact...
Lets review our strings now:
x -> 1 0 1 1 0 0
isolated y -> 0 1 1 1 0 0
Doing a bitwise or on these two values gives us what we need for this case:
1 1 1 1 0 0
But this would fail if instead of 1 1 1, we had 1 0 1...so if we need to dig a little more to get to our "silver-bullet"...
Lets look at the above two strings one more time...
x -> bit by bit...1(stays) 0(changes) 1(changes) 1(changes) 0(stays) 0(stays)
So ideally..we need the bitstring 1 x x x 0 0, where the x's will be swapped with 1's.
Here's a leap of intuition that will help us..
Bitwise complement of isolated y -> 1 0 0 0 1 1
& this with x gives us -> 1 0 0 0 0 0
| this with isolated y -> 1 1 1 1 0 0 (TADA!)
Hope this long post helps people with rationalizing and solving such bitmasking problems...
Thanks
Note that ~0 << i gives you a number with the least significant i bits set to 0, and the rest of the bits set to 1. Similarly, ~(~0 << i) gives you a number with the least significant i bits set to 1, and the rest to 0.
Now, to solve your problem:
First, you want a number that has all the bits except the n bits that begin at position p set to the bits of x. For this, you need a mask that comprises of 1 in all the places except the n bits beginning at position p:
this mask has the topmost (most significant) bits set, starting with the bit at position p+1.
this mask also has the least significant p+1-n bits set.
Once you have the above mask, & of this mask with x will give you the number you wanted in step 1.
Now, you want a number that has the least significant n bits of y set, shifted left p+1-n bits.
You can easily make a mask that has only the least significant n bits set, and & it with y to extract y's least significant n bits.
Then, you can shift this number by p+1-n bits.
Finally, you can bitwise-or (|) the results of step 2 and 3.2 to get your number.
Clear as mud? :-)
(The above method should be independent of the size of the numbers, which I think is important.)
Edit: looking at your effort: n & y doesn't do anything with n bits. For example, if n is 8, you want the last 8 bits of y, but n & y will just pick the 4th bit of y (8 in binary is 1000). So you know that can't be right. Similarly, right-shifting x p+1-n times gives you a number that has the most significant p+1-n bits set to zero and the rest of the bits are made of the most significant bits of x. This isn't what you want either.