Writing a C function that uses pointers and bit operators to change one bit in memory? - c

First off, here is the exact wording of the problem I need to solve:
The BIOS (Basic Input Output Services) controls low level I/O on a computer. When a computer
first starts up the system BIOS creates a data area starting at memory address 0x400 for its own use.
Address 0x0417 is the Keyboard shift flags register, the bits of this byte have the following
meanings:
Bit Value Meaning
7 0/1 Insert off/on
6 0/1 CapsLock off/on
5 0/1 NumLock off/on
4 0/1 ScrollLock off/on
3 0/1 Alt key up/down
2 0/1 Control key up/down
1 0/1 Left shift key up/down
0 0/1 Right shift key up/down
This byte can be written as well as read. Thus we may change the status of the CapsLock, NumLock and ScrollLock LEDs on the keyboard by setting or clearing the relevant bit.
Write a C function using pointers and bit operators to turn Caps lock on without changing the other bits.
Our teacher didn't go over this at all, and I've referenced the textbook and conducted many Google searches looking for some help.
I understand how bitwise operators work, and understand that the solution is to OR this byte with the binary value '00000010'. However, I'm stuck when it comes to implementing this. How do I write this in C code? I don't know how to declare a pointer to exactly 1 byte of memory. Besides that, I'm assuming the answer looks like the following (with byte replaced with something proper):
byte* b_ptr = 0x417;
(*b_ptr) |= 00000010;
Is the above solution correct?

unsigned char is the typical synonym for byte. You can typedef byte to mean that if it's not available.
That notation you're using is decimal. The notation for a binary number is easiest in hex, so I'd just use 0x02 instead of the 00000010, which is actually octal notation for a literal number.
I think that you've reversed the order of the bits. My guess is that the bits are numbered most - least significant, so in the solution you propose, the bitmask 00000010 is left shift.
Otherwise, assuming byte is typedef'd to unsigned char, the syntax of the byte * b_ptr = 0x417; will point you at memory address 0x417, which is what you want.

Related

Is there a way to set/clear a bit from an unknown byte whilst leaving all other bits unchanged?

I am wanting to write to a register.
The register holds 1 byte of information. I wish to change bit 6 for argument's sake.
The way I am accomplishing this right now is to read the register, then do a set/clear.
This way I only change the bit I am interested in, and leave the rest untouched.
E.g.:
// Read register
uint8_t reading = read_reg(0x00);
// Set 6th bit
if (wanting_to_set) { reading |= (1 << 6); }
if (wanting_to_reset) { reading &= ~(1 << 6); }
// Write back to register
write_reg(0x00, reading);
Is there a way I can set or reset the nth bit without knowing that the byte is? This way I can avoid having to read the register first.
Is there a way I can set or reset the nth bit without knowing that the byte is? This way I can avoid having to read the register first.
There is no standard way to perform such a thing in C. Even in the event that you have a machine with individually addressable bits, the C language does not define any means of accessing memory in units smaller than 8 bits, no matter what the capabilities of the underlying hardware.
Therefore, in standard C, if you want to modify an individual bit without modifying any of the other bits nearby then you must accomplish it by at the same time overwriting at least seven bits near that one with the same values they already have. That means you must either have the current values of those bits or not care what values are written to them.
As an addition to #John's answer:
Very popular ARM Cortex-M(3,4) uCs have bit-banding memory area. All bits in this area are individually addressable. So you can simply read or write one bit by simple resing or writing to that address.

Arduino calculating with pins

I just searched for a library for an LCD. When I found one I tried to understand how it works.
Then I saw
PORTD &= ~(0xF0>>(4-PD0));
I never saw this (4-PD0) and don't know what that will return.
I would like to know how that works, what it returns, and what it's useful for.
So thanks for everyone who helps :D.
PD0 probably contains the bit number of the port, rather than directly a mask. e.g. "bit0", "bit1", "bit2"… etc. which can goes up to "3".
The mask described aside cover everything but these four first bits, so it needs to be shifted from the complement of this number to reach the given bit, thus « 4 - PD0 », which performs four shifts for bit0, three shifts for bit 1, and so on. Since the shift operation preserves the sign, all bits on the left remains sets.
After the final "~" complement operation, you obtain a mask that sets all bits to 1, up to PD0 position, excluded. This mask is supposed to locate PORTD.

bit endianness and portability of C binary files

In C, I have a char array that I use to store data at the bit level. I store these arrays to files, then read them in machines with different architectures. My question is if the order of the bits will be guaranteed consistent? For example, if I store "10010011" to the first byte, will the adjacent 1's always be read to be in the 2^0 and 2^1 positions, or could they end up interpreted as the 2^7 and 2^6 bits?
EDIT: I want to clarify this question a little for people who read this page later. Byte endianness is the order of bytes in a multibyte object, but my concern is with the bits in a given byte. When a byte is stored to disc, it is stored as a sequence of (usually) 8 bits. I'm no hardware expert, but it has to come down to that somehow. So, my concern is if the way the byte is stored is such that any machine will read the original unsigned char value, or if what is 3 to one machine will be 192 to another. I am concerned the bits will end up shuffled somehow. Apparently, this is not a concern, according to the answer I selected as well as one of the comments below. Thanks.
the simple answer:
The bits will still be in the correct order.
However, if performing any format conversion beyond %c, for instance %d, then the endianness of the reading architecture will determine the byte order The bits within each byte will still be the same.
Endianness is about bytes' order not bits. So 00001101 in a little-endian machine will be the same in a big-endian machine. However there is something you should now about bits' order in different machines. Bits' order change in unions. If you are going to use union, read this to figure out how endianness effects bitfield packing.
The concept you are trying to ask about is known as bit-numbering or bit endianness and system architectures are referred to as least-or most- significant bit (MSB, LSB) ordering.
As far as I know the reference is always with respects to the 0-th or first bit position.
With respect to a single 8-bit byte or octet, it will be portable, such that the value of the byte will be consistently considered to be 0x93 (147 decimal). Assuming you are writing the bit string as a LSB representation with the 0-th bit is the rightmost bit (the norm for little endian processor), as typically done by users of left to right natural languages such as English.

Does endianness apply to bit order too?

I haven't found a specific question here on SO, if this is a duplicate please point it out to me and I'll delete this.
So really, does endianness have anything to do with bit order?
This seems to imply the answer is NO, while other sources (I fail to find one right now, but surely I've read some time ago some articles) imply that endianness is the order of bytes and bits.
To be more specific: in a Big Endian architecture, where MSB is first, inside any byte, is also MSb first? Conversly, on Little Endian systems, where LSB is first, is LSb also first?
LAST EDIT: I found this which says "Bit order usually follows the same endianness as the byte order for a given computer system"
The other responses are not completely accurate. Yes, memory is byte addressable, so usually endianness stops there. But addressability isn't the only way to create well defined endianness.
In C, you can define bit fields. Bit fields have particular layouts; for example, the first bit field, if one bit, could be stored in either the msb or the lsb, and wrapping bit fields across byte boundaries in a big endian way is strikingly different than doing so in a little endian way. So if you define bit fields, you may have bit endianness.
But how these are arranged would have more to do with the compiler than the architecture, at least as a rule.
Endianness applied to only byte order. Not for bit order. The bit order remains same.
Why?
Memory is byte-addresseable. This is just a fancy way of saying that each address stores one byte. So You can change the order of bytes, not bits in memory.
Endianness only makes sense when you want to break a large value (such as a word) into several small ones. You must decide on an order to place it in memory.
Endianness only makes sense when you are breaking up a multi-byte quantity, and attempting to store the bytes at consecutive memory locations. But if you take in a register, it doesn't make sense. A register is simply a 32 bit quantity(depends on your processor/controller) and endianness does not apply to it.
Bit order usually follows the same endianness as the byte order for a given computer system - It is True
For more info Endianness
No, simply because you cannot address bits individually.
In modern computing endianness only applies to ordering of bytes, not bits.
Little-endian CPUs usually employ "LSB 0" bit numbering, however both bit numbering conventions can be seen in big-endian machines. Some architectures like SPARC and Motorola 68000 use "LSB 0" bit numbering, while S/390, PowerPC and PA-RISC use "MSB 0".[2]
please see
http://en.wikipedia.org/wiki/Bit_numbering
http://en.wikipedia.org/wiki/Most_significant_bit
Looking back at this question I asked some years ago, today I can add a partial response with an example where bit endianness exists and is important: in communication protocols. Any protocol specification needs to define which bit is sent first when an octet is pushed in a bit stream.
Endianness works on the basis of bytes not bits. Bits are not addressable.
What does "bits are not addressable" mean for dummies (like me and my students) ?
The smallest value/entity a computer can store/move/change is a byte = 8-bits. You cannot instruct it to read/write 3-bits, then 1-bit, then 11-bits, and tell the computer to memorize them as separate values. No, it only handles single bytes or byte packs/ranges.
Then the whatever application/process/hardware can play around with the bits inside a byte (or a native integer type, always a multiple of a byte).
Yet, once done playing, the stored values in memory/storage are still by "range" of bytes, which each single BYTE value - despite having their bits modified - gets directly updated with their new value at BYTE level.
That means, unless you build processors, you only actually deal with bytes, 0-255, period.
In records ? yes. Memory address ? yes. File data/stream ? yes. Network transfer ? YES GodDAMMIT !
As an example : lets say I'm on my LittleEndian computer having a 3 bytes (24-bits) record in memory in my application :
structure Test24 {
var16 : 0xC1D3, // 49619 = 1100 0001 1101 0011
var8 : 0xCA // 202 = 1100 1010
}
Record that I save raw in a file, which I pass on to a buddy, on his BigEndian machine. What he gets on his machine upon loading the record in the same application is :
structure Test24 {
var16 : 0xD3C1, // 54209 = 1101 0011 1100 0001 (bytes swapped)
var8 : 0xCA // 202 = 1100 1010 (same value)
}
The var16 is slightly different (corrupted), but that's because I didn't code my application to handle IO datas on a BigEndian architecture. He does not get the following record though where the bits gets entirely swapped :
structure Test24 {
var16 : 0xCB83, // 52099 = 1100 1011 1000 0011 (bits swapped)
var8 : 0x53 // 83 = 0101 0011 (different value)
}
^^ and that's because the file is read bytes-by-bytes, not bits-by-bits.
That's what "bits are not addressable" implies.
Student : But why are people being fancy about LSBit (LSB-0), MSBit, sometimes bit-endianness, like "it's very important knowledge" ?
Because that way of calling the bits actually gets the endianness concern out of the equation. Use the meaning of things in the right way, at least, use logic. The misunderstanding arises when we mix up talks about data, and talks about (processor) architecture. They-are-un-re-la-ted !
A data requires a context, like "this byte at this address contains the eight flags defining your access privileges on the system". That's NOT endianness concerns, that's what a bit at a given index is used for in the context of a defined data.
We, humans, are trained to read horizontally. But here is the problem :
What is byte 241 ? How we may represent what it contains ?
perhaps on little endian, LSB-0 on the left (?)
bit index : 0, 1, 2, 3, 4, 5, 6, 7 => 10001111b ?
or on big endian, LSB-0 on the right (?)
bit index : 7, 6, 5, 4, 3, 2, 1, 0 => looks more intuitive 11110001b
STOP SHOOTING YOURSELF IN THE FOOT ! Consider looking at it this way :
LSB 0 <- see : here it is, it doesn't matter where it is, it's just there
1
2
3
4
5
6
MSB 7 <- and here the other one, on the opposite side.
or this way :
MSB 7
6
5
4
3
2
1
0 LSBit
It doesn't matter where LSB-0 MSBit are as long as one is on the opposite side of the other. Get endianness out of the way ! Use the LSB-0 term to fix the start of your data encoding.
^^ now when you define "data flags order start at LSB-0", the flags being, "directory open, open pwd protected, directory copy, copy pwd protected, directory write, write pwd protected, directory delete, delete pwd protected", that means :
LSB 0 directory open
1 open pwd protected
2 directory copy
3 copy pwd protected
4 directory write subitem
5 write to pwd protected
6 delete subitem
MSB 7 delete pwd protected
So a byte value of 241 = LSB-0 : 1, 0, 0, 0, 1, 1, 1, MSB : 1 means
LSB 0 = 1 -> you can open directory
1 = 0 -> directory access is not password protected
2 = 0 -> you cannot copy directory
3 = 0 -> noop
4 = 1 -> you may create files or sub directories
5 = 1 -> but a password is required to create file or dir
6 = 1 -> you may delete the things in the directory
MSB 7 = 1 -> but a password is required to delete
There is no endianness involved at bit level when it comes to data, only bit position matters when you give a meaning to the value of a byte, and LSB-0 is always 0x01, no matter it's a big endian machine or a little endian one.
Byte values has nothing to do with how the machine treats a bit, most significant or not. So when you ask "could my data gets corrupted ?", noone should talk about MSBit or LSBit on the left or on the right. If you ask "what are the most volatile bits on this CPU ?", there you go, LSB/MSB positions is right in the middle of the talk (but are you into processor engineering ?)
The actual SO question here is not how most architectures are assumed to behave (memory addressing), or can/cannot do, that, we should ask over Theorical Computer Science or Software Engineering.
the real underlying question is when the data produced by my application, when it goes through network, IO, COM, hardware, whatever, gets its bits swapped, or gets corrupted when I failed to handle bit-order in my application.
Answer is : that doesn't happen, as long as you are not writing ROM data at hardware level, and you care about endianness at BYTE-level ONLY, when transferring data through different architectures/protocols. People are confused because of unnecessary endianness drama, and that endianness-thing is lurking where it should not.
At least, that's why I'm here, for students to start understanding what matters for a given challenge/concerns. They wouldn't be here if they already knew. Most people only have either little endian computer, or big endian, and don't have the privilege to cross check. They just don't know for sure, they get here via google and "bits are not addressable" errrmm... what ? what does that mean ? Should I swap my game data at byte or bit-level to be read correctly on all platforms or not ? I still don't know :/ That's the underlying reason why one would ask. Beyond the technical fact "bits are not addressable", there are that don't have the background to fully grasp the implications of such statement.
You only have to care about byte-endianness in the very rare case you actually encode your own made (or company specified) binary data through your application via your proprietary or borrowed stream read/write logic.
Don't want to handle endianness, use engines/frameworks with built-in read/write (like databases, clouds, etc.) or go plain text such as Json or XML.

how do I know if the case is true?

Let say if we are given a byte of binary data, how can you know what that data represents?
Is it true that you cant really know what the data represents because you need to know whether the one byte of binary data is represented in base 2, if it unsigned, signed, etc.
or is it that you can know what it represents since binary is base 2?
I am sorry to tell that a byte of data has nothing to do with it's supposed representation.
You state that because it's a byte, it's a binary representation. this is purely assumption.
It depends on the intention of the guy who store the very data.
It might represent anything. As #nos told you, it really depend on the convention the setter used to store it.
You may have a complementary to 2 number, a signed byte on 7 bit, un unsigned on 8 bits, an octal representation (or a partial representation) or a mask (each group of byte within the byte may describer something totally different than another). It could also be a representation of a special coding. Etc.
This is truly unlimited.
In order to properly interpret it you need to know the underlying convention (a spec). #fede1024 told you about files, which use special character so that you can double check with the convention.
One more thing… Bear in mind that even binary data can be stored in natural order or in reverse order: that's endianness. So when you examine a number store in at least 2 bytes, you have to know whether the most significant byte is stored first or sec on din memory. If you misinterpret this, you won't understand the underlying piece of data. Endianness is a constant for a given processor.
Base-2 and binary refer to the same thing. Typically, you do need to know whether the byte is signed or unsigned at least (in C). As for what the data represents - well, "it depends". Whether you want to interpret it as a single byte, as a character (or not), etc. With multi-byte data, you often also have to take endianness (ordering of the bytes into larger words) into account.
Some files format start with a magic number, for example all PNG files starts with 89 50 4E 47 0D 0A 1A 0A. That said, if you have a general binary file without any kind of magic number, you can just guess about his contents.
You can try to open it with an hexadecimal editor, but there is no automatic way to understand what the data represents.
You know it's base 2 since it's a byte of binary data, as you said. To see if it is true, in C everything but 0 is true. If it's 0, then it's false.

Resources