Meaning of leading zero in integer literal - c

I'm studying (ANSI) C by The C Programming Language (2nd Edition).
This is a code snippet from 2.10 Assignment Operators and Expressions:
1 /* bitcount() counts the number of 1-bits in its integer argument */
2 int bitcount(unsigned x)
3 {
4 int b;
5 for (b = 0; x != 0; x >>= 1)
6 if (x & 01)
7 b++;
8 return b;
9 }
I am confused why x & 01 is written in line 6 rather than x & 1 or x & 0x1? Is 0 before 1 necessary?

The 01 could also be written 1, or 0x1.
01 = Octal Constant Base 8 (1 Decimal)
1 = Decimal Constant. Base 10
0x1 = Hexadecimal Constant. Base 16 (1 Decimal).
When the book was written, Base 8 (octal) was pervasive in existing computer programming.

A leading 0 makes a constant into an octal (base 8) value.
In this case it's no different from 1 or 0x1 because they will all have the same binary representation of 000...001. People often use hex constants to distinguish that a value is being used as a bitmask or other bitwise value today. In the past, octal constants were more often used for the same purpose.
0x1 = hex constant = 1
0X1 = hex constant = 1
01 = octal contant = 1
1 = decimal constant = 1
1U = unsigned decimal constant = 1

Indeed "01" is distinct from 1 and 0x1 in principle. "01" is encoded in octal. Kernighan/Ritchie probably value the ease in which one can convert from octal to binary for this particular problem.

Related

Formula to convert byte array representing signed integer into integer

This question is more generic without a particular language. I am more interested in solving this generally across languages. Every answer I find references a built-in method of something like getInt32 to extract an integer from a byte array.
I have a byte array which contains the big-endian representation of a signed integer.
1 -> [0, 0, 0, 1]
-1 -> [255, 255, 255, 255]
-65535 -> [255, 255, 0, 1]
Getting the values for the positive cases are easy:
arr[3] | arr[2] << 8 | arr[1] << 16 | arr[0] << 24
What I would like to figure out is the more general case. I have been reading about 2s complement, which lead me to the python function from Wikipedia:
def twos_complement(input_value, num_bits):
'''Calculates a two's complement integer from the given input value's bits'''
mask = 2**(num_bits - 1) - 1
return -(input_value & mask) + (input_value & ~mask)
which in turn lead me to produce this function:
# Note that the mask from the wiki function has an additional - 1
mask = 2**(32 - 1)
def arr_to_int(arr):
uint_val = arr[3] | arr[2] << 8 | arr[1] << 16 | arr[0] << 24
if (determine_if_negative(uint_val)):
return -(uint_val & mask) + (uint_val & ~mask)
else:
return uint_val
In order for my function to work I need to fill in determine_if_negative (I should mask the signed bit and check if it is 1). But is there a standard formula to handle this? One thing I found is that in some languages, like Go, the bitshift might overflow the int value.
This is pretty hard to search because I get a thousand results explaining the difference between big-endian and little-endian or results explaining twos complement, and many more giving examples of using the standard library but I haven't seen a complete formula for bitwise functions.
Is there a canonical example in C or similar language of converting a char array using only array access and bitwise functions (ie, no memcpy or pointer casting or tricky stuff)
The bitwise method only works properly for unsigned values so you will need to build the unsigned integer and then convert to signed. The code could be:
int32_t val( uint8_t *s )
{
uint32_t x = ((uint32_t)s[0] << 24) + ((uint32_t)s[1] << 16) + ((uint32_t)s[2] << 8) + s[3];
return x;
}
Note, this assumes you are on a 2's complement system which also defines unsigned->signed conversion as no change in repesentation. If you want to support other systems too , it would be more complicated.
The casts are necessary so that the shift is performed over the right width.
Even c might be too high level for this. After all, the exact representation of int is machine dependent. On top of that, not all integer types on all systems are 2s complement.
When you mention a byte array and converting it to integer you must specify what format that byte array implies.
If you assume 2s complement and little endian (like intel/amd). Then the last byte contains the sign.
For simplicity's sake lets start with a 4 digit 2s complement integer,then byte byte, then 2 byte integers and then 4.
BIN SIGNED_DEC UNSIGNED_DEC
000 0 0
001 1 1
010 2 2
100 -4(oops) 4
101 -3 5
110 -1 6
111 -1 7
---
123
let each bit be b3,b2,b1, where b1 is the most significant bit(and sign)
then the formula would be:
b3*2^2+b2*2^1-b1*4
for a byte we have 4 bits and the formula would look like this:
b4*2^3 + b3*2^2+b2*2^1-b1*2^3
for 2 bytes it is the same but we have to multiple the most significant byte by 256 and the negative value would be 256^2 or 2^16.
/**
* returns calculated value of 2s complement bit string.
* expects string of bits 0or1. if a chanracter is not 1 it is considered 0.
*
*/
public static long twosComplementFromBitArray(String input) {
if(input.length()<2) throw new RuntimeException("intput too short ");
int sign=input.charAt(0)=='1'?1:0;
long unsignedComplementSum=1;
long unsignedSum=0;
for(int i=1;i<input.length();++i) {
char c=input.charAt(i);
int val=(c=='1')?1:0;
unsignedSum=unsignedSum*2+val;
unsignedComplementSum*=2;
}
return unsignedSum-sign*unsignedComplementSum;
}
public static void main(String[] args) {
System.out.println(twosComplementFromBitArray("000"));
System.out.println(twosComplementFromBitArray("001"));
System.out.println(twosComplementFromBitArray("010"));
System.out.println(twosComplementFromBitArray("011"));
System.out.println(twosComplementFromBitArray("100"));
System.out.println(twosComplementFromBitArray("101"));
System.out.println(twosComplementFromBitArray("110"));
System.out.println(twosComplementFromBitArray("111"));
}
outputs:
0
1
2
3
-4
-3
-2
-1

How does switch case work when the cases have number prepended with 0?

I am working on an Ingenico's EDC terminal. The below code was existing from previous implementation. While debugging I came across this block of code which I am facing difficulty in understanding.
short bankPEM = 0;
//bankPEM = 41; //Chip
bankPEM = 17; //Swipe
//bankPEM = 801; //Fallback
switch(bankPEM)
{
case 021: cout<<"021"; break; //Swipe
case 051: cout<<"051"; break; //Chip
case 801: cout<<"801"; break; //Fallback
default: cout<<"Default"; break;
}
bankPEM is a short variable. I found below exection observation:
When it contains 41, case 051 is executed.
When it contains 17, case 021 is executed.
When it contains 801, case 801 is executed.
I expected the code to executed default case for number 1 & 2.
Can anyone show some light in this case.
I am also converting the code to assembly language. I will share my understanding after debugging the assembly code.
Thanks in advance.
Referring to c standard
6.4.4.1 Integer constants
A decimal constant begins with a nonzero digit and consists of a sequence of decimal
digits. An octal constant consists of the prefix 0 optionally followed by a sequence of the
digits 0 through 7 only. A hexadecimal constant consists of the prefix 0x or 0X followed
by a sequence of the decimal digits and the letters a (or A) through f (or F) with values
10 through 15 respectively.
Emphasis mine
As the code shown seems/is c++ you can also refer to c++14
2.13.2 Integer literals
An integer literal is a sequence of digits that has no period or exponent part, with optional separating single
quotes that are ignored when determining its value. An integer literal may have a prefix that specifies its base
and a suffix that specifies its type. The lexically first digit of the sequence of digits is the most significant.
A binary integer literal (base two) begins with 0b or 0B and consists of a sequence of binary digits. An octal
integer literal (base eight) begins with the digit 0 and consists of a sequence of octal digits. A decimal
integer literal (base ten) begins with a digit other than 0 and consists of a sequence of decimal digits. A
hexadecimal integer literal (base sixteen) begins with 0x or 0X and consists of a sequence of hexadecimal
digits, which include the decimal digits and the letters a through f and A through F with decimal values
ten through fifteen. [ Example: The number twelve can be written 12, 014, 0XC, or 0b1100. The literals
1048576, 1’048’576, 0X100000, 0x10’0000, and 0’004’000’000 all have the same value. — end example ]
Emphasis mine
The two numbers 021 and 051 are written in octal form. If you convert them into decimal form you will get:
21 (base 8) = 1 * 8^0 + 2 * 8^1 = 1 + 16 = 17 (base 10)
51 (base 8) = 1 * 8^0 + 5 * 8^2 = 1 + 40 = 41 (base 10)
So, I think you see now why when bankPEM is 17, case 021 is executed and when it is 41, case 051 is executed.
I don't understand though why the person who implemented the code decided to write the switch cases like this (it isn't even consistent because the 3rd case has a number in base 10).
The problem is preceding zero in case declaration preceding zero make complier to think it's an octal number 021 therefore 17 base 10 = 21 base 8 hence the execution of of case 021 : same goes for 41 base 10 = 51 base 8.
short bankPEM = 0;
//bankPEM = 41; //Chip
bankPEM = 17; //Swipe
//bankPEM = 801; //Fallback
switch (bankPEM) {
case 021: cout << bankPEM << " " << 021;
break; //Swipe
case 051: cout << bankPEM << " " << 051;
break; //Chip
case 801: cout << "801";
break; //Fallback
default: cout << "Default";
break;
}
I hope this helps.

How is this bitwise AND operator masking the lower seven order bits of the number?

I am reading The C Programming Language by Brian Kernigan and Dennis Ritchie. Here is what it says about the bitwise AND operator:
The bitwise AND operator & is often used to mask off some set of bits, for example,
n = n & 0177
sets to zero all but the low order 7 bits of n.
I don't quite see how it is masking the lower seven order bits of n. Please can somebody clarify?
The number 0177 is an octal number representing the binary pattern below:
0000000001111111
When you AND it using the bitwise operation &, the result keeps the bits of the original only in the bits that are set to 1 in the "mask"; all other bits become zero. This is because "AND" follows this rule:
X & 0 -> 0 for any value of X
X & 1 -> X for any value of X
For example, if you AND 0177 and 0545454, you get
0000000001111111 -- 0000177
0101010101010101 -- 0545454
---------------- -------
0000000001010101 -- 0000154
In C an integer literal prefixed with 0 is an octal number so 0177 is an octal number.
Each octal digit (of value 0 to 7) is represented with 3 bits and 7 is the greatest value for each digit. So a value of 7 in octal means 3 bits set.
Since 0177 is an octal literal and each octal number is 3 three bits you have, the following binary equivalents:
7 = 111
1 = 001
Which means 0177 is 001111111 in binary.
It is already explained that the first '0' used for octal representation of a number in ANSI C. Actually, the number 0177 (octal) is same with 127 (in decimal), which is 128-1 and also can be represented as 2^7-1, and 2^n-1 in binary representation means take n 1's and put all the 1's to the right.
0177 = 127 = 128-1
which is a bitmask;
0000000000000000000000001111111
You can check the code down below;
Demo
#include <stdio.h>
int main()
{
int n = 0177; // octal representation of 127
printf("Decimal:[%d] : Octal:[%o]\n", n, n, n);
n = 127; // decimal representation of 127
printf("Decimal:[%d] : Octal:[%o]\n", n, n, n);
return 0;
}
Output
Decimal:[127] : Octal:[177]
Decimal:[127] : Octal:[177]
0177 is an octal value each digit is represented by 3 bits form the value 000 to 111 so 0177 translates to 001111111 (i.e 001|111|111) which if you consider in 32 bit binary ( can be 64 bit too except the remainder of the digits are populated as per the MSB i.e sign bit in this case value 0) form is 0000000000000000000000001111111 and and performing a bitwise with it for a given number, will output the lower 7 bits of the number turning of rest of the digits in the n-bit number to 0.
(since x&0 =0 & x&1=x e.g 0&0=0 ,1&0=0, 1&1=1 0&1=1)

Hex to Octal Conversion Program Without Using Decimal or Binary

Today I was just playing around for basic conversions from one base to another. I goggled some code for converting from hex to octal, and I noticed that it mostly uses intermediate conversion to either decimal or binary and then back to octal.Is it possible write my own function for converting hex string to octal string without using any intermediate conversion.Also I do not want to use inbuilt printf option like %x or %o. Thanks for your inputs.
Of course it is possible. A number is a number no matter what numeric system it is in. The only problem is that people are used to decimal and that is why they understand it better. You may convert from any base to any other.
EDIT: more info on how to perform the conversion.
First note that 3 hexadecimal digits map to exactly 4 octal digits. So having the number of hexadecimal digits you may find the number of octal digits easily:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int get_val(char hex_digit) {
if (hex_digit >= '0' && hex_digit <= '9') {
return hex_digit - '0';
} else {
return hex_digit - 'A' + 10;
}
}
void convert_to_oct(const char* hex, char** res) {
int hex_len = strlen(hex);
int oct_len = (hex_len/3) * 4;
int i;
// One hex digit left that is 4 bits or 2 oct digits.
if (hex_len%3 == 1) {
oct_len += 2;
} else if (hex_len%3 == 2) { // 2 hex digits map to 3 oct digits
oct_len += 3;
}
(*res) = malloc((oct_len+1) * sizeof(char));
(*res)[oct_len] = 0; // don't forget the terminating char.
int oct_index = oct_len - 1; // position we are changing in the oct representation.
for (i = hex_len - 1; i - 3 >= 0; i -= 3) {
(*res)[oct_index] = get_val(hex[i]) % 8 + '0';
(*res)[oct_index - 1] = (get_val(hex[i])/8+ (get_val(hex[i-1])%4) * 2) + '0';
(*res)[oct_index - 2] = get_val(hex[i-1])/4 + (get_val(hex[i-2])%2)*4 + '0';
(*res)[oct_index - 3] = get_val(hex[i-2])/2 + '0';
oct_index -= 4;
}
// if hex_len is not divisible by 4 we have to take care of the extra digits:
if (hex_len%3 == 1) {
(*res)[oct_index] = get_val(hex[0])%8 + '0';
(*res)[oct_index - 1] = get_val(hex[0])/8 + '0';
} else if (hex_len%3 == 2) {
(*res)[oct_index] = get_val(hex[1])%8 + '0';
(*res)[oct_index - 1] = get_val(hex[1])/8 + (get_val(hex[0])%4)*4 + '0';
(*res)[oct_index - 2] = get_val(hex[0])/4 + '0';
}
}
Also here is the example on ideone so that you can play with it: example.
It's a little tricky as you will be converting groups of 4 bits to groups of 3 bits - you'll probably want to work with 12 bits at a time, i.e. 3 hex digits to 4 octal digits and you'll then have to deal with any remaining bits separately.
E.g. to convert 5274 octal to hex:
5 2 7 4
101 010 111 100
|||/ \\// \|||
1010 1011 1100
A B C
All numbers in computer's memory are base 2. So whenever you want to actually DO something with the values (mathematical operations), you'll need them as ints, floats, etc. So it's handy or may come handy in the future to do the conversion via computable types.
I'd avoid direct string to string conversions, unless the values can be too big to fit into a numeric variable. It is surprisingly hard to write reliable converter from scratch.
(Using base 10 makes very little sense in a binary computer.)
Yes, you can do it relatively easily: four octal digits always convert to three hex digits, so you can split your string into groups of three hex digits, and process each group from the back. If you do not have enough hex digits to complete a group of three, add leading zeros.
Each hex digit gives you four bits; take the last three, and convert them to octal. Add the next four, and take three more bits to octal. Add the last group of four - now you have six bits in total, so convert them to two octal digits.
This avoids converting the entire number to a binary, although there will be a "sliding" binary window used in the process of converting the number.
Consider an example: converting 62ABC to octal. Divide into groups of three digits: 062 and ABC (note the added zero in front of 62 to make a group of three digits).
Start from the back:
C, or 1100, gets chopped into 1 and 100, making octal 4, and 1 extra bit for the next step
B, or 1011, gets chopped into 10 for the next step and 11 for this step. The 1 from the previous step is attached on the right of 11, making an octal 7
A, or 1010, gets chopped into 101 and 0. The 10 from the previous step is attached on the right, making 010, or octal 2. The 101 is octal 5, so we have 5274 so far.
2 becomes 2 and 0 for the next step;
6 becomes 4 and 01 for the next step;
0 becomes 0 and 1 (because 01 from the previous step is added).
The final result is 01425274.
Seems like a pretty straight forward task to me... You want a hex string and you want to convert it to an octal string. Let's take the ASCII hex and convert it to an int type to work with:
char hex_value[] = "0x123";
int value = strtol(hex_value,NULL,16);
It's still hex at this point, then if we want to convert from one base to another there's simple math that can be done:
123 / 8 = 24 R 3
24 / 8 = 4 R 4
4 / 8 = 0 R 4
This tells us that 12316 == 4438 so all we have to do is write that math into a basic function and put the final value back into a string:
char * convert_to_oct(int hex)
{
int ret = 0, quotient = 0, reminder = 0, dividend = hex, counter = 0, i;
char * ret_str; // returned string
while(dividend > 0){ // while we have something to divide
quotient = dividend / 0x8; // get the quotient
reminder = dividend - quotient * 0x8; // get the reminder
ret += reminder * pow(10, counter); // add the reminder (shifted)
// into our return value
counter++; // increment our shift
dividend = quotient; // get ready for the next divide operation
}
ret_str = malloc(counter); // allocate the right number of characters
sprintf(ret_str, "%d", ret); // store the result
return ret_str;
}
So this function will convert a hex (int) value into a oct string. You could call it like:
int main()
{
char hex_value[] = "0x123";
char * oct_value;
int value = strtol(hex_value,NULL,16);
// sanity check, see what the value should be before the convert
printf("value is %x, auto convert via printf gives %o\n", value, value);
oct_value = convert_to_oct(value);
printf("value is %s\n", oct_value);
All octal digits contain 3 bits of information. All Hex digits contain 12 bits of information. The least common multiple of 3 and 4 is 12.
This means you can build a simple lookup table
0000 = 0x000
0001 = 0x001
0002 = 0x002
...
0007 = 0x007
0010 = 0x008
0011 = 0x009
0012 = 0x00A
...
0017 = 0x00F
0020 = 0x010
...
5274 = 0xABC
...
Now that the idea is there, you have several choices:
Build a Map (lookup table)
The routine here would add leading zeros to the octal (string) number until it was 4 digits long, and then lookup the hexadecimal value from the table. Two variations are typing out the table statically, or populating it dynamically.
Use math to replace the lookup table
Instead of typing out each solution, you could calculate them
hexdigit1 = 01 & octaldigit8 + octaltdigit1;
hexdigit16 = 03 & octaldigit64 << 02 + 06 & octaldigit8 >> 01;
hexdigit256 = octaldigit512 << 01 + 01 & octaldigit64;
where the octaldigit1 / hexdigit16 / octaldigit8 means "octal 1's place", "hexadecimal 16's place", "octal 8's place" respectively.
Note that in either of these cases you don't "use binary" or "use decimal" but as these numbers can be represented in either of those two systems, it's not possible to avoid someone coming along behind and analyzing the correctness of the (or any) solution in decimal or binary terms.
Here is an easy function to convert your characters into javascript. valid for ALERT or for your pages up to 65536 32BITS. The concern you encounter is often for the text for codes beyond 127. The safest value is the OCTAL. ParseXX to avoid.
Thank you for your likes (^ _ ^). it's free to enjoy.
function enjoyOCTALJS (valuestr){
var arrstr = valuestr.split('');
arrstr = arrstr.map(f => (!isNaN(f.charCodeAt(0)))? (f.charCodeAt(0)>127)? '\\'+f.charCodeAt(0).toString(8):f:f);
return arrstr.join('');
}
If you just want to get the octal value of a character do this: Max = 65536 ou 0XFFFF.
var mchar = "à";
var result = mchar.charCodeAt(0).toString(8);
Or completely :
var mchar = 'à';
var result = mchar.codePointAt(0).toString(8);
If value > 65536 return UNDEFINED. You can use the function parameter to increase the value. charCodeAt(x) or codePointAt(x).
Your computer considers everything as 0 to 255.
We do not need to do big functions to convert characters, it's very easy.
CHAR TO UNICIDE
var mchar = 'à';
var result = mchar.codePointAt(0); or mchar.charCodeAt();
UNICODE TO OCTAL :
var mcode = 220;
var result = mcode.toString(8);
etc... :)

Converting Decimal to Hexadecimal and Octal

Show how to write a constant in C, whose decimal value is 65 as
a. a hexadecimal constant
65/16 = 1 r1
1/16 = 0 r1
Hexadecimal constant = 11
b. an octal constant (in C)
65/8 = 8 r1
8/8 = 1 r0
1/8 = 0 r1
Octal constant = 101
Is this the right way to convert constants in C?
You just need a while loop and a string. As this is homework, I do not think I should say more than that.
The method is to divide by the base until the result is less than the base.
So 65/8 gives 8 r1 but you don't stop there because the result is 8 not less than 8
You divide by 8 again and get 1
It should be
65/64 = 10 r 1 where 64 = 8x8 = octal 10
I don't think I've said too much
Maybe I am misunderstanding the questions, but it seems like you are being asked how hex and oct constants are represented in C, not how to implement an algorithm to convert dec to hex and oct.
If that is the case:
hex numbers are represented by a preceding 0x or 0X
oct numbers are represented by a preceding 0
int hex = 0x41;
int oct = 0101;
Of course, you can verify this by printing our the values in decimal:
printf("%d\n", hex);
printf("%d\n", oct);

Resources