How to modify character size as 2 bytes in C? - c

Hi how to modify character size as 2 bytes in C ? because the size of character in C is 1 byte only

You can't change any data type size. You need to change the data type depending on the values you are going to store..
If you need to store more than one character,use the array of character as below..
char a[2];
In above declaration variable 'a' will hold string of two characters..

You can use unsigned short. You cannot modify the data types.

You can do as following,
typedef unsigned short newChar;
int main()
{
newChar c = 'a';
}

No it is not possible. You cannot modify the character size to 2 bytes as the size of character is set to 1 byte by default. You cannot modify the data types. You can probably use a array of characters to store more than one character like:
char s[10];
On a side note:-
From here
Size qualifiers alter the size of the basic data types. There are two
size qualifiers that can be applied to integer: short and long. The
minimum size of short int is 16 bit. The size of int must be greater
than or equal to that of a short int. The size of long int must be
greater than or equal to a short int. The minimum size of a long int
is 32 bits.

C uses Ascii format of storing characters, so the range of these ASCII characters 0-255. 0-127 are general ascii character set. 127 onwards is extended set. So C supports 1 byte characters only. Whereas Java uses unicodes, which has a larger range, so the storage in Java for characters could be greater than 1 byte.

Related

The size of a char and an int in C

In the C programming language, if an int is 4 bytes and letters are represented in ASCII as a number (also an int), then why is a char 1 byte?
A char is one byte because the standard says so. But that's not really what you are asking. In terms of the decimal values of a char it can hold from -128 to 127, have a look at a table for ASCII character codes, you'll notice that the decimal values of those codes are between 0 and 127, hence, they fit in positive values of a char. There are extended character sets that use unsigned char, values from 0 to 255.
6.2.5 Types
...
3 An object declared as type char is large enough to store any member of the basic execution character set. If a member of the basic execution character set is stored in a char object, its value is guaranteed to be nonnegative. If any other character is stored in a char object, the resulting value is implementation-defined but shall be within the range of values that can be represented in that type.
...
5 An object declared as type signed char occupies the same amount of storage as a ‘‘plain’’ char object. A ‘‘plain’’ int object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the range INT_MIN to INT_MAX as defined in the header <limits.h>).
C 2012 Online Draft
Type sizes are not defined in terms of bits, but in terms of the range of values that must be represented.
The basic execution character set consists of 96 or so characters (26 uppercase Latin characters, 26 lowercase latin characters, 10 decimal digits, 29 graphical characters, space, vertical tab, horizontal tab, line feed, form feed); 8 bits is more than sufficient to represent those.
int, OTOH, must be able to represent a much wider range of values; the minimum range as specified in the standard is [-32767..32767]1, although on most modern implementations it’s much wider.
The standard doesn’t assume two’s complement representation of signed integers, which is why INT_MIN is -32767 and not -32768.
In the C language, a char usually has a size of 8 bits.
In all the compilers that I have seen (which are, admittedly, not very many), the char is taken to be large enough to hold the ASCII character set (or the so called “extended ASCII”) and the size of the char data type is 8 bits (this includes compilers in major Desktop platforms, and a some embedded systems).
1 byte was sufficient to represent the whole character set.

Relationship between char and ASCII Code?

My computer science teacher taught us that which data type to declare depends on the size of the value for a variable you need. And then he demonstrated having a char add and subtract a number to output a different char. I remember he said this is something to do with ASCII Code. Can anyone explain this more specifically and clearly ? So, is char considerd as a number(since we can do math with it ) or a character or both? Can we print out the number behind a char?how?
So, is char considerd as a number or a character or both?
Both. It is an integer, but that integer value represents a character, as described by the character encoding of your system. The character encoding of the system that your computer science teacher uses happens to be ASCII.
Can we print out the number behind a char?how?
C++ (as the question used to be tagged):
The behaviour of the character output stream (such as std::cout) is to print the represented character when you insert an integer of type char. But the behaviour for all other integer types is to print the integer value. So, you can print the integer value of a char by converting it to another integer type:
std::cout << (unsigned)'c';
C:
There are no templated output streams, so you don't need to do explicit conversion to another integer (except for the signedness). What you need is the correct format specifier for printf:
printf("%hhu", (unsigned char)'c');
hh is for integer of size char, u is to for unsigned as you probably are interested in the unsigned representation.
A char can hold a number, it's the smallest integer type available on your machine and must have at least 8 bits. It is synonymous to a byte.
It's typical use is to store the codes of characters. Computers can only deal with numbers, so, to represent characters, numbers are used. Of course you must agree on which number means which character.
C doesn't require a specific character encoding, but most systems nowadays use a superset of ASCII (this is a very old encoding using only 7 bits) like e.g. UTF-8.
So, if you have a char that holds a character and you add or subtract some value, the result will be another number that happens to be the code for a different character.
In ASCII, the characters 0-9, a-z and A-Z have adjacent code points, therefore by adding e.g. 2 to A, the result will be C.
Can we print out the number behind a char?
Of course. It just depends whether you interpret the value in the char as just a number or as the code of a character. E.g. with printf:
printf("%c\n", 'A'); // prints the character
printf("%hhu\n", (unsigned char)'A'); // prints the number of the code
The cast to (unsigned char) is only needed because char is allowed to be either signed or unsigned, we want to treat it as unsigned here.
A char takes up a single byte. On systems with an 8 bit byte this gives it a range (assuming char is signed) of -128 to 127. You can print this value as follows:
char a = 65;
printf("a=%d\n", a);
Output:
65
The %d format specifier prints its argument as a decimal integer. If on the other hand you used the %c format specifier, this prints the character associated with the value. On systems that use ASCII, that means it prints the ASCII character associated with that number:
char a = 65;
printf("a=%c\n", a);
Output:
A
Here, the character A is printed because 65 is the ASCII code for A.
You can perform arithmetic on these numbers and print the character for the resulting code:
char a = 65;
printf("a=%c\n", a);
a = a + 1;
printf("a=%c\n", a);
Output:
A
B
In this example we first print A which is the ASCII character with code 65. We then add 1 giving us 66. Then we print the ASCII character for 66 which is B.
Every variable is stored in binary (i.e as a number,) chars, are just numbers of a specific size.
They represent a character when encoded using some character encoding, the ASCII standard (www.asciitable.com) is here.
As in the #Igor comment, if you run the following code; you see the ASCII character, Decimal and Hexadecimal representation of your char.
char c = 'A';
printf("%c %d 0x%x", c, c, c);
Output:
A 65 0x41
As an exercise to understand it better, you could make a program to generate the ASCII Table yourself.
My computer science teacher taught us that which data type to declare depends on the size of the value for a variable you need.
This is correct. Different types can represent different ranges of values. For reference, here are the various integral types and the minimum ranges they must be able to represent:
Type Minimum Range
---- -------------
signed char -127...127
unsigned char 0...255
char same as signed or unsigned char, depending on implementation
short -32767...32767
unsigned short 0...65535
int -32767...32767
unsigned int 0...65535
long -2147483647...2147483647
unsigned long 0...4294967295
long long -9223372036854775807...9223372036854775807
unsigned long long 0...18446744073709551615
An implementation may represent a larger range in a given type; for example, on most modern implementations, the range of an int is the same as the range of a long.
C doesn't mandate a fixed size (bit width) for the basic integral types (although unsigned types are the same size as their signed equivalent); at the time C was first developed, byte and word sizes could vary between architectures, so it was easier to specify a minimum range of values that the type had to represent and leave it to the implementor to figure out how to map that onto the hardware.
C99 introduced the stdint.h header, which defines fixed-width types like int8_t (8-bit), int32_t (32-bit), etc., so you can define objects with specific sizes if necessary.
So, is char considerd as a number(since we can do math with it ) or a character or both?
char is an integral data type that can represent values in at least the range [0...127]1, which is the range of encodings for the basic execution character set (upper- and lowercase Latin alphabet, decimal digits 0 through 9, and common punctuation characters). It can be used for storing and doing regular arithmetic on small integer values, but that's not the typical use case.
You can print char objects out as a characters or numeric values:
#include <limits.h> // for CHAR_MAX
...
printf( "%5s%5s\n", "dec", "char" );
printf( "%5s%5s\n", "---", "----" );
for ( char i = 0; i < CHAR_MAX; i++ )
{
printf("%5hhd%5c\n", i, isprint(i) ? i : '.' );
}
That code will print out the integral value and the associated character, like so (this is ASCII, which is what my system uses):
...
65 A
66 B
67 C
68 D
69 E
70 F
71 G
72 H
73 I
...
Control characters like SOH and EOT don't have an associated printing character, so for those value the code above just prints out a '.'.
By definition, a char object takes up a single storage unit (byte); the number of bits in a single storage unit must be at least 8, but could be more.
Plain char may be either signed or unsigned depending on the implementation so it can represent additional values outside that range, but it must be able to represent *at least* those values.

Difference between char and int when declaring character

I just started learning C and am rather confused over declaring characters using int and char.
I am well aware that any characters are made up of integers in the sense that the "integers" of characters are the characters' respective ASCII decimals.
That said, I learned that it's perfectly possible to declare a character using int without using the ASCII decimals. Eg. declaring variable test as a character 'X' can be written as:
char test = 'X';
and
int test = 'X';
And for both declaration of character, the conversion characters are %c (even though test is defined as int).
Therefore, my question is/are the difference(s) between declaring character variables using char and int and when to use int to declare a character variable?
The difference is the size in byte of the variable, and from there the different values the variable can hold.
A char is required to accept all values between 0 and 127 (included). So in common environments it occupies exactly
one byte (8 bits). It is unspecified by the standard whether it is signed (-128 - 127) or unsigned (0 - 255).
An int is required to be at least a 16 bits signed word, and to accept all values between -32767 and 32767. That means that an int can accept all values from a char, be the latter signed or unsigned.
If you want to store only characters in a variable, you should declare it as char. Using an int would just waste memory, and could mislead a future reader. One common exception to that rule is when you want to process a wider value for special conditions. For example the function fgetc from the standard library is declared as returning int:
int fgetc(FILE *fd);
because the special value EOF (for End Of File) is defined as the int value -1 (all bits to one in a 2-complement system) that means more than the size of a char. That way no char (only 8 bits on a common system) can be equal to the EOF constant. If the function was declared to return a simple char, nothing could distinguish the EOF value from the (valid) char 0xFF.
That's the reason why the following code is bad and should never be used:
char c; // a terrible memory saving...
...
while ((c = fgetc(stdin)) != EOF) { // NEVER WRITE THAT!!!
...
}
Inside the loop, a char would be enough, but for the test not to succeed when reading character 0xFF, the variable needs to be an int.
The char type has multiple roles.
The first is that it is simply part of the chain of integer types, char, short, int, long, etc., so it's just another container for numbers.
The second is that its underlying storage is the smallest unit, and all other objects have a size that is a multiple of the size of char (sizeof returns a number that is in units of char, so sizeof char == 1).
The third is that it plays the role of a character in a string, certainly historically. When seen like this, the value of a char maps to a specified character, for instance via the ASCII encoding, but it can also be used with multi-byte encodings (one or more chars together map to one character).
Size of an int is 4 bytes on most architectures, while the size of a char is 1 byte.
Usually you should declare characters as char and use int for integers being capable of holding bigger values. On most systems a char occupies a byte which is 8 bits. Depending on your system this char might be signed or unsigned by default, as such it will be able to hold values between 0-255 or -128-127.
An int might be 32 bits long, but if you really want exactly 32 bits for your integer you should declare it as int32_t or uint32_t instead.
I think there's no difference, but you're allocating extra memory you're not going to use. You can also do const long a = 1;, but it will be more suitable to use const char a = 1; instead.

What's the difference between these two uses of sizeof() in C?

If I do sizeof('r'), the character 'r' requires 4 bytes in memory. Alternatively, if I first declare a char variable and initialize it like so:
char val = 'r';
printf("%d\n", sizeof(val));
The output indicates that 'r' only requires 1 byte in memory.
Why is this so?
This is because the constant 'c' is interpreted as an int.
If you run this:
printf("%d\n", sizeof( (char) 'c' ) );
it will print 1.
In C literal 'c' is called integer character constant and according to the C Standard:
10 An integer character constant has type int.
On the other hand, in C++ this literal is called character literal and according to the C++ Standard:
An ordinary character literal that contains a single c-char
representable in the execution character set has type char.
In this declaration
char val = 'r';
variable val is explicitly declared as having type char. In both the languages sizeof( char ) is equal to 1.
This is because the literal 'r' is considered an integer and its value is its ASCII value. An int requires generally 4 bytes hence the output. With the second case you are explicitly declaring it as a character, hence it outputs 1.
If you try this line printf("%d",(10+'c')); It will print 109 as the output i.e. (10+99).
For some clarification you may want to take a look at this table.
http://goo.gl/nOa5ju (ascii table for chars)
Firstly, in C there are two types of int. 16 bit (2 byte) and 32 bit (4 byte).
A constant char in C is considered an int which relates to the character it represents on the table. The decimal value of 'c' is 99 (2 bytes per).
There you go, you got the char or in other words int value being 99 or 4 bytes.
On the other hand though the char var = 'c'; is a 1 byte value because ASCII is represented with 8 bits (1 byte).
table of c type sizes http://goo.gl/yhxmSF

How to increment the value of an unsigned char *

I have a value stored as an unsigned char *. It holds the SHA1 hash of a string. Since I'm using <openssl/evp.h> to generate the hashes, I end up with an unsigned char* holding the SHA1 value.
Now I want to iterate from a value until the end of the SHA1 image space. So if the value was a decimal int I would iterate with i = <original_value> and i++ till I reach the max possible value of the image space.
How do I do this over an unsigned char * value?
I am assuming your pointer refers to 20 bytes, for the 160 bit value. (An alternative may be text characters representing hex values for the same 160 bit meaning, but occupying more characters)
You can declare a class for the data, and implement a method to increment the low order unsigned byte, test it for zero, and if zero, increment the next higher order byte, and so on.

Resources