size in bytes of a char number and int number - c

please tell me if I am wrong, if a number is stored as a character it will contain 1 byte per character of the number(not 4 bytes)?
for example if I make an int variable of the number 8 and a char variable of '8' the int variable will have consumed more memory?
and if I create an int variable as the number 12345 and a character array of "12345" the character array will have consumed more memory?
and in text files if numbers are stored are they considered as integers or characters?
thank you.

Yes, all of your answers are correct.
int will always take up sizeof(int) bytes, 8(int) assuming 32-bit int it will take 4 bytes, whereas 8(char) will take up one byte.
The way to think about your last question IMO is that data is stored as bytes. char and int are way of interpreting bytes, so in text files you write bytes, but if you want to write human-readable "8" into a text file, you must write this in some encoding, such as ASCII where bytes correspond to human-readable characters. So, to write "8" you would need to write the byte 0x38 (ASCII value of 8).
So, in files you have data, not int or chars.

When we consider the memory location for an int or for a char we think as a whole. Integers are commonly stored using a word of memory, which is 4 bytes or 32 bits, so integers from 0 up to 4,294,967,295 (232 - 1) can be stored in an int variable. As we need total 32 bits (32/8 = 4) hence we need 4 bytes for an int variable.
But to store a ascii character we need 7 bits. The ASCII table has 128 characters, with values from 0 through 127. Thus, 7 bits are sufficient to represent a character in ASCII; (However, most computers typically reserve 1 bits more, (i.e. 8 bits), for an ASCII character
And about your question:-
and if I create an int variable as the number 12345 and a character array of "12345" the character array will have consumed more memory?
Yes from the above definition it is true. In the first case(int value) it just need 4 bytes and for the second case it need total 5 bytes. The reason is in the first case 12345 is a single integer value and in the second case "12345" are total 5 ascii characters. Even in the second case, you actually need one more byte to hold the '\0' character as a part of a string (marks end of string).

When int is defined the memory would be allocated based on compiler option ( it can be 4 to 8 bytes). The number assigned to int is stored as is.
e.g int a = 86;
The number 86 would be stored at memory allocated for a.
When char is defined , there are numbers assigned to each character. When these character needs to be printed the same would print but when its stored in memory it would stored as number. These are called ASCII character, there are some more.
The allocation to store is 1Byte because with 1Byte you can represent 2^8 symbols.

if a number is stored as a character it will contain 1 byte per character of the number(not 4 bytes)? for example if I make an int variable of the number 8 and a char variable of '8' the int variable will have consumed more memory?
Yes, since it is guaranteed that (assuming 8-bit bytes):
sizeof(char) == 1
sizeof(int) >= 2
if I create an int variable as the number 12345 and a character array of "12345" the character array will have consumed more memory?
Correct. See the different between:
strlen("12345") == 5
sizeof(12345) >= 2
Of course, for small numbers like 7, it is not true:
strlen("7") == 1
sizeof(7) >= 2
in text files if numbers are stored are they considered as integers or characters?
To read any data (be it in a file or in a clay tablet!) you need to know its encoding.
If it is a text file, then typically the numbers will be encoded using characters, possibly in their decimal representation.
If it is a binary file, then you may find them written as they are stored in memory for a particular computer.
In short, it depends.

Related

Convert int/string to byte array with length n

How can I convert a value like 5 or "Testing" to an array of type byte with a fixed length of n byte?
Edit:
I want to represent the number 5 in bits. I know that it's 101, but I want it represented as array with a length of for example 6 bytes, so 000000 ....
I'm not sure what you are trying to accomplish here but all I can say is assuming you simply want to represent characters in the binary form of it's ASCII code, you can pad the binary representation with zeros. For example if the set number of characters you want is 10, then encoding the letter a (with ASCII code of 97) in binary will be 1100001, padded to 10 characters will be 0001100001, but that is for a single character to be encoded. The encoding of a string, which is made up of multiple characters will be a set of these 10 digit binary codes which represent the corresponding character in the ASCII table. The encoding of data is important so that the system knows how to interpret the binary data. Then there is also endianness depending on the system architecture - but that's less of an issue these days with more old and modern processors like the ARM processors being bi-endian.
So forget about representing the number 5 and the string "WTF" using
the same number of bytes - it makes the brain hurt. Stop it.
A bit more reading on character encoding will be great.
Start here - https://en.wikipedia.org/wiki/ASCII
Then this - https://en.wikipedia.org/wiki/UTF-8
Then brain hurt - https://en.wikipedia.org/wiki/Endianness

Convert a char* to uppercase in C without using a loop

Is it possible to convert a char* to uppercase without traversing character by character in a loop ?
Assumption:
1. Char pointer points to fixed size string array.
2. The array pointed to contains only lowercase characters
In the ASCII encoding, converting lowercase to uppercase amounts to setting the bit of weight 32 (i.e. 20H, the space character).
With a bitwise operator,
Char|= 0x20;
You can process several characters at a time by mapping longer data types on the array. For instance, to convert an array of 11 characters,
int ToUpper= 0x20202020;
*(int*) &Char[0]|= ToUpper;
*(int*) &Char[4]|= ToUpper;
*(short*)&Char[8]|= ToUpper;
Char[10]|= ToUpper;
You can go to 64 bit ints and even larger (up to 512 bits = 64 characters at a time) with the SIMD intrinsics (SSE, AVX).
If your code allows it, it is better to extend the buffer length to the next larger data type so that all bytes can be updated in a single operation. But don't forget to restore the terminating null.

I am having 24 individual bits (1 or 0), want to form a byte array of size 3 (forming 3bytes) with these bits and require a pointer for it

24bits corresponds to different parameters, with each parameter taking varying number of bits (taking up value either 1 or 0).
Need to create a byte array (or unsigned char array) so that all these 24bits will fit in and there will be no memory loss (which will happen in case of taking int data type)
Require a byte pointer so that upon incrementing the pointer it would seek the location 0,8,16 in the byte array formed.
I couldnt get a proper method to do it as I couldnt able to access in case of byte array (say from 0 to 23 bits). Kindly help me with this in C program
**Example:**
>**Available-**
>a=0; b=1; c=1; d=0;... till 23 parameters
>**Required-**
>byte arr[3] containing above 23 parameters
>byte *p
>p=arr to access every 8 bit
You can use bits field.
Where there is a guide
Bits field

Why char is of 1 byte in C language

Why is a char 1 byte long in C? Why is it not 2 bytes or 4 bytes long?
What is the basic logic behind it to keep it as 1 byte? I know in Java a char is 2 bytes long. Same question for it.
char is 1 byte in C because it is specified so in standards.
The most probable logic is. the (binary) representation of a char (in standard character set) can fit into 1 byte. At the time of the primary development of C, the most commonly available standards were ASCII and EBCDIC which needed 7 and 8 bit encoding, respectively. So, 1 byte was sufficient to represent the whole character set.
OTOH, during the time Java came into picture, the concepts of extended charcater sets and unicode were present. So, to be future-proof and support extensibility, char was given 2 bytes, which is capable of handling extended character set values.
Why would a char hold more than 1byte? A char normally represents an ASCII character. Just have a look at an ASCII table, there are only 256 characters in the (extended) ASCII Code. So you need only to represent numbers from 0 to 255, which comes down to 8bit = 1byte.
Have a look at an ASCII Table, e.g. here: http://www.asciitable.com/
Thats for C. When Java was designed they anticipated that in the future it would be enough for any character (also Unicode) to be held in 16bits = 2bytes.
It is because the C languange is 37 years old and there was no need to have more bytes for 1 char, as only 128 ASCII characters were used (http://en.wikipedia.org/wiki/ASCII).
When C was developed (the first book on it was published by its developers in 1972), the two primary character encoding standards were ASCII and EBCDIC, which were 7 and 8 bit encodings for characters, respectively. And memory and disk space were both of greater concerns at the time; C was popularized on machines with a 16-bit address space, and using more than a byte for strings would have been considered wasteful.
By the time Java came along (mid 1990s), some with vision were able to perceive that a language could make use of an international stnadard for character encoding, and so Unicode was chosen for its definition. Memory and disk space were less of a problem by then.
The C language standard defines a virtual machine where all objects occupy an integral number of abstract storage units made up of some fixed number of bits (specified by the CHAR_BIT macro in limits.h). Each storage unit must be uniquely addressable. A storage unit is defined as the amount of storage occupied by a single character from the basic character set1. Thus, by definition, the size of the char type is 1.
Eventually, these abstract storage units have to be mapped onto physical hardware. Most common architectures use individually addressable 8-bit bytes, so char objects usually map to a single 8-bit byte.
Usually.
Historically, native byte sizes have been anywhere from 6 to 9 bits wide. In C, the char type must be at least 8 bits wide in order to represent all the characters in the basic character set, so to support a machine with 6-bit bytes, a compiler may have to map a char object onto two native machine bytes, with CHAR_BIT being 12. sizeof (char) is still 1, so types with size N will map to 2 * N native bytes.
1. The basic character set consists of all 26 English letters in both upper- and lowercase, 10 digits, punctuation and other graphic characters, and control characters such as newlines, tabs, form feeds, etc., all of which fit comfortably into 8 bits.
You don't need more than a byte to represent the whole ascii table (128 characters).
But there are other C types which have more room to contain data, like int type (4 bytes) or long double type (12 bytes).
All of these contain numerical values (even chars! even if they're represented as "letters", they're "numbers", you can compare it, add it...).
These are just different standard sizes, like cm and m for lenght, .

How long can a char be?

Why does int a = 'adf'; compile and run in C?
The literal 'adf' is a multi-byte character constant. Its value is platform dependent. Don't use it.
For example, one some platform a 32-bit unsigned integer could take the value 0x00616466, and on another it could be 0x66646100, and on yet another it could be 0x84860081...
This, as Kerrek said, is a multi-byte character constant. It works because each character takes up 8 bits. 'adf' is 3 characters, which is 24 bits. An int is usually large enough to contain this.
But all of the above is platform dependent, and could be different from architecture to architecture. This kind of thing is still used in ancient Apple code, can't quite remember where, although file creator codes ring a bell.
Note the difference in syntax between " and '.
char *x = "this is a string. The value assigned to x is a pointer to the string in memory"
char y = '!' // the value assigned to y is the numerical character value of the character '!'
char z = 'asd' // the value of z is the numerical value of the 'string' data, which can in theory be expressed as an int if it's short enough
It works just because "adf" is 3 ASCII characters and thus 3 bytes long and your platform is a 24 bit or larger system. It would fail on a 16bit system for instance.
Its also worth remembering that although sizeof(char) will always return 1, dependending on platform and compiler more than 1 byte of memory space could be assigned to a char hence for
struct st
{
int a;
char c;
};
when you:
sizeof(st) a number of 32 bit systems will return 8. This is because the system will pad out the single byte for char c to 4 bytes.
ASCII. Every character has a numerical value. Halfway through this tutorial is a description if you need more information http://en.wikibooks.org/wiki/C_Programming/Variables
Edit_______________________________________
char letter2 = 97; /* in ASCII, 97 = 'a' */
This is considered by some to be extremely bad practice, if we are using it to store a character, not a small number, in that if someone reads your code, most readers are forced to look up what character corresponds with the number 97 in the encoding scheme. In the end, letter1 and letter2 store both the same thing – the letter "a", but the first method is clearer, easier to debug, and much more straightforward.
One important thing to mention is that characters for numerals are represented differently from their corresponding number, i.e. '1' is not equal to 1.

Resources