I have a value stored as an unsigned char *. It holds the SHA1 hash of a string. Since I'm using <openssl/evp.h> to generate the hashes, I end up with an unsigned char* holding the SHA1 value.
Now I want to iterate from a value until the end of the SHA1 image space. So if the value was a decimal int I would iterate with i = <original_value> and i++ till I reach the max possible value of the image space.
How do I do this over an unsigned char * value?
I am assuming your pointer refers to 20 bytes, for the 160 bit value. (An alternative may be text characters representing hex values for the same 160 bit meaning, but occupying more characters)
You can declare a class for the data, and implement a method to increment the low order unsigned byte, test it for zero, and if zero, increment the next higher order byte, and so on.
Related
Say I have the string (represented as a char pointer) given from a strSHA2 hash of a file:
"f731d405b522b69d79f2495f0963e48d534027cc1852dd99fa84ef1f5f3387ee"
How could I effectively turn it into an integer? Is there any way to cast it? atoi() terminates as soon as it reaches a char.
Would iterating through and converting char's using arithmetic such as letter - 'a' be the best way?
I intend to use it as an index for a hash table, thus need an integer.
Length of the integer would be the standard 32bit for C
You probaly want to transform the hexadecimal number made of the first 8 chars of the SHA2 string into an unsigned integer (32 bit) which sounds like a pretty good hash function to me as it is pretty unlikely that two different sha2 hashes start with the same 8 bytes:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
unsigned int GetHashValueFromSHA2String(const char *sha2string)
{
char first[9];
memcpy(first, sha2string, 8); // copy first 8 chars of sha2 string
first[8] = 0; // null terminate
return strtoul(first, NULL, 16);
}
int main()
{
unsigned int hashvalue = GetHashValueFromSHA2String("f731d405b522b69d79f2495f0963e48d534027cc1852dd99fa84ef1f5f3387ee");
printf("Hashvalue = %08x", hashvalue);
}
Or even simpler:
unsigned int GetHashValueFromSHA2String(const char *sha2string)
{
unsigned int value;
sscanf(sha2string, "%8x", &value);
return value;
}
Say I have the string (represented as a char pointer) given from a strSHA2 hash of a file:
That is then a hexadecimal representation of a 256 bit integer.
Your computer doesn't have a 256 bit integer type, so you can't cast that, possibly.
Instead, you'll want to use a different function from your hashing library that doesn't give you a printable string, but just 32 bytes of raw hash data. You can then use, say, the upper 2 bytes as hash table indices.
Using a 32 byte (256 bit) hash table index makes no sense – no computer in this world has enough memory for a table with 2²⁵⁶ entries.
Honestly, however, if you want a hash table, use an existing hash table instead of building your own.
In order to convert a hexadecimal string to a 32-bit unsigned integer data type, you can use the function strtoul.
However, a 32-bit unsigned integer data type is only able to represent numbers up to 232-1, which is insufficient in your example of a 256-bit number.
Therefore, it would only be possible to convert this number into eight 32-bit integers.
However, as pointed out in one of the other answers, it does not make sense to use a 256-bit index into a hash table. Since you can probably assume that all of the bits of a SHA-2 hash are sufficiently uniformly distributed for your use-case, it should be sufficient to simply take the first 10 or 16 bits of the SHA-2 hash and use them as an index into your hash table. That way, your hash table would have a length between 8 KiB or 512 KiB, assuming 8 bytes per hash table entry.
I just started learning C and am rather confused over declaring characters using int and char.
I am well aware that any characters are made up of integers in the sense that the "integers" of characters are the characters' respective ASCII decimals.
That said, I learned that it's perfectly possible to declare a character using int without using the ASCII decimals. Eg. declaring variable test as a character 'X' can be written as:
char test = 'X';
and
int test = 'X';
And for both declaration of character, the conversion characters are %c (even though test is defined as int).
Therefore, my question is/are the difference(s) between declaring character variables using char and int and when to use int to declare a character variable?
The difference is the size in byte of the variable, and from there the different values the variable can hold.
A char is required to accept all values between 0 and 127 (included). So in common environments it occupies exactly
one byte (8 bits). It is unspecified by the standard whether it is signed (-128 - 127) or unsigned (0 - 255).
An int is required to be at least a 16 bits signed word, and to accept all values between -32767 and 32767. That means that an int can accept all values from a char, be the latter signed or unsigned.
If you want to store only characters in a variable, you should declare it as char. Using an int would just waste memory, and could mislead a future reader. One common exception to that rule is when you want to process a wider value for special conditions. For example the function fgetc from the standard library is declared as returning int:
int fgetc(FILE *fd);
because the special value EOF (for End Of File) is defined as the int value -1 (all bits to one in a 2-complement system) that means more than the size of a char. That way no char (only 8 bits on a common system) can be equal to the EOF constant. If the function was declared to return a simple char, nothing could distinguish the EOF value from the (valid) char 0xFF.
That's the reason why the following code is bad and should never be used:
char c; // a terrible memory saving...
...
while ((c = fgetc(stdin)) != EOF) { // NEVER WRITE THAT!!!
...
}
Inside the loop, a char would be enough, but for the test not to succeed when reading character 0xFF, the variable needs to be an int.
The char type has multiple roles.
The first is that it is simply part of the chain of integer types, char, short, int, long, etc., so it's just another container for numbers.
The second is that its underlying storage is the smallest unit, and all other objects have a size that is a multiple of the size of char (sizeof returns a number that is in units of char, so sizeof char == 1).
The third is that it plays the role of a character in a string, certainly historically. When seen like this, the value of a char maps to a specified character, for instance via the ASCII encoding, but it can also be used with multi-byte encodings (one or more chars together map to one character).
Size of an int is 4 bytes on most architectures, while the size of a char is 1 byte.
Usually you should declare characters as char and use int for integers being capable of holding bigger values. On most systems a char occupies a byte which is 8 bits. Depending on your system this char might be signed or unsigned by default, as such it will be able to hold values between 0-255 or -128-127.
An int might be 32 bits long, but if you really want exactly 32 bits for your integer you should declare it as int32_t or uint32_t instead.
I think there's no difference, but you're allocating extra memory you're not going to use. You can also do const long a = 1;, but it will be more suitable to use const char a = 1; instead.
Hi how to modify character size as 2 bytes in C ? because the size of character in C is 1 byte only
You can't change any data type size. You need to change the data type depending on the values you are going to store..
If you need to store more than one character,use the array of character as below..
char a[2];
In above declaration variable 'a' will hold string of two characters..
You can use unsigned short. You cannot modify the data types.
You can do as following,
typedef unsigned short newChar;
int main()
{
newChar c = 'a';
}
No it is not possible. You cannot modify the character size to 2 bytes as the size of character is set to 1 byte by default. You cannot modify the data types. You can probably use a array of characters to store more than one character like:
char s[10];
On a side note:-
From here
Size qualifiers alter the size of the basic data types. There are two
size qualifiers that can be applied to integer: short and long. The
minimum size of short int is 16 bit. The size of int must be greater
than or equal to that of a short int. The size of long int must be
greater than or equal to a short int. The minimum size of a long int
is 32 bits.
C uses Ascii format of storing characters, so the range of these ASCII characters 0-255. 0-127 are general ascii character set. 127 onwards is extended set. So C supports 1 byte characters only. Whereas Java uses unicodes, which has a larger range, so the storage in Java for characters could be greater than 1 byte.
I have a buffer structure with a field
char inc_factor;
which is the amount of bytes to increment in a character array. The problem is that it must be able to hold a value up to 255. Obviously the easiest solution is to change it to unsigned char, but I'm not able to change the supplied structure definition. The function:
Buffer * b_create(short init_capacity, char inc_factor, char o_mode)
Takes in those parameters and return a pointer to a buffer. I was wondering how I would be able to fit the number 255 in a signed char.
You can convert the type:
unsigned char n = inc_factor;
Signed-to-unsigned conversion is well defined and does what you want, since all three char types are required to have the same width.
You may need to be careful on the calling end (or when you store the char in your structure) and do something like f(n - UCHAR_MAX) or so (since again, if this is negative and char is unsigned, all is well).
Lets use the term "byte" to represents 8-bits of storage in memory.
A byte with the value of "0xff" can be accessed either as an unsigned character or as a signed character.
BYTE byte = 0xff;
unsigned char* uc = (unsigned char*)&byte;
signed char* sc = (signed char*)&byte; // same as "char", the "signed" is a default.
printf("uc = %u, sc = %d\n", *uc, *sc);
(I chose to use pointers because I want to demonstrate that the underlying value stored in memory is the same).
Will output
uc = 255, sc = -1
"signed" numbers use the same storage space (number of bits) as unsigned, but they use the upper-most bit as a flag to tell the cpu whether to treat them as negative or not.
The bit pattern that represents "255" (11111111) unsigned is the same bit pattern that represents -1. The bit pattern "10000000" is either 128 or -127.
So you can store the number "255" in a signed char, by storing "-1" and then casting it to an unsigned int.
EDIT:
In-case you're wondering: negative numbers start "at the top" (i.e. 0xff/255) for computational convenience. Remember that the underlying storage is a byte, so if you take "0xff" and add 1, just using normal, unsigned cpu math, it produces the value "0x00". Which is the correct value for "i + 1" when "i = -1". Of course, it would be equally odd if negative numbers started with "-1" having the value 0x80/128.
You COULD cast it.
(unsigned char)inc_factor = 250;
And then you could read it back also with a cast :
if( (unsigned char)inc_factor == 250 ) {...}
However, that's really not best practices, It'll confuse anyone who has to maintain the code.
In addition, it's not going to help you if you're passing inc_factor into a function that expects a signed char.
There's no way to read that value as a signed char and get a value above 128.
I am using the following code to generate the MD5 hash for a string. The value printed in hex seems to be correct (i verified it on a website for the same string). However when I print the value as an integer it has 36 digits. My understanding is that it should have 16 digits because the hash generated is 128 bits long.
I'm want to know how the conversion from the unsigned char to int should be done and how can this be stored in a variable, so that it can ultimately be printed to a file.
It'll be nice of someone can explain how the values are being stored in the unsigned char, like how many bits is it taking to represent one digit of the hex and decimal and how can i convert between them. I tried sscanf and strtol but i guess i am not using them right.
int main (void)
{
char *str = "tell";
u_int8_t *output; //unsigned char
output = malloc(16 * sizeof(char));
int i = 0;
MD5_CTX ctx;
MD5Init(&ctx);
MD5Update(&ctx, str, strlen(str));
MD5Final(output, &ctx);
while(i < 16)
printf("%x",output[i++]);
printf("\n");
i = 0;
while(i < 16)
printf("%i",output[i++]);
printf("\n");
}
THe output here is
fe17ec3c451f132ef82a3a54e84a461e
254232366069311946248425884232747030
You are printing the decimal value of each byte (254, 23, 236, ...), not a single 128-bit value converted to an int. Your loop should look something like value <<= 8; value+=output[i++] (provided your value is a type which can accommodate such a big number).
Your expectancy for how many digits the result should be is also off; unless you print leading zeros, its length could be anything between 1 and len(2**128-1), which is 39 decimal digits.
By the by, you should also zero-pad your hex output, otherwise any byte with a value below 16 will print a single hex digit (printf("%02x",...)).
You can't store a 128-bit value in an int (unless int is at least 128 bits on your C implementation, which I confidently predict it isn't). Same goes for long long, the biggest standard integer type.
The decimal value you've printed is "254" (decimal for 0xfe), followed by "23" (decimal for 0x17), and so on. It's basically meaningless -- if you represented either 0x010001 or 0x0A01 like this you'd get the same string, 101. You got 36 digits because that happens to be the total number of decimal digits in each of the 16 byte values.
The hex value you've printed is 32 characters long (4 bits per character, 32 characters, 128 bits). This is actually a bit of luck, that each byte in your digest happens to be at least 0x10. You should print with %02x to include a lead 0 for small values.
If you want to represent a 128-bit value as a decimal string then you need either a bignum library or else long division. But it's fairly pointless to express MD5 checksums in decimal: when people represent them as strings they always use hex.
It'll be nice of someone can explain how the values are being stored in the unsigned char
8 bits of the digest in each unsigned char, 16 unsigned chars in the array, makes 128 bits. You can't use sscanf or strtol because the value stored by MD5Final is not a string.