openssl/sha.h seems to compute wrong hash - c

I am trying to compute the SHA1 value of a given string in C. I am using the OpenSSL library via #include <openssl/sha.h>. The relevant part of the program is below.
but it shouldn't cause any issues.
void checkHash(char* tempString) {
unsigned char testHash[SHA_DIGEST_LENGTH];
unsigned char* sha1String = (unsigned char*)tempString;
SHA1(sha1String, sizeof(sha1String), testHash);
printf("String: %s\nActual hash: 86f7e437faa5a7fce15d1ddcb9eaeaea377667b8\nComputed hash: ", tempString);
// I verified the actual hash for "a" using multiple online hash generators.
for (i = 0; i < SHA_DIGEST_LENGTH; i++)
printf("%x", testHash[i]);
printf("\n");
}
Running the program with checkHash("a"); yields the following output:
String: a
Actual hash: 86f7e437faa5a7fce15d1ddcb9eaeaea377667b8
Computed hash: 16fac7d269b6674eda4d9cafee21bb486556527c
How come these hashes do not match? I am running in a 64-bit Linux VM on top of a 64-bit Windows 7 machine. That has caused some problems with poor hashing implementations for me in the past but I doubt that is the issue using the OpenSSL version.

sizeof(sha1string) is the same thing as sizeof(unsigned char*), i.e. the size of a data pointer. You want to pass the string's length there, use strlen instead of sizeof, otherwise you won't be hashing what you think you're hashing.
If tempString isn't a null-terminated string but arbitrary data, you need to pass in the length of the data to checkHash, there's no way in that case to tell the length from within that function.

Related

Hashing a timestamp into a sha256 checksum in c

Quick question for those more experienced in c...
I want to compute a SHA256 checksum using the functions from openssl for the current time an operation takes place. My code consists of the following:
time_t cur_time = 0;
char t_ID[40];
char obuf[40];
char * timeBuf = malloc(sizeof(char) * 40 + 1);
sprintf(timeBuf, "%s", asctime(gmtime(&cur_time)));
SHA256(timeBuf, strlen(timeBuf), obuf);
sprintf(t_ID, "%02x", obuf);
And yet, when I print out the value of t_ID in a debug statement, it looks like 'de54b910'. What am I missing here?
Edited to fix my typo around malloc and also to say I expected to see the digest form of a sha256 checksum, in hex.
Since obuf is an array, printing its value causes it to decay to a pointer and prints the value of the memory address that the array is stored at. Write sensible code to print a 256-bit value.
Maybe something like:
for (int i = 0; i < 32; ++i)
printf("%02X", obuf[i]);
This is not really intended as an answer, I'm just sharing a code fragment with the OP.
To hash the binary time_t directly without converting the time to a string, you could use something like (untested):
time_t cur_time;
char t_ID[40];
char obuf[40];
gmtime(&cur_time);
SHA256(&cur_time, sizeof(cur_time), obuf);
// You know this doesn't work:
// sprintf(t_ID, "%02x", obuf);
// Instead see https://stackoverflow.com/questions/6357031/how-do-you-convert-buffer-byte-array-to-hex-string-in-c
How do you convert buffer (byte array) to hex string in C?
This doesn't address byte order. You could use network byte order functions, see:
htons() function in socket programing
http://beej.us/guide/bgnet/output/html/multipage/htonsman.html
One complication: the size of time_t is not specified, it can vary by platform. It's traditionally 32 bits, but on 64 bit machines it can be 64 bits. It's also usually the number of seconds since Unix epoc, midnight, January 1, 1970.
If you're willing to live with assumption that the resolution is seconds and don't have to worry about the code working in 20 years (see: https://en.wikipedia.org/wiki/Year_2038_problem) then you might use (untested):
#include <netinet/in.h>
time_t cur_time;
uint32_t net_cur_time; // cur_time converted to network byte order
char obuf[40];
gmtime(&cur_time);
net_cur_time = htonl((uint32_t)cur_time);
SHA256(&net_cur_time, sizeof(net_cur_time), obuf);
I'll repeat what I mentioned in a comment: it's hard to understand what you possibly hope to gain from this hash, or why you can't use the timestamp directly. Cryptographically secure hashes such as SHA256 go through a lot of work to ensure the hash is not reversible. You can't benefit from that because the input data is from a limited known set. At the very least, why not use CRC32 instead because it's much faster.
Good luck.

Received MAC address of available WiFi networks are not true

I am trying to find MAC address of available "Wi_Fi"s in this area but I receive wrong MAC address( at least I am sure about 1 access point MAC address here that I know is not the same with thing I receive) .
My code is:
char MAC[64];
int len=sizeof(MAC)/sizeof(int);
int i;
for(i=1;i<len;i++){
MyScanResults = WFScanList(i);
//unsigned long long testMac =MyScanResults.bssid[i];
unsigned char* pTestMac = (unsigned char*)&MyScanResults.bssid[i];
sprintf(MAC, "%02x:%02x:%02x:%02x:%02x:%02x",
(unsigned)pTestMac[6],
(unsigned)pTestMac[5],
(unsigned)pTestMac[4],
(unsigned)pTestMac[3],
(unsigned)pTestMac[2],
(unsigned)pTestMac[1]
);
and my expected answer is:
bssid: 00:12:17:C6:F4:36
but each time I receive some addresses like this and some times this address change also:
MAC: 73:6D:65:36:F4:C6
I have changed also order of numbers but nothing...
is there anyone to tell me where is my problem?
thanks
Regards
Your code doesn't make a lot of sense.
You call MyScanResults = WFScanList(i); before even declaring i. Also, the looping and indexing from 1 is very suspect.
I also think the use of i is very strange throughout, the calculation of a pointer into MyScanResults.bssid, effectively slicing it, can't be right.
I think your loop should be something like:
for(i=0; i < WFNetworkFound; i++)
{
const tWFNetwork myScanResults = WFScanList(i);
sprintf(MAC, "%02x:%02x:%02x:%02x:%02x:%02x",
myScanResult.ssid[0],
myScanResult.ssid[1],
myScanResult.ssid[2],
myScanResult.ssid[3],
myScanResult.ssid[4],
myScanResult.ssid[5]);
This assumes you've run the scan already so that the global variable WFNetworkFound has been updated. It also assumes that you're using openPicus, so that this reference code from which I picked up a thing or two is valid.

Using snprintf to print an array? Alternatives?

Is it at all possible to use snprintf to print an array? I know that it can take multiple arguments, and it expects at least as many as your formatting string suggests, but if I just give it 1 formatting string and an array of values, will it print the entire array to the buffer?
The reason I ask, is because I am modifying source code, and the current implementation only supported one value being placed in a string, but I am modifying it to support an array of values. I want to change the original implementation as little as possible.
If this doesn't work, is there another way someone would recommend to do this? Should I just suck it up and use a for loop (how well would that really work without stringbuffers)?
Essentially: What would be the best way to get all of the values from an array of doubles into the same string for a return?
No, there's no formatting specifier for that.
Sure, use a loop. You can use snprintf() to print each double after the one before it, so you never need to copy the strings around:
double a[] = { 1, 2, 3 };
char outbuf[128], *put = outbuf;
for(int = 0; i < sizeof a / sizeof *a; ++i)
{
put += snprintf(put, sizeof outbuf - (put - outbuf), "%f ", a[i]);
}
The above is untested, but you get the general idea. It separates each number with a single space, and also emits a trailing space which might be annoying.
It does not do a lot to protect itself against buffer overflow, generally for code like this you can know the range of the inputs and make sure the outbuf is big enough. For production code you would need to think about this of course, the point here is to show how to solve the core problem.
I decided to go with this:
int ptr = 0;
for( i = 0; i < size; i++)
{
ptr += snprintf(outbuf + ptr, sizeof(outbuf) - ptr, "%.15f ", values[i]);
}
slightly different, but to the same effect as in #unwind 's solution. I got this idea from the reference page for snprintf()

How to write into a char array in C at specific location using sprintf?

I am trying to port some code written in MATLAB to C, so that I can compile the function and execute it faster (the code is executed very often and it would bring a significant speed increase).
So basically what my MATLAB code does it that it takes a matrix and converts it to a string, adding brackets and commas, so I can write it to a text file. Here's an idea of how this would work for a vector MyVec:
MyVec = rand(1,5);
NbVal = length(MyVec)
VarValueAsText = blanks(2 + NbVal*30 + (NbVal-1));
VarValueAsText([1 end]) = '[]';
VarValueAsText(1 + 31*(1:NbVal-1)) = ',';
for i = 1:NbVal
VarValueAsText(1+(i-1)*31+(1:30)) = sprintf('%30.15f', MyVec(i));
end
Now, how can I achieve a similar result in C? It doesn't seem too difficult, since I can calculate in advance the size of my string (char array) and I know the position of each element that I need to write to my memory area. Also the sprintf function exists in C. However, I have trouble understanding how to set this up, also because I don't have an environment where I can learn easily by trial and error (for each attempt I have to recompile, which often leads to a segmentation fault and MATLAB crashing...).
I hope someone can help even though the problem will probably seem trivial, but I have have very little experience with C and I haven't been able to find an appropriate example to start from...
Given an offset (in bytes) into a string, retrieving a pointer to this offset is done simply with:
char *ptr = &string[offset];
If you are iterating through the lines of your matrix to print them, your loop might look as follow:
char *ptr = output_buffer;
for (i = 0; i < n_lines; i++) {
sprintf (ptr, "...", ...);
ptr = &ptr[line_length];
}
Be sure that you have allocated enough memory for your output buffer though.
Remember that sprintf will put a string-terminator at the end of the string it prints, so if the string you "print" into should be longer than the string you print, then that won't work.
So if you just want to overwrite part of the string, you should probably use sprintf to a temporary buffer, and then use memcpy to copy that buffer into the actual string. Something like this:
char temp[32];
sprintf(temp, "...", ...);
memcpy(&destination[position], temp, strlen(temp));

Best way to convert whole file to lowercase in C

I was wondering if theres a realy good (performant) solution how to Convert a whole file to lower Case in C.
I use fgetc convert the char to lower case and write it in another temp-file with fputc. At the end i remove the original and rename the tempfile to the old originals name. But i think there must be a better Solution for it.
This doesn't really answer the question (community wiki), but here's an (over?)-optimized function to convert text to lowercase:
#include <assert.h>
#include <ctype.h>
#include <stdio.h>
int fast_lowercase(FILE *in, FILE *out)
{
char buffer[65536];
size_t readlen, wrotelen;
char *p, *e;
char conversion_table[256];
int i;
for (i = 0; i < 256; i++)
conversion_table[i] = tolower(i);
for (;;) {
readlen = fread(buffer, 1, sizeof(buffer), in);
if (readlen == 0) {
if (ferror(in))
return 1;
assert(feof(in));
return 0;
}
for (p = buffer, e = buffer + readlen; p < e; p++)
*p = conversion_table[(unsigned char) *p];
wrotelen = fwrite(buffer, 1, readlen, out);
if (wrotelen != readlen)
return 1;
}
}
This isn't Unicode-aware, of course.
I benchmarked this on an Intel Core 2 T5500 (1.66GHz), using GCC 4.6.0 and i686 (32-bit) Linux. Some interesting observations:
It's about 75% as fast when buffer is allocated with malloc rather than on the stack.
It's about 65% as fast using a conditional rather than a conversion table.
I'd say you've hit the nail on the head. Temp file means that you don't delete the original until you're sure that you're done processing it which means upon error the original remains. I'd say that's the correct way of doing it.
As suggested by another answer (if file size permits) you can do a memory mapping of the file via the mmap function and have it readily available in memory (no real performance difference if the file is less than the size of a page as it's probably going to get read into memory once you do the first read anyway)
You can usually get a little bit faster on big inputs by using fread and fwrite to read and write big chunks of the input/output. Also you should probably convert a bigger chunk (whole file if possible) into memory and then write it all at once.
edit: I just rememberd one more thing. Sometimes programs can be faster if you select a prime number (at the very least not a power of 2) as the buffer size. I seem to recall this has to do with specifics of the cacheing mechanism.
If you're processing big files (big as in, say, multi-megabytes) and this operation is absolutely speed-critical, then it might make sense to go beyond what you've inquired about. One thing to consider in particular is that a character-by-character operation will perform less well than using SIMD instructions.
I.e. if you'd use SSE2, you could code the toupper_parallel like (pseudocode):
for (cur_parallel_word = begin_of_block;
cur_parallel_word < end_of_block;
cur_parallel_word += parallel_word_width) {
/*
* in SSE2, parallel compares are either about 'greater' or 'equal'
* so '>=' and '<=' have to be constructed. This would use 'PCMPGTB'.
* The 'ALL' macro is supposed to replicate into all parallel bytes.
*/
mask1 = parallel_compare_greater_than(*cur_parallel_word, ALL('A' - 1));
mask2 = parallel_compare_greater_than(ALL('Z'), *cur_parallel_word);
/*
* vector op - and all bytes in two vectors, 'PAND'
*/
mask = mask1 & mask2;
/*
* vector op - add a vector of bytes. Would use 'PADDB'.
*/
new = parallel_add(cur_parallel_word, ALL('a' - 'A'));
/*
* vector op - zero bytes in the original vector that will be replaced
*/
*cur_parallel_word &= !mask; // that'd become 'PANDN'
/*
* vector op - extract characters from new that replace old, then or in.
*/
*cur_parallel_word |= (new & mask); // PAND / POR
}
I.e. you'd use parallel comparisons to check which bytes are uppercase, and then mask both original value and 'uppercased' version (one with the mask, the other with the inverse) before you or them together to form the result.
If you use mmap'ed file access, this could even be performed in-place, saving on the bounce buffer, and saving on many function and/or system calls.
There is a lot to optimize when your starting point is a character-by-character 'fgetc' / 'fputc' loop; even shell utilities are highly likely to perform better than that.
But I agree that if your need is very special-purpose (i.e. something as clear-cut as ASCII input to be converted to uppercase) then a handcrafted loop as above, using vector instruction sets (like SSE intrinsics/assembly, or ARM NEON, or PPC Altivec), is likely to make a significant speedup possible over existing general-purpose utilities.
Well, you can definitely speed this up a lot, if you know what the character encoding is. Since you're using Linux and C, I'm going to go out on a limb here and assume that you're using ASCII.
In ASCII, we know A-Z and a-z are contiguous and always 32 apart. So, what we can do is ignore the safety checks and locale checks of the toLower() function and do something like this:
(pseudo code)
foreach (int) char c in the file:
c -= 32.
Or, if there may be upper and lowercase letters, do a check like
if (c > 64 && c < 91) // the upper case ASCII range
then do the subtract and write it out to the file.
Also, batch writes are faster, so I would suggest first writing to an array, then all at once writing the contents of the array to the file.
This should be considerable faster.

Resources