Handling cryptic output when printing bytes array in C - c

I've recently encountered a problem with a script I wrote.
#include "md5.h"
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
char* hashfunc (char* word1, char* word2){
MD5_CTX md5;
MD5_Init(&md5);
char * word = (char *) malloc(strlen(word1)+ strlen(word2) );
strcpy(word,word1);
strcat(word,word2);
MD5_Update(&md5,word,strlen(word));
unsigned char* digest = malloc(sizeof(char)* 16); //MD5 generates 128bit hashes, each char has 4 bit, 128/4 = 32
MD5_Final(digest,&md5);
return digest;
}
int main(int argc, char* argv[]){
char* a = argv[1];
char* b = argv[2];
unsigned char* ret = hashfunc(a,b);
printf("%s\n",ret);
printf("%i\n",sizeof(ret));
}
As the hash function returns an array of unsigned chars I thought I'd print that as is.
Unfortunately, the following is my output:
��.�#a��%Ćw�0��
which, according to sizeof() is 8 bytes long.
How do I convert that to a readable format?
Thanks in advance!
PS:
Output should look like this:
1c0143c6ac7505c8866d10270a480dec

Firstly, sizeof a pointer will give the size of a pointer to char, which is the size of a word in your machine (I suppose it’s 64-bit, since your size returned 8). Pointers do not carry information of the size of the pointer, you’d have to return it elsewhere.
Anyway, since you know that a MD5 digest has 16 bytes, you can just iterate over each of them and print each byte in a more readable format using sprintf. Something like that:
for (int i = 0; i < 16; i++)
printf("%02x", (int)(unsigned char)digest[i]);
putchar('\n');
If you want to print it to a file, change printf to fprintf and putchar to fputc (the arguments change a bit however).
To put it into a string, you’d have to sprint each byte in the correct position of the string, something like this:
char* str = malloc(33 * sizeof(char));
for (int i = 0; i < 16; i++)
sprintf(str+2*i, "%02x", (int)(unsigned char)digest[i]);
P.S: don’t forget to free everything after.

There is no guarantee that your hashfunc is going to produce printable ASCII strings. In theory since they are really just binary data they could have embedded 0s which will screw up all the normal string handling functions anyway.
Best bet is to print each unsigned char as an unsigned char via a for loop.
void printhash(unsigned char* hash)
{
for(int i = 0; i < 16; i++)
{
printf("%02x", hash[i]);
}
printf("\n");
}

Related

Caesar Encryption in C

Hello I am working on a Caesar encryption program. Right now it takes as an input the file with the message to encrypt, the key
The input is currently in this format:
"text.txt", "ccc"
I need to convert this into taking a number so that it fits my requirements, so something like this:
"text.txt", "3"
Then i need to convert this "3" back into "ccc" so that the program still works. The logic being that 3 translates to the third letter of the alphabet "c", and is repeated 3 times. Another example would be if the key entered is "2", it should return "bb".
This is what i have so far but its giving me a lot of warnings and the function does not work correctly.
#include <stdio.h>
void number_to_alphabet_string(int n) {
char buffer[n];
char *str;
str = malloc(256);
char arr[8];
for(int i = 0; i < n; i++) {
buffer[i] = n + 64;
//check ASCII table the difference is fixed to 64
arr[i] = buffer[i];
strcat(str, arr);
}
printf(str);
}
int main(int argc, char *argv[]) {
const char *pt_path = argv[1]; //text.txt
char *key = argv[2]; //3
number_to_alphabet_string((int)key); //should change '3' to 'CCC'
}
Your problem is that you have a function
void number_to_alphabet_string(int n)
that takes an int but you call it with a char*
char* key = argv[2]; //3
number_to_alphabet_string(key);
My compiler says
1>C:\work\ConsoleApplication3\ConsoleApplication3.cpp(47,34): warning C4047: 'function': 'int' differs in levels of indirection from 'char *'
You need
char* key = argv[2]; //3
number_to_alphabet_string(atoi(key));
to convert that string to a number
With char *key = argv[2];, the cast (int) key does not reinterpret the contents of that string as a valid integer. What that cast does is take the pointer value of key, and interprets that as an integer. The result of this is implementation-defined, or undefined if the result cannot be represented in the integer type (a likely outcome if sizeof (int) < sizeof (char *)).
The C standard does not define any meaning for these values.
Here is a test program that, depending on your platform, should give you an idea of what is happening (or failing to happen)
#include <stdio.h>
int main(int argc, char **argv) {
if (sizeof (long long) >= sizeof (char *))
printf("Address %p as an integer: %lld (%llx)\n",
(void *) argv[0],
(long long) argv[0],
(long long) argv[0]);
}
As an example of implementation-defined behaviour, on my system this prints something like
Address 0x7ffee6ffdb70 as an integer: 140732773948272 (7ffee6ffdb70)
On my system, casting that same pointer value to (int) results in undefined behaviour.
Note that intptr_t and uintptr_t are the proper types for treating a pointer value as an integer, but these types are optional.
To actually convert a string to an integer, you can use functions such as atoi, strtol, or sscanf. Each of these have their pros and cons, and different ways of handling / reporting bad input.
Examples without error handling:
int three = atoi("3");
long four = strtol("4", NULL, 10);
long long five;
sscanf("5", "%lld", &five);
number_to_alphabet_string has a few problems.
malloc can fail, returning NULL. You should be prepared to handle this event.
In the event malloc succeeds, the contents of its memory are indeterminate. This means that you need to initialize (at least partially) the memory before passing it to a function like strcat, which expects a proper null terminated string. As is, strcat(str, arr); will result in undefined behaviour.
Additionally, memory allocated by malloc should be deallocated with free when you are done using it, otherwise you will create memory leaks.
char *foo = malloc(32);
if (foo) {
foo[0] = '\0';
strcat(foo, "bar");
puts(foo);
free(foo);
}
In general, strcat and the additional buffers are unnecessary. The use of char arr[8]; in particular is unsafe, as arr[i] = buffer[i]; can easily access the array out-of-bounds if n is large enough.
Additionally, in strcat(str, arr);, arr is also never null terminated (more UB).
Note also that printf(str); is generally unsafe. If str contains format specifiers, you will again invoke undefined behaviour when the matching arguments are not provided. Use printf("%s", str), or perhaps puts(str).
As far as I can tell, you simply want to translate your integer value n into the uppercase character it would be associated with if A=1, B=2, ... and repeat it n times.
To start, there is no need for buffers of any kind.
void number_to_alphabet_string(int n) {
if (1 > n || n > 26)
return;
for (int i = 0; i < n; i++)
putchar('A' + n - 1);
putchar('\n');
}
When passed 5, this will print EEEEE.
If you want to create a string, ensure there is an additional byte for the terminating character, and that it is set. calloc can be used to zero out the buffer during allocation, effectively null terminating it.
void number_to_alphabet_string(int n) {
if (1 > n || n > 26)
return;
char *str = calloc(n + 1, 1);
if (str) {
for (int i = 0; i < n; i++)
str[i] = 'A' + n - 1;
puts(str);
free(str);
}
}
Note that dynamic memory is not actually needed. char str[27] = { 0 }; would suffice as a buffer for the duration of the function.
A cursory main for either of these:
#include <stdio.h>
#include <stdlib.h>
void number_to_alphabet_string(int n);
int main(int argc, char *argv[]) {
if (argc > 1)
number_to_alphabet_string(atoi(argv[1]));
}
Note that with an invalid string, atoi simply returns 0, which is indistinguishable from a valid "0" - a sometimes unfavourable behaviour.
You can't use a cast to cast from a char array to an int, you have to use functions, such as atoi().
You never free your str after you allocate it. You should use free(str) when you no longer need it. Otherwise, it will cause a memory leak, which means the memory that you malloc() will always be occupied until your process dies. C doesn't have garbage collection, so you have to do it yourself.
Don't write things such as char buffer[n];, it can pass the compile of GCC, but it can't in MSVC.
And that isn't the stander way of declaring an array with variable length. use
char* buffer = malloc(n);
//don't forget to free() in order to avoid a memory leak
free(buffer);

Copying strings from extern char environ in C

I have a question pertaining to the extern char **environ. I'm trying to make a C program that counts the size of the environ list, copies it to an array of strings (array of array of chars), and then sorts it alphabetically with a bubble sort. It will print in name=value or value=name order depending on the format value.
I tried using strncpy to get the strings from environ to my new array, but the string values come out empty. I suspect I'm trying to use environ in a way I can't, so I'm looking for help. I've tried to look online for help, but this particular program is very limited. I cannot use system(), yet the only help I've found online tells me to make a program to make this system call. (This does not help).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
extern char **environ;
int main(int argc, char *argv[])
{
char **env = environ;
int i = 0;
int j = 0;
printf("Hello world!\n");
int listSZ = 0;
char temp[1024];
while(env[listSZ])
{
listSZ++;
}
printf("DEBUG: LIST SIZE = %d\n", listSZ);
char **list = malloc(listSZ * sizeof(char**));
char **sorted = malloc(listSZ * sizeof(char**));
for(i = 0; i < listSZ; i++)
{
list[i] = malloc(sizeof(env[i]) * sizeof(char)); // set the 2D Array strings to size 80, for good measure
sorted[i] = malloc(sizeof(env[i]) * sizeof(char));
}
while(env[i])
{
strncpy(list[i], env[i], sizeof(env[i]));
i++;
} // copy is empty???
for(i = 0; i < listSZ - 1; i++)
{
for(j = 0; j < sizeof(list[i]); j++)
{
if(list[i][j] > list[i+1][j])
{
strcpy(temp, list[i]);
strcpy(list[i], list[i+1]);
strcpy(list[i+1], temp);
j = sizeof(list[i]); // end loop, we resolved this specific entry
}
// else continue
}
}
This is my code, help is greatly appreciated. Why is this such a hard to find topic? Is it the lack of necessity?
EDIT: Pasted wrong code, this was a separate .c file on the same topic, but I started fresh on another file.
In a unix environment, the environment is a third parameter to main.
Try this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char *argv[], char **envp)
{
while (*envp) {
printf("%s\n", *envp);
*envp++;
}
}
There are multiple problems with your code, including:
Allocating the 'wrong' size for list and sorted (you multiply by sizeof(char **), but should be multiplying by sizeof(char *) because you're allocating an array of char *. This bug won't actually hurt you this time. Using sizeof(*list) avoids the problem.
Allocating the wrong size for the elements in list and sorted. You need to use strlen(env[i]) + 1 for the size, remembering to allow for the null that terminates the string.
You don't check the memory allocations.
Your string copying loop is using strncpy() and shouldn't (actually, you should seldom use strncpy()), not least because it is only copying 4 or 8 bytes of each environment variable (depending on whether you're on a 32-bit or 64-bit system), and it is not ensuring that they're null terminated strings (just one of the many reasons for not using strncpy().
Your outer loop of your 'sorting' code is OK; your inner loop is 100% bogus because you should be using the length of one or the other string, not the size of the pointer, and your comparisons are on single characters, but you're then using strcpy() where you simply need to move pointers around.
You allocate but don't use sorted.
You don't print the sorted environment to demonstrate that it is sorted.
Your code is missing the final }.
Here is some simple code that uses the standard C library qsort() function to do the sorting, and simulates POSIX strdup()
under the name dup_str() — you could use strdup() if you have POSIX available to you.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
extern char **environ;
/* Can also be spelled strdup() and provided by the system */
static char *dup_str(const char *str)
{
size_t len = strlen(str) + 1;
char *dup = malloc(len);
if (dup != NULL)
memmove(dup, str, len);
return dup;
}
static int cmp_str(const void *v1, const void *v2)
{
const char *s1 = *(const char **)v1;
const char *s2 = *(const char **)v2;
return strcmp(s1, s2);
}
int main(void)
{
char **env = environ;
int listSZ;
for (listSZ = 0; env[listSZ] != NULL; listSZ++)
;
printf("DEBUG: Number of environment variables = %d\n", listSZ);
char **list = malloc(listSZ * sizeof(*list));
if (list == NULL)
{
fprintf(stderr, "Memory allocation failed!\n");
exit(EXIT_FAILURE);
}
for (int i = 0; i < listSZ; i++)
{
if ((list[i] = dup_str(env[i])) == NULL)
{
fprintf(stderr, "Memory allocation failed!\n");
exit(EXIT_FAILURE);
}
}
qsort(list, listSZ, sizeof(list[0]), cmp_str);
for (int i = 0; i < listSZ; i++)
printf("%2d: %s\n", i, list[i]);
return 0;
}
Other people pointed out that you can get at the environment via a third argument to main(), using the prototype int main(int argc, char **argv, char **envp). Note that Microsoft explicitly supports this. They're correct, but you can also get at the environment via environ, even in functions other than main(). The variable environ is unique amongst the global variables defined by POSIX in not being declared in any header file, so you must write the declaration yourself.
Note that the memory allocation is error checked and the error reported on standard error, not standard output.
Clearly, if you like writing and debugging sort algorithms, you can avoid using qsort(). Note that string comparisons need to be done using strcmp(), but you can't use strcmp() directly with qsort() when you're sorting an array of pointers because the argument types are wrong.
Part of the output for me was:
DEBUG: Number of environment variables = 51
0: Apple_PubSub_Socket_Render=/private/tmp/com.apple.launchd.tQHOVHUgys/Render
1: BASH_ENV=/Users/jleffler/.bashrc
2: CDPATH=:/Users/jleffler:/Users/jleffler/src:/Users/jleffler/src/perl:/Users/jleffler/src/sqltools:/Users/jleffler/lib:/Users/jleffler/doc:/Users/jleffler/work:/Users/jleffler/soq/src
3: CLICOLOR=1
4: DBDATE=Y4MD-
…
47: VISUAL=vim
48: XPC_FLAGS=0x0
49: XPC_SERVICE_NAME=0
50: _=./pe17
If you want to sort the values instead of the names, you have to do some harder work. You'd need to define what output you wish to see. There are multiple ways of handling that sort.
To get the environment variables, you need to declare main like this:
int main(int argc, char **argv, char **env);
The third parameter is the NULL-terminated list of environment variables. See:
#include <stdio.h>
int main(int argc, char **argv, char **environ)
{
for(size_t i = 0; env[i]; ++i)
puts(environ[i]);
return 0;
}
The output of this is:
LD_LIBRARY_PATH=/home/shaoran/opt/node-v6.9.4-linux-x64/lib:
LS_COLORS=rs=0:di=01;34:ln=01;36:m
...
Note also that sizeof(environ[i]) in your code does not get you the length of
the string, it gets you the size of a pointer, so
strncpy(list[i], environ[i], sizeof(environ[i]));
is wrong. Also the whole point of strncpy is to limit based on the destination,
not on the source, otherwise if the source is larger than the destination, you
will still overflow the buffer. The correct call would be
strncpy(list[i], environ[i], 80);
list[i][79] = 0;
Bare in mind that strncpy might not write the '\0'-terminating byte if the
destination is not large enough, so you have to make sure to terminate the
string. Also note that 79 characters might be too short for storing env variables. For example, my LS_COLORS variable
is huge, at least 1500 characters long. You might want to do your list[i] = malloc calls based based on strlen(environ[i])+1.
Another thing: your swapping
strcpy(temp, list[i]);
strcpy(list[i], list[i+1]);
strcpy(list[i+1], temp);
j = sizeof(list[i]);
works only if all list[i] point to memory of the same size. Since the list[i] are pointers, the cheaper way of swapping would be by
swapping the pointers instead:
char *tmp = list[i];
list[i] = list[i+1];
list[i+1] = tmp;
This is more efficient, is a O(1) operation and you don't have to worry if the
memory spaces are not of the same size.
What I don't get is, what do you intend with j = sizeof(list[i])? Not only
that sizeof(list[i]) returns you the size of a pointer (which will be constant
for all list[i]), why are you messing with the running variable j inside the
block? If you want to leave the loop, the do break. And you are looking for
strlen(list[i]): this will give you the length of the string.

converting character pointer to string pointer, removing duplication

I'm writing a program that takes in a string as input e.g. 35x40x12. I want to then store the numbers as separate elements using an int pointer. So far I've managed to do this so that single digit numbers work, i.e. 3x4x6 works, however if I put in two digit numbers such as 35x40x12, the 35 will be stored in the first position, however in the second position it will also store the 5 from 35, it does this for positions 3 and 4 with regard to 40 as well. How do I remove this duplication?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int present(int l, int w, int h);
int *stringC (char *z);
int main(int argc, char *argv[])
{
char *d = "53x23x4";//input
printf("%d", *(stringC(d)+2));//whatever's stored in pointer position
return 0;
}
int *stringC (char *z)
{
int i;
int *k = malloc(sizeof(int)*20);
int j = 0;
for(i=0; z[i] !='\0';i++)
{
if( z[i]!= 'x')
{
k[j]=atoi(&z[i]);
j++;}
}
return k;
}
As others have suggested, learn to debug. It's going to be worth it!
Have a look at strtok. From man strtok:
The strtok() function parses a string into a sequence of tokens.
These tokens are divided by delimiters like "x". So, in order to parse the numbers, use something like this:
char d[] = "53x23x4";
int array[3];
char* it = strtok(d, "x");
for (size_t i = 0; i < sizeof(array) / sizeof(*array) && it; ++i, it = strtok(NULL, "x"))
array[i] = atoi(it);
Note that d points to an automatic and writable string. strtok modifies a string's content and since string literal modification yields undefined behavior, you need to allocate the string at a writable location.
Instead of array use some dynamic memory allocation mechanism and you have it. This spares you from this inconvenient hassle you're currently using.
Notes:
stop using char* to point to string literals. Use const char* instead. This prevents subtle errors where you try to modify string literals (undefined behavior).

Handling BSTRs on MacOSX in C

I've written some code in C for converting strings passed from VBA, when the C code is called from VBA from a MacOSX dylib. I got some good hints here, and since I only care about ASCII strings I've written the following functions to convert the BSTR to a simple char*:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include "myheader.h"
size_t vbstrlen(BSTR *vbstr)
{
size_t len = 0U;
while(*(vbstr++)) ++len;
len = len*2;
return len;
}
void vbstochr(BSTR *vbstr, char** out)
{
int len2 = vbstrlen(vbstr);
char str[len+1];
int i;
for(i = 0; i < len; i++)
{
str[i] = (char) (((uint16_t*) vbstr)[i]);
}
str[i] = '\0';
asprintf(out, str);
}
int test(BSTR *arg1)
{
char* convarg;
vbstochr(arg1, &convarg);
return 1;
}
The myheader.h looks like this:
typedef uint16_t OLECHAR;
typedef OLECHAR * BSTR;
. I used uint16_t because of the 4 byte (not 2 byte) wchar_t in the MacOSX C compiler. I added a breakpoint after vbstochar is called to look at the content of convarg, and it seems to work when called from Excel.
So this works, but one thing I don't understand is why I have to multiply my len in the vbstrlen function by 2. I'm new to C, so I had to read up on pointers a little bit - and I thought since my BSTR contains 2 byte characters, I should get the right string length without having to multiply by two? It would be great if someone could explain this to me, or post a link to a tutorial?
Also, my functions with string arguments work when called in VBA, but only after the first call. So when I call a function with a BSTR* argument from a dylib for the first time (after I start the application, Excel in this case), the BSTR* pointer just points at some (random?) address, but not the string. When I call the function from VBA a second time, everything works just fine - any ideas why this is the case?!
A BSTR has an embedded length, you do not need to calculate the length manually.
As for the need to multiply the length by 2, that is because a BSTR uses 2-byte characters, but char is only 1 byte. You coded your vbstrlen() function to return the number of bytes in the BSTR, not the number of characters.
Since you are only interested in ASCII strings, you can simplify the code to the following:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include "myheader.h"
size_t vbstrlen(BSTR *vbstr)
{
if (vbstr)
return *(((uint32_t*)vbstr)-1);
return 0;
}
void vbstochr(BSTR *vbstr, char** out)
{
size_t len = vbstrlen(vbstr);
char str[len+1] = {0};
for(size_t i = 0; i < len; ++i)
str[i] = (char) vbstr[i];
asprintf(out, str);
}
The chances are that the VB string is a UTF-16 string that uses 2 bytes per character (except for characters beyond the BMP, Basic Multilingual Plane, or U+0000..U+FFFF, which are encoded as surrogate pairs). So, for your 'ASCII' data, you will have alternating ASCII characters and zero bytes. The 'multiply by 2' is because UTF-16 uses two bytes to store each counted character.
This is almost definitive when we see:
typedef uint16_t OLECHAR;
typedef OLECHAR * BSTR;

building a string out of a variable amount of arguments

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
int main(int argc, char * argv[])
{
char *arr[] = { "ab", "cd", "ef" };
char **ptr, **p, *str;
int num = 3;
int size = 0;
ptr = calloc(num, 4);
p = ptr;
for (; num > 0; num--)
size += strlen(*(p++) = arr[num - 1]);
str = calloc(1, ++size);
sprintf(str, "%s%s%s", ptr[0], ptr[1], ptr[2]);
printf("%s\n", str);
return 0;
}
output: "efcdab" as expected.
now, this is all fine and suitable if the argument count to sprintf is predetermined and known. what i'm trying to achieve, however, is an elegant way of building a string if the argument count is variable (ptr[any]).
first problem: 2nd argument that is required to be passed to sprintf is const char *format.
second: the 3rd argument is the actual amount of passed on arguments in order to build the string based on the provided format.
how can i achieve something of the following:
sprintf(str, "...", ...)
basically, what if the function receives 4 (or more) char pointers out of which i want to build a whole string (currently, within the code provided above, there's only 3). that would mean, that the 2nd argument must be (at least) in the form of "%s%s%s%s", followed by an argument list of ptr[0], ptr[1], ptr[2], ptr[3].
how can make such a 'combined' call, to sprintf (or vsprintf), in the first place? things would be easier, if i could just provide a whole pointer array (**ptr) as the 3rd argument, instead.. but that does not seem to be feasible? at least, not in a way that sprintf would understand it, so it seems.. as it would need some special form of format.
ideas / suggestions?
karlphillip's suggestion of strcat does seem to be the solution here. Or rather, you'd more likely want to use something like strncat (though if you're working with a C library that supports it, I'd recommend strlcat, which, in my opinion, is much better than strncat).
So, rather than sprintf(str, "%s%s%s", ptr[0], ptr[1], ptr[2]);, you could do something like this:
int i;
for (i = 0; i < any; i++)
strncat(str, arr[i], size - strlen(str) - 1);
(Or strlcat(str, arr[i], size);; the nice thing about strlcat is that its return value will indicate how many bytes are needed for reallocation if the destination buffer is too small, but it's not a standard C function and a lot of systems don't support it.)
There's no other way to do this in C without manipulating buffers.
You could, however, switch to C++ and use the fabulous std::string to make your life easier.
Your first problem is handled by: const char * is for the function, not you. Put together your own string -- that signature just means that the function won't change it.
Your second problem is handled by: pass in your own va_list. How do you get it? Make your own varargs function:
char *assemble_strings(int count, ...)
{
va_list data_list;
va_list len_list;
int size;
char *arg;
char *formatstr;
char *str;
int i;
va_start(len_list, count);
for (i = 0, size = 0; i < count; i++)
{
arg = va_arg(len_list, char *);
size += strlen(arg);
}
va_end(len_list);
formatstr = malloc(2*count + 1);
formatstr[2*count] = 0;
for (i = 0; i < count; i++)
{
formatstr[2*i] = '%';
formatstr[2*i+1] = 's';
}
str = malloc(size + 1);
va_start(data_list, count);
vsprintf(str, formatstr, data_list);
va_end(data_list);
free(formatstr);
return(str);
}
You'll need some way to terminate the varargs, of course, and it's much easier to just pass it to vsprintf if the string list is entirely within the varargs -- since standard C requires at least one regular argument.
The loop I would use for the final copy into str would be something like:
for(i=0, p=str; i < num; i++)
p += sprintf(p, "%s", ptr[i]);
or
for(i=0, p=str; i < num; i++)
p += strlen(strcpy(p, ptr[i]));
rather than trying to deal with a variable number of arguments in a single call to sprintf.

Resources