Printing variable number of bytes using format strings with printf - c

Goal: Print variable number of bytes using a single format specifier.
Environment: x86-64 Ubuntu 20.04.3 LTS running in VM on an x86-64 host machine.
Example:
Let %kmagic be the format specifier I am looking for which prints k bytes by popping them from the stack and additing them to the output. Then, for %rsp pointing to a region in memory holding bytes 0xde 0xad 0xbe 0xef, I want printf("Next 4 bytes on the stack: %4magic") to print Next 4 bytes on the stack: deadbeef.
What I tried so far:
%khhx, which unfortunately just results in k-1 blank spaces followed by two hex-characters (one byte of data).
%kx, which I expected to print k/2 bytes interpreted as one number. This only prints 8 hex-characters (4 bytes) prepended by k - 8 blank spaces.
The number of non-blank characters printed matches the length of the format specifiers, i.e. the expected length of %hhx is 2, which is also the number of non-blank characters printed. The same holds for %x, which one expects to print 8 characters.
Question:
Is it possible to get the desired behavior? If so, how?

Is it possible to get the desired behavior? If so, how?
There does not exist printf format specifier to do what you want.
Is it possible
Write your own printf implementation that supports what you want. Use implementation-specific tools to create your own printf format specifier. You can take inspiration from linux kernel printk %*phN format speciifer.

It is not possible to using standard printf. You need to write your own function and customize the printf function.
http://www.gnu.org/software/libc/manual/html_node/Customizing-Printf.html
Example (simple dump):
int printdump (FILE *stream, const struct printf_info *info, const void *const *args)
{
const unsigned char *ptr = *(const unsigned char **)args[0];
size_t size = *(size_t*)args[1];
for(size_t i = 1; i <= size; i++)
{
fprintf(stream, "%02X%c", ptr[i-1], i % 8 ? ' ' : '\n');
}
return 1;
}
int printdumpargs (const struct printf_info *info, size_t n, int *argtypes)
{
if (n == 2)
argtypes[0] = PA_POINTER;
argtypes[1] = PA_INT;
return 2;
}
int main(void)
{
double x[4] = {456543645.6786e45, 456543654, 1e345, -345.56e67};
register_printf_function ('Y', printdump, printdumpargs);
printf("%Y\n", &x, sizeof(x));
}
As I see it is depreciated now (probably no one was using it)
https://godbolt.org/z/qKs6e1d9q
Output:
30 18 CB 5A EF 10 13 4B
00 00 00 A6 4D 36 BB 41
00 00 00 00 00 00 F0 7F
C4 5D ED 48 9C 05 60 CE

There is no standard conversion specifier for your purpose, but you can achieve your goal in C99 using an ancillary function and dynamic array:
#include <stdio.h>
char *dump_bytes(char *buf, const void *p, size_t count) {
const unsigned char *src = p;
char *dest = buf;
while (count --> 0) {
dest += sprintf(dest, "%.2X", *src++);
if (count)
*dest++ = ' ';
}
*dest = '\0'; // return an empty sting for an empty memory chunk
return buf;
}
int main() {
long n = 0x12345;
printf("n is at address %p with contents: %s\n",
(void *)&n,
dump_bytes((char[3 * sizeof(n)]){""}, &n, sizeof(n)));
return 0;
}
Output: n is at address 0x7fff523f57d8 with contents: 45 23 01 00 00 00 00 00
You can use a macro for simpler invocation:
#define DUMPBYTES(p, n) dump_bytes((char[3 * (n)]){""}, p, n)
int main() {
char *p = malloc(5);
printf("allocated 5 bytes at address %p with contents: %s\n",
p, DUMPBYTES(p, 5));
free(p);
return 0;
}

Related

Comparing integers with memcmp()

I am making a function to get the maximum value of an array of NMEMB members each one of size SIZ,
comparing each member with memcmp(). The problem is that when comparing signed integers the result is incorrect but at the same time correct. Here is an example:
void *
getmax(const void *data, size_t nmemb, size_t siz){
const uint8_t *bytes = (const uint8_t *)data;
void *max = malloc(siz);
if (!max){
errno = ENOMEM;
return NULL;
}
memcpy(max, bytes, siz);
while (nmemb > 0){
hexdump(bytes, siz);
if (memcmp(max, bytes, siz) < 0)
memcpy(max, bytes, siz);
bytes += siz;
--nmemb;
}
return max;
}
int
main(int argc, char **argv){
int v[] = {5, 1, 3, 1, 34, 198, -12, -11, -0x111118};
size_t nmemb = sizeof(v)/sizeof(v[0]);
int *maximum = getmax(v, nmemb, sizeof(v[0]));
printf("%d\n", *maximum);
return 0;
}
hexdump() is just a debugging function, doesn't alter the program.
When compiling and executing the output is the following:
05 00 00 00 // hexdump() output
01 00 00 00
03 00 00 00
01 00 00 00
22 00 00 00
c6 00 00 00
f4 ff ff ff
f5 ff ff ff
e8 ee ee ff
-11 // "maximum" value
Which is correct since memcmp() compares an string of bytes and doesn't care about types or sign so -11 = 0xfffffff5 is the maximum string of bytes in the array v[] but at the same time is incorrect since -11 is not the maximum integer in the array.
Is there any way of getting the maximum integer of an array using this function?
memcmp compares the locations and does not care about the sign. so for it -11 means 0xFFFFFFF5 and -12 means 0xFFFFFFF4 and the biggest number in the array 198 means 0x000000C6, so out of all these, -11 is the biggest unsigned number and it is returned for you. You should not use memcmp to compare the signed numbers.
Go down the qsort route and require a custom comparator. Note that you absolutely don't need dynamic memory allocation in a function this simple:
#include <stdio.h>
void const *getmax(void const *data, size_t const count, size_t const elm_sz,
int (*cmp)(void const *, void const *)) {
char const *begin = data;
char const *end = begin + count * elm_sz;
char const *max = begin;
while (begin != end) {
if (cmp(max, begin) < 0) max = begin;
begin += elm_sz;
}
return max;
}
int int_cmp(void const *e1, void const *e2) {
int const i1 = *(int const *)e1;
int const i2 = *(int const *)e2;
if (i1 > i2) return 1;
if (i1 < i2) return -1;
return 0;
}
int main() {
int v[] = {5, 1, 3, 1, 34, 198, -12, -11, -0x111118};
int const *maximum = getmax(v, sizeof(v) / sizeof(*v), sizeof(*v), int_cmp);
printf("%d\n", *maximum);
}
All memory comparisons made by memcmp are unsigned and based on char sized array elements. When you feed this with a signed int array of cells, different size, your result can only be used to test equality of binary representations, meaning that a result of 0 or different than 0 means equality or unequality, but the sign on a different of zero result means comparing the individual bytes of the array of integeres, which, descomposed as bytes (in the machine endianness architecture), makes some of the bytes to be signed and compared as unsigned and others be signed and compared as unsigned. In addition, the significance of the different bytes in an integer will probably affect the sorting order, as the bytes are compared from lower addresses to higher addresses, that would match with the architecture endianness only in the case that the integers where stored as unsigned and (very important) stored in memory in big endian order. If probably you are using intel architecture, then this is just the opposite to be able to use that.

Wrong ciphertext length in c libgcrypt

So, i am trying to encrypt and decrypt a string, using libgcrypt library (version 1.8.7) on arch and at this moment i have tried 2 modes: CBC and GCM (not sure about GCM, so let's solve the CBC first), but the same problem appears.
I padd the string and then, encrypt it block by block. Sometimes, this happens chaotically by the way, the gcry_cipher_encrypt function returns wrong amount of bytes (5, 7, 11...), but if i understood correctly, the output should be 16 bytes (128 bits). The same thing happens with decryption, that i do the exact same way, block by block. I'm using the same GCRY handler with through out the encryption or decryption process and it feels like i'm really missing something... Here is an example, only encryption in CBC mode, to make it easier to find the problem.
Code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <gcrypt.h>
// Define cipher details
#define GCRY_CIPHER GCRY_CIPHER_AES256
#define GCRY_C_MODE GCRY_CIPHER_MODE_CBC
char * encrypt_block(gcry_cipher_hd_t handler, unsigned char * key, unsigned char * input) {
size_t key_length = 32;
size_t blk_length = 16;
// Encryption result variable
unsigned char * enc = (char *) calloc(16, sizeof(char));
// Error variable
gcry_error_t err = 0;
// Set key
err = gcry_cipher_setkey(handler, key, key_length);
if (err) {
printf("Couldn't set the key!\n%s\n%s\n", gcry_strsource(err), gcry_strerror(err));
exit(-1);
}
// Start encryption process
err = gcry_cipher_encrypt(handler, enc, blk_length, input, blk_length);
if (err) {
printf("Couldn't encrypt!\n%s\n%s\n", gcry_strsource(err), gcry_strerror(err));
exit(-1);
}
if (strlen(enc) != 16) {
printf("\n\nCORRUPTED BLOCK!\n\n");
}
// Printing the block result
printf("\nENC BLOCK:\t%d\t", strlen(enc));
for (unsigned short int i = 0; i < strlen(enc); ++i) {
printf("%X ", enc[i]);
}
printf("\n");
return enc;
}
int main() {
// Creating basic variables
unsigned char * input = (char *) calloc(2048, sizeof(char));
unsigned char * key = (char *) calloc(32, sizeof(char));
unsigned char * iv = (char *) calloc(16, sizeof(char));
// Taking user input
printf("Input (2048 max): ");
scanf(" %[^\n]", input);
printf("Key (32 max): ");
scanf(" %[^\n]", key);
printf("RAW DATA:\n\tinput: %d\t%s\n\tkey: %d\t%s\n\n", strlen(input), input, strlen(key), key);
// Create GCRY handler
gcry_cipher_hd_t handler;
gcry_error_t err = 0;
// Initialize cipher handler
err = gcry_cipher_open(&handler, GCRY_CIPHER, GCRY_C_MODE, 0);
if (err) {
printf("Couldn't initialize the cipher!\n%s\n%s\n", gcry_strsource(err), gcry_strerror(err));
exit(-1);
}
// Add padding to the input
if ((strlen(input) % 16) != 0) {
for (unsigned short int i = 0; i < (((strlen(input) / 16) * 16) - strlen(input)); ++i) {
strcat(input, "X");
}
}
// Add padding to the key
if (strlen(key) < 32) {
for (unsigned short int i = strlen(key); i < 32; ++i) {
key[i] = 0x0058;
}
}
// Generate random IV
char charset[] = "abcdefghijklmnopqrstuvwxyz0123456789";
unsigned short int iv_size = 16;
for (unsigned short int i = 0; i < iv_size; ++i) {
unsigned short int index = rand() % (unsigned short int) (sizeof charset - 1);
iv[i] = charset[index];
}
// Set the IV
err = gcry_cipher_setiv(handler, iv, 16);
if (err) {
printf("Couldn't set the IV!\n%s\n%s\n", gcry_strsource(err), gcry_strerror(err));
exit(-1);
}
printf("ENC DATA:\n\tinput: %d\t%s\n\tkey: %d\t%s\n\tiv: %d\t%s\n\n", strlen(input), input, strlen(key), key, strlen(iv), iv);
// Create encryption variables
unsigned char * input_buffer = (char *) calloc(16, sizeof(char));
unsigned char * enc_buffer = (char *) calloc(16, sizeof(char));
unsigned char * out = (char *) calloc(strlen(input), sizeof(char));
// Start encryption process block by block
for (unsigned short int i = 0; i < (strlen(input) / 16); ++i) {
// Create a new block
for (unsigned short int j = 0; j < 16; ++j) {
input_buffer[j] = input[(i * 16) + j];
}
printf("\nENC INPUT:\t%d\t%s\n", strlen(input_buffer), input_buffer);
// Check if this is a final round
if (i == ((strlen(input) / 16) - 1)) {
err = gcry_cipher_final(handler);
}
// Start encrypting the block
enc_buffer = encrypt_block(handler, key, input_buffer);
// Adding up the block to the out result
strcat(out, enc_buffer);
memset(input_buffer, 0, 16);
memset(enc_buffer, 0, 16);
}
// Print the encryption result
printf("\n\nENC RESULT:\n\t%d\n\t", strlen(out));
for (unsigned short int i = 0; i < strlen(out); ++i) {
printf("%X ", out[i]);
}
printf("\n");
gcry_cipher_close(handler);
}
Output:
Input (2048 max): This string is made for testing the program
Key (32 max): hey my password
RAW DATA:
input: 43 This string is made for testing the program
key: 15 hey my password
ENC DATA:
input: 48 This string is made for testing the programXXXXX
key: 32 hey my passwordXXXXXXXXXXXXXXXXX
iv: 16 t8jhfhkm7bo5ohxw
ENC INPUT: 16 This string is m
ENC BLOCK: 16 2 BF AA A0 1 7C A8 77 DA 4A 5A 72 29 EB FA F6
ENC INPUT: 16 ade for testing
ENC BLOCK: 16 41 BA CE 61 8A E3 F4 89 8A 46 50 2 47 5 11 A4
ENC INPUT: 16 the programXXXXX
CORRUPTED BLOCK!
ENC BLOCK: 12 AE D6 92 D2 5A AF 85 CB 57 2 1B 93
ENC RESULT:
44
2 BF AA A0 1 7C A8 77 DA 4A 5A 72 29 EB FA F6 41 BA CE 61 8A E3 F4 89 8A 46 50 2 47 5 11 A4 AE D6 92 D2 5A AF 85 CB 57 2 1B 93
I am really sorry for this mess, it's just me going crazy at this point, it seems like the solution is so simple, but i just can't get it.
strlen does not tell you anything about your output buffer's length. Your output buffer is, in fact, always the same length. There's no need to test its length because libgcrypt has no way of modifying its length.
If you want to understand why strlen is returning "chaotic" values, you need to understand what strlen is intended to do. strlen is intended to operate on a C-style (null-terminated) string, not on arbitrary bytes. Strings in C are stored as arrays of characters ending with a '\0' (0x00) character. This is the null terminator. This is how the length of C-strings can be determined.
// example implementation to explicate the concept
size_t strlen(const char *s) {
size_t i = 0;
while (s[i] != '\0')
++i;
return i;
}
When you apply strlen to arbitrary bytes, the results are nonsensical. It is perfectly possible for your binary ciphertext to contain the byte 0x00 anywhere. It could appear at the beginning or anywhere in the middle. It could appear several times. Or it could never appear, in which case you would get a fatal segmentation fault. Wherever 0x00 happens to first appear in your ciphertext, that will be where strlen assumes it ends. The behavior appears "chaotic" because encryption produces random-seeming data, so the distribution of 0x00 within that data is also random-seeming.
PS: You don't need to reset the key every time you encrypt a block.

scanf hexadecimal a long byte array

I have a file with the following format:
0 b71b3a8de0c18abd2e56ec5f4efc4af2ba084604
1 4bec20891a68887eef982e9cda5d02ca8e6d4f57
The first value is an integer, and the second integer is a 20-byte value encoded in hexadecimal. I want to be able read in both values using a fscanf loop like so:
FILE *file = fopen("file.txt", "r");
int id;
char hash[20];
while(fscanf(has_chunks, "%i %40x\n", &id, c_hash) == 2){
// Do Stuff
}
However, this clearly doesn't work, as %40x expects an unsigned int pointer, but this is not large enough to hold the value. I know I can do multiple formatters, like %x%x%x, but this doesn't seem elegant. Is there a better way I can do this using fscanf?
b7 1b 3a 8d e0 c1 8a bd 2e 56 ec 5f 4e fc 4a f2 ba 08 46 04
Each pair of characters is in the range between 0 to 0xff. This fits in one byte, or unsigned char. Hash functions normally expect unsigned char as well.
Use the following conversion:
int i, id;
unsigned int v;
unsigned char hash[20];
char buf[41];
while(fscanf(file, "%d %s\n", &id, buf) == 2)
{
for(i = 0; i < 20; i++)
{
if(sscanf(buf + i * 2, "%2x", &v) != 1) break;
hash[i] = (unsigned char)v;
}
}

Reading in bytes from a struct one at a time

I have a struct of six 16 bit integers and 1 32 bit integer (16 byte's total) and I'm trying to read in the struct one at a time. Currently I use
printf("%.4x %.4x %.4x %.4x %.4x %.4x %.4x\n", );
with the 7 struct members as the following parameters.
My output is as following:
0001 0100 0010 0002 0058 0070 464c45
And I would like to format it as:
01 00 00 01 10 00 02 00 58 00 70 00 45 4c 46 00
I've been searching everywhere to try and find out how to properly format it. Any help would be greatly appreciated! thank you in advance!
You can just move an unsigned char pointer over the struct, reading byte for byte (I hope I don't mix things up with C++, getting into undefined behavior may happen when doing such things):
#include <stdio.h>
#include <stdint.h>
struct Data {
int16_t small[6];
int32_t big;
};
void funky_print(struct Data const * data) {
unsigned char const * ptr = (unsigned char const *)data;
size_t i;
printf("%.2hhx", *ptr);
++ptr;
for (i = 1; i < sizeof(*data); ++i) {
printf(" %.2hhx", *ptr);
++ptr;
}
}
int main(void) {
struct Data d = {{0xA0B0, 0xC0D0, 84, 128, 3200, 0}, 0x1BADCAFE};
funky_print(&d);
return 0;
}
(Live here)

Use printf to print character string in hexadecimal format, distorted results

I want to print a character string in hexadecimal format on machine A. Something like:
ori_mesg = gen_rdm_bytestream (1400, seed)
sendto(machine B, ori_mesg, len(mesg))
On machine B
recvfrom(machine A, mesg)
mesg_check = gen_rdm_bytestream (1400, seed)
for(i=0;i<20;i++){
printf("%02x ", *(mesg+i)& 0xFF);
}
printf("\n");
for(i=0; i<20; i++){
printf("%02x ", *(mesg_check+i));
}
printf("\n");
seed varies among 1, 2, 3, ...
The bytes generation funcion is:
u_char *gen_rdm_bytestream (size_t num_bytes, unsigned int seed)
{
u_char *stream = malloc (num_bytes+4);
size_t i;
u_int16_t seq = seed;
seq = htons(seq);
u_int16_t tail = num_bytes;
tail = htons(tail);
memcpy(stream, &seq, sizeof(seq));
srand(seed);
for (i = 3; i < num_bytes+2; i++){
stream[i] = rand ();
}
memcpy(stream+num_bytes+2, &tail, sizeof(tail));
return stream;
}
But I got results from printf like:
00 01 00 67 c6 69 73 51 ff 4a ec 29 cd ba ab f2 fb e3 46 7c
00 01 00 67 ffffffc6 69 73 51 ffffffff 4a ffffffec 29 ffffffcd ffffffba ffffffab fffffff2 fffffffb ffffffe3 46 7c
or
00 02 88 fa 7f 44 4f d5 d2 00 2d 29 4b 96 c3 4d c5 7d 29 7e
00 02 00 fffffffa 7f 44 4f ffffffd5 ffffffd2 00 2d 29 4b ffffff96 ffffffc3 4d ffffffc5 7d 29 7e
Why are there so many fffff for mesg_check?
Are there any potential reasons for this phenomenon?
Here's a small program that illustrates the problem I think you might be having:
#include <stdio.h>
int main(void) {
char arr[] = { 0, 16, 127, 128, 255 };
for (int i = 0; i < sizeof arr; i ++) {
printf(" %2x", arr[i]);
}
putchar('\n');
return 0;
}
On my system (on which plain char is signed), I get this output:
0 10 7f ffffff80 ffffffff
The value 255, when stored in a (signed) char, is stored as -1. In the printf call, it's promoted to (signed) int -- but the "%2x" format tells printf to treat it as an unsigned int, so it displays fffffffff.
Make sure that your mesg and mesg_check arrays are defined as arrays of unsigned char, not plain char.
UPDATE: Rereading this answer more than a year later, I realize it's not quite correct. Here's a program that works correctly on my system, and will almost certainly work on any reasonable system:
#include <stdio.h>
int main(void) {
unsigned char arr[] = { 0, 16, 127, 128, 255 };
for (int i = 0; i < sizeof arr; i ++) {
printf(" %02x", arr[i]);
}
putchar('\n');
return 0;
}
The output is:
00 10 7f 80 ff
An argument of type unsigned char is promoted to (signed) int (assuming that int can hold all values of type unsigned char, i.e., INT_MAX >= UCHAR_MAX, which is the case on practically all systems). So the argument arr[i] is promoted to int, while the " %02x" format requires an argument of type unsigned int.
The C standard strongly implies, but doesn't quite state directly, that arguments of corresponding signed and unsigned types are interchangeable as long as they're within the range of both types -- which is the case here.
To be completely correct, you need to ensure that the argument is actually of type unsigned int:
printf("%02x", (unsigned)arr[i]);
Yes, always print the string in hexadecimal format as:
for(i=0; till string length; i++)
printf("%02X", (unsigned char)str[i]);
You will get an error when you try to print the whole string in one go and when printing the hexadecimal string character by character which is using 'unsigned char' if the string is in format other than 'unsigned char'.

Resources