I have a file with the following format:
0 b71b3a8de0c18abd2e56ec5f4efc4af2ba084604
1 4bec20891a68887eef982e9cda5d02ca8e6d4f57
The first value is an integer, and the second integer is a 20-byte value encoded in hexadecimal. I want to be able read in both values using a fscanf loop like so:
FILE *file = fopen("file.txt", "r");
int id;
char hash[20];
while(fscanf(has_chunks, "%i %40x\n", &id, c_hash) == 2){
// Do Stuff
}
However, this clearly doesn't work, as %40x expects an unsigned int pointer, but this is not large enough to hold the value. I know I can do multiple formatters, like %x%x%x, but this doesn't seem elegant. Is there a better way I can do this using fscanf?
b7 1b 3a 8d e0 c1 8a bd 2e 56 ec 5f 4e fc 4a f2 ba 08 46 04
Each pair of characters is in the range between 0 to 0xff. This fits in one byte, or unsigned char. Hash functions normally expect unsigned char as well.
Use the following conversion:
int i, id;
unsigned int v;
unsigned char hash[20];
char buf[41];
while(fscanf(file, "%d %s\n", &id, buf) == 2)
{
for(i = 0; i < 20; i++)
{
if(sscanf(buf + i * 2, "%2x", &v) != 1) break;
hash[i] = (unsigned char)v;
}
}
Related
I want to make something like a small hex editor for my project.
so i wrote a function like this(to replace the original code with the new code):
int replace(FILE *binaryFile, long offset, unsigned char *replaced, int length) {
if (binaryFile != NULL) {
for (int i = 0; i < length; i++) {
fseek(binaryFile, offset + i, SEEK_SET);
fwrite(&replaced[i], sizeof(replaced), 1, binaryFile);
}
fclose(binaryFile);
return 1;
} else return -1;
}
So I wrote this code to test the function and sent it to address 0x0:
unsigned char code[] = "\x1E\xFF\x2F\xE1";
and i got this hexadecimal result:
1e ff 2f e1 00 10 2b 35 ff fe 07 00
But I don't need data after E1 (00 10 2b 35 ff fe 07 00)
How can I write the function so that only the data sent to the function is stored?
sizeof(replaced) is wrong. replaced is a unsigned char *, so that's not the size you want.
You probably want sizeof(unsigned char) or sizeof(*replaced).
Currently, you end up writing eight times too much.
Note that you could also write in a single step:
if (binaryFile != NULL)
{
fseek(binaryFile, offset, SEEK_SET);
fwrite(replaced, sizeof(unsigned char), length, binaryFile);
}
Goal: Print variable number of bytes using a single format specifier.
Environment: x86-64 Ubuntu 20.04.3 LTS running in VM on an x86-64 host machine.
Example:
Let %kmagic be the format specifier I am looking for which prints k bytes by popping them from the stack and additing them to the output. Then, for %rsp pointing to a region in memory holding bytes 0xde 0xad 0xbe 0xef, I want printf("Next 4 bytes on the stack: %4magic") to print Next 4 bytes on the stack: deadbeef.
What I tried so far:
%khhx, which unfortunately just results in k-1 blank spaces followed by two hex-characters (one byte of data).
%kx, which I expected to print k/2 bytes interpreted as one number. This only prints 8 hex-characters (4 bytes) prepended by k - 8 blank spaces.
The number of non-blank characters printed matches the length of the format specifiers, i.e. the expected length of %hhx is 2, which is also the number of non-blank characters printed. The same holds for %x, which one expects to print 8 characters.
Question:
Is it possible to get the desired behavior? If so, how?
Is it possible to get the desired behavior? If so, how?
There does not exist printf format specifier to do what you want.
Is it possible
Write your own printf implementation that supports what you want. Use implementation-specific tools to create your own printf format specifier. You can take inspiration from linux kernel printk %*phN format speciifer.
It is not possible to using standard printf. You need to write your own function and customize the printf function.
http://www.gnu.org/software/libc/manual/html_node/Customizing-Printf.html
Example (simple dump):
int printdump (FILE *stream, const struct printf_info *info, const void *const *args)
{
const unsigned char *ptr = *(const unsigned char **)args[0];
size_t size = *(size_t*)args[1];
for(size_t i = 1; i <= size; i++)
{
fprintf(stream, "%02X%c", ptr[i-1], i % 8 ? ' ' : '\n');
}
return 1;
}
int printdumpargs (const struct printf_info *info, size_t n, int *argtypes)
{
if (n == 2)
argtypes[0] = PA_POINTER;
argtypes[1] = PA_INT;
return 2;
}
int main(void)
{
double x[4] = {456543645.6786e45, 456543654, 1e345, -345.56e67};
register_printf_function ('Y', printdump, printdumpargs);
printf("%Y\n", &x, sizeof(x));
}
As I see it is depreciated now (probably no one was using it)
https://godbolt.org/z/qKs6e1d9q
Output:
30 18 CB 5A EF 10 13 4B
00 00 00 A6 4D 36 BB 41
00 00 00 00 00 00 F0 7F
C4 5D ED 48 9C 05 60 CE
There is no standard conversion specifier for your purpose, but you can achieve your goal in C99 using an ancillary function and dynamic array:
#include <stdio.h>
char *dump_bytes(char *buf, const void *p, size_t count) {
const unsigned char *src = p;
char *dest = buf;
while (count --> 0) {
dest += sprintf(dest, "%.2X", *src++);
if (count)
*dest++ = ' ';
}
*dest = '\0'; // return an empty sting for an empty memory chunk
return buf;
}
int main() {
long n = 0x12345;
printf("n is at address %p with contents: %s\n",
(void *)&n,
dump_bytes((char[3 * sizeof(n)]){""}, &n, sizeof(n)));
return 0;
}
Output: n is at address 0x7fff523f57d8 with contents: 45 23 01 00 00 00 00 00
You can use a macro for simpler invocation:
#define DUMPBYTES(p, n) dump_bytes((char[3 * (n)]){""}, p, n)
int main() {
char *p = malloc(5);
printf("allocated 5 bytes at address %p with contents: %s\n",
p, DUMPBYTES(p, 5));
free(p);
return 0;
}
So, i am trying to encrypt and decrypt a string, using libgcrypt library (version 1.8.7) on arch and at this moment i have tried 2 modes: CBC and GCM (not sure about GCM, so let's solve the CBC first), but the same problem appears.
I padd the string and then, encrypt it block by block. Sometimes, this happens chaotically by the way, the gcry_cipher_encrypt function returns wrong amount of bytes (5, 7, 11...), but if i understood correctly, the output should be 16 bytes (128 bits). The same thing happens with decryption, that i do the exact same way, block by block. I'm using the same GCRY handler with through out the encryption or decryption process and it feels like i'm really missing something... Here is an example, only encryption in CBC mode, to make it easier to find the problem.
Code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <gcrypt.h>
// Define cipher details
#define GCRY_CIPHER GCRY_CIPHER_AES256
#define GCRY_C_MODE GCRY_CIPHER_MODE_CBC
char * encrypt_block(gcry_cipher_hd_t handler, unsigned char * key, unsigned char * input) {
size_t key_length = 32;
size_t blk_length = 16;
// Encryption result variable
unsigned char * enc = (char *) calloc(16, sizeof(char));
// Error variable
gcry_error_t err = 0;
// Set key
err = gcry_cipher_setkey(handler, key, key_length);
if (err) {
printf("Couldn't set the key!\n%s\n%s\n", gcry_strsource(err), gcry_strerror(err));
exit(-1);
}
// Start encryption process
err = gcry_cipher_encrypt(handler, enc, blk_length, input, blk_length);
if (err) {
printf("Couldn't encrypt!\n%s\n%s\n", gcry_strsource(err), gcry_strerror(err));
exit(-1);
}
if (strlen(enc) != 16) {
printf("\n\nCORRUPTED BLOCK!\n\n");
}
// Printing the block result
printf("\nENC BLOCK:\t%d\t", strlen(enc));
for (unsigned short int i = 0; i < strlen(enc); ++i) {
printf("%X ", enc[i]);
}
printf("\n");
return enc;
}
int main() {
// Creating basic variables
unsigned char * input = (char *) calloc(2048, sizeof(char));
unsigned char * key = (char *) calloc(32, sizeof(char));
unsigned char * iv = (char *) calloc(16, sizeof(char));
// Taking user input
printf("Input (2048 max): ");
scanf(" %[^\n]", input);
printf("Key (32 max): ");
scanf(" %[^\n]", key);
printf("RAW DATA:\n\tinput: %d\t%s\n\tkey: %d\t%s\n\n", strlen(input), input, strlen(key), key);
// Create GCRY handler
gcry_cipher_hd_t handler;
gcry_error_t err = 0;
// Initialize cipher handler
err = gcry_cipher_open(&handler, GCRY_CIPHER, GCRY_C_MODE, 0);
if (err) {
printf("Couldn't initialize the cipher!\n%s\n%s\n", gcry_strsource(err), gcry_strerror(err));
exit(-1);
}
// Add padding to the input
if ((strlen(input) % 16) != 0) {
for (unsigned short int i = 0; i < (((strlen(input) / 16) * 16) - strlen(input)); ++i) {
strcat(input, "X");
}
}
// Add padding to the key
if (strlen(key) < 32) {
for (unsigned short int i = strlen(key); i < 32; ++i) {
key[i] = 0x0058;
}
}
// Generate random IV
char charset[] = "abcdefghijklmnopqrstuvwxyz0123456789";
unsigned short int iv_size = 16;
for (unsigned short int i = 0; i < iv_size; ++i) {
unsigned short int index = rand() % (unsigned short int) (sizeof charset - 1);
iv[i] = charset[index];
}
// Set the IV
err = gcry_cipher_setiv(handler, iv, 16);
if (err) {
printf("Couldn't set the IV!\n%s\n%s\n", gcry_strsource(err), gcry_strerror(err));
exit(-1);
}
printf("ENC DATA:\n\tinput: %d\t%s\n\tkey: %d\t%s\n\tiv: %d\t%s\n\n", strlen(input), input, strlen(key), key, strlen(iv), iv);
// Create encryption variables
unsigned char * input_buffer = (char *) calloc(16, sizeof(char));
unsigned char * enc_buffer = (char *) calloc(16, sizeof(char));
unsigned char * out = (char *) calloc(strlen(input), sizeof(char));
// Start encryption process block by block
for (unsigned short int i = 0; i < (strlen(input) / 16); ++i) {
// Create a new block
for (unsigned short int j = 0; j < 16; ++j) {
input_buffer[j] = input[(i * 16) + j];
}
printf("\nENC INPUT:\t%d\t%s\n", strlen(input_buffer), input_buffer);
// Check if this is a final round
if (i == ((strlen(input) / 16) - 1)) {
err = gcry_cipher_final(handler);
}
// Start encrypting the block
enc_buffer = encrypt_block(handler, key, input_buffer);
// Adding up the block to the out result
strcat(out, enc_buffer);
memset(input_buffer, 0, 16);
memset(enc_buffer, 0, 16);
}
// Print the encryption result
printf("\n\nENC RESULT:\n\t%d\n\t", strlen(out));
for (unsigned short int i = 0; i < strlen(out); ++i) {
printf("%X ", out[i]);
}
printf("\n");
gcry_cipher_close(handler);
}
Output:
Input (2048 max): This string is made for testing the program
Key (32 max): hey my password
RAW DATA:
input: 43 This string is made for testing the program
key: 15 hey my password
ENC DATA:
input: 48 This string is made for testing the programXXXXX
key: 32 hey my passwordXXXXXXXXXXXXXXXXX
iv: 16 t8jhfhkm7bo5ohxw
ENC INPUT: 16 This string is m
ENC BLOCK: 16 2 BF AA A0 1 7C A8 77 DA 4A 5A 72 29 EB FA F6
ENC INPUT: 16 ade for testing
ENC BLOCK: 16 41 BA CE 61 8A E3 F4 89 8A 46 50 2 47 5 11 A4
ENC INPUT: 16 the programXXXXX
CORRUPTED BLOCK!
ENC BLOCK: 12 AE D6 92 D2 5A AF 85 CB 57 2 1B 93
ENC RESULT:
44
2 BF AA A0 1 7C A8 77 DA 4A 5A 72 29 EB FA F6 41 BA CE 61 8A E3 F4 89 8A 46 50 2 47 5 11 A4 AE D6 92 D2 5A AF 85 CB 57 2 1B 93
I am really sorry for this mess, it's just me going crazy at this point, it seems like the solution is so simple, but i just can't get it.
strlen does not tell you anything about your output buffer's length. Your output buffer is, in fact, always the same length. There's no need to test its length because libgcrypt has no way of modifying its length.
If you want to understand why strlen is returning "chaotic" values, you need to understand what strlen is intended to do. strlen is intended to operate on a C-style (null-terminated) string, not on arbitrary bytes. Strings in C are stored as arrays of characters ending with a '\0' (0x00) character. This is the null terminator. This is how the length of C-strings can be determined.
// example implementation to explicate the concept
size_t strlen(const char *s) {
size_t i = 0;
while (s[i] != '\0')
++i;
return i;
}
When you apply strlen to arbitrary bytes, the results are nonsensical. It is perfectly possible for your binary ciphertext to contain the byte 0x00 anywhere. It could appear at the beginning or anywhere in the middle. It could appear several times. Or it could never appear, in which case you would get a fatal segmentation fault. Wherever 0x00 happens to first appear in your ciphertext, that will be where strlen assumes it ends. The behavior appears "chaotic" because encryption produces random-seeming data, so the distribution of 0x00 within that data is also random-seeming.
PS: You don't need to reset the key every time you encrypt a block.
I want to print a character string in hexadecimal format on machine A. Something like:
ori_mesg = gen_rdm_bytestream (1400, seed)
sendto(machine B, ori_mesg, len(mesg))
On machine B
recvfrom(machine A, mesg)
mesg_check = gen_rdm_bytestream (1400, seed)
for(i=0;i<20;i++){
printf("%02x ", *(mesg+i)& 0xFF);
}
printf("\n");
for(i=0; i<20; i++){
printf("%02x ", *(mesg_check+i));
}
printf("\n");
seed varies among 1, 2, 3, ...
The bytes generation funcion is:
u_char *gen_rdm_bytestream (size_t num_bytes, unsigned int seed)
{
u_char *stream = malloc (num_bytes+4);
size_t i;
u_int16_t seq = seed;
seq = htons(seq);
u_int16_t tail = num_bytes;
tail = htons(tail);
memcpy(stream, &seq, sizeof(seq));
srand(seed);
for (i = 3; i < num_bytes+2; i++){
stream[i] = rand ();
}
memcpy(stream+num_bytes+2, &tail, sizeof(tail));
return stream;
}
But I got results from printf like:
00 01 00 67 c6 69 73 51 ff 4a ec 29 cd ba ab f2 fb e3 46 7c
00 01 00 67 ffffffc6 69 73 51 ffffffff 4a ffffffec 29 ffffffcd ffffffba ffffffab fffffff2 fffffffb ffffffe3 46 7c
or
00 02 88 fa 7f 44 4f d5 d2 00 2d 29 4b 96 c3 4d c5 7d 29 7e
00 02 00 fffffffa 7f 44 4f ffffffd5 ffffffd2 00 2d 29 4b ffffff96 ffffffc3 4d ffffffc5 7d 29 7e
Why are there so many fffff for mesg_check?
Are there any potential reasons for this phenomenon?
Here's a small program that illustrates the problem I think you might be having:
#include <stdio.h>
int main(void) {
char arr[] = { 0, 16, 127, 128, 255 };
for (int i = 0; i < sizeof arr; i ++) {
printf(" %2x", arr[i]);
}
putchar('\n');
return 0;
}
On my system (on which plain char is signed), I get this output:
0 10 7f ffffff80 ffffffff
The value 255, when stored in a (signed) char, is stored as -1. In the printf call, it's promoted to (signed) int -- but the "%2x" format tells printf to treat it as an unsigned int, so it displays fffffffff.
Make sure that your mesg and mesg_check arrays are defined as arrays of unsigned char, not plain char.
UPDATE: Rereading this answer more than a year later, I realize it's not quite correct. Here's a program that works correctly on my system, and will almost certainly work on any reasonable system:
#include <stdio.h>
int main(void) {
unsigned char arr[] = { 0, 16, 127, 128, 255 };
for (int i = 0; i < sizeof arr; i ++) {
printf(" %02x", arr[i]);
}
putchar('\n');
return 0;
}
The output is:
00 10 7f 80 ff
An argument of type unsigned char is promoted to (signed) int (assuming that int can hold all values of type unsigned char, i.e., INT_MAX >= UCHAR_MAX, which is the case on practically all systems). So the argument arr[i] is promoted to int, while the " %02x" format requires an argument of type unsigned int.
The C standard strongly implies, but doesn't quite state directly, that arguments of corresponding signed and unsigned types are interchangeable as long as they're within the range of both types -- which is the case here.
To be completely correct, you need to ensure that the argument is actually of type unsigned int:
printf("%02x", (unsigned)arr[i]);
Yes, always print the string in hexadecimal format as:
for(i=0; till string length; i++)
printf("%02X", (unsigned char)str[i]);
You will get an error when you try to print the whole string in one go and when printing the hexadecimal string character by character which is using 'unsigned char' if the string is in format other than 'unsigned char'.
noob warning.
I'm trying to create a compression program. It takes a .txt with ASCII characters as an argument, and cuts off the leading 0 of the binary representation of each character.
It does this by using the last 2 bytes of two different integers. A character with a leading zero is put into the 4th byte of the integer 'write', and the next character is put into the 3rd byte of the integer 'temp'. The 'temp' int is then shifted to the right once, and then OR'd with 'write', so that the leading zero slot has been filled with data we need. This repeats, with the shift counter increasing after every character. The first case is a bit odd. The algorithm isn't very complex if written out on paper.
I feel like I've tried everything. I've been over the algorithm so many times. I'm pretty sure the problem is when shift_counter gets to 8.. but it should work fine. It just doesn't. I can show you why here (the code is further down):
This is the hex dump of my output:
0000000 3f 00 00 00 41 10 68 9e 6e c3 d9 65 10 88 5e c6
0000020 d3 41 e6 74 9a 5d 06 d1 df a0 7a 7d 5e 06 a5 dd
0000040 20 3a bd 3c a7 a7 dd 67 10 e8 5d a7 83 e8 e8 72
0000060 19 a4 c7 c9 6e a0 f1 f8 dd 86 cb cb f3 f9 3c
0000077
And the correct output:
0000000 3f 00 00 00 41 d0 3c dd 86 b3 cb 20 7a 19 4f 07
0000020 99 d3 ec 32 88 fe 06 d5 e7 65 50 da 0d a2 97 e7
0000040 f4 b4 fb 0c 7a d7 e9 20 3a ba 0c d2 e3 64 37 d0
0000060 f8 dd 86 cb cb f3 79 fa ed 76 29 00 0a 0a
0000076
code:
int compress(char *filename_ptr){
int in_fd;
in_fd = open(filename_ptr, O_RDONLY);
//set pointer to the end of the file, find file size, then reset position
//by closing/opening
unsigned int file_bytes = lseek(in_fd, 0, SEEK_END);
close(in_fd);
in_fd = open(filename_ptr, O_RDONLY);
//store file contents in buffer
unsigned char read_buffer[file_bytes];
read(in_fd, read_buffer, file_bytes);
//file where the output will be stored
int out_fd;
creat("output.txt", 0644);
out_fd = open("output.txt", O_WRONLY);
//sets file size in header (needed for decompression, this is the size of the
//file before compression. everything after this we write this 4-byte int
//is a 1 byte char
write(out_fd, &file_bytes, 4);
unsigned int writer;
unsigned int temp;
unsigned char out_char;
int i;
int shift_count = 8;
for(i = 0; i < file_bytes; i++){
if(shift_count == 8){
writer = read_buffer[i];
temp = temp & 0x00000000;
temp = read_buffer[i+1] << 8;
shift_count = 1;
}else{
//moves the next char's bits to the left, for the purpose of filling the
//8 bit buffer (writer) via OR operation
temp = read_buffer[i] << 8;
}
temp = temp >> shift_count;
writer = writer | temp;
//output right byte of writer
unsigned int right_byte = writer & 0x000000ff;
//output right_byte as a char
out_char = (char) right_byte;
//write_buffer[i] = out_char;
write(out_fd, &out_char, 1);
//clear right side of writer
writer = writer & 0x0000ff00;
//shift left side of writer to the right by 8
writer = writer >> 8;
shift_count++;
}
return 0;
}
It seems to me that input and output are too strongly coupled.
At some point, the program should be reading (roughly) the 80th octet from the input and writing (roughly) the 70th octet to the output, because you want to (on average) write 7 bits out for every 8 bits you read in, right?
What the loop
for(i = 0; i < file_bytes; i++){
...
... = read_buffer[i];
...
write(out_fd, &out_char, 1);
...
}
actually seems to be doing is:
On the 70th pass through the loop -- when 70==i --
it's reading the 70th octet from the input and writing the 70th octet to the output.
On the 80th pass through the loop -- when 80==i --
it's reading the 80th octet from the input and writing the 80th octet to the output.
You must decide:
Do you want "i" to represent the number of input characters processed, or the number of output chars processed?
Because it's not possible to do both -- it's not possible to have 70 equal 80.
Perhaps something like this is closer to what you wanted:
/* test.c
http://stackoverflow.com/questions/15080239/c-how-to-fix-this-algorithm-for-z827-ascii-compression
WARNING: untested code.
*/
int compress(char *filename_ptr){
int in_fd;
in_fd = open(filename_ptr, O_RDONLY);
//set pointer to the end of the file, find file size, then reset position
//by closing/opening
unsigned int file_bytes = lseek(in_fd, 0, SEEK_END);
close(in_fd);
in_fd = open(filename_ptr, O_RDONLY);
//store file contents in buffer
unsigned char read_buffer[file_bytes];
read(in_fd, read_buffer, file_bytes);
//file where the output will be stored
int out_fd;
creat("output.txt", 0644);
out_fd = open("output.txt", O_WRONLY);
//sets file size in header (needed for decompression, this is the size of the
//file before compression. everything after this we write this 4-byte int
//is a 1 byte char
write(out_fd, &file_bytes, 4);
unsigned int writer;
unsigned int temp;
unsigned char out_char;
int i;
int writer_bits = 0; // 0 bits of data in writer so far
for(i = 0; i < file_bytes; i++){
// i is the number of (7 bit ASCII) characters
// read from the input so far.
// add 7 more bits to the writer
temp = read_buffer[i];
//moves the next char's bits to the left, for the purpose of filling the
//8 bit buffer (writer) via OR operation
//(avoid overwriting the "writer_bits" of good bits
//already in the buffer).
temp = read_buffer[i] << writer_bits;
writer = writer | temp;
writer_bits = writer_bits + 7;
//output right byte of writer
unsigned int right_byte = writer & 0x000000ff;
//output right_byte as a char
out_char = (unsigned char) right_byte;
// output 8 bits of data whenever
// we have *at least* 8 bits of data in the writer buffer.
if(writer_bits >= 8){
//write_buffer[i] = out_char;
write(out_fd, &out_char, 1);
//shift left side of writer to the right by 8
writer = writer >> 8;
writer_bits = writer_bits - 8;
}else{
// 7 or fewer bits in writer --
// skip writing until next time.
}
}
// is there any leftover bits still in writer?
if(writer_bits > 0){
//write_buffer[i] = out_char;
write(out_fd, &out_char, 1);
}
return 0;
}
(Currently the program reads the entire input file into RAM, then writes the entire output file. Some programmers prefer to read a little at a time, then write a little at a time. Both approaches have advantages and disadvantages).