Convert buffer to array of 8-bit values and back in C - c

I would like to write a program in C that gets the file content via stdin and reads it line by line and, for each line, converts it to an array of 8-bit integer values.
I also would like to be able to do the reverse process. After working with my array of 8-bit values, I would like to convert it again to "lines" that would be organized as a new buffer.
So basically, I would like to convert a char * line to an int array[] and back (an int array[] to a char * line) while keeping the consistency, so when I create the file again out of the conversions, the file is valid (and by valid I mean, the conversion from int array[] to char * line generates the same content of the original char * line, while reading each line of the stdin.
My code is currently as follows:
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE *stream;
char *line = NULL;
size_t len = 0;
ssize_t read;
stream = stdin;
if (stream == NULL)
exit(EXIT_FAILURE);
while ((read = getline(&line, &len, stream)) != -1) {
char * array = line_to_array(line);
// here I include the rest of my code
// where I am going to use the generated array
// ...
}
free(line);
fclose(stream);
exit(EXIT_SUCCESS);
}
The line_to_array function would be the one to convert the "line" content to the array of integers. In a second file, I would just do the opposite.
The mechanics of the process would be like this:
The first program (first.c) would receive a file content via stdin. By reading it using getline, I would have each line to convert to an array of integers and send each line to a second program (second.c) that would convert each array to a char * buffer again and the reconstruct the file.
In the terminal, I would run it like this:
./first | ./second
I appreciate any help on this matter.
Thank you.

I believe you may already know that a name of array is a kind of constant pointer. You could verify the fact from following code:
char hello[] = "hello world!";
for( int idx=0; *(hello + idx) != 0; idx++ )
{
printf("%c", *(hello + idx));
}
printf("\n");
So, there are no reason to convert character pointer to array. For your information, A char variable is a 8bit data in C, this can contain a integer value which is represent a character: 65 represent 'A' in ASCII code.
Secondly, this link may help you to understand how to convert between c string and std::string.
On second thought, may your input file is UNICODE or UTF-8 encoded file which is using multi-byte character code. In that case, you may not able to use getline() to read the string from the file. If so, please refer this question: Reading unicode characters.
I wish following code assist you to understand char type, array and pointer in C/C++:
std::string hello("Hello world");
const char *ptr = hello.c_str();
for( int idx=0; idx < hello.size(); idx++ )
{
printf("%3d ", *(ptr + idx));
}
printf("\n");
std::string hello("Hello world");
const char *ptr = hello.c_str();
for( int idx=0; idx < hello.size(); idx++ )
{
printf("%3d ", ptr[idx]);
}
printf("\n");

Related

reading and printing unsigned int from file

I am trying to print and read unsigned ints from a .txt file. I am using fprintf to print the unsigend int, (menualy checking the file presents the wanted values), but when reading it, I get a weird offset to all the values I read (same offset beween all reads, but not every run of the program), here is the reading code I am using:
unsigned int tempDuration = (unsigned int)fileFgets(file);
and this is fileFgets:
char tempStr[MAX_STR_SIZE] = { 0 };
char* str = 0;
fgets(tempStr, MAX_STR_SIZE, file);
tempStr[strcspn(tempStr, "\n")] = 0;
str = (char*)malloc(strlen(tempStr) * sizeof(str));
strcpy(str, tempStr);
return str;
I am using this function becuse it is ment to read both strings and unsinged ints, seperated by '\n', but am open for using diffrent solutions for both or either. (reading the strings works as intended)
Casting from an array of characters to an unsigned integer will actually cast the pointer and not the string itself. You need to convert it using strtoul().
Replacing the '\n' character isn't required because strtoul stopps at the first character which is not a valid digit.
I modified your function :
unsigned int fileFgets(file)
{
char tempStr[MAX_STR_SIZE] = { 0 };
fgets(tempStr, MAX_STR_SIZE, file);
return strtoul(tempStr, NULL, 0);
}

Cannot get Call to Function Working

I found this piece of code at Reading a file character by character in C and it compiles and is what I wish to use. My problem that I cannot get the call to it working properly. The code is as follows:
char *readFile(char *fileName)
{
FILE *file = fopen(fileName, "r");
char *code;
size_t n = 0;
int c;
if (file == NULL)
return NULL; //could not open file
code = malloc(1500);
while ((c = fgetc(file)) != EOF)
{
code[n++] = (char) c;
}
code[n] = '\0';
return code;
}
I am not sure of how to call it. Currently I am using the following code to call it:
.....
char * rly1f[1500];
char * RLY1F; // This is the Input File Name
rly1f[0] = readFile(RLY1F);
if (rly1f[0] == NULL) {
printf ("NULL array); exit;
}
int n = 0;
while (n++ < 1000) {
printf ("%c", rly1f[n]);
}
.....
How do I call the readFile function such that I have an array (rly1f) which is not NULL? The file RLY1F exists and has data in it. I have successfully opened it previously using 'in line code' not a function.
Thanks
The error you're experiencing is that you forgot to pass a valid filename. So either the program crashes, or fopen tries to open a trashed name and returns NULL
char * RLY1F; // This is not initialized!
RLY1F = "my_file.txt"; // initialize it!
The next problem you'll have will be in your loop to print the characters.
You have defined an array of pointers char * rly1f[1500];
You read 1 file and store it in the first pointer of the array rly1f[0]
But when you display it you display the pointer values as characters which is not what you want. You should just do:
while (n < 1000) {
printf ("%c", rly1f[0][n]);
n++;
}
note: that would not crash but would print trash if the file read is shorter than 1000.
(BLUEPIXY suggested the post-incrementation fix for n BTW or first character is skipped)
So do it more simply since your string is nul-terminated, pass the array to puts:
puts(rly1f[0]);
EDIT: you have a problem when reading your file too. You malloc 1500 bytes, but you read the file fully. If the file is bigger than 1500 bytes, you get buffer overflow.
You have to compute the length of the file before allocating the memory. For instance like this (using stat would be a better alternative maybe):
char *readFile(char *fileName, unsigned int *size) {
...
fseek(file,0,SEEK_END); // set pos to end of file
*size = ftell(file); // get pos, i.e. size
rewind(file); // set pos to 0
code = malloc(*size+1); // allocate the proper size plus one
notice the extra parameter which allows you to return the size as well as the file data.
Note: on windows systems, text files use \r\n (CRLF) to delimit lines, so the allocated size will be higher than the number of characters read if you use text mode (\r\n are converted to \n so there are less chars in your buffer: you could consider a realloc once you know the exact size to shave off the unused allocated space).

C text parser does not identify word

I'm trying to do a simple parsing of text with a C program. A function I have written is supposed to check the buffer that has a line of text saved into it and see if this line contains a particular word at BOL.
The input arguments are:
size: the sizeof(word), calculated before the function is called.
buf: the buffer containing a line from the text being parsed.
word: the word that the function looks for at BOL.
The code is as follows:
#include <stdio.h>
#include <string.h>
int strchk(int size, const char buf[1024], char *word) {
char a[size];
int i;
for (i = 0; i < size - 1; i++) {
a[i] = buf[i];
}
if (strcmp(a, word) == 0)
return 1;
else
return 0;
}
The problem is that for some reason, a word is not being recognized. Previous words have been correctly identified by the same function. Below are two contexts wherein the function is being called, the first one results in a correct identification, the second does not, while the text contains both words at the start of different lines within the text.
char c[] = "|conventional_long_name";
if (strchk(sizeof(c), buf, c)) {
fputs(" conventional_long_name: \"", stdout);
getdata(buf, c, sizeof(c));
}
char d[] = "|official_languages";
if (strchk(sizeof(d), buf, d)) {
fputs(" religion: \"", stdout);
getdata(buf, d, sizeof(d));
}
When I check string a in the strchk() function for size first, it gives me a size of 20, but if I make it print out the string it tells me it is in fact |official_languagesfici. When you count the number of characters it's just as long as the previously mentioned |conventional_long_name, which would suggest some parameter from that function call is at play in the next function call, I just can't figure out where I have made the mistake. Any help would be greatly appreciated.
You need to set the null terminator of the a array in the function strchk.
Do it using
a[size - 1] = '\0';
after the for-loop.
Notes:
since you don't modify the string word in the function strchk, declare the parameter const. const-correctness is important!

How to read a char array stored in a file, into a char buffer during runtime

I'm working in C and I'm modifying existing code.
I have a char array which is stored in a file as follows:
"\x01\x02\x03"
"\x04\x05\x06"
"\x07\x08\x09"
In the original source code this char array is included as follows:
const static char chs[] =
#include "file.h"
;
I'm modifying this code to load the file into a char array during runtime (to get the exact same result as with the above approach) instead of it to be included by the pre-processor. My first approach was to simply read the file into a char buffer, as follows:
FILE *fp;
const char *filename = "file.h";
fp = fopen (filename, "rb");
assert(fp != NULL);
fseek(fp, 0L, SEEK_END);
long int size = ftell(fp);
rewind(fp);
// read entire file into the buffer
char *buffer = (char*)malloc(sizeof(char) * size);
size_t nrOfBytesRead = fread(buffer, 1, size, fp);
However I've quickly discovered that this is not correct. The file already contains the exact code representation of the char array, I cannot simply read it into a char buffer and get the same result as the include approach.
What is the best way to get my char array which is stored in file, into a char array during runtime?
As you've seen, when you read the file using fread it reads it byte for byte. It doesn't get any of the syntactic processing that the compiler does on your source files. It doesn't know that strings live inside of quotes. It doesn't map escape sequences like \x01 into single bytes.
You have several different possibilities for fixing this:
Teach your program how to do that processing as it reads the file. This would be a fair amount of work.
Put just the bytes you want into the file.
Pick a different encoding for your file.
To say a little more about #2: If you don't want to change your file-reading code, what you can do is to create an (in this case) 9-byte file containing just the nine bytes you want. Since your nine bytes are not text, it'll end up being a "binary" file, which you won't be able to straightforwardly edit with an ordinary text editor, etc. (In fact, depending on the tools you have available to you, it might be challenging just to create this particular 9-byte file.)
So if you can't use #1 or #2, you might want to go with #3: pick a brand-new way to encode the data in the file, easier to parse than #1, but easier to prepare than #2. My first thought would be to have the file be hexadecimal. That is, the file would contain
010203040506070809
or
010203
040506
070809
Your file-reading code, instead of the single call to fread, would read two characters at a time and assemble them into bytes for your array. (I'd sketch this out for you, but the compilation I was waiting for has finished, and I ought to get back to my job.)
This should read the hex values from the file and save them to buffer.
fgets() reads each line from the file.
sscanf() reads each hex value from the line.
The format string for sscanf, "\\x%x%n", scans the backslash, an x, the hex value and stores the number of characters processed by the scan. The number of characters processed is used to advance through the line. This is needed if some lines have a different number of hex values.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char line[100] = {'\0'};
unsigned char *buffer = NULL;
unsigned char *temp = NULL;
unsigned int hex = 0;
int size = 0;
int offset = 0;
int used = 0;
int bufferused = 0;
int increment = 100;
int each = 0;
FILE *pf = NULL;
if ( ( pf = fopen ( "file.h", "r")) != NULL) {
while ( fgets ( line, sizeof ( line), pf)) {//get each line of the file
offset = 1;//to skip leading quote
//sscanf each hex value in the line
while ( ( sscanf ( line + offset, "\\x%x%n", &hex, &used)) == 1) {
offset += used;// to advance through the line
if ( bufferused >= size) {
temp = realloc ( buffer, size + increment);
if ( temp == NULL) {
//one way to handle the failure
printf ( "realloc failed\n");
free ( buffer);
exit (1);
}
buffer = temp;
size += increment;
}
buffer[bufferused] = hex;
bufferused++;
}
}
fclose ( pf);
}
for ( each = 0; each < bufferused; each++) {
printf ( "%x\n", buffer[each]);
}
free ( buffer);
return 0;
}

Correctly store content of file by line to array and later print the array content

I'm getting some issues with reading the content of my array. I'm not sure if I'm storing it correctly as my result for every line is '1304056712'.
#include <stdio.h>
#include <stdlib.h>
#define INPUT "Input1.dat"
int main(int argc, char **argv) {
int data_index, char_index;
int file_data[1000];
FILE *file;
int line[5];
file = fopen(INPUT, "r");
if(file) {
data_index = 0;
while(fgets(line, sizeof line, file) != NULL) {
//printf("%s", line); ////// the line seems to be ok here
file_data[data_index++] = line;
}
fclose(file);
}
int j;
for(j = 0; j < data_index; j++) {
printf("%i\n", file_data[j]); // when i display data here, i get '1304056712'
}
return 0;
}
I think you need to say something like
file_data[data_index++] = atoi(line);
From your results I assume the file is a plain-text file.
You cannot simply read the line from file (a string, an array of characters) into an array of integers, this will not work. When using pointers (as you do by passing line to fgets()) to write data, there will be no conversion done. Instead, you should read the line into an array of chars and then convert it to integers using either sscanf(), atoi() or some other function of your choice.
fgets reads newline terminated strings. If you're reading binary data, you need fread. If you're reading text, you should declare line as an array of char big enough for the longest line in the file.
Because file_data is an array of char, file_data[data_index] is a single character. It is being assigned a pointer (the base address of int line[5] buffer). If reading binary data, file_data should be an array of integers. If reading strings, it should be an array of string, ie char pointers, like char * file_data[1000]
you also need to initialize data_index=0 outside the if (file) ... block, because the output loop needs it to be set even if the file failed to open. And when looping and storing input, the loop should test that it's not reached the size of the array being stored into.

Resources