Splitting string in C by null terminator - c

I am currently trying to implement a program that intakes a file, reads the file and copies its contents to an array (farray). After this we copy the contents of farray as strings separated by null terminators into a string array called sarray.
For example, say farray contains "ua\0\0Z3q\066\0", then sarray[0] should contain "ua", sarray[1] should contain "\0", sarray[2] should contain "Z3q", and finally sarray[3] should contain "66"
However I cannot figure out how to separate the string by the null terminators. I currently can only use the system calls like fread, fopen, fclose, fwrite...etc. Can someone please help me?
src code:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <fcntl.h>
#include <string.h>
int main(int argc, char *argv[]){
char *farray;
const char *sarray;
long length;
int i;
//Open the input file
FILE *input = fopen(argv[1], "rb");
if(!input){
perror("INPUT FILE ERROR");
exit(EXIT_FAILURE);
}
//Find the length
fseek(input, 0, SEEK_END);
length = ftell(input);
fseek(input, 0, SEEK_SET);
//Allocate memory for farray and sarray
farray = malloc(length + 1);
//Read the file contents to farray then close the file
fread(farray, 1, length, input);
fclose(input);
//Do string splitting here
//Free the memory
free(farray);
return 0;
}

Keep the number of characters length for further use.
What characters do you want to replace by the null characters? Based on that, walk through farray, replace the appropriate characters by the null character. While doing that, count the number of characters that were replaced by the null characters.
If the number of characters that were replaced by the null character is N, then the array of pointers needs to be of size N+1.
Allocate memory for the array of pointers.
Walk through farray again and make sure the elements in the array of pointers point to the right location in farray.
Update, in response to OP's comment
In step (2) above, don't replace anything, just compute N.
In step (5), use strdup and assign the returned value to the elements of the array of pointers instead pointing to farray.

Related

How to insert a number in the place I need in a string, in the C programming language

I'm just starting to write in the C language, I ran into a small problem related more to algorithms than to the features of the language.
In my program, the task is to insert the file size after the _ symbol, if there is one in the file name.
I don't quite understand how this can be implemented, maybe someone will tell you and there is a ready-made algorithm that copes with this, insert a number into a string (array of characters)
Here is an example of my code, with explanations of where and what is being done:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <dirent.h>
#include <string.h>
int main(){
DIR * p;
p = opendir("."); // Open catalog
if(p!=NULL){ // check on error
struct dirent * dir;
while((errno=0, dir=readdir(p))){ // Reading catalog
struct stat infoAboutFile;
int res = stat(dir->d_name, &infoAboutFile);
if(res==0){ // check on error
if(S_ISREG(infoAboutFile.st_mode)){ // Check on regular files
char str[256];
strcpy(str,dir->d_name);
int size = infoAboutFile.st_size; //Size file
printf("\n");
}
}else{
perror("Errors in Stat");
}
}
if(errno!=0){ // check on error
perror("Errors in Readdir");
}
int res = closedir(p);
if(res==-1){ // check on errorr
perror("Errors in Closedir");
}
}else{
perror("Errors in Opendir");
}
return 0;
}
The simplest way is to use sprintf:
sprintf( str, "%s_%d", dir->d_name, size );
You will need to make sure str is wide enough for the final string.
EDIT
William Pursell points out in the comments that there's also snprintf, where you specify the maximum number of characters to write to the target buffer such that you avoid an accidental buffer overflow:
snprintf( str, sizeof str, "%s_%d", dir->d_name, size );
Personally, I prefer making sure the target buffer is large enough for the final string, rather than risk truncating the string. The first thing we need to do is compute how large the buffer will need to be:
size_t str_len = strlen( dir->d_name ) // length of d_name
+ 10 // number of decimal digits in a 32-bit int
+ 1; // for the '_' character
If your compiler supports variable-length arrays, then we just declare an array with size str_len:
char str[ str_len + 1 ]; // +1 for the string terminator
Otherwise, we need to allocate that buffer dynamically:
char *str = malloc( str_len + 1 ); // +1 for the string terminator
Then you can use regular sprintf:
sprintf( str, "%s_%d", dir->d_name, size );
If you use malloc, then you will have to remember to deallocate str with free when you're done with it.

Innacurate file readings from fopen and/or fscanf

I try to read a file from this code. I am trying to load images and store them into my program as strings, so I can later create the identical image with fprintf to a new file. I am not allowed to use some file duplication; I need to load the files in as a string and write them to a new file later. What I am attempting is to have a char array, and since one char is one byte the array is as long as the file size, and each element of the char array corresponds to one byte of the diamond block texture, and I want to also be able to write this string from the code to a new file, and have another diamond block that I can open with an image viewer.
#include <stdio.h>
#include <stdlib.h>
char Contents[468];
int main(int argc, char *argv[]) {
char *WD = getenv("HOME");
char Path[strlen(WD)+strlen("/Desktop/diamond_block.png")+1];
sprintf(Path, "%s/Desktop/diamond_block.png", WD);
FILE *File = fopen(Path, "r");
fscanf(File, "%s", Contents);
printf(Contents);
}
The result is just four letters, âPNG, and it is supposed to be hundreds of characters meaning the file is NOT being fully read. I suspect it is somehow being terminated early by some terminating character, but how can I solve my problem?
This is a very basic answer to your question. With the code below you may understand what's your issue. This code need a good review to intercept all the errors the used functions may return. By the way ... enjoy it!
The code loads the whole file fname into the char array imgMem. It compute the file dimension on the variable n, allocates the memory for the array imgMem (malloc) and then loads the whole file into imgMem (fread).
Then the code writes the first 30 bytes of the file in two format:
the hex value of the byte
the char value if the byte has a console representation (otherwise prints a .)
Here the code:
#include <unistd.h>
#include <stdio.h>
#include <malloc.h>
int main(void)
{
const char * fname = "/home/sergio/Pictures/vpn.png";
FILE * fptr;
char * imgMem=NULL;
long n;
int i;
fptr=fopen(fname, "r");
//Determine the file dimension
fseek(fptr,0,SEEK_END); n=ftell(fptr);
//Set the file cursor to the beginning
fseek(fptr,0,SEEK_SET);
printf("The file is %lu byte long.\n\n",n);
//Allocate n bytes to load the file
imgMem = malloc((size_t)n);
//Load the file
fread(imgMem,(size_t)n,1,fptr);;
for(i=0; i<30; i++) {
printf("[%02X %c] ",
(unsigned char)imgMem[i],
(imgMem[i]>31 && imgMem[i]<127)?
imgMem[i]:'.'
);
if ((i+1)%8==0)
puts("");
}
puts("");
free(imgMem);
fclose(fptr);
return 0;
}

printf only display 24 characters of char*

I am doing a project of creating a bot that surfs the internet.
I have to code it in C and for now I'm focusing on the choice of the address where it will go (choosen from a list in a file). This works properly but when I display the addresses the bot has chosen, some are truncated to 24 characters and end with "!" which makes the code unusable with long addresses. Does anyone have any idea of where it might come?
The program :
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <math.h>
int main() {
FILE* file = fopen("test.txt", "r+");
char *line = NULL;
char *tab[1023];
int tailleTab = 0;
line = malloc(sizeof(*line));
if(line == NULL)
return(EXIT_FAILURE);
while((fgets(line, 1023, file)) != NULL ) {
if(line[0] != '#' && line[0] != '\n') {
tab[tailleTab] = line;
line = malloc(sizeof(*line));
tailleTab++;
}
}
srand(time(NULL));
int n = rand()%tailleTab;
printf("\n%d = %.32s\n", n, tab[n]);
printf("%s\n", tab[n]);
fclose(file);
}
The file from which the address is chosen:
www.google.com
www.wikipedia.org
www.dahunicorn.xyz
www.cloudimperiumgames.com
www.robertspaceindustries.com
www.candybox2.net
www.42.com
www.1337.com
The main problem is this:
line = malloc(sizeof(*line));
This only allocates a single character to line. The expression *line is a char which means you allocate sizeof(char) bytes, and sizeof(char) is defined to always be 1.
That means your call to fgets will write out of bounds of your allocated memory and you will have undefined behavior.
There's no reason to actually allocate line dynamically. Instead create it as an array, and then use strdup when saving it in the tab array. Either that or allocate more memory (1023 is a good number, since that's amount you pass to fgets).
As already explained in another answer, with this code:
line = malloc(sizeof(*line));
you are allocating with malloc a single char on the heap, since the expression *line is equivalent to a char (as line is declared as char *).
I would simplify your code using named constants instead of magic numbers like 1023 that are spread through code (and make it harder to maintain), in addition to just reserving space for the temporary line buffer on the stack instead of dynamically allocating it on the heap, e.g.:
/* Instead of: line = malloc(sizeof(*line)); */
#define LINE_MAX_SIZE 1024
char line[LINE_MAX_SIZE];
Also consider doing:
#define TAB_MAX_ITEMS /* 1023 or whatever */
char* tab[TAB_MAX_ITEMS];
In the while loop consider using LINE_MAX_SIZE instead of the magic number 1023:
while ((fgets(line, LINE_MAX_SIZE, file)) != NULL ) {
You may also want to add a check to the index in the tab array, to avoid buffer overruns:
if (tailleTab >= TAB_MAX_ITEMS) {
/* Index out of range */
...
}
/* tailleTab is a valid index.
* Deep-copy the line read in the temporary buffer
* and save a pointer to the copy into the tab array.
*/
tab[tailleTab] = strdup(line);
In production code you should also loop through the pointers stored in the tab array, and call free on the them to release the memory allocated on the heap.

Issue accessing contents of array of character pointers in C

I am going through The C Programming Language by K&R and trying to understand character pointers and arrays.
I am creating a function in C that reads multiple lines from stdin and stores the lines (char*) in an array of character pointers (char* []).
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
enum {MAXINPUT = 1024, MAXLINES = 100};
/* Reads at most `maxLines` lines and stores them in an array of char pointers. Returns number of lines read. */
int readlines(char* lineptr[], int maxLines);
/* Takes a single line input from stdin and stores it in str. Returns str length. */
int getInputLine(char* str, int maxInput);
int main(int argc, char** argv) { ... }
int readlines(char* lineptr[], int maxLines) {
/* Return number of lines read. */
int numLines = 0;
/* Buffer to store current line being read. */
char currentLine[MAXINPUT];
/* Terminate loop when enter is pressed at empty input or number of lines exceeds max. */
while(getInputLine(currentLine,MAXINPUT) && numLines < maxLines) {
/* Address of current line's first character is set to the appropriate index at lineptr. */
lineptr[numLines] = currentLine;
/* Both currentLine and lineptr[numLines] print accurately (note they are the same). */
printf("CURRENT LINE:\t %s\n",currentLine);
printf("lineptr[%d]:\t %s\n",numLines,lineptr[numLines]);
numLines++;
}
/* ISSUE: Outside the loop, lineptr does NOT print anything. */
printf("\nLOOPING\n");
for(int i = 0; i < numLines; i++) {
printf("%d: %s\n",i,lineptr[i]);
}
/* ISSUE: currentLine (which should be the last line entered) ALSO does not print outside the while. */
printf("\ncurrentLine: %s",currentLine);
return numLines;
}
My issue is that in the while(), the contents of lineptr and currentLine print accurately. But outside the while(), both lineptr and currentLine do not print anything.
And of course, this issue persists when I try to read lines into a char* [] in the main() and try to print its contents.
Why is it that the contents at the addresses being accessed by lineptr are printing inside the loop but not outside? Am I missing something obvious?
That's because you have a single buffer called currentLine into which you read text. Then you assign the address of currentLine to your lineptr[i], and proceed to overwrite its contents with new text. So, all your lineptrs essentially point to the same one location, which is the address of currentLine, and currentLine contains only the last line that you read. I suppose the loop does not print anything because the last line you read is empty.
So, to get this to work, you need to read a line into currentLine, measure its length, use malloc() to allocate enough memory for that line, copy the line from currentLine to the allocated memory, and store the pointer to the allocated memory in lineptr[i].
This line
lineptr[numLines] = currentLine;
just assigns a pointer to lineptr[numLines]. There are couple of issues with that:
Every line points to the same pointer.
The pointer is invalid after you return from the function.
You need to use something akin to:
lineptr[numLines] = strdup(currentLine);
Remember that strdup is not a standard C library function. If your platform does not support it, you can implement it very easily.
char* strdup(char const* in)
{
char* ret = malloc(strlen(in)+1);
return strcpy(ret, in);
}

C fwrite function giving data with unknow space

This function is supposed to get a parameter as the pointer of a file and put all file into the struct anagram, then write it to another file. Right now the each data has a lot of space bewteen them. The charCompare is working fine since i make a test file to test it.
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <string.h>
#include <errno.h>
#include <ctype.h>
#include "anagrams.h"
#define SIZE 80
//struct
struct anagram {
char word[SIZE];
char sorted[SIZE];
};
void buildDB ( const char *const dbFilename ){
FILE *dict, *anagramsFile;
struct anagram a;
//check if dict and anagram.data are open
errno=0;
dict= fopen(dbFilename, "r");
if(errno!=0) {
perror(dbFilename);
exit(1);
}
errno=0;
anagramsFile = fopen(anagramDB,"wb");
char word[SIZE];
char *pos;
int i=0;
while(fgets(word, SIZE, dict) !=NULL){
//get ripe of the '\n'
pos=strchr(word, '\n');
*pos = '\0';
//lowercase word
int j=0;
while (word[j]){
tolower(word[j]);
j++;
}
/* sort array using qsort functions */
qsort(word,strlen(word), sizeof(char), charCompare);
strncpy(a.sorted,word,sizeof(word));
fwrite(&a,1,sizeof(struct word),anagramsFile);
i++;
}
fclose(dict);
fclose(anagramsFile);
}
data:
10th 1st 2nd
A probable cause is the size argument passed to qsort(). From the linked reference page for qsort():
size - size of each element in the array in bytes
Therefore the size argument should be 1, which is guaranteed to be sizeof(char), and not sizeof(char*) which is likely to be 4 or 8. The posted code incorrectly informs qsort() that word is pointing to an array of 4 (or 8) times larger than the actual array and qsort() will access memory it is not supposed to. Change to:
qsort(word,strlen(word), 1, charCompare);
Another possible cause is buffer overrun caused by this line:
strncpy(&a.sorted[i],word,sizeof(word));
i is being incremented on every iteration of the while loop but sizeof(word) is always being written. The values of SIZE and BUFSIZ are not posted but even if they were equal the strncpy() will write beyond the bounds of a.sorted after the first iteration.
Other points:
fgets() is not guaranteed to read the new-line character so check return value of strchr() before dereferencing it.
tolower() returns the lowercase character, it does not change its argument.
why read into a temporary buffer (word) and copy? Just read directly into the struct members.

Resources