Copying a file line by line into a char array with strncpy - c

So i am trying to read a text file line by line and save each line into a char array.
From my printout in the loop I can tell it is counting the lines and the number of characters per line properly but I am having problems with strncpy. When I try to print the data array it only displays 2 strange characters. I have never worked with strncpy so I feel my issue may have something to do with null-termination.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char* argv[])
{
FILE *f = fopen("/home/tgarvin/yes", "rb");
fseek(f, 0, SEEK_END);
long pos = ftell(f);
fseek(f, 0, SEEK_SET);
char *bytes = malloc(pos); fread(bytes, pos, 1, f);
int i = 0;
int counter = 0;
char* data[counter];
int length;
int len=strlen(data);
int start = 0;
int end = 0;
for(; i<pos; i++)
{
if(*(bytes+i)=='\n'){
end = i;
length=end-start;
data[counter]=(char*)malloc(sizeof(char)*(length)+1);
strncpy(data[counter], bytes+start, length);
printf("%d\n", counter);
printf("%d\n", length);
start=end+1;
counter=counter+1;
}
}
printf("%s\n", data);
return 0;
}

Your "data[]" array is declared as an array of pointers to characters of size 0. When you assign pointers to it there is no space for them. This could cause no end of trouble.
The simplest fix would be to make a pass over the array to determine the number of lines and then do something like "char **data = malloc(number_of_lines * sizeof(char *))". Then doing assignments of "data[counter]" will work.
You're right that strncpy() is a problem -- it won't '\0' terminate the string if it copies the maximum number of bytes. After the strncpy() add "data[counter][length ] = '\0';"
The printf() at the end is wrong. To print all the lines use "for (i = 0; i < counter; i++) printf("%s\n", data[counter]);"

Several instances of bad juju, the most pertinent one being:
int counter = 0;
char* data[counter];
You've just declared data as a variable-length array with zero elements. Despite their name, VLAs are not truly variable; you cannot change the length of the array after allocating it. So when you execute the lines
data[counter]=(char*)malloc(sizeof(char)*(length)+1);
strncpy(data[counter], bytes+start, length);
data[counter] is referring to memory you don't own, so you're invoking undefined behavior.
Since you don't know how many lines you're reading from the file beforehand, you need to create a structure that can be extended dynamically. Here's an example:
/**
* Initial allocation of data array (array of pointer to char)
*/
char **dataAlloc(size_t initialSize)
{
char **data= malloc(sizeof *data * initialSize);
return data;
}
/**
* Extend data array; each extension doubles the length
* of the array. If the extension succeeds, the function
* will return 1; if not, the function returns 0, and the
* values of data and length are unchanged.
*/
int dataExtend(char ***data, size_t *length)
{
int r = 0;
char **tmp = realloc(*data, sizeof *tmp * 2 * *length);
if (tmp)
{
*length= 2 * *length;
*data = tmp;
r = 1;
}
return r;
}
Then in your main program, you would declare data as
char **data;
with a separate variable to track the size:
size_t dataLength = SOME_INITIAL_SIZE_GREATER_THAN_0;
You would allocate the array as
data = dataAlloc(dataLength);
initially. Then in your loop, you would compare your counter against the current array size and extend the array when they compare equal, like so:
if (counter == dataLength)
{
if (!dataExtend(&data, &dataLength))
{
/* Could not extend data array; treat as a fatal error */
fprintf(stderr, "Could not extend data array; exiting\n");
exit(EXIT_FAILURE);
}
}
data[counter] = malloc(sizeof *data[counter] * length + 1);
if (data[counter])
{
strncpy(data[counter], bytes+start, length);
data[counter][length] = 0; // add the 0 terminator
}
else
{
/* malloc failed; treat as a fatal error */
fprintf(stderr, "Could not allocate memory for string; exiting\n");
exit(EXIT_FAILURE);
}
counter++;

You are trying to print data with a format specifier %s, while your data is a array of pointer s to char.
Now talking about copying a string with giving size:
As far as I like it, I would suggest you to use
strlcpy() instead of strncpy()
size_t strlcpy( char *dst, const char *src, size_t siz);
as strncpy wont terminate the string with NULL,
strlcpy() solves this issue.
strings copied by strlcpy are always NULL terminated.

Allocate proper memory to the variable data[counter]. In your case counter is set to 0. Hence it will give segmentation fault if you try to access data[1] etc.
Declaring a variable like data[counter] is a bad practice. Even if counter changes in the subsequent flow of the program it wont be useful to allocate memory to the array data.
Hence use a double char pointer as stated above.
You can use your existing loop to find the number of lines first.
The last printf is wrong. You will be printing just the first line with it.
Iterate over the loop once you fix the above issue.

Change
int counter = 0;
char* data[counter];
...
int len=strlen(data);
...
for(; i<pos; i++)
...
strncpy(data[counter], bytes+start, length);
...
to
int counter = 0;
#define MAX_DATA_LINES 1024
char* data[MAX_DATA_LINES]; //1
...
for(; i<pos && counter < MAX_DATA_LINES ; i++) //2
...
strncpy(data[counter], bytes+start, length);
...
//1: to prepare valid memory storage for pointers to lines (e.g. data[0] to data[MAX_DATA_LINES]). Without doing this, you may hit into 'segmentation fault' error, if you do not, you are lucky.
//2: Just to ensure that if the total number of lines in the file are < MAX_DATA_LINES. You do not run into 'segmentation fault' error, because the memory storage for pointer to line data[>MAX_DATA_LINES] is no more valid.

I think that this might be a quicker implementation as you won't have to copy the contents of all the strings from the bytes array to a secondary array. You will of course lose your '\n' characters though.
It also takes into account files that don't end with a new line character and as pos is defined as long the array index used for bytes[] and also the length should be long.
#include <stdio.h>
#include <stdlib.h>
#define DEFAULT_LINE_ARRAY_DIM 100
int main(int argc, char* argv[])
{
FILE *f = fopen("test.c", "rb");
fseek(f, 0, SEEK_END);
long pos = ftell(f);
fseek(f, 0, SEEK_SET);
char *bytes = malloc(pos+1); /* include an extra byte incase file isn't '\n' terminated */
fread(bytes, pos, 1, f);
if (bytes[pos-1]!='\n')
{
bytes[pos++] = '\n';
}
long i;
long length = 0;
int counter = 0;
size_t size=DEFAULT_LINE_ARRAY_DIM;
char** data=malloc(size*sizeof(char*));
data[0]=bytes;
for(i=0; i<pos; i++)
{
if (bytes[i]=='\n') {
bytes[i]='\0';
counter++;
if (counter>=size) {
size+=DEFAULT_LINE_ARRAY_DIM;
data=realloc(data,size*sizeof(char*));
if (data==NULL) {
fprintf(stderr,"Couldn't allocate enough memory!\n");
exit(1);
}
}
data[counter]=&bytes[i+1];
length = data[counter] - data[counter - 1] - 1;
printf("%d\n", counter);
printf("%ld\n", length);
}
}
for (i=0;i<counter;i++)
printf("%s\n", data[i]);
return 0;
}

Related

Partition a 1D char* into 2D char**

There are a lot of questions about converting a 2D array into a 1D array, but I am attempting just the opposite. I'm trying to partition a string into substrings of constant length and house them in a 2D array. Each row of this 2D matrix should contain a substring of the initial string, and, if each row were to be read in succession and concatenated, the initial string should be reproduced.
I nearly have it working, but for some reason I am losing the first substring (partitions[0] -- length 8*blockSize) of the initial string (bin):
int main (void){
char* bin = "00011101010000100001111101001101000010110000111100000010000111110100111100010011010011100011110000011010";
int blockSize = 2; // block size in bytes
int numBlocks = strlen(bin)/(8*blockSize); // number of block to analyze
char** partitions = (char**)malloc((numBlocks+1)*sizeof(char)); // break text into block
for(int i = 0; i<numBlocks;++i){
partitions[i] = (char*)malloc((8*blockSize+1)*sizeof(char));
memcpy(partitions[i],&bin[8*i*blockSize],8*blockSize);
partitions[i][8*blockSize] = '\0';
printf("Printing partitions[%d]: %s\n", i, partitions[i]);
}
for(int j=0; j<numBlocks;++j)
printf("Printing partitions[%d]: %s\n", j,partitions[j]);
return 0;
}
The output is as follows:
Printing partitions[0]: 0001110101000010
Printing partitions[1]: 0001111101001101
Printing partitions[2]: 0000101100001111
Printing partitions[3]: 0000001000011111
Printing partitions[4]: 0100111100010011
Printing partitions[5]: 0100111000111100
Printing partitions[0]: Hj
Printing partitions[1]: 0001111101001101
Printing partitions[2]: 0000101100001111
Printing partitions[3]: 0000001000011111
Printing partitions[4]: 0100111100010011
Printing partitions[5]: 0100111000111100
The construction of partitions in the first for loop is successful. After construction at read out, the string at partitions[0] contains garbage values. Can anyone offer some insight?
int numBlocks = strlen(bin)/(8*blockSize); // number of block to analyze
char** partitions = (char**)malloc((numBlocks+1)*sizeof(char)); // break text into block
for(int i = 0; i<numBlocks;++i){
partitions[i] = (char*)malloc((8*blockSize+1)*sizeof(char));
memcpy(partitions[i],&bin[8*i*blockSize],8*blockSize);
partitions[i][8*blockSize] = '\0';
printf("Printing partitions[%d]: %s\n", i, partitions[i]);
}
This all looks suspicious to me; it's far too complex for the task, making it a prime suspect for errors.
For reasons explained in answers to this question, void * pointers which are returned by malloc and other functions shouldn't be casted.
There's no need to multiply by 1 (sizeof (char) is always 1 in C). In fact, in your first call to malloc you should be multiplying by sizeof (char *) (or better yet, sizeof *partitions, as in the example below), since that's the size of the type of element that partitions points at.
malloc might return NULL, resulting in undefined behaviour when you attempt to assign into the location it points at.
Anything else (i.e. everything that isn't NULL) that malloc, calloc or realloc returns will need to be freed when no longer in use, or else tools such as valgrind (a leak detection program, useful for people who habitually forget to free allocated objects and thus cause memory leaks) will report false positives and lose part of their usefulness.
numBlocks, i, or anything else that's for counting elements of an array, should be declared as a size_t to follow standard convention (e.g. check the strlen manual, synopsis section to see how strlen is declared, noting the type of the return value is size_t). Negative values caused by overflows here will obviously cause the program to misbehave.
I gather you've yet to think about any excess beyond the last group of 8 characters... This shouldn't be difficult to incorporate.
I suggest using a single allocation, such as:
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BLOCK_SIZE 8
int main(void) {
char const *bin = "00011101010000100001111101001101000010110000111100000010000111110100111100010011010011100011110000011010";
size_t bin_length = strlen(bin),
block_count = (bin_length / BLOCK_SIZE)
+ (bin_length % BLOCK_SIZE > 0); // excess as per point 6 above
char (*block)[BLOCK_SIZE + 1] = malloc(block_count * sizeof *block);
if (!block) { exit(EXIT_FAILURE); }
for (size_t x = 0; x < block_count; x++) {
snprintf(block[x], BLOCK_SIZE + 1, "%s", bin + x * BLOCK_SIZE);
printf("Printing partitions[%zu]: %s\n", x, block[x]);
}
for (size_t x = 0; x < block_count; x++) {
printf("Printing partitions[%zu]: %s\n", x, block[x]);
}
free(block);
exit(0);
}
Their are a few problems with your code.
You are allocating **partitions incorrectly.
Instead of:
char** partitions = (char**)malloc((numBlocks+1)*sizeof(char)); /* dont need +1, as numblocks is enough space. */
You need to allocate space for char* pointers, not char characters.
instead, this needs to be:
char** partitions = malloc((numBlocks+1)*sizeof(char*));
Also read Why not to cast result of malloc(), as it is not needed in C.
malloc() needs to be checked everytime, as it can return NULL when unsuccessful.
Once finished with the space allocated, it is always good to free() memory previously requested by malloc(). It is important to do this at some point in the program.
Here is some code which shows this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BLOCKSIZE 2
#define BLOCK_MULTIPLIER 8
int main(void) {
const char *bin = "00011101010000100001111101001101000010110000111100000010000111110100111100010011010011100011110000011010";
const size_t blocksize = BLOCKSIZE;
const size_t multiplier = BLOCK_MULTIPLIER;
const size_t numblocks = strlen(bin)/(multiplier * blocksize);
const size_t numbytes = multiplier * blocksize;
char **partitions = malloc(numblocks * sizeof(*partitions));
if (partitions == NULL) {
printf("Cannot allocate %zu spaces\n", numblocks);
exit(EXIT_FAILURE);
}
for (size_t i = 0; i < numblocks; i++) {
partitions[i] = malloc(numbytes+1);
if (partitions[i] == NULL) {
printf("Cannot allocate %zu bytes for pointer\n", numbytes+1);
exit(EXIT_FAILURE);
}
memcpy(partitions[i], &bin[numbytes * i], numbytes);
partitions[i][numbytes] = '\0';
printf("Printing partitions[%zu]: %s\n", i, partitions[i]);
}
printf("\n");
for(size_t j = 0; j < numblocks; j++) {
printf("Printing partitions[%zu]: %s\n", j,partitions[j]);
free(partitions[j]);
partitions[j] = NULL;
}
free(partitions);
partitions = NULL;
return 0;
}
Which outputs non-garbage values:
Printing partitions[0]: 0001110101000010
Printing partitions[1]: 0001111101001101
Printing partitions[2]: 0000101100001111
Printing partitions[3]: 0000001000011111
Printing partitions[4]: 0100111100010011
Printing partitions[5]: 0100111000111100
Printing partitions[0]: 0001110101000010
Printing partitions[1]: 0001111101001101
Printing partitions[2]: 0000101100001111
Printing partitions[3]: 0000001000011111
Printing partitions[4]: 0100111100010011
Printing partitions[5]: 0100111000111100

copy a const char* into array of char (facing a bug)

I have following method
static void setName(const char* str, char buf[16])
{
int sz = MIN(strlen(str), 16);
for (int i = 0; i < sz; i++) buf[i] = str[i];
buf[sz] = 0;
}
int main()
{
const char* string1 = "I am getting bug for this long string greater than 16 lenght);
char mbuf[16];
setName(string,mybuf)
// if I use buf in my code it is leading to spurious characters since length is greater than 16 .
Please let me know what is the correct way to code above if the restriction for buf length is 16 in method static void setName(const char* str, char buf[16])
When passing an array as argument, array decays into the pointer of FIRST element of array. One must define a rule, to let the method know the number of elements.
You declare char mbuf[16], you pass it to setName(), setName() will not get char[], but will get char* instead.
So, the declaration should be
static void setName(const char* str, char* buf)
Next, char mbuf[16] can only store 15 chars, because the last char has to be 'null terminator', which is '\0'. Otherwise, the following situation will occur:
// if I use buf in my code it is leading to spurious characters since length is greater than 16 .
Perhaps this will help you understand:
char str[] = "foobar"; // = {'f','o','o','b','a','r','\0'};
So the code should be
static void setName(const char* str, char* buf)
{
int sz = MIN(strlen(str), 15); // not 16
for (int i = 0; i < sz; i++) buf[i] = str[i];
buf[sz] = '\0'; // assert that you're assigning 'null terminator'
}
Also, I would recommend you not to reinvent the wheel, why don't use strncpy instead?
char mbuf[16];
strncpy(mbuf, "12345678901234567890", 15);
The following code passes the size of the memory allocated to the buffer, to the setName function.
That way the setName function can ensure that it does not write outside the allocated memory.
Inside the function either a for loop or strncpy can be used. Both will be controlled by the size parameter sz and both will require that a null terminator character is placed after the copied characters. Again, sz will ensure that the null terminator is written within the memory allocated to the buffer.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void setName(const char *str, char *buf, int sz);
int main()
{
const int a_sz = 16;
char* string = "This bit is OK!! but any more than 15 characters are dropped";
/* allocate memory for a buffer & test successful allocation*/
char *mbuf = malloc(a_sz);
if (mbuf == NULL) {
printf("Out of memory!\n");
return(1);
}
/* call function and pass size of buffer */
setName(string, mbuf, a_sz);
/* print resulting buffer contents */
printf("%s\n", mbuf); // printed: This bit is OK!
/* free the memory allocated to the buffer */
free(mbuf);
return(0);
}
static void setName(const char *str, char *buf, int sz)
{
int i;
/* size of string or max 15 */
if (strlen(str) > sz - 1) {
sz--;
} else {
sz = strlen(str);
}
/* copy a maximum of 15 characters into buffer (0 to 14) */
for (i = 0; i < sz; i++) buf[i] = str[i];
/* null terminate the string - won't be more than buf[15]) */
buf[i] = '\0';
}
Changing one value const int a_sz allows different numbers of characters to be copied. There is no 'hard coding' of the size in the function, so reducing the risk of errors if the code is modified later on.
I replaced MIN with a simple if ... else structure so that I could test the code.

My returnList[0] gets rewritten to #5'

I am trying to return an array of strings and while I copy the strings something weird happens when it passes the 4th index. For example, when it loops through the first 3 times it is stored as "the" but then it sudden becomes rewritten but it writes the next index just fine[index 5]. Can you guys find anything wrong with it because I'm stumped.
#include <stdlib.h>
#include <stdio.h>
#include "hash.h"
#include <string.h>
#define MAX 200
#define TERMINATE "asdfghjkl"
int createTable(int numFiles, char** files, char** stopList)
{
printf("stepped into create table\n");
FILE* fp1;
char oneWord[100];
HashTable hash = InitializeTable(900000);
int index = 2;
while(numFiles >0) {
fp1 = fopen(files[index++], "r");
while(fscanf(fp1, "%s", oneWord)!=EOF){
Insert(oneWord, hash, stopList);
}
numFiles--;
}
return 0;
}
char** createStopList(char* stopL)
{
FILE* fp1;
fp1 = fopen(stopL, "r");
char oneWord[100];
int i = 0;
char* stopList[MAX];
while(fscanf(fp1, "%s", oneWord)!=EOF){
stopList[i] = (char*)malloc(sizeof(oneWord));
strcpy(stopList[i++], oneWord);
}
stopList[i] = (char*)malloc(sizeof(char*));
strcpy(stopList[i], TERMINATE);
char** strings = stopList;
char** returnList = malloc(sizeof(strings));
i=0;
while(strcmp(strings[i], TERMINATE)!=0){
returnList[i] = malloc(sizeof(char*));
strcpy(returnList[i], strings[i]);
i++;
}
returnList[i] = (char*)malloc(sizeof(char*));
strcpy(returnList[i], TERMINATE);
return returnList;
}
int main(int argc, char** argv)
{
printf("start of prg\n");
char** stopList= createStopList(argv[1]);
createTable(argc-2, argv, stopList);
return 0;
}
This code causes a buffer overflow:
#define TERMINATE "asdfghjkl"
// ...
returnList[i] = (char*)malloc(sizeof(char*));
strcpy(returnList[i], TERMINATE);
The length of TERMINATE is 10, but sizeof(char*) is probably less than 10.
To fix it:
returnList[i] = malloc( sizeof TERMINATE );
strcpy(returnList[i], TERMINATE);
Your comments suggest you used strdup instead (that function is not in Standard C, but it is commonly provided).
This is also completely fubar'd:
char** strings = stopList;
char** returnList = malloc(sizeof(strings));
// ...
returnList[i] = malloc(sizeof(char*));
sizeof(strings) is the same as sizeof(char **), which is probably 4 or 8, but you go on to write past the end of this array, as soon as i gets to 1! This is probably the cause of your symptoms.
I think perhaps you have a misconception about what sizeof does. It tells you how many bytes are used to store a variable (NOT how many bytes are at the location the variable is pointing to, if that variable is a pointer).
Presumably you meant:
returnList = malloc( (i+1) * sizeof *returnList );
which gives you enough pointers for indices returnList[0] through returnList[i].
The code after that is badly designed, you have unnecessary code duplication. Change the while loop to do...while, then the last iteration will copy TERMINATE for you without you having to write extra code for it.
Earlier on in that same function, this line is poor:
while(fscanf(fp1, "%s", oneWord)!=EOF){
You should prevent the input overflowing. Also you never check whether i exceeds MAX. And could make another improvement. Instead of copying TERMINATE into stopList, just save i, and write TERMINATE on the end of returnList.
Finally you seem to be pointlessly storing and copying your array instead of just dynamically allocating it in the first place. Oh, and your mallocs have warts.
Putting all of those changes together:
char **createStopList(char const *stopL)
{
FILE* fp1;
fp1 = fopen(stopL, "r");
char oneWord[100];
size_t i;
char **stopList;
if ( !fp1 )
return NULL;
stopList = malloc(MAX * sizeof *stopList);
if ( !stopList )
exit(EXIT_FAILURE);
for (i = 0; i < MAX - 1 && fscanf(fp1, "%99s", oneWord) == 1; ++i)
{
stopList[i] = malloc( strlen(oneWord) + 1);
if ( !stopList[i] )
exit(EXIT_FAILURE);
strcpy(stopList[i], oneWord);
}
stopList[i] = malloc(sizeof TERMINATE);
strcpy(returnList[i], TERMINATE);
// (optional) free entries you didn't use in the list
stopList = realloc(stopList, (i+1) * sizeof *returnList);
if ( !stopList )
exit(EXIT_FAILURE);
return stopList;
}

String (array) capacity via pointer

I am tring to create a sub-routine that inserts a string into another string. I want to check that the host string is going to have enough capacity to hold all the characters and if not return an error integer. This requires using something like sizeof but that can be called using a pointer. My code is below and I would be very gateful for any help.
#include<stdio.h>
#include<conio.h>
//#include "string.h"
int string_into_string(char* host_string, char* guest_string, int insertion_point);
int main(void) {
char string_one[21] = "Hello mother"; //12 characters
char string_two[21] = "dearest "; //8 characters
int c;
c = string_into_string(string_one, string_two, 6);
printf("Sub-routine string_into_string returned %d and creates the string: %s\n", c, string_one);
getch();
return 0;
}
int string_into_string(char* host_string, char* guest_string, int insertion_point) {
int i, starting_length_of_host_string;
//check host_string is long enough
if(strlen(host_string) + strlen(guest_string) >= sizeof(host_string) + 1) {
//host_string is too short
sprintf(host_string, "String too short(%d)!", sizeof(host_string));
return -1;
}
starting_length_of_host_string = strlen(host_string);
for(i = starting_length_of_host_string; i >= insertion_point; i--) { //make room
host_string[i + strlen(guest_string)] = host_string[i];
}
//i++;
//host_string[i] = '\0';
for(i = 1; i <= strlen(guest_string); i++) { //insert
host_string[i + insertion_point - 1] = guest_string[i - 1];
}
i = strlen(guest_string) + starting_length_of_host_string;
host_string[i] = '\0';
return strlen(host_string);
}
C does not allow you to pass arrays as function arguments, so all arrays of type T[N] decay to pointers of type T*. You must pass the size information manually. However, you can use sizeof at the call site to determine the size of an array:
int string_into_string(char * dst, size_t dstlen, char const * src, size_t srclen, size_t offset, size_t len);
char string_one[21] = "Hello mother";
char string_two[21] = "dearest ";
string_into_string(string_one, sizeof string_one, // gives 21
string_two, strlen(string_two), // gives 8
6, strlen(string_two));
If you are creating dynamic arrays with malloc, you have to store the size information somewhere separately anyway, so this idiom will still fit.
(Beware that sizeof(T[N]) == N * sizeof(T), and I've used the fact that sizeof(char) == 1 to simplify the code.)
This code needs a whole lot more error handling but should do what you need without needing any obscure loops. To speed it up, you could also pass the size of the source string as parameter, so the function does not need to calculate it in runtime.
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
signed int string_into_string (char* dest_buf,
int dest_size,
const char* source_str,
int insert_index)
{
int source_str_size;
char* dest_buf_backup;
if (insert_index >= dest_size) // sanity check of parameters
{
return -1;
}
// save data from the original buffer into temporary backup buffer
dest_buf_backup = malloc (dest_size - insert_index);
memcpy (dest_buf_backup,
&dest_buf[insert_index],
dest_size - insert_index);
source_str_size = strlen(source_str);
// copy new data into the destination buffer
strncpy (&dest_buf[insert_index],
source_str,
source_str_size);
// restore old data at the end
strcpy(&dest_buf[insert_index + source_str_size],
dest_buf_backup);
// delete temporary buffer
free(dest_buf_backup);
}
int main()
{
char string_one[21] = "Hello mother"; //12 characters
char string_two[21] = "dearest "; //8 characters
(void) string_into_string (string_one,
sizeof(string_one),
string_two,
6);
puts(string_one);
return 0;
}
I tried using a macro and changing string_into_string to include the requirement for a size argument, but I still strike out when I call the function from within another function. I tried using the following Macro:
#define STRING_INTO_STRING( a, b, c) (string_into_string2(a, sizeof(a), b, c))
The other function which causes failure is below. This fails because string has already become the pointer and therefore has size 4:
int string_replace(char* string, char* string_remove, char* string_add) {
int start_point;
int c;
start_point = string_find_and_remove(string, string_remove);
if(start_point < 0) {
printf("string not found: %s\n ABORTING!\n", string_remove);
while(1);
}
c = STRING_INTO_STRING(string, string_add, start_point);
return c;
}
Looks like this function will have to proceed at risk. looking at strcat it also proceeds at risk, in that it doesn't check that the string you are appending to is large enough to hold its intended contents (perhaps for the very same reason).
Thanks for everyone's help.

Segmentation fault when returning a struct

I am trying to do a pretty simple thing - it is reading a file and then turning it into a char** splitting it into lines. However when I return a struct containing the char** and size i get Segmentation fault. I read here: C segmentation fault before/during return statement that it's probably "mangled stack". I still however don't know what I did to mangle it. This is my code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#include "comp_words.h"
#define BLOCK 4096
struct sized_str {
char* str;
long size;
};
struct sized_arr {
char** content;
int size;
};
struct sized_str readfile(char* name) {
FILE *f;
long filesize;
char *buf;
struct sized_str res;
int r, p = 0;
f = fopen(name, "r");
fseek(f, 0, SEEK_END);
filesize = ftell(f);
rewind(f);
buf = calloc(filesize + 1, sizeof(char));
while ((r = fread(buf + p, sizeof(char), BLOCK, f))) {
p += r;
}
res.str = buf;
res.size = filesize + 1;
return res;
}
struct sized_arr read_dict() {
struct sized_str file_content;
struct sized_arr result;
char *buf, *buf_cpy, *buf_cpy_point, *line, **res;
int i = 0, j, line_count = 0;
file_content = readfile("/var/tmp/twl06.txt");
buf = file_content.str;
buf_cpy = (char*)malloc(file_content.size * sizeof(char));
strcpy(buf_cpy, buf);
buf_cpy_point = buf_cpy;
while (strtok(buf_cpy_point, "\n\r")) {
line_count++;
buf_cpy_point = NULL;
}
res = (char**)malloc(sizeof(char*) * line_count);
while ((line = strtok(buf, "\n\r"))) {
res[i] = (char*)malloc(sizeof(char) * strlen(line));
j = 0;
while ((res[i][j] = tolower(line[j]))) {
j++;
}
buf = NULL;
}
free(buf_cpy);
result.size = line_count;
result.content = res;
return result;
}
// ...
int main (int argc, char** argv) {
struct sized_str input;
struct sized_arr dict;
dict = read_dict();
// ...
return 0;
The code segfaults while returning from read_dict function.
At least at first glance, this seems to have a couple of problems. First:
while ((line = strtok(buf, "\n\r"))) {
To use strtok you normally pass the buffer on the first all, then make subsequent calls passing "NULL" for the first parameter until strtok returns a NULL (indicating that it's reached the end of the buffer). [Edit: upon further examination, it's apparent this isn't really a bug -- as pointed out by #Casablanca, he sets buf to NULL in the loop so the second and subsequent iterations actually do pass NULL for the first parameter -- so the current code is a bit hard to understand and (at least arguably) somewhat fragile, but not actually wrong.]
Second, when you allocate your space, it looks like you're not allocating space for the terminating NUL:
res[i] = (char*)malloc(sizeof(char) * strlen(line));
At least at first glance, it looks like this should be:
res[i] = malloc(strlen(line)+1);
[As an aside, sizeof(char)==1 and casting the return from malloc can mask the bug of failing to #include <stdlib.h> to get a proper prototype in scope.]
Some of your other code isn't exactly wrong, but strikes me as less readable than ideal. For example:
j = 0;
while ((res[i][j] = tolower(line[j]))) {
j++;
}
This appears to be a rather obfuscated way of writing:
for (j=0; line[j] != '\0'; j++)
res[i][j] = tolower((unsigned char)line[j]);
Also note that when you call tolower, you generally need/want to cast the parameter to unsigned char (passing a negative value gives undefined behavior, and quite a few characters with accents, umlauts, etc., will normally show up as negative in the typical case that char is signed).
You also seem to have a memory leak -- read_dict calls readfile, which allocates a buffer (with calloc -- why not malloc?) and returns a pointer to that memory in a structure. read_dict receives the structure, but unless I've missed something, the struct goes out of scope without your ever freeing the memory it pointed to.
Rather than try to find and fix the problem you've seen, my immediate reaction would be to start over. It seems to me that you've made the problem considerably more complex than it really is. If I were doing it, I'd probably start with a function to allocate space and read a line into the space, something on this order:
// Warning: Untested code.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *readline(FILE *file) {
char *buffer = NULL;
size_t current_size = 1;
char *temp;
const int block_size = 256;
do {
if (NULL == (temp = realloc(buffer, current_size+block_size)))
break;
buffer = temp;
buffer[current_size-1] = '\0';
if (fgets(buffer+current_size-1, block_size, file)==NULL)
return strlen(buffer) > 0 ? buffer : NULL;
current_size += block_size-1;
} while (strchr(buffer, '\n') == NULL);
strtok(buffer, "\n");
if (NULL != (temp = realloc(buffer, strlen(buffer)+1)))
buffer =temp;
return buffer;
}
Once that's working, reading all the lines in the file and converting them to upper-case comes out something like:
// Warning: more untested code.
while (res[i] = readline(file)) {
size_t j;
for (j=0; res[i][j]; j++)
res[i][j] = toupper((unsigned char)res[i][j]);
++i;
}
It looks like you forgot to increment i after storing each line into the result array, so you end up storing all lines into res[0]. But you still set result.size = line_count at the end, so all array elements beyond the first are undefined. An i++ at the end of this loop: while ((line = strtok(buf, "\n\r"))) should fix it.

Resources