C - Array of Char Arrays - c

Im trying to work with the example in the K and R book for this topic, but struggling.
I want an array of Char Arrays, whereby each element of the 'Father' Array points to an array of characters (string). Basically, I am reading from a file, line at a time, storing each line into an array, and then trying to store that array, into another array, which I can then sort via qsort.
But I can't seem to get anywhere with this! Anyhelp on my code is much appreciated, i.e. where to go from where I am!
EDIT: The problem is, the printing function isn't printing out my words that should be within the array of arrays, instead its just printing garbage, the main problem is, I'm not sure whether i am de-referencing things correctly, or not at all, whether I am adding it to the array of arrays correctly etc.
Regards.
#define MAXLINES 5000 /* max no. lines to be stored */
#define MAXLEN 1000 /* max length of single line */
char *lineptr[MAXLINES];
void writelines(char *lineptr[], int nlines);
int main(int argc, char *argv[]) {
int nlines = 0, i, j, k;
char line[MAXLEN];
FILE *fpIn;
fpIn = fopen(argv[1], "rb");
while((fgets(line, 65, fpIn)) != NULL) {
j = strlen(line);
if (j > 0 && (line[j-1] == '\n')) {
line[j-1] = '\0';
}
if (j > 8) {
lineptr[nlines++] = line;
}
}
for(i = 0; i < nlines; i++)
printf("%s\n", lineptr[i] );
return 0;
}

A problem is that line[MAXLEN] is an automatic variable, and so each time through the while loop it refers to the same array. You should dynamically allocate line each time through the while loop (line = calloc(MAXLEN, sizeof(char)) before calling fgets). Otherwise fgets always writes to the same memory location and lineptr always points to the same array.

Dan definitely found one error, the identical storage. But I think there are more bugs here:
while((fgets(line, 65, fpIn)) != NULL) {
Why only 65? You've got MAXLEN space to work with, you might as well let your input be a bit longer.
j = strlen(line);
if (j > 0 && (line[j-1] == '\n')) {
line[j-1] = '\0';
}
if (j > 8) {
lineptr[nlines++] = line;
}
}
Why exactly j > 8? Are you supposed to be throwing away short lines? Don't forget to deallocate the memory for the line in this case, once you've moved to the dynamic allocation that Dan suggests.
Update
ott recommends strdup(3) -- this would be easy to fit into your existing system:
while((fgets(line, 65, fpIn)) != NULL) {
j = strlen(line);
if (j > 0 && (line[j-1] == '\n')) {
line[j-1] = '\0';
}
if (j > 8) {
lineptr[nlines++] = strdup(line);
}
}
Dan recommended calloc(3), that would be only slightly more work:
line = calloc(MAXLINE, sizeof char);
while((fgets(line, 65, fpIn)) != NULL) {
j = strlen(line);
if (j > 0 && (line[j-1] == '\n')) {
line[j-1] = '\0';
}
if (j > 8) {
lineptr[nlines++] = line;
line = calloc(MAXLINE, sizeof char);
}
}
Of course, both these approaches will blow up if the memory allocation fails --
checking error returns from memory allocation is always a good idea. And
there's something distinctly unbeautiful about the second mechanism.

Related

Reversing a string without two loops?

I came up with the following basic item to reverse a string in C:
void reverse(char in[], char out[]) {
int string_length = 0;
for(int i=0; in[i] != '\0'; i++) {
string_length += 1;
}
for(int i=0; i < string_length ; i++) {
out[string_length-i] = in[i];
}
out[string_length+1] = '\0';
}
Is there a way to do this in one for loop or is it necessary to first use a for length to get the string length, and then do a second one to reverse it? Are there other approaches to doing a reverse, or is this the basic one?
Assuming you can't use functions to get the string length and you want to preserve the second loop I'm afraid this is the shortest way.
Just as a side-note though: this code is not very safe as at for(int i=0; in[i] != '\0'; i++) you are not considering cases where the argument passed to parameter in is not a valid C string where there isn't a single \0 in all elements of the array pointed by in and this code will end up manifesting a buffer over-read at the first for loop when it will read beyond in boundaries and a buffer overflow in the second for loop where you can write beyond the boundaries of out. In functions like this you should ask the caller for the length of both arrays in and out and use that as a max index when accessing them both.
As pointed by Rishikesh Raje in comments: you should also change the exit condition in the second for loop from i <= string_length to i < string_length as it will generate another buffer over-read when i == string_length as it will access out by a negative index.
void reverse(char *in, char *out) {
static int index;
index = 0;
if (in == NULL || in[0] == '\0')
{
out[0] = '\0';
return;
}
else
{
reverse(in + 1, out);
out[index + 1] = '\0';
out[index++] = in[0];
}
}
With no loops.
This code is surely not efficient and robust and also won't work for multithreaded programs. Also the OP just asked for an alternative method and the stress was on methods with lesser loops.
Are there other approaches to doing a reverse, or is this the basic one
Also, there was no real need of using static int. This would cause it not to work with multithreaded programs. To get it working correct in those cases:
int reverse(char *in, char *out) {
int index;
if (in == NULL || in[0] == '\0')
{
out[0] = '\0';
return 0;
}
else
{
index = reverse(in + 1, out);
out[index + 1] = '\0';
out[index++] = in[0];
return index;
}
}
You can always tweak two loops into one, more confusing version, by using some kind of condition to determine which phase in the algorithm you are in. Below code is untested, so most likely contains bugs, but you should get the idea...
void reverse(const char *in, char *out) {
if (*in == '\0') {
// handle special case
*out = *in;
return;
}
char *out_begin = out;
char *out_end;
do {
if (out == out_begin) {
// we are still looking for where to start copying from
if (*in != '\0') {
// end of input not reached, just go forward
++in;
++out_end;
continue;
}
// else reached end of input, put terminating NUL to out
*out_end = '\0';
}
// if below line seems confusing, write it out as 3 separate statements.
*(out++) = *(--in);
} while (out != out_end); // end loop when out reaches out_end (which has NUL already)
}
However, this is exactly as many loop iterations so it is not any faster, and it is much less clear code, so don't do this in real code...

Taking large character array as input in c

I want to take a large character array as input.
E.g.: char array[c][d]
where c <= 200000 and d <= 500000.
Is there any way in C programming language to take such a character array as input?
After declaring an array, you can use calloc function in order to give size to your array. So your code will approximately look like this:
char** array;
array = calloc(c, sizeof(char*));
And there is no need, for defining second value, as it will change autimatically. But if you still want, you can write:
for (int i = 0; i < c; i++)
array[i] = calloc(d, sizeof(char));
Taking large size character array as input in c ...
Is there any way in C programming language to take input such a character array?
If code truly needs to retain all c <= 200000 test cases at once ...
Yes, read in each line, one at a time and then allocate per its length. There is certainly little need for a char array[200000][500000] array. Use an array of char * pointers instead.
#define c_MAX 200000
#define d_MAX 500000
// Allocate pointer array and a single buffer for reading the lines
char **array = malloc(sizeof *array * c_MAX);
if (array == NULL) Handle_OutOfMemory();
char *buffer = malloc(sizeof *buffer * (d_MAX + 3)); // Add room for 1 extra, \n, \0
if (buffer == NULL) Handle_OutOfMemory();
size_t c_count = 0;
while (fgets(buffer, d_MAX + 3, stdin)) {
if (c_count >= c_MAX) Handle_too_many_lines();
size_t len = strlen(buffer);
// lop off potential \n
if (len > 0 && buffer[len-1] == '\n') { // lop off potential \n
buffer[--len] = '\0';
}
if (len > d_MAX) Handle_too_long_a_line();
// Make a copy
size_t sz = sizeof array[c_count][0] * (len + 1);
array[c_count] = malloc(sz);
if (array[c_count] == NULL) Handle_OutOfMemory();
memcpy(array[c_count], buffer, sz);
c_count++;
}
free(buffer);
// TBD code
// Right-size `array` if desired with realloc()
// Use the `c_count` elements of `array`
// when done free them all
for (size_t i = 0; i< c_count; i++) {
free(array[c_count]);
}
free(array);

Parsing character array to words held in pointer array (C-programming)

I am trying to separate each word from a character array and put them into a pointer array, one word for each slot. Also, I am supposed to use isspace() to detect blanks. But if there is a better way, I am all ears. At the end of the code I want to print out the content of the parameter array.
Let's say the line is: "this is a sentence". What happens is that it prints out "sentence" (the last word in the line, and usually followed by some random character) 4 times (the number of words). Then I get "Segmentation fault (core dumped)".
Where am I going wrong?
int split_line(char line[120])
{
char *param[21]; // Here I want to put one word for each slot
char buffer[120]; // Word buffer
int i; // For characters in line
int j = 0; // For param words
int k = 0; // For buffer chars
for(i = 0; i < 120; i++)
{
if(line[i] == '\0')
break;
else if(!isspace(line[i]))
{
buffer[k] = line[i];
k++;
}
else if(isspace(line[i]))
{
buffer[k+1] = '\0';
param[j] = buffer; // Puts word into pointer array
j++;
k = 0;
}
else if(j == 21)
{
param[j] = NULL;
break;
}
}
i = 0;
while(param[i] != NULL)
{
printf("%s\n", param[i]);
i++;
}
return 0;
}
There are many little problems in this code :
param[j] = buffer; k = 0; : you rewrite at the beginning of buffer erasing previous words
if(!isspace(line[i])) ... else if(isspace(line[i])) ... else ... : isspace(line[i]) is either true of false, and you always use the 2 first choices and never the third.
if (line[i] == '\0') : you forget to terminate current word by a '\0'
if there are multiple white spaces, you currently (try to) add empty words in param
Here is a working version :
int split_line(char line[120])
{
char *param[21]; // Here I want to put one word for each slot
char buffer[120]; // Word buffer
int i; // For characters in line
int j = 0; // For param words
int k = 0; // For buffer chars
int inspace = 0;
param[j] = buffer;
for(i = 0; i < 120; i++) {
if(line[i] == '\0') {
param[j++][k] = '\0';
param[j] = NULL;
break;
}
else if(!isspace(line[i])) {
inspace = 0;
param[j][k++] = line[i];
}
else if (! inspace) {
inspace = 1;
param[j++][k] = '\0';
param[j] = &(param[j-1][k+1]);
k = 0;
if(j == 21) {
param[j] = NULL;
break;
}
}
}
i = 0;
while(param[i] != NULL)
{
printf("%s\n", param[i]);
i++;
}
return 0;
}
I only fixed the errors. I leave for you as an exercise the following improvements :
the split_line routine should not print itself but rather return an array of words - beware you cannot return an automatic array, but it would be another question
you should not have magic constants in you code (120), you should at least have a #define and use symbolic constants, or better accept a line of any size - here again it is not simple because you will have to malloc and free at appropriate places, and again would be a different question
Anyway good luck in learning that good old C :-)
This line does not seems right to me
param[j] = buffer;
because you keep assigning the same value buffer to different param[j] s .
I would suggest you copy all the char s from line[120] to buffer[120], then point param[j] to location of buffer + Next_Word_Postition.
You may want to look at strtok in string.h. It sounds like this is what you are looking for, as it will separate words/tokens based on the delimiter you choose. To separate by spaces, simply use:
dest = strtok(src, " ");
Where src is the source string and dest is the destination for the first token on the source string. Looping through until dest == NULL will give you all of the separated words, and all you have to do is change dest each time based on your pointer array. It is also nice to note that passing NULL for the src argument will continue parsing from where strtok left off, so after an initial strtok outside of your loop, just use src = NULL inside. I hope that helps. Good luck!

strtok disappearing when returning -1

So I'm writing code to put strings into arrays and it's working perfectly, however I want it to terminate the reading of the strings when I hit a ## in the file. I'm running a loop and parsing the strings line by line. Within my string parser I put a loop to check for the ##. It's at the very end of my parser function and it goes:
for (i = 0; i < strlen(line)); i++)
{
if ((buffer[i] == '#') && (buffer[i+1] == '#'))
{
return -1;
}
}
The problem is that when it hits the line with the ## at the end it doesn't parse the string into my array. It seems like it's just ignoring the code before this loop.
As additional information I'm using strtok to put the tokens in positions in my char* array before this for loop.
EDIT: Here's my parseString function:
int parseString(char* line, char*** inString)
{
char* buffer;
int Token, i;
buffer = (char*) malloc(strlen(line) * sizeof(char));
strcpy(buffer,line);
(*inString) = (char**) malloc(MAX_TOKS * sizeof(char**));
Token = 0;
(*inString)[Token++] = strtok(buffer, DELIMITERS);
while ((((*inString)[token] = strtok(NULL, DELIMITERS)) != NULL) && (Token < MAX_TOKS))
Token++;
for(i=0; i<strlen(line); i++)
{
if ((buffer[i] == '#') && (buffer[i+1] == '#'))
{
return -1;
}
}
return Token;
}
First of all, you are reading out of bounds on an array, because array[-1] is not good. Secondly, use a variable to hold the string length, as the way you do it causes the for loop to re-evaluate strlen(line) for each iteration.
Now, for your problem, it seems like you're putting it before the code that adds it to an array. If you could give us a bit more code, that would help.
Insufficient buffer allocation
// buffer = (char*) malloc(strlen(line) * sizeof(char));
buffer = malloc(strlen(line) + 1); // +1 for the \0
strcpy(buffer,line);
Memory Leak
The allocated 'buffer' may be lost. The *inString array_ have a pointer to the beginning of 'buffer', allowing it to be freed in the calling routine, but that is iffy. Suggest using first element of *inString to save that buffer explicitly.
Algorithm hole
(*inString)[token-1] == NULL should be asserted before for().
O(n*n) via strlen()
Suggestion:
// for(i=0; i<strlen(line); i++)
int length = strlen(line); // `length` should be used in `malloc()` too.
for(i=0; i<length; i++)
OP's early edit approach was almost OK
Just needed to start indexing at 1, rather than 0. No need to test every index i of line, but (length-1). So (i = 1; i<length; i++) or (i = 0; i<length-1; i++).
// for (i = 0; i < strlen(line)); i++) {
int length = strlen(line);
for (i = 1; i<length; i++) { // start at 1
if ((buffer[i-1] == '#') && (buffer[i] == '#')) {
return -1;
}
}
For better assistance, recommend OP provide sample line, line with the ## at the end, MAX_TOKS and DELIMITERS.

remove a specified number of characters from a string in C

I can't write a workable code for a function that deletes N characters from the string S, starting from position P. How you guys would you write such a function?
void remove_substring(char *s, int p, int n) {
int i;
if(n == 0) {
printf("%s", s);
}
for (i = 0; i < p - 1; i++) {
printf("%c", s[i]);
}
for (i = strlen(s) - n; i < strlen(s); i++) {
printf("%c", s[i]);
}
}
Example:
s: "abcdefghi"
p: 4
n: 3
output:
abcghi
But for a case like n = 0 and p = 1 it's not working!
Thanks a lot!
A few people have shown you how to do this, but most of their solutions are highly condensed, use standard library functions or simply don't explain what's going on. Here's a version that includes not only some very basic error checking but some explanation of what's happening:
void remove_substr(char *s, size_t p, size_t n)
{
// p is 1-indexed for some reason... adjust it.
p--;
// ensure that we're not being asked to access
// memory past the current end of the string.
// Note that if p is already past the end of
// string then p + n will, necessarily, also be
// past the end of the string so this one check
// is sufficient.
if(p + n >= strlen(s))
return;
// Offset n to account for the data we will be
// skipping.
n += p;
// We copy one character at a time until we
// find the end-of-string character
while(s[n] != 0)
s[p++] = s[n++];
// And make sure our string is properly terminated.
s[p] = 0;
}
One caveat to watch out for: please don't call this function like this:
remove_substr("abcdefghi", 4, 3);
Or like this:
char *s = "abcdefghi";
remove_substr(s, 4, 3);
Doing so will result in undefined behavior, as string literals are read-only and modifying them is not allowed by the standard.
Strictly speaking, you didn't implement a removal of a substring: your code prints the original string with a range of characters removed.
Another thing to note is that according to your example, the index p is one-based, not zero-based like it is in C. Otherwise the output for "abcdefghi", 4, 3 would have been "abcdhi", not "abcghi".
With this in mind, let's make some changes. First, your math is a little off: the last loop should look like this:
for (i = p+n-1; i < strlen(s); i++) {
printf("%c", s[i]);
}
Demo on ideone.
If you would like to use C's zero-based indexing scheme, change your loops as follows:
for (i = 0; i < p; i++) {
printf("%c", s[i]);
}
for (i = p+n; i < strlen(s); i++) {
printf("%c", s[i]);
}
In addition, you should return from the if at the top, or add an else:
if(n == 0) {
printf("%s", s);
return;
}
or
if(n == 0) {
printf("%s", s);
} else {
// The rest of your code here
...
}
or remove the if altogether: it's only an optimization, your code is going to work fine without it, too.
Currently, you code would print the original string twice when n is 0.
If you would like to make your code remove the substring and return a result, you need to allocate the result, and replace printing with copying, like this:
char *remove_substring(char *s, int p, int n) {
// You need to do some checking before calling malloc
if (n == 0) return s;
size_t len = strlen(s);
if (n < 0 || p < 0 || p+n > len) return NULL;
size_t rlen = len-n+1;
char *res = malloc(rlen);
if (res == NULL) return NULL;
char *pt = res;
// Now let's use the two familiar loops,
// except printf("%c"...) will be replaced with *p++ = ...
for (int i = 0; i < p; i++) {
*pt++ = s[i];
}
for (int i = p+n; i < strlen(s); i++) {
*pt++ = s[i];
}
*pt='\0';
return res;
}
Note that this new version of your code returns dynamically allocated memory, which needs to be freed after use.
Here is a demo of this modified version on ideone.
Try copying the first part of the string, then the second
char result[10];
const char input[] = "abcdefg";
int n = 3;
int p = 4;
strncpy(result, input, p);
strncpy(result+p, input+p+n, length(input)-p-n);
printf("%s", result);
If you are looking to do this without the use of functions like strcpy or strncpy (which I see you said in a comment) then use a similar approach to how strcpy (or at least one possible variant) works under the hood:
void strnewcpy(char *dest, char *origin, int n, int p) {
while(p-- && *dest++ = *origin++)
;
origin += n;
while(*dest++ = *origin++)
;
}
metacode:
allocate a buffer for the destination
decalre a pointer s to your source string
advance the pointer "p-1" positions in your source string and copy them on the fly to destination
advance "n" positions
copy rest to destination
What did you try? Doesn't strcpy(s+p, s+p+n) work?
Edit: Fixed to not rely on undefined behaviour in strcpy:
void remove_substring(char *s, int p, int n)
{
p--; // 1 indexed - why?
memmove(s+p, s+p+n, strlen(s) - n);
}
If your heart's really set on it, you can also replace the memmove call with a loop:
char *dst = s + p;
char *src = s + p + n;
for (int i = 0; i < strlen(s) - n; i++)
*dst++ = *src++;
And if you do that, you can strip out the strlen call, too:
while ((*dst++ = *src++) != '\0);
But I'm not sure I recommend compressing it that much.

Resources