Taking large character array as input in c - c

I want to take a large character array as input.
E.g.: char array[c][d]
where c <= 200000 and d <= 500000.
Is there any way in C programming language to take such a character array as input?

After declaring an array, you can use calloc function in order to give size to your array. So your code will approximately look like this:
char** array;
array = calloc(c, sizeof(char*));
And there is no need, for defining second value, as it will change autimatically. But if you still want, you can write:
for (int i = 0; i < c; i++)
array[i] = calloc(d, sizeof(char));

Taking large size character array as input in c ...
Is there any way in C programming language to take input such a character array?
If code truly needs to retain all c <= 200000 test cases at once ...
Yes, read in each line, one at a time and then allocate per its length. There is certainly little need for a char array[200000][500000] array. Use an array of char * pointers instead.
#define c_MAX 200000
#define d_MAX 500000
// Allocate pointer array and a single buffer for reading the lines
char **array = malloc(sizeof *array * c_MAX);
if (array == NULL) Handle_OutOfMemory();
char *buffer = malloc(sizeof *buffer * (d_MAX + 3)); // Add room for 1 extra, \n, \0
if (buffer == NULL) Handle_OutOfMemory();
size_t c_count = 0;
while (fgets(buffer, d_MAX + 3, stdin)) {
if (c_count >= c_MAX) Handle_too_many_lines();
size_t len = strlen(buffer);
// lop off potential \n
if (len > 0 && buffer[len-1] == '\n') { // lop off potential \n
buffer[--len] = '\0';
}
if (len > d_MAX) Handle_too_long_a_line();
// Make a copy
size_t sz = sizeof array[c_count][0] * (len + 1);
array[c_count] = malloc(sz);
if (array[c_count] == NULL) Handle_OutOfMemory();
memcpy(array[c_count], buffer, sz);
c_count++;
}
free(buffer);
// TBD code
// Right-size `array` if desired with realloc()
// Use the `c_count` elements of `array`
// when done free them all
for (size_t i = 0; i< c_count; i++) {
free(array[c_count]);
}
free(array);

Related

Why are there extra characters in my char[]?

I,m trying to shorten a char[] by a specified number, and for some reason, I've got more characters in my new char[]. Can you help me fix this?
When I tried with 1 or 2 letters, the result is this:
(the d, n, k, a are the first letters of each lines reversed)
#▬w #▬n #▬k #▬a
(the di, an, ok, la are the first two letters of each lines reversed)
#id #an #ok #la
With 3 letters, it works perfectly:
nid ran rok mla
But same problem with more than 3:
qp░nnid qp░aran qp░trok qp░amla
And with more letters than the longest line, it also works perfectly:
eynnid scnaran etrok amla
<--- These are my words backwards --->
char **read(FILE *file, int lineLength, int *pLines)
{
size_t total = 0;
size_t allocated = START;
int sor = 0;
char buffer[MAX_LENGTH];
char shortened[lineLength];
/////////
//printf("%d", sizeof(shortened));
char **lines= (char **)malloc(allocated* sizeof(char *));
while (fgets(buffer, MAX_LENGTH, file) != NULL)
{
for (int i = 0; i < lineLength; i++)
{
shortened[i] = buffer[i];
}
int length = strlen(shortened);
if (shortened[length - 1] == '\n')
{
shortened[length - 1] = '\0';
}
if (line == allocated)
{
allocated*= 2;
lines= realloc(sorok, allocated* sizeof(char *));
}
lines[line] = (char *)malloc(lineLength);
strcpy(lines[line], shortened);
line++;
}
*pLines = line;
return lines;
}
One major problem is this loop:
for (int i = 0; i < lineLength; i++)
{
shortened[i] = buffer[i];
}
If lineLength > strlen(buffer) then you will copy the null-terminator (and beyond, including data that isn't initialized by the fgets call).
But if strlen(buffer) >= lineLength you will not copy the null-terminator. Then you use the strlen function on shortened which will then go beyond the end of shortened and you will have undefined behavior.
And for a better way to remove the newline (which you need to do for buffer and not shortened) see Removing trailing newline character from fgets() input

Reversing a string without using any library function

So I want to write a program which will reverse a string taken from the user.
Here's my source code:
#include <stdio.h>
#include <stdlib.h>
int main(int argv, char *argc[]) {
if (argv != 2) {
printf("Please enter the number of elements in your string!\n");
return 1;
}
int n = atoi(argc[1]);
char *c = malloc((sizeof(char) * n) + 1);
char *o = malloc((sizeof(char) * n) + 1);
printf("Enter your string - ");
fgets(c, n, stdin);
for (int i = 0; i < n + 1; i++) {
*(o + i) = *(c + (n - 1) - i);
}
printf("%s\n", o);
free(c);
free(o);
}
But the printed output is nothing!
Can someone please point out what's wrong with my code?
The issue that prevents the code from working is the missmatch in the size of o and c containers, and the read size in fgets, since fgets null-terminates the string read from input.
So let's say n = 6 as you read your string, fgets replaces the 6th character with a null-terminator, when you reverse it the null-terminator will now be the first character in o, essentially, it will be an empty string, as a string is a null-terminated char-array, or byte-array.
To fix this give fgets the size of your mallocced space.
fgets(c, n + 1, stdin);
And null-terminate o when you are finished reversing.
*(o + n) = '\0';
Or
o[n] = '\0'; //you can use this notation which is more readable than dereferencing
Minor issues:
The fact that you switch the names of main arguments. It normally is int main(int argc, char * argv[]). That can be confusing for someone who reads your code.
char *c = malloc((sizeof(char) * n) + 1); has unnecessary logic, it can be char *c = malloc(n + 1);, a char is one byte in size.
There is an underlying problem with the logic of the program, when the inputed string is shorter than what you ask the user the outupt will not be the desired one, you can make an extra effort bullet-proofing your code for erroneous inputs.
All things considered, taking your code as base, it can be something like:
//Only the changed parts are represented, the rest is the same
#include <string.h> //for strlen
//...
if (argc != 2 || atoi(argv[1]) < 1) { //n must be positive (I switched argv and argc)
printf("Please enter the number of elements in your string!\n");
return 1;
}
size_t n = atoi(argv[1]); //size_t type more suited for sizes
char *c = malloc(n + 1);
char *o = malloc(n + 1);
//...
fgets(c, n + 1, stdin); //as stated n + 1 size argument
if(strlen(c) < n) { //if the length of inputed string is shorter than intended
puts("The string size shorter than stated!");
return 1;
}
//...
for (size_t i = 0; i < n + 1; i++) { //repalced int iterator type whith size_t
//...
o[n] = '\0'; //null terminate o
//...
There are multiple problems in your program:
why do you require an argument for the number of characters? it would be much simpler to assume a maximum length and define char arrays in main() with automatic storage.
the statement char *c = malloc((sizeof(char) * n) + 1); computes the correct allocation size, but by chance because sizeof(char) is always 1. You should write char *c = malloc(n + 1); or char *c = malloc(sizeof(*c) * (n + 1));.
since fgets() will store the newline, you should increase the allocation size by 1 to avoid leaving the newline in the input stream, but you will need to avoid including the newline in the characters to reverse. In all cases, you must pass the size of the array to fgets(), not n because fgets() would then only store n - 1 bytes into the array and set c[n - 1] to a null terminator, which causes the reversed string to start with a null terminator, making it an empty string.
you do not test if fgets() succeeded at reading standard input.
you do not compute the number of characters to reverse. If the user entered fewer characters than n, you will transpose bytes beyond those that were entered, possibly null bytes which will make the reversed string empty (this is a good explanation for what you observe).
the transposition loop should iterate for i = 0 while i < n, not n + 1.
you do not set the null terminator at the end of the destination array. This array is allocated with malloc(), so it is uninitialized.
Here is a modified version:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argv, char *argc[]) {
if (argv != 2) {
printf("Please enter the maximum number of characters in your string!\n");
return 1;
}
int n = atoi(argc[1]);
if (n < 1) {
printf("Invalid number of characters: %d\n", n);
return 1;
}
// allocate 1 extra byte for the newline, one more for the null terminator
char *buf = malloc(n + 2);
char *out = malloc(n + 2);
printf("Enter your string: ");
if (!fgets(buf, n + 2, stdin)) {
printf("no input\n");
return 1;
}
// get the number of characters in the input before the newline, if any
int len;
for (len = 0; buf[len] && buf[len != '\n'; n++)
continue;
// if you can use the library function `strcspn()`, replace the for loop with this:
//len = strcspn(buf, "\n");
// copy the string in reverse order
for (int i = 0; i < len; i++) {
out[i] = buf[len - 1 - i];
}
// set the null terminator
out[len] = '\0';
printf("%s\n", out);
free(buf);
free(out);
return 0;
}
It is also possible that you run your program from the IDE on a system that closes the terminal window as soon as the program terminates. This would prevent you from seeing the output. Add a getchar(); before the return 0; to fix this problem or run the program manually from a shell window.
what's wrong with my code!
Key functional problems include:
To read "12345\n" with fgets() takes at least 6 bytes, 7 better1.
Missing null character on o[]
With fgets(c, n, stdin), c[n-1] is a null character and with "reversing", as code assumes n characters, c[n-1] becomes o[0] and so code prints the empty string.
// fgets(c, n, stdin); // too small
fgets(c, n + 1, stdin);
// add before printing.
o[n] = '\0';
Other minor issues exist.
1 Increase allocation too and then lop off \n from input.
your program will fail if the value n does not match the number of
characters in the input string mainly because you do not initialize
the memory that you allocate.
e.g.
n = 10
c = "hello"
length of c is 5 but you have allocated 11 bytes, so the bytes after hello\n\0 are uninitialized in c since fgets will not fill those out for you.
in memory it looks something like this
+---+---+---+---+---+---+---+---+---+---+---+
c ->| h | e | l | l | o |\n |\0 | | | | |
+---+---+---+---+---+---+---+---+---+---+---+
when you turn the string around with
*(o + i) = *(c + n - 1 - i)
since you are using n as offset to start copying characters, you start beyond "hello\n\0" copying
position 9 (10 - 1 - 0) and placing this as first character in o,
but since all of c is not initialized anything can be there, also even a \0 which could explain why you
don't print anything.
better is to once you read the string calculate the length of the string with a simple for loop
int len = 0;
for (len = 0; c[len] && c[len] != '\n'; ++len);
and then use len as the offset instead of n
*(o + i) = *(c + len - 1 + i)

Dynamic array using malloc and realloc?

I'm trying to collect input integers one by one. My array starts with size 1 and I want to expand it by 1, with every input collected (is this a smart thing to do?)
Anyway, this is the code I managed to come up with, but it's not working as intended.
After this process, sizeof(array) always returns 8, which I assume means that the array is only being resized once. (sizeof(int) is 4 bits)
Trying to output the array results in multiple instances of the first input variable.
OUTPUT CODE
for(int s=0;s<sizeof(array)/sizeof(int);s++){
printf("%i\n",array[i]);
}
ORIGINAL CODE:
int i;
int size = 1;
int *array = malloc(size * sizeof(int));
int position = 0;
do{
i = getchar() - 48;
f (i != -16 && i != -38 && i != 0) {
array[position] = i;
position++;
size++;
*array = realloc(array, size * sizeof(int));
}
} while (i != 0);
UPDATED STILL NOT WORKING CODE
int i;
int size = 1;
int *array = malloc(size * sizeof(int));
int position = 0;
do{
i = getchar() - 48;
f (i != -16 && i != -38 && i != 0) {
array[position] = i;
position++;
size++;
array = realloc(array, size * sizeof(int));
}
} while (i != 0);
array = realloc(...)
not *array. Per the realloc docs, realloc returns the pointer, which you can store directly in your array pointer.
Edit One thing that will make your life easier: use char constants instead of raw numbers. E.g.,
i = getchar();
if(i != ' ' && i != '\n' && i != '0') {
/* 48-16 48-38 48-0 right? */
array[position] = i - '0'; /* '0' = 48 */
One thing that jumps out at me: inside your loop, this line:
*array = realloc(array, size * sizeof(int));
should instead be:
array = realloc(array, size * sizeof(int));
In the original version, you were sticking the result of realloc in the first element of the array by dereferencing the pointer first. Without the asterisk, you're reassigning the array itself.
(With some copy-paste from my comment:) sizeof(array) returns 8 because it equals sizeof(int*) (array is type int*) which is 8 (you're probably compiling as 64-bit). sizeof doesn't work how you think for pointers to arrays.
Similarly, your output code is wrong, for the same reason. You only print the first two elements because sizeof(array)/sizeof(int) will always be 8/4=2. It should be
for(int s=0;s<size;s++){
printf("%i\n",array[s]);
}
(note also changed index variable i to s)
where size is the variable from your other code chunk(s). You cannot find the length of the array from sizeof if it's dynamically allocated with pointers; that's impossible. Your code must "remember" the size of your array.

Invalid read - Valgrind and C

New to C and Valgrind and manual memory management and I'm having trouble locating an error that I'm getting when I run Valgrind. I have this function which gets strings from the user:
char **get_fragments_from_user(){
// No more than 20k strings containing at most 1k characters
char **strings = malloc(20000 * sizeof(char *));
char tempstring[MAX_INPUT]; //MAX_INPUT = 1001
int count = 0;
while(true){
printf("\n> ");
fgets(tempstring, MAX_INPUT, stdin);
if((strlen(tempstring) > 0) && (tempstring[strlen(tempstring) - 1] == '\n')){
tempstring[strlen(tempstring) - 1] = '\0';
}
if(tempstring[0] == 'q') break;
strings[count] = malloc(sizeof(char) * (strlen(tempstring)+1));
strcpy(strings[count], tempstring);
count++;
}
int i = 0;
char **fstrings = malloc((count)*sizeof(char *)); // count+1 needed? Something I tried removing while debugging
for(i = 0; i < count; i++){
fstrings[i] = malloc(sizeof(char) * (strlen(strings[i])+1));
strcpy(fstrings[i], strings[i]);
free(strings[i]);
}
free(strings);
return fstrings;
}
The idea here is simply to get strings and put them in an array. I initially allocate an array that is large enough to fit the maximum number of strings that could ever be entered (20,000), but I then resize the array so that I don't allocate more memory than the each string needs. I am a little embarrassed with the above code, since its less clean than anything I would have written in another language, but that was my first pass through.
I then get "Invalid read of size 8" from Valgrind when I try to calculate the number of strings in the array using this function:
int lengthOf(char **arr){
int i = 0;
while(arr[i] != NULL){
i++;
}
return i;
}
I'm pretty sure this is due to a dereferenced pointer or something, but I can't find it for the life of me and I've been looking at this code for an hour or so.
So, I believe the problem was that I wasn't allocating enough memory to store the whole array.
Instead of doing:
malloc(count * sizeof(char *));
I should have been allocating count+1, so either:
malloc((count + 1) * sizeof(char *))
or
calloc((count + 1), sizeof(char *));

Allocate a 3D char array in C (char ***)

I would like to allocate a char***.
I have a got a sentence like this: "This is a command && which I || need to ; split"
I need to put in each box a full sentence just like that:
cmd[0] = "This is a command"
cmd[1] = "wich I"
cmd[2] = "need to"
cmd[3] = "split"
Sentences are separated by tokens like &&, ||, ;, |.
My problem is that I don't know how to allocate my triple dimension array.
I always get a Segmentation Fault.
This is what I do :
for(k = 0; k < 1024; k++)
for( j = 0; j < 1024; j++)
cmd[k][j] = malloc(1024);
But a few line later, in an other loop :
» cmd[k][l] = array[i];
I get a segfault here.
How can I do this please ?
Thanks in advance
Please keep in mind that a 2/3D array in C is not the same as a char ***.
If all you wish is to have a 1024^3 character array then you will be good with
char array[1024][1024][1024];
But keep in mind that this will allocate 1 GB of space on your stack which may or may not work.
To allocate this much on the heap you need to type it correctly:
char (*array)[1024][1024] = malloc(1024*1024*1024);
In this scenario array is a pointer to an array of 2D 1024x1024 character matrices.
If you really want to work with char *** (which I do not recommend if your array lengths are static) then you need to allocate all intermediary arrays too:
char *** cmd = malloc(sizeof(char **) * 1024);
for(k = 0; k < 1024; k++) {
cmd[k] = malloc(sizeof(char *) * 1024);
for( j = 0; j < 1024; j++)
cmd[k][j] = malloc(1024);
}
If you are going to be splitting your string by delimiters that are longer then a single character then this is how you could do it with string search.
The following function will accept an input string and a delimiter string.
It will return a char ** which has to be freed and it will destroy your input string (reusing it's memory to store the tokens).
char ** split_string(char * input, const char * delim) {
size_t num_tokens = 0;
size_t token_memory = 16; // initialize memory initially for 16 tokens
char ** tokens = malloc(token_memory * sizeof(char *));
char * found;
while ((found = strstr(input, delim))) { // while a delimiter is found
if (input != found) { // if the strind does not start with a delimiter
if (num_tokens == token_memory) { // increase the memory array if it is too small
void * tmp = realloc(tokens, (token_memory *= 2) * sizeof(char *));
if (!tmp) {
perror("realloc"); // out of memory
}
tokens = tmp;
}
tokens[num_tokens++] = input;
*found = '\0';
}
// trim off the processed part of the string
input = found + strlen(delim);
}
void * tmp = realloc(tokens, (num_tokens +1) * sizeof(char *));
if (!tmp) {
perror("realloc"); // something weird happened
}
tokens = tmp;
// this is so that you can count the amount of tokens you got back
tokens[num_tokens] = NULL;
return tokens;
}
You will need to recursively run this to split by more then one delimiter.

Resources