Getting the input strings into an array in C - c

In C, we do something like:
int main(int argc, char **argv) {
printf("The first argument is %s", argv[1]);
printf("The second argument is %s", argv[2]);
return 0;
}
I was wondering if it's possible to store strings in an array in similar way as above when using scanf or fgets .
I tried like:
char **input;
scanf("%s", &input);
Anyway I can access the strings entered as input[0], input[1].. so on...

Yes, but you need to make sure you have enough space to do so:
char input[3][50]; // enough space for 3 strings with
// a length of 50 (including \0)
fgets(&input[0], 50, stdin);
printf("Inputted string: %s\n", input[0]);
Using char **input does not have any space allocated for the input, therefore you cannot do it.

It's possible, but somewhat tedious, especially if you don't know the number of strings at the beginning.
char **input;
That much is fine. From there, you need allocate an array of (the right number of) pointers:
input = malloc(sizeof(char *) * MAX_LINES);
Then you need to allocate space for each line. Since you typically only want enough space for each string, you typically do something like this:
#define MAX_LINE_LEN 8192
static char buffer[MAX_LINE_LEN];
long current_line = 0;
while (fgets(buffer, sizeof(buffer), infile) && current_line < MAX_LINES) {
input[current_line] = malloc(strlen(buffer)+1);
strcpy(buffer[current_line++], buffer);
}
If you don't know the number of lines up-front, you typically allocate a number of pointers to start with (about as above), but as you read each line, check whether you've exceeded the current allocation, and if you have realloc the array of pointers to get more space.
If you want to badly enough, you can do the same with each individual line. Above, I've simply set a maximum that's large enough you probably won't exceed it very often with most typical text files. If you need it larger, it's pretty easy to expand that. At the same time, any number you pick will be an arbitrary limit. If you want to, you can read a chunk into your buffer, and if the last character in the string is not a new-line, keep reading more into the same string (and, again, use realloc to expand the allocation as needed). This isn't terribly difficult to do, but covering all the corner cases correctly can/does get tedious.
Edit: I should add that there's a rather different way to get the same basic effect. Read the entire content of the file into a single big buffer, then (typically) use strtok to break the buffer into lines (replacing "\n" with "\0") to build an array of pointers into the buffer. This typically improves speed somewhat (one big read instead of many one-line reads) as well as allocation overhead because you use one big allocation instead of many small ones. Each allocation will typically have a header, and get rounded to something like a (multiple of some) power of two. The effect of this varies with the line length involved. If you have a few long lines, it probably won't matter much. If you have a lot of short lines, it can save a lot.

Related

How to store multiple sets of strings in a character array

I am trying to store a given input number of sets of strings in a 3D character array, but couldn't.
Is it even possible using char arrays or should i use any other concept like data structures......?
int main()
{
int i,j,T,N[10];
char word[10][10][10];
scanf("%d",&T);/* No of test cases*/
for(i=0;i<T;i++)
{
scanf("%d",&N[i]); /*No of strings*/
for(j=0;j<N[i];j++)
scanf("%s",word[i][j]); /* reading the strings*/
}
return 0;
First: a "3D character array" is better thought of as a "2D string matrix" in this case.
And yes, of course it's very possible.
There are some weaknesses with your code that might trip it up, hard to say since you don't show a full test case with the input data you provide.
scanf() can fail, in which case you cannot rely on the variables having values
scanf() with %s will stop on the first whitespace character, which might cause your scanning code to become very confused
You don't limit the size of string you scan, but only provide 10 bytes of buffer space per string, so easy to get buffer overruns.
A better solution would be to check that the scanning succeeded, and make each string be on a line on its own, and read full lines using fgets() into an appropriately-sized buffer, then perhaps copy the part you want to keep into place in the matrix.

Will assigning a large value for length of char string be an issue?

I am reading a line from a file and I do not know the length it is going to be. I know there are ways to do this with pointers but I am specifically asking for just a plan char string. For Example if I initialize the string like this:
char string[300]; //(or bigger)
Will having large string values like this be a problem?
Any hard coded number is potentially too small to read the contents of a file. It's best to compute the size at run time, allocate memory for the contents, and then read the contents.
See Read file contents with unknown size.
char string[300]; //(or bigger)
I am not sure which of the two issues you are concerned with, so I will try to address both below:
if the string in the file is larger than 300 bytes and you try to "stick" that string in that buffer, without accounting the max length of your array -you will get undefined behaviour because of overwriting the array.
If you are just asking if 300 bytes is too much too allocate - then no, it is not a big deal unless you are on some very restricted device. e.g. In Visual Studio the default stack size (where that array would be stored) is 1 MB if I am not wrong. Benefits of doing so is understandable, e.g. you don't need to concern yourself with freeing it etc.
PS. So if you are sure the buffer size you specify is enough - this can be fine approach as you free yourself from memory management related issues - which you get from pointers and dynamic memory.
Will having large string values like this be a problem?
Absolutely.
If your application must read the entire line from a file before processing it, then you have two options.
1) Allocate buffer large enough to hold the line of maximum allowed length. For example, the SMTP protocol does not allow lines longer than 998 characters. In that case you can allocate a static buffer of length 1001 (998 + \r + \n + \0). Once you have read a line from a file (or from a client, in the example context) which is longer than the maximum length (that is, you have read 1000 characters and the last one is not \n), you can treat it as a fatal (protocol) error and report it.
2) If there are no limitations on the length of the input line, the only thing you can do to ensure your program robustness is allocating buffers dynamically as the input is read. This may involve storing multiple malloc-ed buffers in a linked list, or calling realloc when buffer exhaustion detected (this is how getline function works, although it is not specified in the C standard, only in POSIX.1-2008).
In either case, never use gets to read the line. Call fgets instead.
It all depends on how you read the line. For example:
char string[300];
FILE* fp = fopen(filename, "r");
//Error checking omitted
fgets(string, 300, fp);
Taken from tutorialspoint.com
The C library function char *fgets(char *str, int n, FILE *stream) reads a line from the specified stream and stores it into the string pointed to by str. It stops when either (n-1) characters are read, the newline character is read, or the end-of-file is reached, whichever comes first.
That means that this will read 299 characters from the file at most. This will cause only a logical error (because you might not get all the data you need) that won't cause any undefined behavior.
But, if you do:
char string[300];
int i = 0;
FILE* fp = fopen(filename, "r");
do{
string[i] = fgetc(fp);
i++;
while(string[i] != '\n');
This will cause Segmantation Fault because it will try to write on unallocated memory on lines bigger than 300 characters.

How to declare a char array of size same as the size of the string entered at the runtime in C?

I want to get a string from the user, in a char array that have no fixed length. The length should be equal to the, length of the string that the user enters. I tried malloc(), but that also requires the size to be specified. Please help.
Please mark it, I want to use a char array, not a string type.
C strings do not pack their length with them. Every C string is a plain array of characters, with a null after the last char to indicate its end. Standard functions from the C IO library will generally receive, therefore, an array of chars and write data into it. The array will have to be big enough to hold everything that is typed by the user. Most functions won't even check for buffer overflows.
Now what you can do is ask first for the max length of the string the user is going to type and allocate exact memory, or you can declare a huge array and define its size as the max string length.
char bigBuffer[2048];
fgets(bigBuffer, 2048, stdin);
fgets() allows you to specify the maximum number of chars you are taking in. If the user types more than 2048 chars, in this example, fgets() it will return with an error and prevent your program from crashing.
It is not possible to allocate a memory with infinite length. Every memory is bound by size one way or other.
There are two ways to handle your situation.
1. Allocate a large memory which will not overrun any possible user input.
2. [Better Option] Use reasonable size memory and use function with length check like, fgets, to get user input.
What you need is a basic unlimited input function. The idea is to allocate a reasonably sized buffer for input, begin reading one char at a time, and if you exceed the buffer size to realloc it and increase its size.
You could optimize this a bit by reading strings the length of the remaining space, but that gets complicated and fiddly. Mainly not worth it.
I wrote this code off the top of my head so it probably won't compile and work as-is, but it should give you the basic idea.
char *buffer = malloc(100);
size_t bufferLen = 100;
size_t currLen = 0;
int c;
while ((c = getchar()) != EOF && c != '\n')
{
if (currLen > bufferLen-1) // -1 because must leave room for null terminator
{
bufferLen += 100;
buffer = realloc(buffer, bufferLen);
buffer[currLen++] = c;
}
else
buffer[currLen++] = c;
}
buffer[currLen] = '\0';
This can be done in indirect way.
Read one character at a time from input. Using malloc/realloc allocate memory in increasing fashion. It is not of constant order time and constant order memory algo but your functionality can be achieved.
Here is the code snippet for that.
char ch;
int count=0;
char *charArray=NULL;
printf("Enter string\n");
while((ch=getchar())!='\n')//This condition can be changed according to needs
{
count=count+1;
charArray=(char *)realloc(charArray,count);
charArray[count-1]=ch;
}
You create the array after the user has entered the string. Can't remember exact C syntax but something along the lines of
string word = "";
scanf("%s", word");
char myArray[word.length];

string input and output in C

I have this snippet of the code:
char* receiveInput(){
char *s;
scanf("%s",s);
return s;
}
int main()
{
char *str = receiveInput();
int length = strlen(str);
printf("Your string is %s, length is %d\n", str, length);
return 0;
}
I receive this output:
Your string is hellàÿ", length is 11
my input was:
helloworld!
can somebody explain why, and why this style of the coding is bad, thanks in advance
Several questions have addressed what you've done wrong and how to fix it, but you also said (emphasis mine):
can somebody explain why, and why this style of the coding is bad
I think scanf is a terrible way to read input. It's inconsistent with printf, makes it easy to forget to check for errors, makes it hard to recover from errors, and is incompatable with ordinary (and easier to do correctly) read operations (like fgets and company).
First, note that the "%s" format will read only until it sees whitespace. Why whitespace? Why does "%s" print out an entire string, but reads in strings in such a limited capacity?
If you'd like to read in an entire line, as you may often be wont to do, scanf provides... with "%[^\n]". What? What is that? When did this become Perl?
But the real problem is that neither of those are safe. They both freely overflow with no bounds checking. Want bounds checking? Okay, you got it: "%10s" (and "%10[^\n]" is starting to look even worse). That will only read 9 characters, and add a terminating nul-character automatically. So that's good... for when our array size never needs to change.
What if we want to pass the size of our array as an argument to scanf? printf can do this:
char string[] = "Hello, world!";
printf("%.*s\n", sizeof string, string); // prints whole message;
printf("%.*s\n", 6, string); // prints just "Hello,"
Want to do the same thing with scanf? Here's how:
static char tmp[/*bit twiddling to get the log10 of SIZE_MAX plus a few*/];
// if we did the math right we shouldn't need to use snprintf
snprintf(tmp, sizeof tmp, "%%%us", bufsize);
scanf(tmp, buffer);
That's right - scanf doesn't support the "%.*s" variable precision printf does, so to do dynamic bounds checking with scanf we have to construct our own format string in a temporary buffer. This is all kinds of bad, and even though it's actually safe here it will look like a really bad idea to anyone just dropping in.
Meanwhile, let's look at another world. Let's look at the world of fgets. Here's how we read in a line of data with fgets:
fgets(buffer, bufsize, stdin);
Infinitely less headache, no wasted processor time converting an integer precision into a string that will only be reparsed by the library back into an integer, and all the relevant elements are sitting there on one line for us to see how they work together.
Granted, this may not read an entire line. It will only read an entire line if the line is shorter than bufsize - 1 characters. Here's how we can read an entire line:
char *readline(FILE *file)
{
size_t size = 80; // start off small
size_t curr = 0;
char *buffer = malloc(size);
while(fgets(buffer + curr, size - curr, file))
{
if(strchr(buffer + curr, '\n')) return buffer; // success
curr = size - 1;
size *= 2;
char *tmp = realloc(buffer, size);
if(tmp == NULL) /* handle error */;
buffer = tmp;
}
/* handle error */;
}
The curr variable is an optimization to prevent us from rechecking data we've already read, and is unnecessary (although useful as we read more data). We could even use the return value of strchr to strip off the ending "\n" character if you preferred.
Notice also that size_t size = 80; as a starting place is completely arbitrary. We could use 81, or 79, or 100, or add it as a user-supplied argument to the function. We could even add an int (*inc)(int) argument, and change size *= 2; to size = inc(size);, allowing the user to control how fast the array grows. These can be useful for efficiency, when reallocations get costly and boatloads of lines of data need to be read and processed.
We could write the same with scanf, but think of how many times we'd have to rewrite the format string. We could limit it to a constant increment, instead of the doubling (easily) implemented above, and never have to adjust the format string; we could give in and just store the number, do the math with as above, and use snprintf to convert it to a format string every time we reallocate so that scanf can convert it back to the same number; we could limit our growth and starting position in such a way that we can manually adjust the format string (say, just increment the digits), but this could get hairy after a while and may require recursion (!) to work cleanly.
Furthermore, it's hard to mix reading with scanf with reading with other functions. Why? Say you want to read an integer from a line, then read a string from the next line. You try this:
int i;
char buf[BUSIZE];
scanf("%i", &i);
fgets(buf, BUFSIZE, stdin);
That will read the "2" but then fgets will read an empty line because scanf didn't read the newline! Okay, take two:
...
scanf("%i\n", &i);
...
You think this eats up the newline, and it does - but it also eats up leading whitespace on the next line, because scanf can't tell the difference between newlines and other forms of whitespace. (Also, turns out you're writing a Python parser, and leading whitespace in lines is important.) To make this work, you have to call getchar or something to read in the newline and throw it away it:
...
scanf("%i", &i);
getchar();
...
Isn't that silly? What happens if you use scanf in a function, but don't call getchar because you don't know whether the next read is going to be scanf or something saner (or whether or not the next character is even going to be a newline)? Suddenly the best way to handle the situation seems to be to pick one or the other: do we use scanf exclusively and never have access to fgets-style full-control input, or do we use fgets exclusively and make it harder to perform complex parsing?
Actually, the answer is we don't. We use fgets (or non-scanf functions) exclusively, and when we need scanf-like functionality, we just call sscanf on the strings! We don't need to have scanf mucking up our filestreams unnecessarily! We can have all the precise control over our input we want and still get all the functionality of scanf formatting. And even if we couldn't, many scanf format options have near-direct corresponding functions in the standard library, like the infinitely more flexible strtol and strtod functions (and friends). Plus, i = strtoumax(str, NULL) for C99 sized integer types is a lot cleaner looking than scanf("%" SCNuMAX, &i);, and a lot safer (we can use that strtoumax line unchanged for smaller types and let the implicit conversion handle the extra bits, but with scanf we have to make a temporary uintmax_t to read into).
The moral of this story: avoid scanf. If you need the formatting it provides, and don't want to (or can't) do it (more efficiently) yourself, use fgets / sscanf.
scanf doesn't allocate memory for you.
You need to allocate memory for the variable passed to scanf.
You could do like this:
char* receiveInput(){
char *s = (char*) malloc( 100 );
scanf("%s",s);
return s;
}
But warning:
the function that calls receiveInput will take the ownership of the returned memory: you'll have to free(str) after you print it in main. (Giving the ownership away in this way is usually not considered a good practice).
An easy fix is getting the allocated memory as a parameter.
if the input string is longer than 99 (in my case) your program will suffer of buffer overflow (which is what it's already happening).
An easy fix is to pass to scanf the length of your buffer:
scanf("%99s",s);
A fixed code could be like this:
// s must be of at least 100 chars!!!
char* receiveInput( char *s ){
scanf("%99s",s);
return s;
}
int main()
{
char str[100];
receiveInput( str );
int length = strlen(str);
printf("Your string is %s, length is %d\n", str, length);
return 0;
}
You have to first allocate memory to your s object in your receiveInput() method. Such as:
s = (char *)calloc(50, sizeof(char));

Reading a file to char array then malloc size. (C)

Hey, so lets say I get a file as the first command line argument.
int main(int argc, char** argv) {
unsigned char* fileArray;
FILE* file1 = fopen(argv[1], "r");
}
Now how can I go about reading that file, char by char, into the char* fileArray?
Basically how can I convert a FILE* to a char* before I know how big I need to malloc the char*
I know a possible solution is to use a buffer, but my problem here is I'm dealing with files that could have over 900000 chars, and don't see it fit making a buffer that is that large.
If only "real" files (not stream, devices, ...) are used, you can use stat/fstat or something like
int retval=fseek(file1,0,SEEK_END); // succeeded if ==0 (file seekable, etc.)
long size=ftell(file1); // size==-1 would be error
rewind(file1);
to get the file's size beforehand. Then you can malloc and read.
But since file1 might change in the meantime you still have to ensure not to read beyond your malloced size.
There are a couple of approaches you can take:
specify a maximum size that you can handle, then you just allocate once (whether as a global or on the heap).
handle the file in chunks if you're worried about fitting it all into memory at once.
handle an arbitrary size by using malloc with realloc (as you read bits in).
Number 1 is easy:
static char buff[900001]; // or malloc/free of 900000
count = fread (buff, 1, 900001, fIn);
if (count > 900000) // problem!
Number 2 is probably the best way to do it unless you absolutely need the whole file in memory at once. For example, if your program counts the number of words, it can sequentially process the file a few K at a time.
Number 3, you can maintain a buffer, used and max variable. Initially set max to 50K and allocate buffer as that size.
Then try read in one 10K chunk to a fixed buffer tbuff. Add up the current used and the number of bytes read into tbuff and, if that's greater than max, do a realloc to increase buffer by another 50K (adjusting max at the same time).
Then append tbuff to buffer, adjust used, rinse and repeat. Note that all those values (10K, 50K and so on) are examples only. There are different values you can use depending on your needs.

Resources