I've wrote this small function to read all the data from stdin.
I need to know if this function is POSIX compatible (by this, I mean it will work under Unix and Unix-like systems) at least it works on Windows...
char* getLine()
{
int i = 0, c;
char* ptrBuff = NULL;
while ((c = getchar()) != '\n' && c != EOF)
{
if ((ptrBuff = (char*)realloc(ptrBuff, sizeof (char)+i)) != NULL)
ptrBuff[i++] = c;
else
{
free(ptrBuff);
return NULL;
}
}
if (ptrBuff != NULL)
ptrBuff[i] = '\0';
return ptrBuff;
}
The function reads all the data from stdin until get '\n' or EOF and returns a pointer to the new location with all the chars. I don't know if this is the most optimal or safer way to do that, and neither know if this works under Unix and Unix-like systems... so, I need a little bit of help here. How can I improve that function? or is there a better way to get all the data from stdin without leaving garbage on the buffer? I know that fgets() is an option but, it may leave garbage if the user input is bigger than expected... plus, I want to get all the chars that the user has written.
EDIT:
New version of getLine():
char* readLine()
{
int i = 0, c;
size_t p4kB = 4096;
void *nPtr = NULL;
char *ptrBuff = (char*)malloc(p4kB);
while ((c = getchar()) != '\n' && c != EOF)
{
if (i == p4kB)
{
p4kB += 4096;
if ((nPtr = realloc(ptrBuff, p4kB)) != NULL)
ptrBuff = (char*)nPtr;
else
{
free(ptrBuff);
return NULL;
}
}
ptrBuff[i++] = c;
}
if (ptrBuff != NULL)
{
ptrBuff[i] = '\0';
ptrBuff = realloc(ptrBuff, strlen(ptrBuff) + 1);
}
return ptrBuff;
}
LAST EDIT:
This is the final version of the char* readLine() function. Now I can't see more bugs neither best ways to improve it, if somebody knows a better way, just tell me, please.
char* readLine()
{
int c;
size_t p4kB = 4096, i = 0;
void *newPtr = NULL;
char *ptrString = malloc(p4kB * sizeof (char));
while (ptrString != NULL && (c = getchar()) != '\n' && c != EOF)
{
if (i == p4kB * sizeof (char))
{
p4kB += 4096;
if ((newPtr = realloc(ptrString, p4kB * sizeof (char))) != NULL)
ptrString = (char*) newPtr;
else
{
free(ptrString);
return NULL;
}
}
ptrString[i++] = c;
}
if (ptrString != NULL)
{
ptrString[i] = '\0';
ptrString = realloc(ptrString, strlen(ptrString) + 1);
}
else return NULL;
return ptrString;
}
POSIX-compatible: yes!
You're calling only getchar(), malloc(), realloc() and free(), all of which are
standard C functions and therefore also available under POSIX. As far as I can tell, you've done all the necessary return code checks too. Given that, the code will be good in any environment that supports malloc() and stdin.
Only thing I would change is the last call to strlen, which is not necessary since the length is already stored in i.
Related
According to c-for-dummies.com:
The latest and most trendy function for reading a string of text is getline(). It’s a new C library function, having appeared around 2010 or so.
You might not have heard of the getline() function, and a few C
programmers avoid it because it uses — brace yourself — pointers! Even
so, it’s a good line-input function, and something you should be
familiar with, even if you don’t plan on using it.
That page provides 2 simple examples after that description, but such examples don't explain step by step how the getline function actually works behind the scenes. So I seeked the code of this function in the net, and I managed to get a "replica" of the getline function called _getline which also uses a "replica" of the getchar function called _getchar, here it is:
#include <unistd.h>
#include <stdlib.h>
int _getchar(void)
{
int rd;
char buff[2];
rd = read(STDIN_FILENO, buff, 1);
if (rd == 0)
return (EOF);
if (rd == -1)
exit(99);
return (*buff);
}
ssize_t _getline(char **lineptr, size_t *n, FILE *stream)
{
char *temp;
const size_t n_alloc = 120;
size_t n_read = 0;
size_t n_realloc;
int c;
if (lineptr == NULL || n == NULL || stream == NULL)
return (-1);
if (*lineptr == NULL)
{
*lineptr = malloc(n_alloc);
if (*lineptr == NULL)
return (-1);
*n = n_alloc;
}
while ((c = _getchar()) != EOF)
{
if (n_read >= *n)
{
n_realloc = *n + n_alloc;
temp = realloc(*lineptr, n_realloc + 1);
if (temp == NULL)
return (-1);
*lineptr = temp;
*n = n_realloc;
}
n_read++;
(*lineptr)[n_read - 1] = (char) c;
if (c == '\n')
break;
}
if (c == EOF)
return (-1);
(*lineptr)[n_read] = '\0';
return ((ssize_t) n_read);
}
After reading the code above, I realized why it is said that there are few C programmers that avoid it, and I don't want to stay within that group so, could someone here explain how these replicas work together please?
I don't know why while loop can't stop. It can't compare c with Delim neither stop by reaching eof.
wchar_t* Getline(const wchar_t* Filename, const wchar_t Delim){
FILE* f = _wfopen(Filename, L"r, ccs=UTF-8");
wchar_t* info = NULL;
wchar_t* temp = NULL;
int count = 1;
int i = 0;
wchar_t c;
c = fgetwc(f);
while (c != Delim || !feof(f))
{
count++;
temp = (wchar_t*)realloc(info, count * sizeof(wchar_t));
if (temp)
{
info = temp;
info[i] = c;
i++;
}
else
{
free(info);
wprintf(L"Failed to read\n");
}
c = fgetwc(f);
}
info[i] = '\0';
fclose(f);
return info;
}
After reading all character in file. It seem not to stop. Even c are the same with Delim. And !feof(f) hasn't worked too. I have try c != WEOF but fail too
I thought that the problem is in the file that I read but not. I have change another file but the same problem.
Thanks for helping me!
You wish to loop whilst you have not got a Delim character and it is not the end of the file, so replace the || with &&
while (!feof(f) && c != Delim)
Edit: the order has also been changed in response to comments
I wrote this simple readline function, it can return each line length but it doesn't return a pointer to the allocated buffer. Another issue is the last line ignored(it doesn't return it):
FILE *passFile = NULL;
char *current = NULL;
size_t len = 0;
passFile = fopen("pass.txt", "r");
while(readline(passFile, ¤t, &len) != -1) {
printf("%s\n", current); // SEGMENTAION FAULT
printf("%d\n", len);
free(current);
current = NULL;
}
ssize_t
readline(FILE *file, char **bufPtr, size_t *len)
{
char c, *buf = NULL;
size_t n = 0;
buf = (char*)malloc(sizeof(char));
while((c = fgetc(file)) != '\n' && (c != EOF)) {
buf[n] = c;
++n;
buf = realloc(buf, n + 1);
}
buf[n] = '\0';
*bufPtr = buf;
*len = n;
if(c == EOF) // reach end of file
return -1;
return 0;
}
Your readline() function is not returning a pointer to allocated memory. In your call, current is never set, so the pointer is invalid and you get the error.
In C, functions are "call by value". Inside readline(), bufPtr is a copy of whatever was passed to readline(). Assigning to bufPtr merely overwrites the local copy and does not return a value that the calling code can see.
In pseudocode:
TYPE a;
define function foo(TYPE x)
{
x = new_value;
}
foo(a); // does not change a
This only changes the local copy of x and does not return a value. You change it to use a pointer... the function still gets a copy, but now it's a copy of a pointer, and it can use that pointer value to find the original variable. In pseudocode:
TYPE a;
define function foo(TYPE *px)
{
*px = new_value;
}
foo(&a); // does change a
Now, to change your function:
ssize_t
readline(FILE *file, char **pbufPtr, size_t *len)
{
// ...deleted...
buf[n] = '\0';
*pbufPtr = buf;
// ...deleted...
}
And you call it like so:
while(readline(passFile, ¤t, &len) != -1)
P.S. It is not a good idea to call realloc() the way you do here. It's potentially a very slow function, and for an input string of 65 characters you will call it 65 times. It would be better to use an internal buffer for the initial file input, then use malloc() to allocate a string that is just the right size and copy the string into the buffer. If the string is too long to fit in the internal buffer at once, use malloc() to get a big-enough place to copy out the part of the string you have in the internal buffer, then continue using the internal buffer to copy more of the string, and then call realloc() as needed. Basically I'm suggesting you have an internal buffer of size N, and copy the string in chunks of N characters at a time, thus minimizing the number of calls to realloc() while still allowing arbitrary-length input strings.
EDIT: Your last-line problem is that you return -1 when you hit end of file, even though there is a line to return.
Change your code so that you return -1 only if c == EOF and n == 0, so a final line that ends with EOF will be correctly returned.
You should also make readline() use the feof() function to check if file is at end-of-file, and if so, return -1 without calling malloc().
Basically, when you return -1, you don't want to call malloc(), and when you did call malloc() and copy data into it, you don't want to return -1! -1 should mean "you got nothing because we hit end of file". If you got something before we hit end of file, that's not -1, that is 0. Then the next call to readline() after that will return -1.
In your readline function you pass current by value. So if you change bufPtr inside your function, it doesn't change value of current outside. If you want to change value of current pass it by reference: ¤t and change readline() parameter to char **bufPTR.
You could pass current the way you did if you wanted to change something it points to, but you want to change where it points in first place.
replace your readlinefunction with this
char* readline(FILE *file, size_t *len)
{
char c, *buf = NULL;
size_t n = 0;
buf = (char*)malloc(sizeof(char));
while((c = fgetc(file)) != '\n' && (c != EOF)) {
buf[n] = c;
++n;
buf = realloc(buf, n + 1);
}
buf[n] = '\0';
bufPtr = buf;
*len = n;
if(c == EOF) // reach end of file
return NULL;
return buf;
}
and then in main replace this line while(readline(passFile, current, &len) != -1) with this while((current = readline(passFile, &len) != NULL)
Now it works:
ssize_t
readline(FILE *file, char **bufPtr, size_t *len)
{
if(feof(file)) // reach end of file
return -1;
char c, *buf = NULL;
size_t n = 0, portion = CHUNK;
buf = (char*)malloc(sizeof(char) * CHUNK);
while((c = fgetc(file)) != '\n' && (c != EOF)) {
buf[n] = c;
++n;
if(n == portion) {
buf = realloc(buf, CHUNK + n);
portion += n;
}
}
buf[n] = '\0';
*bufPtr = buf;
*len = n;
return 0;
}
I'm using this function to read, char by char, a text file or a stdin input
void readLine(FILE *stream, char **string) {
char c;
int counter = 0;
do {
c = fgetc(stream);
string[0] = (char *) realloc (string[0], (counter+1) * sizeof(char));
string[0][counter++] = c;
} while(c != ENTER && !feof(stream));
string[counter-1] = '\0';
}
But when I call it, my program crashed and I really don't know why, because I don't forget the 0-terminator and I'm convinced that I stored correctly the char sequence. I've verified the string length, but it appears alright.
This is an error:
do {
c = fgetc(stream);
// What happens here?!?
} while(c != ENTER && !feof(stream));
"What happens here" is that you add c to string before you've checked for EOF, whoops.
This is very ungood:
string[0] = (char *) realloc (string[0], (counter+1) * sizeof(char));
in a loop. realloc is a potentially expensive call and you do it for every byte of input! It is also a silly and confusing interface to ask for a pointer parameter that has (apparently) not been allocated anything -- passing in the pointer usually indicates that is already done. What if string were a static array? Instead, allocate in chunks and return a pointer:
char *readLine (FILE *stream) {
// A whole 4 kB!
int chunksz = 4096;
int counter = 0;
char *buffer = malloc(chunksz);
char *test;
int c;
if (!buffer) return NULL;
while (c = fgetc(stream) && c != ENTER && c != EOF) {
buffer[counter++] = (char)c;
if (counter == chunksz) {
chunksz *= 2;
test = realloc(buffer, chunksz);
// Abort on out-of-memory.
if (!test) {
free(buffer);
return NULL;
} else buffer = test;
}
}
// Now null terminate and resize.
buffer[counter] = '\0';
realloc(buffer, counter + 1);
return buffer;
}
That is a standard "power of 2" allocation scheme (it doubles). If you really want to submit a pointer, pre-allocate it and also submit a "max length" parameter:
void *readLine (FILE *stream, char *buffer, int max) {
int counter = 0;
int c;
while (
c = fgetc(stream)
&& c != ENTER
&& c != EOF
&& counter < max - 1
) buffer[counter++] = (char)c;
// Now null terminate.
buffer[counter] = '\0';
}
There are a few issues in this code:
fgetc() returns int.
Don't cast the return value of malloc() and friends, in C.
Avoid using sizeof (char), it's just a very clumsy way of writing 1, so multiplication by it is very redundant.
Normally, buffers are grown more than 1 char at a time, realloc() can be expensive.
string[0] would be more clearly written as *string, since it's not an array but just a pointer to a pointer.
Your logic around end of file means it will store the truncated version of EOF, not very nice.
Change this line
string[counter-1] = '\0';
to
string[0][counter-1] = '\0';
You want to terminate string stored at string[0].
Feel silly asking this question, since this should be easy, but I can't figure out whats wrong.
void loadIniIntoMemory() {
FILE *fp ;
fp = fopen (iniFile, "r");
int ch;
int final_line_num = 0;
int char_index;
char* current_line = (char*) malloc(sizeof(char) * MAX_INI_LINE_LENGTH);
while((ch = fgetc(fp)) != EOF) {
if(ch == 10) {
// new line
*(current_line + char_index) = '\0';
char_index = 0;
iniFileData[final_line_num] = current_line;
final_line_num++;
} else {
// regular char
*(current_line + char_index) = ch; // CAN'T DO THIS, CRASH
char_index++;
if(ch == 13) {
// carriage return
continue;
}
}
}
}
Been a little while since I did C, it crashes at this line : *(current_line + char_index) = ch;
Thanks for any help.
--EDIT--
Also, no one noticed, that this code doesn't save the last line. Here is the full, correct, working code which saves a file into an array of pointers.
void loadIniIntoMemory() {
FILE *fp ;
fp = fopen (iniFile, "r");
int ch;
final_line_num = 0;
int char_index = 0;
char* current_line = (char*) malloc(sizeof(char) * MAX_INI_LINE_LENGTH);
while((ch = fgetc(fp)) != EOF) {
if(ch == '\n') {
// new line
*(current_line + char_index) = '\0';
char_index = 0;
iniFileData[final_line_num] = current_line;
final_line_num++;
current_line = (char*) malloc(sizeof(char) * MAX_INI_LINE_LENGTH);
} else if(ch != '\r') {
// regular char
*(current_line + char_index) = ch;
char_index++;
}
}
iniFileData[final_line_num] = current_line;
fclose(fp);
}
For starters, you don't initialize char_index, meaning it will likely have garbage in it. If you don't initialize it, your program will add some unknown number to the current_line pointer.
int char_index = 0; /* initialize to 0 */
Secondly, a bit more "natural" syntax would be:
current_line[char_index] = ...
Thirdly, you can test the characters without using their integer equivalents:
if (ch == '\n') {
/* this is the same as "ch == 10" */
Fourth, you should close the open file prior to leaving the routine:
fclose(fp);
Finally, I'm not sure what the ch == 13 ('\r') and continue is meant to handle, since the continue is effectively a no-op, but you probably don't want to copy it into the data:
if (ch != '\r') {
current_line[char_index] = ch;
char_index++;
/* or on one line: current_line[char_index++] = ch; */
}
As an aside, a powerful feature of C (and many other languages) is the switch statement:
/* substitutes your if...elseif...else */
switch (ch) {
case '\n':
current_line[char_index] = '\0';
char_index = 0;
iniFileData[final_line_num++] = current_line;
break; /* <-- very important, C allows switch cases to fall thru */
case '\r':
/* do nothing */
break;
default:
/* any character that is not a newline or linefeed */
current_line[char_index++] = ch;
break;
}
You didn't initialize char_index. I suppose you want to initialize it to 0. In C, uninitialized variable will contain garbage. Most likely your char_index equals some very large number.
Besides what others have already pointed out, you will also have to move the line buffer allocation call into the loop and immediately after the final_line_num++; statement. Otherwise, each new line you read will be overwriting the previous line.
And some operating systems, like most POSIX compliant ones, in particular Linux, gives you the ability to map a file segment (possibly the entire file) into virtual memory. On Linux, you could consider using the mmap system call