According to c-for-dummies.com:
The latest and most trendy function for reading a string of text is getline(). It’s a new C library function, having appeared around 2010 or so.
You might not have heard of the getline() function, and a few C
programmers avoid it because it uses — brace yourself — pointers! Even
so, it’s a good line-input function, and something you should be
familiar with, even if you don’t plan on using it.
That page provides 2 simple examples after that description, but such examples don't explain step by step how the getline function actually works behind the scenes. So I seeked the code of this function in the net, and I managed to get a "replica" of the getline function called _getline which also uses a "replica" of the getchar function called _getchar, here it is:
#include <unistd.h>
#include <stdlib.h>
int _getchar(void)
{
int rd;
char buff[2];
rd = read(STDIN_FILENO, buff, 1);
if (rd == 0)
return (EOF);
if (rd == -1)
exit(99);
return (*buff);
}
ssize_t _getline(char **lineptr, size_t *n, FILE *stream)
{
char *temp;
const size_t n_alloc = 120;
size_t n_read = 0;
size_t n_realloc;
int c;
if (lineptr == NULL || n == NULL || stream == NULL)
return (-1);
if (*lineptr == NULL)
{
*lineptr = malloc(n_alloc);
if (*lineptr == NULL)
return (-1);
*n = n_alloc;
}
while ((c = _getchar()) != EOF)
{
if (n_read >= *n)
{
n_realloc = *n + n_alloc;
temp = realloc(*lineptr, n_realloc + 1);
if (temp == NULL)
return (-1);
*lineptr = temp;
*n = n_realloc;
}
n_read++;
(*lineptr)[n_read - 1] = (char) c;
if (c == '\n')
break;
}
if (c == EOF)
return (-1);
(*lineptr)[n_read] = '\0';
return ((ssize_t) n_read);
}
After reading the code above, I realized why it is said that there are few C programmers that avoid it, and I don't want to stay within that group so, could someone here explain how these replicas work together please?
Related
the code below works in the following way: it basically reads every single char from the stdin using a function called _getchar, allocates them in an array which finally ends up returning it if c =! EOF.
I'd like to just know what's doing the statement (*lineptr)[n_read] = '\0'; in the code below:
#include <unistd.h>
#include <stdlib.h>
int _getchar(void)
{
int rd;
char buff[2];
rd = read(STDIN_FILENO, buff, 1);
if (rd == 0)
return (EOF);
if (rd == -1)
exit(99);
return (*buff);
}
ssize_t _getline(char **lineptr, size_t *n, FILE *stream)
{
char *temp;
const size_t n_alloc = 120;
size_t n_read = 0;
size_t n_realloc;
int c;
if (lineptr == NULL || n == NULL || stream == NULL)
return (-1);
if (*lineptr == NULL)
{
*lineptr = malloc(n_alloc);
if (*lineptr == NULL)
return (-1);
*n = n_alloc;
}
while ((c = _getchar()) != EOF)
{
if (n_read >= *n)
{
n_realloc = *n + n_alloc;
temp = realloc(*lineptr, n_realloc + 1);
if (temp == NULL)
return (-1);
*lineptr = temp;
*n = n_realloc;
}
n_read++;
(*lineptr)[n_read - 1] = (char) c;
if (c == '\n')
break;
}
if (c == EOF)
return (-1);
(*lineptr)[n_read] = '\0';
return ((ssize_t) n_read);
}
char **lineptr means lineptr contains the adress of a char pointer.
A pointer is a variable that contains an adress. So by writing *lineptryou're getting that adress.
In your case, **lineptr <=> *(*lineptr) <=> (*lineptr)[0]
Edit : btw I was not answering the question... the instruction (*lineptr)[n_read] = '\0' means you're ending your string ('\0' is EOF (End Of Line) character).
They look the same but are different.
The fist one:
int (*ptr)[something];
defines the pointer to int something elements array
The latter
(*ptr1)[something] //witohut type in front of it.
means derefence pointer ptr1 and then dereference the result of previous dereference with added something.
And it is an equivalent of
*((*ptr1) + something)
I want to make a function that reads a line of your choice, from a given text file. Moving on to the function as parameters (int fd of the open, and int line_number)
It must do so using the language C and Unix system calls (read and / or open).
It should also read any spaces, and it must not have real limits (ie the line must be able to have a length of your choice).
The function I did is this:
char* read_line(int file, int numero_riga){
char myb[1];
if (numero_riga < 1) {
return NULL;
}
char* myb2 = malloc(sizeof(char)*100);
memset(myb2, 0, sizeof(char));
ssize_t n;
int i = 1;
while (i < numero_riga) {
if((n = read(file, myb, 1)) == -1){
perror("read fail");
exit(EXIT_FAILURE);
}
if (strncmp(myb, "\n", 1) == 0) {
i++;
}else if (n == 0){
return NULL;
}
}
numero_riga++;
int j = 0;
while (i < numero_riga) {
ssize_t n = read(file, myb, 1);
if (strncmp(myb, "\n", 1) == 0) {
i++;
}else if (n == 0){
return myb2;
}else{
myb2[j] = myb[0];
j++;
}
}
return myb2;
}
Until recently, I thought that this would work but it really has some problems.
Using message queues, the string read by the read_line is received as a void string ( "\0" ). I know the message queues are not the problem because trying to pass a normal string did not create the problem.
If possible I would like a fix with explanation of why I should correct it in a certain way. This is because if I do not understand my mistakes I risk repeating them in the future.
EDIT 1. Based upon the answers I decided to add some questions.
How do I end myb2? Can someone give me an example based on my code?
How do I know in advance the amount of characters that make up a line of txt to read?
EDIT 2. I don't know the number of char the line have so I don't know how many char to allocate; that's why I use *100.
Partial Analysis
You've got a memory leak at:
char* myb2 = (char*) malloc((sizeof(char*))*100);
memset(myb2, 0, sizeof(char));
if (numero_riga < 1) {
return NULL;
}
Check numero_riga before you allocate the memory.
The following loop is also dubious at best:
int i = 1;
while (i < numero_riga) {
ssize_t n = read(file, myb, 1);
if (strncmp(myb, "\n", 1) == 0) {
i++;
}else if (n == 0){
return NULL;
}
}
You don't check whether read() actually returned anything quick enough, and when you do check, you leak memory (again) and ignore anything that was read beforehand, and you don't detect errors (n < 0). When you do detect a newline, you simply add 1 to i. At no point do you save the character read in a buffer (such as myb2). All in all, that seem's pretty thoroughly broken…unless…unless you're trying to read the Nth line in the file from scratch, rather than the next line in the file, which is more usual.
What you need to be doing is:
scan N-1 lines, paying attention to EOF
while another byte is available
if it is newline, terminate the string and return it
otherwise, add it to the buffer, allocating space if there isn't room.
Implementation
I think I'd probably use a function get_ch() like this:
static inline int get_ch(int fd)
{
char c;
if (read(fd, &c, 1) == 1)
return (unsigned char)c;
return EOF;
}
Then in the main char *read_nth_line(int fd, int line_no) function you can do:
char *read_nth_line(int fd, int line_no)
{
if (line_no <= 0)
return NULL;
/* Skip preceding lines */
for (int i = 1; i < line_no; i++)
{
int c;
while ((c = get_ch(fd)) != '\n')
{
if (c == EOF)
return NULL;
}
}
/* Capture next line */
size_t max_len = 8;
size_t act_len = 0;
char *buffer = malloc(8);
int c;
while ((c = get_ch(fd)) != EOF && c != '\n')
{
if (act_len + 2 >= max_len)
{
size_t new_len = max_len * 2;
char *new_buf = realloc(buffer, new_len);
if (new_buf == 0)
{
free(buffer);
return NULL;
}
buffer = new_buf;
max_len = new_len;
}
buffer[act_len++] = c;
}
if (c == '\n')
buffer[act_len++] = c;
buffer[act_len] = '\0';
return buffer;
}
Test code added:
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
extern char *read_nth_line(int fd, int line_no);
…code from main answer…
int main(void)
{
char *line;
while ((line = read_nth_line(0, 3)) != NULL)
{
printf("[[%s]]\n", line);
free(line);
}
return 0;
}
This reads every third line from standard input. It seems to work correctly. It would be a good idea to do more exhaustive checking of boundary conditions (short lines, etc) to make sure it doesn't abuse memory. (Testing lines of lengths 1 — newline only — up to 18 characters with valgrind shows it is OK. Random longer tests also seemed to be correct.)
I've wrote this small function to read all the data from stdin.
I need to know if this function is POSIX compatible (by this, I mean it will work under Unix and Unix-like systems) at least it works on Windows...
char* getLine()
{
int i = 0, c;
char* ptrBuff = NULL;
while ((c = getchar()) != '\n' && c != EOF)
{
if ((ptrBuff = (char*)realloc(ptrBuff, sizeof (char)+i)) != NULL)
ptrBuff[i++] = c;
else
{
free(ptrBuff);
return NULL;
}
}
if (ptrBuff != NULL)
ptrBuff[i] = '\0';
return ptrBuff;
}
The function reads all the data from stdin until get '\n' or EOF and returns a pointer to the new location with all the chars. I don't know if this is the most optimal or safer way to do that, and neither know if this works under Unix and Unix-like systems... so, I need a little bit of help here. How can I improve that function? or is there a better way to get all the data from stdin without leaving garbage on the buffer? I know that fgets() is an option but, it may leave garbage if the user input is bigger than expected... plus, I want to get all the chars that the user has written.
EDIT:
New version of getLine():
char* readLine()
{
int i = 0, c;
size_t p4kB = 4096;
void *nPtr = NULL;
char *ptrBuff = (char*)malloc(p4kB);
while ((c = getchar()) != '\n' && c != EOF)
{
if (i == p4kB)
{
p4kB += 4096;
if ((nPtr = realloc(ptrBuff, p4kB)) != NULL)
ptrBuff = (char*)nPtr;
else
{
free(ptrBuff);
return NULL;
}
}
ptrBuff[i++] = c;
}
if (ptrBuff != NULL)
{
ptrBuff[i] = '\0';
ptrBuff = realloc(ptrBuff, strlen(ptrBuff) + 1);
}
return ptrBuff;
}
LAST EDIT:
This is the final version of the char* readLine() function. Now I can't see more bugs neither best ways to improve it, if somebody knows a better way, just tell me, please.
char* readLine()
{
int c;
size_t p4kB = 4096, i = 0;
void *newPtr = NULL;
char *ptrString = malloc(p4kB * sizeof (char));
while (ptrString != NULL && (c = getchar()) != '\n' && c != EOF)
{
if (i == p4kB * sizeof (char))
{
p4kB += 4096;
if ((newPtr = realloc(ptrString, p4kB * sizeof (char))) != NULL)
ptrString = (char*) newPtr;
else
{
free(ptrString);
return NULL;
}
}
ptrString[i++] = c;
}
if (ptrString != NULL)
{
ptrString[i] = '\0';
ptrString = realloc(ptrString, strlen(ptrString) + 1);
}
else return NULL;
return ptrString;
}
POSIX-compatible: yes!
You're calling only getchar(), malloc(), realloc() and free(), all of which are
standard C functions and therefore also available under POSIX. As far as I can tell, you've done all the necessary return code checks too. Given that, the code will be good in any environment that supports malloc() and stdin.
Only thing I would change is the last call to strlen, which is not necessary since the length is already stored in i.
I wrote this simple readline function, it can return each line length but it doesn't return a pointer to the allocated buffer. Another issue is the last line ignored(it doesn't return it):
FILE *passFile = NULL;
char *current = NULL;
size_t len = 0;
passFile = fopen("pass.txt", "r");
while(readline(passFile, ¤t, &len) != -1) {
printf("%s\n", current); // SEGMENTAION FAULT
printf("%d\n", len);
free(current);
current = NULL;
}
ssize_t
readline(FILE *file, char **bufPtr, size_t *len)
{
char c, *buf = NULL;
size_t n = 0;
buf = (char*)malloc(sizeof(char));
while((c = fgetc(file)) != '\n' && (c != EOF)) {
buf[n] = c;
++n;
buf = realloc(buf, n + 1);
}
buf[n] = '\0';
*bufPtr = buf;
*len = n;
if(c == EOF) // reach end of file
return -1;
return 0;
}
Your readline() function is not returning a pointer to allocated memory. In your call, current is never set, so the pointer is invalid and you get the error.
In C, functions are "call by value". Inside readline(), bufPtr is a copy of whatever was passed to readline(). Assigning to bufPtr merely overwrites the local copy and does not return a value that the calling code can see.
In pseudocode:
TYPE a;
define function foo(TYPE x)
{
x = new_value;
}
foo(a); // does not change a
This only changes the local copy of x and does not return a value. You change it to use a pointer... the function still gets a copy, but now it's a copy of a pointer, and it can use that pointer value to find the original variable. In pseudocode:
TYPE a;
define function foo(TYPE *px)
{
*px = new_value;
}
foo(&a); // does change a
Now, to change your function:
ssize_t
readline(FILE *file, char **pbufPtr, size_t *len)
{
// ...deleted...
buf[n] = '\0';
*pbufPtr = buf;
// ...deleted...
}
And you call it like so:
while(readline(passFile, ¤t, &len) != -1)
P.S. It is not a good idea to call realloc() the way you do here. It's potentially a very slow function, and for an input string of 65 characters you will call it 65 times. It would be better to use an internal buffer for the initial file input, then use malloc() to allocate a string that is just the right size and copy the string into the buffer. If the string is too long to fit in the internal buffer at once, use malloc() to get a big-enough place to copy out the part of the string you have in the internal buffer, then continue using the internal buffer to copy more of the string, and then call realloc() as needed. Basically I'm suggesting you have an internal buffer of size N, and copy the string in chunks of N characters at a time, thus minimizing the number of calls to realloc() while still allowing arbitrary-length input strings.
EDIT: Your last-line problem is that you return -1 when you hit end of file, even though there is a line to return.
Change your code so that you return -1 only if c == EOF and n == 0, so a final line that ends with EOF will be correctly returned.
You should also make readline() use the feof() function to check if file is at end-of-file, and if so, return -1 without calling malloc().
Basically, when you return -1, you don't want to call malloc(), and when you did call malloc() and copy data into it, you don't want to return -1! -1 should mean "you got nothing because we hit end of file". If you got something before we hit end of file, that's not -1, that is 0. Then the next call to readline() after that will return -1.
In your readline function you pass current by value. So if you change bufPtr inside your function, it doesn't change value of current outside. If you want to change value of current pass it by reference: ¤t and change readline() parameter to char **bufPTR.
You could pass current the way you did if you wanted to change something it points to, but you want to change where it points in first place.
replace your readlinefunction with this
char* readline(FILE *file, size_t *len)
{
char c, *buf = NULL;
size_t n = 0;
buf = (char*)malloc(sizeof(char));
while((c = fgetc(file)) != '\n' && (c != EOF)) {
buf[n] = c;
++n;
buf = realloc(buf, n + 1);
}
buf[n] = '\0';
bufPtr = buf;
*len = n;
if(c == EOF) // reach end of file
return NULL;
return buf;
}
and then in main replace this line while(readline(passFile, current, &len) != -1) with this while((current = readline(passFile, &len) != NULL)
Now it works:
ssize_t
readline(FILE *file, char **bufPtr, size_t *len)
{
if(feof(file)) // reach end of file
return -1;
char c, *buf = NULL;
size_t n = 0, portion = CHUNK;
buf = (char*)malloc(sizeof(char) * CHUNK);
while((c = fgetc(file)) != '\n' && (c != EOF)) {
buf[n] = c;
++n;
if(n == portion) {
buf = realloc(buf, CHUNK + n);
portion += n;
}
}
buf[n] = '\0';
*bufPtr = buf;
*len = n;
return 0;
}
How do I constantly get user input (strings) until enter is pressed in C just like string class in C++?
I don't know the input size so I can't declare a variable of fixed size or even I can't allocate memory dynamically using malloc() or calloc().
Is there any way to implement this as a separate function?
As H2CO3 said, you should allocate a buffer with malloc(), then resize it with realloc() whenever it fills up. Like this:
size_t bufsize = 256;
size_t buf_used = 0;
int c;
char *buf = malloc(bufsize);
if (buf == NULL) { /* error handling here */ }
while ((c = fgetc(stdin)) != EOF) {
if (c == '\n') break;
if (buf_used == bufsize-1) {
bufsize *= 2;
buf = realloc(buf, bufsize);
if (buf == NULL) { /* error handling here */ }
}
buf[buf_used++] = c;
}
buf[buf_used] = '\0';
Use exponential storage expansion:
char *read_a_line(void)
{
size_t alloc_size = LINE_MAX;
size_t len = 0;
char *buf = malloc(LINE_MAX); // should be good for most, euh, *lines*...
if (!buf)
abort();
int c;
while ((c = fgetc(stdin)) != '\n' && c != EOF) {
if (len >= alloc_size) {
alloc_size <<= 1;
char *tmp = realloc(buf, alloc_size);
if (!tmp)
abort(); // or whatever
buf = tmp;
}
buf[len++] = c;
}
if (len >= alloc_size) {
alloc_size++;
char *tmp = realloc(buf, alloc_size);
if (!tmp)
abort(); // or whatever
buf = tmp;
}
buf[len] = 0;
return buf;
}
In C, you have little choice: If you want to input a string of unbounded length, have to use allocations in a loop. Whether you use realloc() or a linked list of buffers, it comes down to reading (usually through fgets()), reading some more, and so on until the buffer you've just read contains a \n.
Then, depending on the method, you either already have a contiguous buffer (the realloc method) or just need to concatenate them all (the linked list method). Then you can return.
If you're lucky, your platform comes with the extension function getline() that does the realloc method for you. If not, you'll have to write it yourself.