How to properly set buffer and bufsize with getpwuid_r()? - c

Background Info
I am trying to get the string of a user's username, with the only info provided about that user being their uid number. I have the uid as a result of a preceding call to fstat (and the uid is stored in a struct stat).
I need to get the username in a thread-safe manner, and so I am trying to use getpwuid_r(). According to the getpwuid (3) man page:
int getpwuid_r(uid_t uid, struct passwd *pwd, char *buffer,
size_t bufsize, struct passwd **result);
The getpwuid_r() function shall update the passwd structure pointed
to by pwd and store a pointer to that structure at the location
pointed to by result. The structure shall contain an entry from
the user database with a matching uid. Storage referenced by the
structure is allocated from the memory provided with the buffer
parameter, which is bufsize bytes in size. A call to
sysconf(_SC_GETPW_R_SIZE_MAX) returns either −1 without changing
errno or an initial value suggested for the size of this buffer. A
null pointer shall be returned at the location pointed to by result
on error or if the requested entry is not found.
If successful, the getpwuid_r() function shall return zero;
otherwise, an error number shall be returned to indicate the error.
Problem Statement
Upon reading the man page example below, I am confused as to why they need to iterate, while increasing the size of the buffer, until the buffer can hold its information.
I am under the presumption that the buffer holds the struct passwd pwd - considering this, why can't we just set buffer = (void *) malloc(getsize(struct passwd)) and bufsize = sizeof(struct passwd)?
long int initlen = sysconf(_SC_GETPW_R_SIZE_MAX);
size_t len;
if (initlen == −1)
/* Default initial length. */
len = 1024;
else
len = (size_t) initlen;
struct passwd result;
struct passwd *resultp;
char *buffer = malloc(len);
if (buffer == NULL)
...handle error...
int e;
while ((e = getpwuid_r(42, &result, buffer, len, &resultp)) == ERANGE)
{
size_t newlen = 2 * len;
if (newlen < len)
...handle error...
len = newlen;
char *newbuffer = realloc(buffer, len);
if (newbuffer == NULL)
...handle error...
buffer = newbuffer;
}
if (e != 0)
...handle error...
free (buffer);
Is there something I'm not understanding about how this function sets the data within pwd? Perhaps I don't fully understand how the struct passwd we are setting is related to the buffer space.

The passwd struct is defined by the standard to contain at least these members:
char *pw_name // User's login name.
uid_t pw_uid // Numerical user ID.
gid_t pw_gid // Numerical group ID.
char *pw_dir // Initial working directory.
char *pw_shell // Program to use as shell.
Note the three char * members; they point to storage that lies elsewhere, outside of the struct.
Many implementations will have two more char * members: pw_passwd and pw_gecos.
The difference between getpwuid and getpwuid_r is that the former may use a static buffer to store the name, passwd, dir, gecos, and shell strings1 - as well as the passwd struct itself - while the latter, since it's reentrant, requires the user to supply one buffer to hold struct passwd and another buffer to hold the character strings.
In practice, the two functions share a lot of common code.
I am confused as to why they need to iterate, while increasing the size of the buffer, until the buffer can hold its information.
If the call to sysconf(_SC_GETPW_R_SIZE_MAX) fails, you just have to guess how big the buffer for the character strings should be, and keep increasing its size until it's big enough.
1 In V7, when all the info was in /etc/passwd, this static buffer was just a copy of the appropriate line of /etc/passwd with a NUL inserted at the end of each of the five string fields.

Related

Can you help explain how this buffer logic works

I'm trying to do some work with inotify and better understand C in general. I'm very novice. I was looking over the inotify man page and I saw an example of using inotify. I have some questions around how exactly they use buffers. The code is here:
http://man7.org/linux/man-pages/man7/inotify.7.html
The block I'm most interested is:
char buf[4096]
__attribute__ ((aligned(__alignof__(struct inotify_event))));
const struct inotify_event *event;
int i;
ssize_t len;
char *ptr;
/* Loop while events can be read from inotify file descriptor. */
for (;;) {
/* Read some events. */
len = read(fd, buf, sizeof buf);
if (len == -1 && errno != EAGAIN) {
perror("read");
exit(EXIT_FAILURE);
}
/* If the nonblocking read() found no events to read, then
it returns -1 with errno set to EAGAIN. In that case,
we exit the loop. */
if (len <= 0)
break;
/* Loop over all events in the buffer */
for (ptr = buf; ptr < buf + len;
ptr += sizeof(struct inotify_event) + event->len) {
event = (const struct inotify_event *) ptr;
What I'm trying to understand is is how exactly are the processing the bits in this buffer. This is what I know:
We define a char buf of 4096, which means we have a buffer just about 4kbs of size. When call read(fd, buf, sizeof buf) and len will be anywhere from 0 - 4096 (partial reads can occur).
We do some async checking, that's obvious.
Now we get to the for loop, here is where I'm a little confused. We set ptr equal to buf and then compare ptr's size to buff + len.
At this point does ptr equal the value '4096' ? And if so we are saying; is ptr:4096 < buf:4096 + len:[0-4096]. I'm using a colon here to signify what I think the variable's value is and [] meaning a range.
We then as the iterator expression, increase ptr+= the size of an inotify event.
I'm used to higher level OOP languages, in which I'd declare a buffer of 'inotify_event' objects. However I'm assuming since we are just getting back a byte array from 'read' we need to pull off the bites at the 'inotify_event' boundary and type cast those bits into an event object. Does this sounds correct?
Also I'm not exactly sure how comparison works with a buf[4096] values. We don't have concept of checking an array's current size (allocated indexes) so I'm assuming when used in comparison, we are comparing the size of it's allocated memory space '4096' in this case?
Thanks for the help, this is my first time really working with processing bits off a buffer. Trying to wrap my head around all this. Any further reading would be helpful! I've been finding a good amount of reading on C as a language, a good amount of reading on linux systems programming, but I can't seem to find topics such as 'working with buffers' or the grey area between the two.
When you do the assignment ptr = buf in C, you are assigning the address of the first element of buf to ptr. Thus, the comparison is checking whether ptr has gone beyond the end of the buffer.
The loop is jumping by the number of bytes needed to skip over one struct inotify_event, which is defined here, and the length of the name of the event.
ptr = buf
You are assigning the address of the first element of buf (i.e &buf[0]) to the pointer ptr. So you are starting looping through the buf using a pointer starting from the first element.
ptr < buf + len;
This is checking that your ptr pointer is "moving" through the array until the end of buf. It is made using pointer arithmetic. So the loop compare addresses of ptr pointed address with the address of buf + the len of buffer returned by read function.
ptr += sizeof(struct inotify_event) + event->len
Lastly the pointer is moved forward of size of the event struct struct inotify_event plus the event len, that I guess is variable based on the event type.

Why is read() syscall blocking when I pass in a invalid buffer pointer?

Here is my code snippet read(STDIN, NULL, 10) executed on Linux-2.6.32.431. I assumed it would return immediadely after I'd browsed the read() syscall's source code:
SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
{
struct file *file;
ssize_t ret = -EBADF;
int fput_needed;
file = fget_light(fd, &fput_needed);
if (file) {
loff_t pos = file_pos_read(file);
ret = vfs_read(file, buf, count, &pos);
file_pos_write(file, pos);
fput_light(file, fput_needed);
}
return ret;
}
and
ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
{
ssize_t ret;
if (!(file->f_mode & FMODE_READ))
return -EBADF;
if (!file->f_op || (!file->f_op->read && !file->f_op->aio_read))
return -EINVAL;
if (unlikely(!access_ok(VERIFY_WRITE, buf, count))) //I suppose it should return here
return -EFAULT;
...
}
However, it got blocked. After I typed in some characters and hitted return, this program consumed one character and returned while the remaining characters inputed into the terminal.
My question is:
why did the read() syscall get blocked?
why did the remaining characters get inputed into the terminal.
I believe access_ok does not do exactly what its name implies.
From the comments in arch/x86/include/asm/uaccess.h:
/**
* access_ok: - Checks if a user space pointer is valid
* #type: Type of access: %VERIFY_READ or %VERIFY_WRITE. Note that
* %VERIFY_WRITE is a superset of %VERIFY_READ - if it is safe
* to write to a block, it is always safe to read from it.
* #addr: User space pointer to start of block to check
* #size: Size of block to check
*
* Context: User context only. This function may sleep.
*
* Checks if a pointer to a block of memory in user space is valid.
*
* Returns true (nonzero) if the memory block may be valid, false (zero)
* if it is definitely invalid.
*
* Note that, depending on architecture, this function probably just
* checks that the pointer is in the user space range - after calling
* this function, memory access functions may still return -EFAULT.
*/
The comments appear to be accurate; on x86, if you trace the definition of access_ok, you will find it just checks (essentially) whether addr + size > user_addr_max(). In particular, it returns "true" for a NULL pointer.
So you have to trace vfs_read a little further, into the call to file->f_op->read(), which is presumably invoking the read function for the TTY driver, which is presumably where it is blocking.
(Note that POSIX guarantees nothing when you pass a NULL pointer to read, so I would advise not doing that.)
[Update]
For your second question, it's the same reason this sequence reads one character and then passes the rest to the terminal:
$ head -c 1 > /dev/null
lalala
$ alala
alala: command not found
All I did was input "lalala" to the head command. Your program is presumably consuming one character of TTY input, terminating (crashing), and then the rest of the input to the TTY is being consumed by the shell after your program exits.
If you check the read manual page you will see that:
EFAULT buf is outside your accessible address space.
A NULL pointer is still within the accessible address space of all processes. Writing to, or dereferencing, a NULL pointer leads to undefined behavior, but it's still a valid address.
So the read call blocks because there's no input to be read. When there is, the process will most likely crash.

C adding one char to buffer

I would like to ask how to add one char to a buffer. For example:
char buffer[50];
char one_symbol;
How to add one_symbol to buffer? I don't know how long the buffer is at the time, so I cant just write, for example buffer[5] = one_symbol;
Thanks.
You need to do something to keep track of the length of the data in the buffer.
You have a couple of choices about how to do that. Strings store data in the buffer (a NUL byte) to signal where there data ends. Another possibility is to store the length externally:
typedef struct {
char data[50];
size_t len;
} buffer;
This latter is particularly preferable when/if you want to allow for data that itself might include NUL bytes. If you don't want your buffer size fixed at 50, you can go a step further:
typedef struct {
size_t allocated;
size_t in_use;
char data[];
};
Note that this uses a flexible array member, which was added in C99, so some older compilers don't support it.
Keep track or the buffers current size. You can do it by adding a new variable for that.
Something like:
char buffer[50];
size_t current_size = 0; /* Buffer is of size zero from the size */
/* ... */
/* Add one character to the buffer */
buffer[current_size++] = 'a';

C: how to read in a variable amount of info from files and store it in array

I am not used to programming in c, so I am wondering how to have an array, and then read a variable amount of variables in a file, and those these files in the array.
//how do I declare an array whose sizes varies
do {
char buffer[1000];
fscanf(file, %[^\n]\n", buffer);
//how do i add buffer to array
}while(!feof(file));
int nlines = 0
char **lines = NULL; /* Array of resulting lines */
int curline = 0;
char buffer[BUFSIZ]; /* Just alloocate this once, not each time through the loop */
do {
if (fgets(buffer, sizeof buffer, file)) { /* fgets() is the easy way to read a line */
if (curline >= nlines) { /* Have we filled up the result array? */
nlines += 1000; /* Increase size by 1,000 */
lines = realloc(lines, nlines*sizeof(*lines); /* And grow the array */
}
lines[curline] = strdup(buffer); /* Make a copy of the input line and add it to the array */
curline++;
}
}while(!feof(file));
Arrays are always fixed-size in C. You cannot change their size. What you can do is make an estimate of how much space you'll need beforehand and allocate that space dynamically (with malloc()). If you happen to run out of space, you reallocate. See the documentation for realloc() for that. Basically, you do:
buffer = realloc(size);
The new size can be larger or smaller than what you had before (meaning you can "grow" or "shrink" the array.) So if at first you want, say, space for 5000 characters, you do:
char* buffer = malloc(5000);
If later you run out of space and want an additional 2000 characters (so the new size will be 7000), you would do:
buffer = realloc(7000);
The already existing contents of buffer are preserved. Note that realloc() might not be able to really grow the memory block, so it might allocate an entirely new block first, then copy the contents of the old memory to the new block, and then free the old memory. That means that if you stored a copy of the buffer pointer elsewhere, it will point to the old memory block which doesn't exist anymore. For example:
char* ptr = buffer;
buffer = realloc(7000);
At that point, ptr is only valid if ptr == buffer, which is not guaranteed to be the case.
It appears that you are trying to read until you read a newline.
The easiest way to do this is via getline.
char *buffer = NULL;
int buffer_len;
int ret = getline(&buffer, &buffer_len, file);
...this will read one line of text from the file file (unless ret is -1, in which there's an error or you're at the end of the file).
An array where the string data is in the array entry is usually a non-optimal choice. If the complete set of data will fit comfortably in memory and there's a reasonable upper bound on the number of entries, then a pointer-array is one choice.
But first, avoid scanf %s and %[] formats without explicit lengths. Using your example buffer size of 1000, the maximum string length that you can read is 999, so:
/* Some needed data */
int n;
struct ptrarray_t
{
char **strings;
int nalloc; /* number of string pointers allocated */
int nused; /* number of string pointers used */
} pa_hdr; /* presume this was initialized previously */
...
n = fscanf(file, "%999[\n]", buffer);
if (n!=1 || getc(file)!='\n')
{
there's a problem
}
/* Now add a string to the array */
if (pa_hdr.nused < pa_hdr.nalloc)
{
int len = strlen(buffer);
char *cp = malloc(len+1);
strcpy(cp, buffer);
pa_hdr.strings[pa_hdr.nused++] = cp;
}
A reference to any string hereafter is just pa_hdr.strings[i], and a decent design will use function calls or macros to manage the header, which in turn will be in a header file and not inline. When you're done with the array, you'll need a delete function that will free all of those malloc()ed pointers.
If there are a large number of small strings, malloc() can be costly, both in time and space overhead. You might manage pools of strings in larger blocks that will live nicely with the memory allocation and paging of the host OS. Using a set of functions to effectively make an object out of this string-array will help your development. You can pick a simple strategy, as above, and optimize the implementation later.

Howto use readlink with dynamic memory allocation

Problem:
On a linux machine I want to read the target string of a link. From documentation I have found the following code sample (without error processing):
struct stat sb;
ssize_t r;
char * linkname;
lstat("<some link>", &sb);
linkname = malloc(sb.st_size + 1);
r = readlink("/proc/self/exe", linkname, sb.st_size + 1);
The probelm is that sb.st_size returns 0 for links on my system.
So how does one allocate memory dynamically for readline on such systems?
Many thanks!
One possible solution:
For future reference. Using the points made by jilles:
struct stat sb;
ssize_t r = INT_MAX;
int linkSize = 0;
const int growthRate = 255;
char * linkTarget = NULL;
// get length of the pathname the link points to
if (lstat("/proc/self/exe", &sb) == -1) { // could not lstat: insufficient permissions on directory?
perror("lstat");
return;
}
// read the link target into a string
linkSize = sb.st_size + 1 - growthRate;
while (r >= linkSize) { // i.e. symlink increased in size since lstat() or non-POSIX compliant filesystem
// allocate sufficient memory to hold the link
linkSize += growthRate;
free(linkTarget);
linkTarget = malloc(linkSize);
if (linkTarget == NULL) { // insufficient memory
fprintf(stderr, "setProcessName(): insufficient memory\n");
return;
}
// read the link target into variable linkTarget
r = readlink("/proc/self/exe", linkTarget, linkSize);
if (r < 0) { // readlink failed: link was deleted?
perror("lstat");
return;
}
}
linkTarget[r] = '\0'; // readlink does not null-terminate the string
POSIX says the st_size field for a symlink shall be set to the length of the pathname in the link (without '\0'). However, the /proc filesystem on Linux is not POSIX-compliant. (It has more violations than just this one, such as when reading certain files one byte at a time.)
You can allocate a buffer of a certain size, try readlink() and retry with a larger buffer if the buffer was not large enough (readlink() returned as many bytes as fit in the buffer), until the buffer is large enough.
Alternatively you can use PATH_MAX and break portability to systems where it is not a compile-time constant or where the pathname may be longer than that (POSIX permits either).
The other answers don't mention it, but there is the realpath function, that does exactly what you want, which is specified by POSIX.1-2001.
char *realpath(const char *path, char *resolved_path);
from the manpage:
realpath() expands all symbolic links and resolves references to
/./, /../ and extra '/' characters in the null-terminated string named
by path to produce a canonicalized absolute pathname.
realpath also handles the dynamic memory allocation for you, if you want. Again, excerpt from the manpage:
If resolved_path is specified as NULL, then realpath() uses
malloc(3) to allocate a buffer of up to PATH_MAX bytes to hold the
resolved pathname, and returns a pointer to this buffer. The caller
should deallocate this buffer using free(3).
As a simple, complete example:
#include <limits.h>
#include <stdlib.h>
#include <stdio.h>
int
resolve_link (const char *filename)
{
char *res = realpath(filename, NULL);
if (res == NULL)
{
perror("realpath failed");
return -1;
}
printf("%s -> %s\n", filename, res);
free(res);
return 0;
}
int
main (void)
{
resolve_link("/proc/self/exe");
return 0;
}
st_size does not give the correct answer on /proc.
Instead you can malloc PATH_MAX, or pathconf(_PC_PATH_MAX) bytes. That should be enough for most cases. If you want to be able to handle paths longer than that, you can call readlink in a loop and reallocate your buffer if the readlink return value indicates that the buffer is too short. Note though that many other POSIX functions simply assume PATH_MAX is enough.
I'm a bit puzzled as to why st_size is zero. Per POSIX:
For symbolic links, the st_mode member shall contain meaningful information when used with the file type macros. The file mode bits in st_mode are unspecified. The structure members st_ino, st_dev, st_uid, st_gid, st_atim, st_ctim, and st_mtim shall have meaningful values and the value of the st_nlink member shall be set to the number of (hard) links to the symbolic link. The value of the st_size member shall be set to the length of the pathname contained in the symbolic link not including any terminating null byte.
Source: http://pubs.opengroup.org/onlinepubs/9699919799/functions/lstat.html
If st_size does not work, I think your only option is to dynamically allocate a buffer and keep resizing it larger as long as the return value of readlink is equal to the buffer size.
The manpage for readlink(2) says it will silently truncate if the buffer is too small. If you truly want to be unbounded (and don't mind paying some cost for extra work) you can start with a given allocation size and keep increasing it and re-trying the readlink call. You can stop growing the buffer when the next call to readlink returns the same string it did for the last iteration.
What exactly are you trying to achieve with the lstat?
You should be able to get the target with just the following
char buffer[1024];
ssize_t r = readlink ("/proc/self/exe", buffer, 1024);
buffer[r] = 0;
printf ("%s\n", buffer);
If you're trying to get the length of the file name size, I don't think st_size is the right variable for that... But that's possibly a different question.

Resources