Weird behavior of getpwnam - c

#include <pwd.h>
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
printf("%s %s\n", getpwnam("steve")->pw_name, getpwnam("root")->pw_name);
printf("%d %d\n", getpwnam("steve")->pw_uid, getpwnam("root")->pw_uid);
return EXIT_SUCCESS;
}
$ gcc main.c && ./a.out
steve steve
1000 0
In line 8, we try to print the user names of steve and root, but it prints steve twice. In line 9, we try to print the UIDs of steve and root, and it successfully prints them.
I wanna ascertain the root cause of that bizarre behavior in line 8.
I know the pointer returned by getpwnam points to a statically allocated memory, and the memory pointed by fields like pw_name/pw_passwd/pw_gecos/pw_dir/pw_shell are also static, which means these values can be overwritten by subsequent calls. But still confused about this strange result.
This is exercise 8-1 of The Linux Programming Interface. Add this so that someone like me could find this through the search engine in the future:). And the question in the book is wrong, go here to see the revised version.

The code calls getpwnam() in succession returning a pointer to the same address and passing the same pointer to printf() twice. The order the compiler decides to make the calls will determine whether it shows “steve” or “root”.
Allocate two buffer spaces and use one for each in the call to printf() by calling getpwnam_r() instead.

The getpwnam function can return a pointer to static data, so each time it's called it returns the same pointer value. And because you're calling this function multiple times as an argument to printf, you'll only see the result of whichever one of those function calls happens last.
The key point here is that the evaluation order of the arguments to a function are unsequenced, which means there's no guarantee whether getpwnam("steve") happens first or getpwnam("root") happens first.

The result from getpwnam() may be overwritten by another call to getpwnam() or getpwuid() or getpwent(). Your code is demonstrating that.
See the POSIX specifications of:
getpwnam()
getpwuid()
getpwent()
You have no control over the order of evaluation of the calls. If you saved the pointers returned, you'd probably get different results printed.
POSIX also says:
The application shall not modify the structure to which the return value points, nor any storage areas pointed to by pointers within the structure. The returned pointer, and pointers within the structure, might be invalidated or the structure or the storage areas might be overwritten by a subsequent call to getpwent(), getpwnam(), or getpwuid(). The returned pointer, and pointers within the structure, might also be invalidated if the calling thread is terminated.
You should treat the return values as if they were const-qualified, in other words.
Note that the code in the library functions need not overwrite previous data. For example, on macOS Big Sur 11.6.8, the following code, compiled with -DUSER1=\"daemon\" as one of the compiler options, yields the result:
daemon root
1 0
U1A = 0x7fecfa405fd0, U2A = 0x7fecfa405c60
U1B = 0x7fecfa405fd0, U2B = 0x7fecfa405c60
Modified code:
/* SO 7345-2740 */
#include <pwd.h>
#include <stdio.h>
#include <stdlib.h>
#ifndef USER1
#define USER1 "steve"
#endif
#ifndef USER2
#define USER2 "root"
#endif
int main(void)
{
const struct passwd *user1a = getpwnam(USER1);
const struct passwd *user2a = getpwnam(USER2);
const char *user1_name = user1a->pw_name;
const char *user2_name = user2a->pw_name;
printf("%s %s\n", user1_name, user2_name);
const struct passwd *user1b = getpwnam(USER1);
const struct passwd *user2b = getpwnam(USER2);
int user1_uid = user1b->pw_uid;
int user2_uid = user2b->pw_uid;
printf("%d %d\n", user1_uid, user2_uid);
printf("U1A = %p, U2A = %p\n", (void *)user1a, (void *)user2a);
printf("U1B = %p, U2B = %p\n", (void *)user1b, (void *)user2b);
return EXIT_SUCCESS;
}
It is moderately likely that the library functions read the whole file into memory and then pass pointers to relevant sections of that memory. Certainly, in this example, the pointers to the entry for user daemon and user root are stable.
YMWV — Your Mileage Will Vary!

Related

Returning an array from function in C

I've written a function that returns an array whilst I know that I should return a dynamically allocated pointer instead, but still I wanted to know what happens when I am returning an array declared locally inside a function (without declaring it as static), and I got surprised when I noticed that the memory of the internal array in my function wasn't deallocated, and I got my array back to main.
The main:
int main()
{
int* arr_p;
arr_p = demo(10);
return 0;
}
And the function:
int* demo(int i)
{
int arr[10] = { 0 };
for (int i = 0; i < 10; i++)
{
arr[i] = i;
}
return arr;
}
When I dereference arr_p I can see the 0-9 integers set in the demo function.
Two questions:
How come when I examined arr_p I saw that its address is the same as arr which is in the demo function?
How come demo_p is pointing to data which is not deallocated (the 0-9 numbers) already in demo? I expected that arr inside demo will be deallocated as we got out of demo scope.
One of the things you have to be careful of when programming is to pay good attention to what the rules say, and not just to what seems to work. The rules say you're not supposed to return a pointer to a locally-allocated array, and that's a real, true rule.
If you don't get an error when you write a program that returns a pointer to a locally-allocated array, that doesn't mean it was okay. (Although, it means you really ought to get a newer compiler, because any decent, modern compiler will warn about this.)
If you write a program that returns a pointer to a locally-allocated array and it seems to work, that doesn't mean it was okay, either. Be really careful about this: In general, in programming, but especially in C, seeming to work is not proof that your program is okay. What you really want is for your program to work for the right reasons.
Suppose you rent an apartment. Suppose, when your lease is up, and you move out, your landlord does not collect your key from you, but does not change the lock, either. Suppose, a few days later, you realize you forgot something in the back of one closet. Suppose, without asking, you sneak back to try to collect it. What happens next?
As it happens, your key still works in the lock. Is this a total surprise, or mildly unexpected, or guaranteed to work?
As it happens, your forgotten item still is in the closet. It has not yet been cleared out. Is this a total surprise, or mildly unexpected, or guaranteed to happen?
In the end, neither your old landlord, nor the police, accost you for this act of trespass. Once more, is this a total surprise, or mildly unexpected, or just about completely expected?
What you need to know is that, in C, reusing memory you're no longer allowed to use is just about exactly analogous to sneaking back in to an apartment you're no longer renting. It might work, or it might not. Your stuff might still be there, or it might not. You might get in trouble, or you might not. There's no way to predict what will happen, and there's no (valid) conclusion you can draw from whatever does or doesn't happen.
Returning to your program: local variables like arr are usually stored on the call stack, meaning they're still there even after the function returns, and probably won't be overwritten until the next function gets called and uses that zone on the stack for its own purposes (and maybe not even then). So if you return a pointer to locally-allocated memory, and dereference that pointer right away (before calling any other function), it's at least somewhat likely to "work". This is, again, analogous to the apartment situation: if no one else has moved in yet, it's likely that your forgotten item will still be there. But it's obviously not something you can ever depend on.
arr is a local variable in demo that will get destroyed when you return from the function. Since you return a pointer to that variable, the pointer is said to be dangling. Dereferencing the pointer makes your program have undefined behavior.
One way to fix it is to malloc (memory allocate) the memory you need.
Example:
#include <stdio.h>
#include <stdlib.h>
int* demo(int n) {
int* arr = malloc(sizeof(*arr) * n); // allocate
for (int i = 0; i < n; i++) {
arr[i] = i;
}
return arr;
}
int main() {
int* arr_p;
arr_p = demo(10);
printf("%d\n", arr_p[9]);
free(arr_p) // free the allocated memory
}
Output:
9
How come demo_p is pointing to data which is not deallocated (the 0-9 numbers) already in demo? I expected that arr inside demo will be deallocated as we got out of demo scope.
The life of the arr object has ended and reading the memory addresses previously occupied by arr makes your program have undefined behavior. You may be able to see the old data or the program may crash - or do something completely different. Anything can happen.
… I noticed that the memory of the internal array in my function wasn't deallocated…
Deallocation of memory is not something you can notice or observe, except by looking at the data that records memory reservations (in this case, the stack pointer). When memory is reserved or released, that is just a bookkeeping process about what memory is available or not available. Releasing memory does not necessarily erase memory or immediately reuse it for another purpose. Looking at the memory does not necessarily tell you whether it is in use or not.
When int arr[10] = { 0 }; appears inside a function, it defines an array that is allocated automatically when the function starts executing (or at certain times within the function execution if the definition is in some nested scope). This is commonly done by adjusting the stack pointer. In common systems, programs have a region of memory called the stack, and a stack pointer contains an address that marks the end of the portion of the stack that is currently reserved for use. When a function starts executing, the stack pointer is changed to reserve more memory for that function’s data. When execution of the function ends, the stack pointer is changed to release that memory.
If you keep a pointer to that memory (how you can do that is another matter, discussed below), you will not “notice” or “observe” any change to that memory immediately after the function returns. That is why you see the value of arr_p is the address that arr had, and it is why you see the old data in that memory.
If you call some other function, the stack pointer will be adjusted for the new function, that function will generally use the memory for its own purposes, and then the contents of that memory will have changed. The data you had in arr will be gone. A common example of this that beginners happen across is:
int main(void)
{
int *p = demo(10);
// p points to where arr started, and arr’s data is still there.
printf("arr[3] = %d.\n", p[3]);
// To execute this call, the program loads data from p[3]. Since it has
// not changed, 3 is loaded. This is passed to printf.
// Then printf prints “arr[3] = 3.\n”. In doing this, it uses memory
// on the stack. This changes the data in the memory that p points to.
printf("arr[3] = %d.\n", p[3]);
// When we try the same call again, the program loads data from p[3],
// but it has been changed, so something different is printed. Two
// different things are printed by the same printf statement even
// though there is no visible code changing p[3].
}
Going back to how you can have a copy of a pointer to memory, compilers follow rules that are specified abstractly in the C standard. The C standard defines an abstract lifetime of the array arr in demo and says that lifetime ends when the function returns. It further says the value of a pointer becomes indeterminate when the lifetime of the object it points to ends.
If your compiler is simplistically generating code, as it does when you compile using GCC with -O0 to turn off optimization, it typically keeps the address in p and you will see the behaviors described above. But, if you turn optimization on and compile more complicated programs, the compiler seeks to optimize the code it generates. Instead of mechanically generating assembly code, it tries to find the “best” code that performs the defined behavior of your program. If you use a pointer with indeterminate value or try to access an object whose lifetime has ended, there is no defined behavior of your program, so optimization by the compiler can produce results that are unexpected by new programmers.
As you know dear, the existence of a variable declared in the local function is within that local scope only. Once the required task is done the function terminates and the local variable is destroyed afterwards. As you are trying to return a pointer from demo() function ,but the thing is the array to which the pointer points to will get destroyed once we come out of demo(). So indeed you are trying to return a dangling pointer which is pointing to de-allocated memory. But our rule suggests us to avoid dangling pointer at any cost.
So you can avoid it by re-initializing it after freeing memory using free(). Either you can also allocate some contiguous block of memory using malloc() or you can declare your array in demo() as static array. This will store the allocated memory constant also when the local function exits successfully.
Thank You Dear..
#include<stdio.h>
#define N 10
int demo();
int main()
{
int* arr_p;
arr_p = demo();
printf("%d\n", *(arr_p+3));
}
int* demo()
{
static int arr[N];
for(i=0;i<N;i++)
{
arr[i] = i;
}
return arr;
}
OUTPUT : 3
Or you can also write as......
#include <stdio.h>
#include <stdlib.h>
#define N 10
int* demo() {
int* arr = (int*)malloc(sizeof(arr) * N);
for(int i = 0; i < N; i++)
{
arr[i]=i;
}
return arr;
}
int main()
{
int* arr_p;
arr_p = demo();
printf("%d\n", *(arr_p+3));
free(arr_p);
return 0;
}
OUTPUT : 3
Had the similar situation when i have been trying to return char array from the function. But i always needed an array of a fixed size.
Solved this by declaring a struct with a fixed size char array in it and returning that struct from the function:
#include <time.h>
typedef struct TimeStamp
{
char Char[9];
} TimeStamp;
TimeStamp GetTimeStamp()
{
time_t CurrentCalendarTime;
time(&CurrentCalendarTime);
struct tm* LocalTime = localtime(&CurrentCalendarTime);
TimeStamp Time = { 0 };
strftime(Time.Char, 9, "%H:%M:%S", LocalTime);
return Time;
}

How to I output the name of an input?

For example, if I have char character = a, how would I be able to print out it's name, which is "character", along with it's output within a single printf?
I know I can do printf("character = %c", character);, but how can I still print out the input's name if I enter char something = b and make it print out "something = b" instead of "character = b"?
I agree that this sounds like an "XY Problem". Nevertheless, it IS possible with C macros:
https://www.geeksforgeeks.org/how-to-print-a-variable-name-in-c/
#include <stdio.h>
#define getName(var) #var
int main()
{
int myVar;
printf("%s", getName(myVar));
return 0;
}
The printf() output of this example would be "myVar"
You can read more about the "Stringizing Operator" here:
https://www.includehelp.com/c/stringize-operator-in-c.aspx
I think paulsm4's coding example from https://www.geeksforgeeks.org/how-to-print-a-variable-name-in-c/ demonstrates a method of doing this but I thought I might delve into why it is not possible to do this without pre-processor macros. The reason for this is once you compile your C program into assembly language, all variable names are completely done away with. Thus, variables are not stored in RAM or on the disk as names with values associated to them but rather addresses with values associated with them. Because of this, it is impossible once a program is compiled to find out what any of the variables were originally called, and this is one of the reasons why reverse engineering a program from assembly language is so difficult.
Consider this code:
#include<stdio.h>
int main(void) {
int exampleVar = 16;
printf("The memory address of a is: %p\n", (void*) &exampleVar);
// Note the "&" symbol makes this return the
// address of exampleVar rather than its actual value
return 0;
}
If you run this on your computer you will notice it returns a very large hexadecimal number which is its address in RAM. If you were to ask the computer to fetch the value of "exampleVar" it would not use its name but rather the location we just printed and this process would simply return 16, the value of our variable. You'll also notice if you run it multiple times this number changes every time. This is because when you run any computer program the variables are initialized and stored in RAM. Data in RAM tends to constantly be changing and because of this the variables are almost always in a different place every time, though the location will not (typically) change while running the actual program, only between sessions.
If you were to look at the compiled version of this program you would never find "exampleVar" being referred to by its name but rather its address, because for all intents and purposes its address is its name to the computer.
We can store variable name in a string using sprintf() in C.
# include <stdio.h>
# define getName(var, str) sprintf(str, "%s", #var)
int main()
{
int myVar;
char str[20];
getName(myVar, str);
printf("%s", str);
return 0;
}
Output:
myVar
I think the simplest form of what you are asking for (using a macro) is as follows:
#include <stdio.h>
#define print_variable(var) printf("%s = %d\n",#var,var)
int main()
{
int myVar = 42;
print_variable(myVar);
exit(0)l
}
Output:
myVar = 42
This utilized the stringizing operator ('#'), which only works in a macro, as it is a preprocessor operator. As pointed out by others here, this MUST be done at the preprocessor level, because once the program is compiled, all name information is lost.

Memcpy with function pointers leads to a segfault

I know I can just copy the function by reference, but I want to understand what's going on in the following code that produces a segfault.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int return0()
{
return 0;
}
int main()
{
int (*r0c)(void) = malloc(100);
memcpy(r0c, return0, 100);
printf("Address of r0c is: %x\n", r0c);
printf("copied is: %d\n", (*r0c)());
return 0;
}
Here's my mental model of what I thought should work.
The process owns the memory allocated to r0c. We are copying the data from the data segment corresponding to return0, and the copy is successful.
I thought that dereferencing a function pointer is the same as calling the data segment that the function pointer points to. If that's the case, then the instruction pointer should move to the data segment corresponding to r0c, which will contain the instructions for function return0. The binary code corresponding to return0 doesn't contain any jumps or function calls that would depend on the address of return0, so it should just return 0 and restore ip... 100 bytes is certainly enough for the function pointer, and 0xc3 is well within the bounds of r0c (it is at byte 11).
So why the segmentation fault? Is this a misunderstanding of the semantics of C's function pointers or is there some security feature that prevents self-modifying code that I'm unaware of?
The memory pages used by malloc to allocate memory are not marked as executable. You can't copy code to the heap and expect it to run.
If you want to do something like that you have to go deeper into the operating system, and allocate pages yourself. Then you need to mark those as executable. You would most likely need administrator rights to be able to set the executable flag on memory pages.
And it's really dangerous. If you do this in a program you distribute and have some kind of bug that lets an attacker use our program to write to those allocated memory pages, then the attacker can gain administrator rights and take control of the computer.
There's also other problems with your code, like pointers to functions might not translate well into general pointers on all platforms. It's very hard (not to mention non-standard) to predict or otherwise get the size of a function. You also print out pointers wrong in your code example. (use the "%p" format to print a void *, casting the pointer to a void * is needed).
Also when you declare a function like int fun() that's not the same as declaring a function that takes no arguments. If you want to declare a function that takes no arguments you should explicitly use void as in int fun(void).
The standard says:
The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.
[C2011, 7.24.2.1/2; emphasis added]
In the standard's terminology, functions are not "objects". The standard does not define behavior for the case where the source pointer points to a function, therefore such a memcpy() call produces undefined behavior.
Additionally, the pointer returned by malloc() is an object pointer. C does not provide for direct conversion of object pointers to function pointers, and it does not provide for objects to be called as functions. It is possible to convert between object pointer and function pointer by means of an intermediate integer value, but the effect of doing so is at minimum doubly implementation-defined. Under some circumstances it is undefined.
As in other cases, UB can turn out to be precisely the behavior you hoped for, but it is not safe to rely on that. In this particular case, other answers present good reasons to not expect to get the behavior you hoped for.
As was said in some comments, you need to make the data executable. This requires communicating with the OS to change protections on the data. On Linux, this is the system call int mprotect(void* addr, size_t len, int prot) (see http://man7.org/linux/man-pages/man2/mprotect.2.html).
Here is a Windows solution using VirtualProtect.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#ifdef _WIN32
#include <Windows.h>
#endif
int return0()
{
return 0;
}
int main()
{
int (*r0c)(void) = malloc(100);
memcpy((void*) r0c, (void*) return0, 100);
printf("Address of r0c is: %p\n", (void*) r0c);
#ifdef _WIN32
long unsigned int out_protect;
if(!VirtualProtect((void*) r0c, 100, PAGE_EXECUTE_READWRITE, &out_protect)){
puts("Failed to mark r0c as executable");
exit(1);
}
#endif
printf("copied is: %d\n", (*r0c)());
return 0;
}
And it works.
Malloc returns a pointer to an allocated memory (100 bytes in your case). This memory area is uninitialized; assuming that memory could be executed by the CPU, for your code to work, you would have to fill those 100 bytes with the executable instructions that the function implements (if indeed it can be held in 100 bytes). But as has been pointed out, your allocation is on the heap, not in the text (program) segment and I don't think it can be executed as instructions. Perhaps this would achieve what it is you want:
int return0()
{
return 0;
}
typedef int (*r0c)(void);
int main(void)
{
r0c pf = return0;
printf("Address of r0c is: %x\n", pf);
printf("copied is: %d\n", pf());
return 0;
}

write(), printf(), and function references in C

A few questions about this simple scenario:
#include <unistd.h>
#include <stdio.h>
void empty(){};
int main()
{
printf("%p\t%lu\n", empty, sizeof(empty));
write(1, empty, 100);
return 0;
}
What exactly is happening when I use the function's name as a reference?
It shows size of one but printf still treats it as a pointer and yet a void pointer is of size 8. Additionally, onto the write function:
Essentially I want to replicate the printf %p writing the value of a memory address rather than the value at that address mainly to get a better handle on how this all works :^)
Thanks!

why it show 777 when first time

this is my code :
#include <stdio.h>
void foo(void)
{
int i;
printf("%d\n", i);
i = 777;
}
int main(void)
{
foo();
//printf("hello\n");
foo();
return 0;
}
it show :
-1079087064
777
but if you remove the // in the code , it will be show :
134513817
hello
13541364
so why it show 777 when first time ,
thanks
What you are doing is undefined behavior. You are printing the value of an uninitialized variable. The number could be based on whatever was previously in the memory location of that variable.
If you want it to be 777 both times do this:
void foo(void)
{
int i = 777;
printf("%d\n", i);
}
Let me also explain why it is uninitialized something gives this undefined behavior. When you make a variable it allocates some memory for the variable. This memory could already have some values on it from other programs previously using that (or those) memory blocks. These value are then stored on the variable until you make it something else such as assigning it with the = operator.
I believe it showed 777 the first time (without the printf) because of how your c compiler works and converts the c to assembly using a stack based memory model. Check wikipedia about assembly language and stacks for more information.
In your first test, you entered the function foo, it 'created' (probably by just moving the stack pointer) the space on the stack for the local variable i, printed it's garbage value (-1079087064) then set that memory to 777. It then returned to main and called foo again. This call behaves exactly the same as the first, it 'creates' it's room for i in the same place as the previous call, and prints whatever value is there. Since that memory was previously set to 777 that is what was printed. The two foo calls would compile to something like this:
call foo
call foo
In your second example there is a call to printf between the calls to foo.
call foo
; code to set up call to printf, probably involving stack
call printf ; most definitely changing the stack
; clean up also probably changes the stack
call foo
Thus the assignment of the variable i in stack memory was altered by the many assignments involved in calling printf. The subsequent call to foo just prints out a new garbage value, presumably at the same memory location as before.
i is not initialized to anything but asking to print what it has. So, it can print anything ( garbage values ) which is what the undefined behavior.

Resources