I was going to use /dev/random output as a seed for key generation for openssl, then I wrote this small program just to check what I was going to do:
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#define LEN 128
void uc2hex(char* hex, unsigned char* uc, unsigned short uc_len)
{
FILE* bp=fmemopen(hex,2*uc_len+1,"w");
unsigned short i;
for(i=0;i<uc_len;i++)
{
fprintf(bp,"%02x",uc[i]);
//printf("%02x\n",uc[i]);
//fprintf(bp,"%d-",i);
}
fprintf(bp,"%c",'\0');
fclose(bp);
}
int main()
{
unsigned char buf[LEN];
char str[2*LEN+1];
int fd=open("/dev/random",O_RDONLY);
read(fd,buf,LEN);
uc2hex(str,buf,LEN);
printf("%s\n",str);
close(fd);
return 0;
}
I ran the program some one or two times and everything seemed to work fine, but then I ran it four times again in short sequence and this is the output:
[walter#eM350 ~]$ ./random
0ee08c942ddf901af1278ba8f335b5df8db7cf18e5de2a67ac200f320a7a20e84866f533667a7e66a4572b3bf83d458e6f71f325783f2e3f921868328051f8f296800352cabeaf00000000000000000001000000000000005d08400000000000c080300e00000000000000000000000010084000000000000006400000000000
[walter#eM350 ~]$ ./random
1f69a0b931c16f796bbb1345b3f58f17f74e3df600000000bb03400000000000ffffffff00000000880e648aff7f0000a88103b4d67f000000305cb4d67f000030415fb4d67f0000000000000000000001000000000000005d08400000000000c080300e00000000000000000000000010084000000000000006400000000000
[walter#eM350 ~]$ ./random
4e8a1715238644a840eb66d9ff7f00002e4e3df600000000bb03400000000000ffffffff00000000a8ec66d9ff7f0000a871a37ad97f00000020fc7ad97f00003031ff7ad97f0000000000000000000001000000000000005d08400000000000c080300e00000000000000000000000010084000000000000006400000000000
[walter#eM350 ~]$ ./random
598c57563e8951e6f0173f0cff7f00002e4e3df600000000bb03400000000000ffffffff0000000058193f0cff7f0000a8e1cbda257f0000009024db257f000030a127db257f0000000000000000000001000000000000005d08400000000000c080300e00000000000000000000000010084000000000000006400000000000
Theese seem to me everything but 128-bytes random strings, since they are mostly the same. Then, excluding the possibility that the NSA tampered with the linux kernel random-number-generator, I could only guess that this has something to do with the available entropy in my machine, which gets exhausted when I ask too many bytes in sequence. My questions are:
1) Is this guess correct?
2) Assuming 1) is correct, how do I know whether there is enough entropy to generate real random bytes sequence?
From the man page for read:
Upon successful completion, read(), readv(), and pread() return the number of bytes actually read and placed in the buffer. The system guarantees to read the number of bytes requested if the descriptor references a normal file that has that many bytes left before the end-of-file, but in no other case.
Bottom line: check the return value from read and see how many bytes you actually read - there may not have been enough entropy to generate the number of bytes you requested.
int len = read(fd, buf, LEN);
printf("read() returned %d bytes: ", len);
if (len > 0)
{
uc2hex(str, buf, len);
printf("%s\n", str);
}
Test:
$ ./a.out
read() returned 16 bytes: c3d5f6a8ee11ddc16f00a0dea4ef237a
$ ./a.out
read() returned 8 bytes: 24e23c57852a36bb
$ ./a.out
read() returned 16 bytes: 4ead04d1eedb54ee99ab1b25a41e735b
$
As other people have suggested, you need to check the return value for the number of bytes read.
If /dev/random did not have sufficent bytes available, it will have returned fewer.
However, you still use the expected length in your following calls:
uc2hex(str,buf,LEN);
printf("%s\n",str);
So, you are converting and printing uninitialised memory. I am not surprised that subsequent calls then show the same value - since if that memory hasn't been written to between calls, the value wont change.
EDIT: Better would be:
int nBytes=read(fd,buf,LEN);
uc2hex(str,buf,nBytes);
printf("%s\n",str);
Related
I'm having some trouble comprehending exactly how read() works. For example, given the following program, with the file infile containing the string "abcdefghijklmnop":
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
int main() {
int fd;
char buf[5] = "WXYZ";
fd = open("infile", O_RDONLY);
read(fd, buf, 2);
read(fd, buf+2, 2);
close(fd);
printf("%c%c%c%c\n", buf[0], buf[1], buf[2], buf[3]);
return 0;
}
Looking at the read system call function:
ssize_t read(int fildes, void *buf, size_t nbyte);
I get that *buf is the buffer that holds the bytes read, and nbyte is the number of bytes that's being read. So after the first read(), only the first 2 characters of infile are read ("a" and "b"). Why isn't the output just "abcd"? Why are there other possibilities such as "aXbZ" or "abcZ"?
The manual page for the version of read I have says:
read() attempts to read nbyte bytes of data from the object referenced by the descriptor fildes into the buffer pointed to by buf.
and:
Upon successful completion, read(), readv(), and pread() return the number of bytes actually read and placed in the buffer. The system guarantees to read the number of bytes requested if the descriptor references a normal file that has that many bytes left before the end-of-file, but in no other case.
Thus, in the case you describe, with “the file infile containing the string "abcdefghijklmnop"”, the two read calls are guarantee to put “ab” and “cd” into buf, so the program will print “abcd” and a new-line character. (I would not take that guarantee literally. Certainly the system can guarantee that it will not allow unrelated interrupts to prevent the read from completely reading the requested data, but it could not guarantee there is no hardware failure, such as a disk drive failing before the read is completed.)
In other situations, when read is reading from a source other than a normal file, each of the two read calls may read 0, 1, or 2 bytes. Thus, the possible buffer contents are:
Bytes read in first read
Bytes read in second read
Buffer contents
0
0
WXYZ
0
1
WXaZ
0
2
WXab
1
0
aXYZ
1
1
aXbZ
1
2
aXbc
2
0
abYZ
2
1
abcZ
2
2
abcd
I am trying to speed up my C program to spit out data faster.
Currently I am using printf() to give some data to the outside world. It is a continuous stream of data, therefore I am unable to use return(data).
How can I use write() or fwrite() to give the data out to the console instead of file?
Overall my setup consist of program written in C and its output goes to the python script, where the data is processed further. I form a pipe:
./program_in_c | script_in_python
This gives additional benefit on Raspberry Pi by using more of processor's cores.
#include <unistd.h>
ssize_t write(int fd, const void *buf, size_t count);
write() writes up to count bytes from the buffer starting at buf to
the file referred to by the file descriptor fd.
the standard output file descriptor is: 1 in linux at least!
concern using flush the stdoutput buffer as well, before calling to write system call to ensure that all previous garabge was cleaned
fflush(stdout); // Will now print everything in the stdout buffer
write(1, buf, count);
using fwrite:
size_t fwrite(const void *ptr, size_t size, size_t nmemb, FILE *stream);
The function fwrite() writes nmemb items of data, each size bytes
long, to the stream pointed to by stream, obtaining them from the
location given by ptr.
fflush(stdout);
int buf[8];
fwrite(buf, sizeof(int), sizeof(buf), stdout);
Please refare to man pages for further reading, in the links below:
fwrite
write
Well, there's little or no win in trying to overcome the already used buffering system of the stdio.h package. If you try to use fwrite() with larger buffers, you'll probably win no more time, and use more memory than is necessary, as stdio.h selects the best buffer size appropiate to the filesystem where the data is to be written.
A simple program like the following will show that speed is of no concern, as stdio is already buffering output.
#include <stdio.h>
int
main()
{
int c;
while((c = getchar()) >= 0)
putchar(c);
}
If you try the above and below programs:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main()
{
char buffer[512];
int n;
while((n = read(0, buffer, sizeof buffer)) > 0)
write(1, buffer, n);
if (n < 0) {
perror("read");
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
You will see that there's no significative difference or, even, the first program will be faster, despite it is doing I/O on a per character basis. (as B. Kernighan & Dennis Ritchie wrote it in her first edition of "The C programming language") Most probably the first program will win.
The calls to read() and write() involve a system call each, with a buffer size decided by you. The individual getchar() and putchar() calls don't. They just store the received chars in a memory buffer, as you print them, whose size has been decided by the stdio.h library implementation, based on the filesystem, and it flushes the buffer, once it is full of data. If you grow the buffer size in the second program, you'll see that you get better results increasing it up to a point, but after that you'll see no more increment in speed. The number of calls made to the library is insignificant with respect to the time involved in doing the actual I/O, and selecting a very large buffer, will eat much memory from your system (and a Raspberry Pi memory is limited in this sense, to 1Gb or ram) If you end making swap due to a so large buffer, you'll lose the battle completely.
Most filesystems have a preferred buffer size, because the kernel does write ahead (the kernel reads more than what you asked for, on sequential reads, in prevision that you'll continue reading more after you consumed the data) and this affects the optimum buffer size. For that, the stat(2) system call tells you what is the optimum buffer size, and stdio uses that when it selects the actual buffer size.
Don't think you are going to get better (or much better) than the program listed first above. Even if you use large enough buffers.
What is not correct (or valid) is to intermix calls that do buffering (like all the stdio package's) with basic system calls (like read(2) or write(2) ---as I've seen recommending you to use fflush(3) after write(2), which is totally incoherent--- that do not buffer the data) there's no earn (and probably you'll get your output incorrectly ordered, if you do part of the calls using printf(3) and part using write(2) (this happens more in pipelines like you plan to do, because the buffers are not line oriented ---another characteristic of buffered output in stdio---)
Finally, I recomend you to read "The Unix programming environment" by Dennis Ritchie and Rob Pike. It will teach you a lot of unix, but one very good thing is that it will teach you to use perfectly the stdio package and the unix filesystem calls for reading and writing. With a little of luck you'll find it in .pdf on internet.
The next program shows you the effect of buffering:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main()
{
int i;
char *sep = "";
for (i = 0; i < 10; i++) {
printf("%s%d", sep, i);
sep = ", ";
sleep(1);
}
printf("\n");
}
One would assume you are going to see (on the terminal) the program, writing the numbers 0 to 9, separated by , and paced on one second intervals.
But due to the buffering, what you observe is quite different, you'll see how your program waits for 10 seconds without writing anything at all on the terminal, and at the end, writes everything in one shot, including the final line end, when the program terminates, and the shell shows you the prompt again.
If you change the program to this:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int
main()
{
int i;
char *sep = "";
for (i = 0; i < 10; i++) {
printf("%s%d", sep, i);
fflush(stdout);
sep = ", ";
sleep(1);
}
printf("\n");
}
You'll see the expected output, because you have told stdio to flush the buffer at each loop pass. In both programs you did 10 calls to printf(3), but there was only one write(2) at the end to write the full buffer. In the second version you forced stdio to do one such write(2) after each printf, and that showed the data out as the program passed through the loop.
Be careful, because another characteristic of stdio can be confounding you, as printf(3), when you print to a terminal device, flushes the output at each \n, but when you run it through a pipe, it does it only when the buffer fills up completely. This saves system calls (in FreeBSD, for example, the buffer size selected by stdio is around 32kb, large enough to force two blocks to write(2) and optimum (you'll not get better going above that size)
The console output in C works almost the same way as a file. Once you have included stdio.h, you can write on the console output, named stdout (for "standard output"). In the end, the following statement:
printf("hello world!\n");
is the same as:
char str[] = "hello world\n";
fwrite(str, sizeof(char), sizeof(str) - 1, stdout);
fflush(stdout);
I have a sample program that takes in an input from the terminal and executes it in a cloned child in a subshell.
#define _GNU_SOURCE
#include <stdlib.h>
#include <sys/wait.h>
#include <sched.h>
#include <unistd.h>
#include <string.h>
#include <signal.h>
int clone_function(void *arg) {
execl("/bin/sh", "sh", "-c", (char *)arg, (char *)NULL);
}
int main() {
while (1) {
char data[512] = {'\0'};
int n = read(0, data, sizeof(data));
// fgets(data, 512, stdin);
// int n = strlen(data);
if ((strcmp(data, "exit\n") != 0) && n > 1) {
char *line;
char *lines = strdup(data);
while ((line = strsep(&lines, "\n")) != NULL && strcmp(line, "") != 0) {
void *clone_process_stack = malloc(8192);
void *stack_top = clone_process_stack + 8192;
int clone_flags = CLONE_VFORK | CLONE_FS;
clone(clone_function, stack_top, clone_flags | SIGCHLD, (void *)line);
int status;
wait(&status);
free(clone_process_stack);
}
} else {
exit(0);
}
}
return 0;
}
The above code works in an older Linux system (with minimal RAM( but not in a newer one. Not works means that if I type a simple command like "ls" I don't see the output on the console. But with the older system I see it.
Also, if I run the same code on gdb in debugger mode then I see the output printed onto the console in the newer system as well.
In addition, if I use fgets() instead of read() it works as expected in both systems without an issue.
I have been trying to understand the behavior and I couldn't figure it out. I tried doing an strace. The difference I see is that the wait() return has the output of the ls program in the cases it works and nothing for the cases it does not work.
Only thing I can think of is that read(), since its not a library function has undefined behavior across systems. But I can't agree as to how its affecting the output.
Can someone point me out to why I might be observing this behavior?
EDIT
The code is compiled as:
gcc test.c -o test
strace when it's not working as expected is shown below
strace when it's working as expected (only difference is I added a printf("%d\n", n); following the call for read())
Thank you
Shabir
There are multiple problems in your code:
a successful read system call can return any non zero number between 1 and the buffer size depending on the type of handle and available input. It does not stop at newlines like fgets(), so you might get line fragments, multiple lines, or multiple lines and a line fragment.
furthermore, if read fills the whole buffer, as it might when reading from a regular file, there is no trailing null terminator, so passing the buffer to string functions has undefined behavior.
the test if ((strcmp(data, "exit\n") != 0) && n > 1) { is performed in the wrong order: first test if read was successful, and only then test the buffer contents.
you do not set the null terminator after the last byte read by read, relying on buffer initialization, which is wasteful and insufficient if read fills the whole buffer. Instead you should make data one byte longer then the read size argument, and set data[n] = '\0'; if n > 0.
Here are ways to fix the code:
using fgets(), you can remove the line splitting code: just remove initial and trailing white space, ignore empty and comment lines, clone and execute the commands.
using read(), you could just read one byte at a time, collect these into the buffer until you have a complete line, null terminate the buffer and use the same rudimentary parser as above. This approach mimics fgets(), by-passing the buffering performed by the standard streams: it is quite inefficient but avoids reading from handle 0 past the end of the line, thus leaving pending input available for the child process to read.
It looks like 8192 is simply too small a value for stack size on a modern system. execl needs more than that, so you are hitting a stack overflow. Increase the value to 32768 or so and everything should start working again.
I want to mmap stdin. But I can't mmap stdin.
So I have to call read and realloc in a loop, and I want to optimize it by choosing a good buffer size.
fstat of the file descriptor 0 gives a struct stat with a member named st_size, which seems to represent the amount of bytes that are already in the pipe buffer corresponding to stdin.
Depending on how I call my program, st_size varies between 0 and the full pipe buffer size of stdin, which is about 65520. For example in this program:
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>
int main()
{
struct stat buf;
int i;
for (i=0; i<20; i++)
{
fstat(0, &buf);
printf("%lld\n", buf.st_size);
usleep(10);
}
}
We can observe the buffer is still being filled:
$ cc fstat.c
$ for (( i=0; i<10000; i++ )) do echo "0123456789" ; done | ./a.out
9196
9306
9350
9394
9427
9471
9515
9559
...
And the output of this program changes everytime I re-run it.
I'd like to use st_size for the initial buffer size so I have to do less calls to realloc.
I have three questions, most important one first:
Can I 'flush' stdin, i.e. wait until the buffer is not being filled anymore ?
Could I do better ? Maybe there is another way to optimize this, to make the realloc in a loop make feel less dirty.
Can you confirm I can't use mmap with stdin ?
Few details:
I wanted to use mmap because I already wrote a program, for educational purpose, that does what nm does, using mmap. I'm wondering how it does handle stdin.
I know that st_size can be 0, in which case I will have no choice but to define a buffer size.
I know stdin can give more bytes than the pipe buffer size.
I want to optimize for educational purpose. There is no imperative need to it.
Thank you
I have an exercise in which I need to find the start and end address of a buffer (buf2). I don't have permissions to edit the code.
Here is the code:
(the password for the level2 code is fsckmelogic)
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(int argc, const char **argv) {
if (argc < 2) { printf("Fail. More Args...\n"); return 1; }
else {
setresuid(geteuid(),geteuid(),geteuid());
char buf2[4096];
char buf[16];
const char password[]="XXXXXXXXXXX";
strncpy(buf, argv[1], sizeof(buf) - 1);
if (strcmp(buf,password) != 0) {
printf("Wrong.\n");
return 1;
}
else {
strcpy(buf2,argv[2]);
printf("%s",buf2);
return 0;
}
}
}
ok, so no changing code however I'm assuming you can compile the code. Also, going to approach this from a hacker point-of-view (given the web site) and only use tools and techniques that might be available to a hacker. Finally I am making the assumption that you are working with a Linux box. So, lets compile the code like thus,
gcc level2.c -o so_test
Now we want to find out some starting addresses....so lets use ltrace (I'm hope it is installed on the system) and we get:
ltrace ./so_test XXXXXXXXXXX ababaabaabab
__libc_start_main(0x4006f0, 3, 0x7fff291bddb8, 0x400800 <unfinished ...>
geteuid() = 1000
geteuid() = 1000
geteuid() = 1000
setresuid(1000, 1000, 1000, -1) = 0
strncpy(0x7fff291bdcb0, "XXXXXXXXXXX", 15) = 0x7fff291bdcb0
strcmp("XXXXXXXXXXX","XXXXXXXXXXX") = 0
strcpy(0x7fff291bcca0, "ababaabaabab") = 0x7fff291bcca0
printf("%s", "ababaabaabab") = 12
ok...so what?
remember that strncpy returns a pointer to the destination string, and from the code
we know that the destination is buf, thus it's starting address is 0x7fff291bdcb0 (on my machine, your number will be different).
the third argument to strncpy is the number of characters to copy, which in this case is 15. From the code, we can see that the third argument of strncpy is sizeof(buf) - 1 which means that sizeof(buf) returns 16. From this we can deduce that the ending address of buf is 0x7fff291bdcb1 + 0x10 or ox7fff291bdcc1
we can also learn that the starting address of buf2 is 0x7fff291bcca0 from the results of the strcpy function call.
We can learn that the string entered by the user was 12 characters long due to the return value from the printf function call.
So now what is left is to find out the ending point of buf2. We can start throwing input at it till it blows up on us. Hint, if you do not want to type a long string of the same character, you can do the following:
ltrace ./so_test XXXXXXXXXXX `perl -e "print 'A' x 1024;"`
just change the 1024 to how many characters you want to pump into buf2. On my system, through a bit of trial and error, I've found out that the largest value that I can input is 4151, pushing in 4152 A's result in a segmentation fault (so we know that the maximum size of buf2 is going to be less than this).
Only thing left do do is figure out the length of buf2.
Hopefully this give you a start, I don't want to do your entire challenge for you :)
If you cannot modify the code, and your assignment is to get the beginning and ending ADDRESS of buf2, then isn't the only thing left is to run it with a debugger, and obtain the address from there??
The Starting Address of Buffer can be viewed using
printf("%p", &buff2);
Once you get the starting point of address then find the length of the buffer i.e.
len=strlen(buff2);
Now add the length in pointer then you get he end address of buff2 i.e.
printf("%p %p", *buff2, *buff2+strlen(buff2));