I am trying to write out the size in bytes of a string that is defined as
#define PATHA "/tmp/matrix_a"
using the code
rtn=write(data,(strlen(PATHA)*sizeof(char)),sizeof(int));
if(rtn < 0)
perror("Writing data_file 2 ");
I get back Writing data_file 2 : Bad address
What exactly about this is a bad address? The data file descriptor is open, and writes correctly immediately before and after the above code segment. The data to be written to the file data needs to be raw, and not ASCII.
I have also tried defining the string as a char[] with the same issue
The second argument to write() is the address of the bytes you want to write, but you are passing the bytes you want to write themselves. In order to get an address, you must store those bytes in a variable (you can't take the address of the result of an expression). For example:
size_t patha_len = strlen(PATHA);
rtn = write(data, &patha_len, sizeof patha_len);
The arguments to POSIX write() are:
#include <unistd.h>
ssize_t write(int fildes, const void *buf, size_t nbyte);
That's a:
file descriptor
buffer
size
You've passed two sizes instead of an address and a size.
Use:
rtn = write(data, PATHA, sizeof(PATHA)-1);
or:
rtn = write(data, PATHA, strlen(PATHA));
If you are seeking to write the size of the string as an int, then you need an int variable to pass to write(), like this:
int len = strlen(PATHA);
rtn = write(data, &len, sizeof(len));
Note that you can't just use a size_t variable unless you want to write a size_t; on 64-bit Unix systems, in particular, sizeof(size_t) != sizeof(int) in general, and you need to decide which size it is you want to write.
You also need to be aware that some systems are little-endian and others big-endian, and what you write using this mechanism on one type is not going to be readable on the other type (without mapping work done before or after I/O operations). You might choose to ignore this as a problem, or you might decide to use a portable format (usually, that's called 'network order', and is equivalent to big-endian), or you might decide to define that your code uses the opposite order. You can write the code so that the same logic is used on all platforms if you're careful (and all platforms get the same answers).
The second argument to write() is the buffer and third argument is the size:
ssize_t write(int fd, const void *buf, size_t count);
The posted code passes the length which is interpreted as an address which is incorrect. The compiler should have emitted a warning about this (don't ignore compiler warnings and compile with the warning level at the highest level).
Change to:
rtn=write(data, PATHA, strlen(PATHA));
Note sizeof(char) is guaranteed to be 1 so it can be omitted from the size calculation.
The Bad address error has already been answered. If you want to write the size of a string just use printf.
printf("Length: %d\n", strlen(data));
Either that, or you can write a function that will convert an integer to a string and print that out... I prefer printf :)
rtn = write(data, PATHA, strlen(PATHA));
is what you want I think. Arguments are supposed to be
file descriptor (data)
the source buffer (your string constant PATHA)
The number of bytes to pull from that buffer (measured using strlen() on the same PATHA constant)
Also, to be complete, you should always check rtn for how many characters you've written. You're not guaranteed that you write() all the bytes requested on all descriptor types. So sometimes you end up writing it in chunks, determined by the amount it answers that it wrote, vs how many you know you have yet to write still then.
Related
The manual says that
Upon successful return, these functions [printf, dprintf etc.] return the number of characters printed.
The manual does not mention whethet may this number less (but yet nonnegative) than the length of the "final" (substitutions and formattings done) string. Nor mentions that how to check whether (or achieve that) the string was completely written.
The dprintf function operates on file descriptor. Similarily to the write function, for which the manual does mention that
On success, the number of bytes written is returned (zero indicates nothing was written). It is not an error if this number is smaller than the number of bytes requested;
So if I want to write a string completely then I have to enclose the n = write() in a while-loop. Should I have to do the same in case of dprintf or printf?
My understanding of the documentation is that dprintf would either fail or output all the output. But I agree that it is some gray area (and I might not understand well); I'm guessing that a partial output is some kind of failure (so returns a negative size).
Here is the implementation of musl-libc:
In stdio/dprintf.c the dprintf function just calls vdprintf
But in stdio/vdprintf.c you just have:
static size_t wrap_write(FILE *f, const unsigned char *buf, size_t len)
{
return __stdio_write(f, buf, len);
}
int vdprintf(int fd, const char *restrict fmt, va_list ap)
{
FILE f = {
.fd = fd, .lbf = EOF, .write = wrap_write,
.buf = (void *)fmt, .buf_size = 0,
.lock = -1
};
return vfprintf(&f, fmt, ap);
}
So dprintf is returning a size like vfprintf (and fprintf....) does.
However, if you really are concerned, you'll better use snprintf or asprintf to output into some memory buffer, and explicitly use write(2) on that buffer.
Look into stdio/__stdio_write.c the implementation of __stdio_write (it uses writev(2) with a vector of two data chunks in a loop).
In other words, I would often not really care; but if you really need to be sure that every byte has been written as you expect it (for example if the file descriptor is some HTTP socket), I would suggest to buffer explicitly (e.g. by calling snprintf and/or asprintf) yourself, then use your explicit write(2).
PS. You might check yourself the source code of your particular C standard library providing dprintf; for GNU glibc see notably libio/iovdprintf.c
With stdio, returning the number of partially written bytes doesn't make much sense because stdio functions work with a (more or less) global buffer whose state is unknown to you and gets dragged in from previous calls.
If stdio functions allowed you to work with that, the error return values would need to be more complex as they would not only need to communicate how many characters were or were not outputted, but also whether the failure was before your last input somewhere in the buffer, or in the middle of your last input and if so, how much of the last input got buffered.
The d-functions could theoretically give you the number of partially written characters easy, but POSIX specifies that they should mirror the stdio functions and so they only give you a further unspecified negative value on error.
If you need more control, you can use the lower level functions.
Concerning printf(), it is quite clear.
The printf function returns the number of characters transmitted, or a negative value if an output or encoding error occurred. C11dr ยง7.21.6.3 3
A negative value is returned if an error occurred. In that case 0 or more characters may have printed. The count is unknowable via the standard library.
If the value return is not negative, that is the number sent to stdout.
Since stdout is often buffered, that may not be the number received at the output device on the conclusion of printf(). Follow printf() with a fflush(stdout)
int r1 = printf(....);
int r2 = fflush(stdout);
if (r1 < 0 || r2 != 0) Handle_Failure();
For finest control, "print" to a buffer and use putchar() or various non-standard functions.
My bet is that no. (After looking into the - obfuscated - source of printf.) So any nonnegative return value means that printf was fully succesful (reached the end of the format string, everything was passed to kernel buffers).
But some (authentic) people should confirm it.
I have some code that uses the low level i/o read and write system calls, as described on page 170 of the C programming language book Kernighan and Ritchie.
The function prototypes are this
int n_read = read ( int fd, char *buf, int n )
int n_read = write ( int fd, char *buf, int n )
now the two .c file that uses these read and write are called by a larger fortran based program to read and write lots of data.
the C code is simply this, with no #include of any kind, having the underscore after the function name and passing by reference:
int read_ ( int *descriptor, char *buffer, int *nbyte )
{
return ( read( *descriptor, buffer, *nbyte ) );
}
int write_ ( int *descriptor, char *buffer, int *nbyte )
{
return ( write( *descriptor, buffer, *nbyte ) );
}
and the larger fortran based program will do something like this
INTEGER nbyte
COMPLEX*16 matrix(*)
INTEGER READ, WRITE
EXTERNAL READ, WRITE
status = READ( fd, matrix, nbyte )
if ( status .eq. -1 ) then
CALL ERROR('C call read failure')
stop
endif
As you may have already guessed, this works fine for nbyte values less than 2^31. I have a need to read more than 2 GB of data, so i need nbyte to be a long integer and INTEGER*8 in fortran.
Is there an equivalent read64 and write64, like there is an lseek64 provided by unistd.h and features.h ?
what is the best way to recode this?
should i use fread and fwrite ?
is the int fd from the low level write the same as FILE *stream from fread() ?
my requirement is being able to pass a long integer of 8 bytes to allow for values up to 100 to 500 gigabytes or an integer having 12 digits, which is all for the value of nbyte
Am i gaining anything or losing out by currently using read and write which is identified as a "system call" ? What does this mean?
Edit: You can't, at least not on Linux. read will never transfer more than what a 32-bit integer can hold.
From the manpages of Linux on read:
On Linux, read() (and similar system calls) will transfer at most
0x7ffff000 (2,147,479,552) bytes, returning the number of bytes
actually transferred. (This is true on both 32-bit and 64-bit
systems.)
This is not a contraint of POSIX, it's allowed by POSIX, but in the end it's implementation defined how read behaves. As Andrew Hanle reports, reading a 32GB file works just fine on Solaris. In this case, my old answer is still valid.
Old Answer:
read can work with 64-bit files just fine. It's defined in <unistd.h> as the following:-
ssize_t read(int fd, void *buf, size_t count);
You would have to adjust your routines to work with size_t instead of int, to properly support big files.
You should check SSIZE_MAX (the maximum value supported for count), before using read with a big file, and abort if it's to small (or split into smaller chunks). SSIZE_MAX is an implementation defined value.
As #Leandros observed, POSIX-conforming implementations of read() and write() accept byte counts of type size_t, and return byte counts of type ssize_t. These are probably the definitions that actually apply to you, as the read() and write() functions are not specified by the C standard. That's a distinction without much difference, however, because size_t is not required to be wider than int -- in fact, it can be narrower.
You anyway have a bigger problem. The Fortran code seems to assume that the C functions it is calling will read / write the full specified number of bytes or else fail, but POSIX read() and write() are not guaranteed to do that when they succeed. Indeed, there was a question around here the other day that hinged on the fact that these functions did not transfer more bytes at a time than can be represented by a signed, 32-bit integer, even on a 64-bit system with 64-bit [s]size_t.
You can kill both of these birds with one stone by implementing the read_() and write_() functions to loop, performing successive calls to the underlying read() or write() function, until the full number of specified bytes is transferred or an error occurs.
For write(fd[1], string, size) - what would happen if string is shorter than size?
I looked up the man page but it doesn't clearly specify that situation. I know that for read, it would simply stop there and read whatever string is, but it's certainly not the case for write. So what is write doing? The return value is still size so is it appending null terminator? Why doesn't it just stop like read.
When you call write(), the system assumes you are writing generic data to some file - it doesn't care that you have a string. A null-terminated string is seen as a bunch of non-zero bytes followed by a zero byte - the system will keep writing out until it's written size bytes.
Thus, specifying size which is longer than your string could be dangerous. It's likely that the system is reading data beyond the end of the string out your file, probably filled with garbage data.
write will write size bytes of data starting at string. If you define string to be an array shorter than size it will have undefined behaviour. But in you previous question the char *line = "apple"; contains 6 characters (i.e. a, p, p, l, e and the null character).
So it is best to write the with the value of size set to the correct value
write(int fildes, const void *buf, size_t nbyte) does not write null terminated strings. It writes the content of a buffer. If there are any null characters in the buffer they will be written as well.
read(int fildes, void *buf, size_t nbyte) also pays no attention to null characters. It reads a number of bytes into the given buffer, up to a maximum of nbyte. It does not add any null terminating bytes.
These are low level routines, designed for reading and writing arbitrary data.
The write call outputs a buffer of the given size. It does not attempt to interpret the data in the buffer. That is, you give it a pointer to a memory location and a number of bytes to write (the length) then, as long as those memory locations exist in a legal portion of your program's data, it will copy those bytes to the output file descriptor.
Unlike the string manipulation routines write, and read for that matter, ignore null bytes, that is bytes with the value zero. read does pay attention to the EOF character and, on certain devices, will only read that amount of data available at the time, perhaps returning less data than requested, but they operate on raw bytes without interpreting them as "strings".
If you attempt to write more data than the buffer contains, it may or may not work depending on the position of the memory. At best the behavior is undefined. At worst you'll get a segment fault and your program will crash.
I have the code below:
#include <stdio.h>
#include <unistd.h>
int main () {
int fd = open("filename.dat", O_CREAT|O_WRONLY|O_TRUNC, 0600);
int result = write(fd, "abcdefghijklmnopqrstuvxz", 100);
printf("\n\nfd = %d, result = %d, errno = %d", fd, result, errno);
close(fd);
return 0;
}
I am trying to understand what happens when I try to write more bytes to a file than I have available. So I am calling write and asking the program to write 100 bytes while I have much less than that. The result: a bunch of stuff from stdout ends up on filename.dat. If instead of 100 I use strlen("abcdefghijklmnopqrstuvxz"), I get the desired result. My question then is: why is the program trying to write beyond the '\0' character on my string? Is there some undefined behavior going on here?
My question then is: why is the program trying to write beyond the
'\0' character on my string?
The function write(2) doesn't care about 0-terminators. It actually doesn't care about buffer contents at all: it will try to write as many bytes as you tell it.
Is there some undefined behavior going on here
Of course, trying to write more than you have might incur the wrath of the OS who could decide to terminate your process if it touches inaccessible memory.
The write() function you are using does not care about the content. It just writes the no. of bytes you tell it to write in the file.
So when you say it to write 100 bytes and provide less than 100 bytes. The remaining bytes are taken as garbage value.
But when you are using strlen("abcdefghijklmnopqrstuvxz"), you are asking the write() to write bytes equal to the length of the string. So it works fine there
Because there are two techniques to represent a string. There is the null-terminated version, and there is another when you define its size and the pointer to the first byte. Write uses the second one. It needs a pointer where your data begins and a length to know how much data should copy to the file, but it doesn't see the null values. Sometimes these methods wraps a simple memcpy.
So when you defined the 100 length, in the memory after your abcdefghijklmnopqrstuvxz the program stored your "bunch of stdout stuff". That's why you see garbage. You were lucky because you can get SEGFAULT easily in these cases!
My question then is: why is the program trying to write beyond a \0 Because you want it to write 100 chars.
Is there some undefined behavior going on here? If you increase that 100 to a large number and if that area is on a non-privilage area, it is undefined behaviour.
I think that the basic issue here is that you're thinking of C strings as values, you think you're passing this value to the write function, and the write function is writing out your value plus extra junk.
C is lower level than that. In C, we don't really pass strings around, instead we pass pointers to strings, which are 'char *' values but with the added promise that they point to a valid block of memory that should be treated as a null-terminated string.
The write() function doesn't care about the null-terminated string convention. The parameters in the write call provide a file descriptor, a char *, and a buffer length.
Also, the compiler converts string constants into const char arrays. The equivalent of this happens at the top level:
const char *stringconst00001[27] = { 'a', 'b', 'c', ... 'y', 'z', '\0' }
And it does this in main():
int result = write(fd, stringconst00001, 100);
it's possible to write non-char* by using write() function? I need to print a unsigned long and I have no idea how to do it. In other words, pass a unsigned long in buf parameter.
It's usually preferable to use the standard C functions where available since they're more portable. If you can, and you want to output it as text, you should look at fprintf rather than fwrite. The former will format it for you, the latter is meant for writing raw memory blocks.
For example:
int val = 42;
fprintf (fh, "%d", val); // should check return value.
will output the text "42".
If you do want to write out the binary representation, fwrite is the means for that:
int val = 42;
fwrite (&val, sizeof (val), 1, fh); // should check return value.
That will write out the binary representation so that the bytes 0, 0, 0 and 42 are written to the file (depending on what the memory layout is for an int variable of course - it may vary depending on the implementation).
That's if you're able to use file handles rather than descriptors, otherwise the f* functions are no good for you. There may be valid reasons why you want to work with the lower levels.
So, if all you have is a descriptor for write, you'll need to format the variable into a string first, with something like:
char buff[100];
sprintf (buff, "%d", val);
write (fd, buff, strlen (buff)); // should check return value.
That's assuming you want it as text. If you want it as a binary value, it's similar to the way we've done it above with the fwrite:
write (fd, &val, sizeof (val)); // should check return value.