The OpenGroup POSIX.1-2001 defines strerror_r, as does The Linux Standard Base Core Specification 3.1. But I can find no reference to the maximum size that could be reasonably expected for an error message. I expected some define somewhere that I could put in my code but there is none that I can find.
The code must be thread safe. Which is why strerror_r is used and not strerror.
Does any one know the symbol I can use? I should I create my own?
Example
int result = gethostname(p_buffy, size_buffy);
int errsv = errno;
if (result < 0)
{
char buf[256];
char const * str = strerror_r(errsv, buf, 256);
syslog(LOG_ERR,
"gethostname failed; errno=%d(%s), buf='%s'",
errsv,
str,
p_buffy);
return errsv;
}
From the documents:
The Open Group Base Specifications Issue 6:
ERRORS
The strerror_r() function may fail if:
[ERANGE] Insufficient storage was supplied via strerrbuf and buflen to
contain the generated message string.
From the source:
glibc-2.7/glibc-2.7/string/strerror.c:41:
char *
strerror (errnum)
int errnum;
{
...
buf = malloc (1024);
Having a sufficiently large static limit is probably good enough for all situations.
If you really need to get the entire error message, you can use the GNU version of strerror_r, or you can use the standard version
and poll it with successively larger buffers until you get what you need. For example,
you may use something like the code below.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* Call strerror_r and get the full error message. Allocate memory for the
* entire string with malloc. Return string. Caller must free string.
* If malloc fails, return NULL.
*/
char *all_strerror(int n)
{
char *s;
size_t size;
size = 1024;
s = malloc(size);
if (s == NULL)
return NULL;
while (strerror_r(n, s, size) == -1 && errno == ERANGE) {
size *= 2;
s = realloc(s, size);
if (s == NULL)
return NULL;
}
return s;
}
int main(int argc, char **argv)
{
for (int i = 1; i < argc; ++i) {
int n = atoi(argv[i]);
char *s = all_strerror(n);
printf("[%d]: %s\n", n, s);
free(s);
}
return 0;
}
I wouldn't worry about it - a buffer size of 256 is far more than sufficient, and 1024 is overkill. You could use strerror() instead of strerror_r(), and then optionally strdup() the result if you need to store the error string. This isn't thread-safe, though. If you really need to use strerror_r() instead of strerror() for thread safety, just use a size of 256. In glibc-2.7, the longest error message string is 50 characters ("Invalid or incomplete multibyte or wide character"). I wouldn't expect future error messages to be significantly longer (in the worst case, a few bytes longer).
This program (run online (as C++) here):
#include <stdio.h>
#include <errno.h>
#include <string.h>
int main(){
const int limit = 5;
int unknowns = 0;
int maxlen = 0;
int i=0; char* s = strerror(i);
while(1){
if (maxlen<strlen(s)) maxlen = strlen(s);
if (/*BEGINS WITH "Unknown "*/ 0==strncmp("Unknown ", s , sizeof("Unknown ")-1) )
unknowns++;
printf("%.3d\t%s\n", i, s);
i++; s=strerror(i);
if ( limit == unknowns ) break;
}
printf("Max: %d\n", maxlen);
return 0;
}
lists and prints all the errors on the system and keeps track of the maximum length. By the looks of it, the length does not exceed 49 characters (pure strlen's without the final \0) so with some leeway, 64–100 should be more than enough.
I got curious if the whole buffer size negotiation couldn't simply be avoided by returning structs and whether there was a fundamental reason for not returning structs. So I benchmarked:
#define _POSIX_C_SOURCE 200112L //or else the GNU version of strerror_r gets used
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
typedef struct { char data[64]; } error_str_t;
error_str_t strerror_reent(int errn) __attribute__((const));
error_str_t strerror_reent(int errn){
error_str_t ret;
strerror_r(errn, ret.data, sizeof(ret));
return ret;
}
int main(int argc, char** argv){
int reps = atoi(argv[1]);
char buf[64];
volatile int errn = 1;
for(int i=0; i<reps; i++){
#ifdef VAL
error_str_t err = strerror_reent(errn);
#else
strerror_r(errn, buf, 64);
#endif
}
return 0;
}
and the performance difference between the two at -O2 is minimal:
gcc -O2 : The VAL version is slower by about 5%
g++ -O2 -x c++ : The VAL version is faster by about 1% than the standard version compiled as C++ and by about 4% faster than the standard version compiled as C (surprisingly, even the slower C++ version beats the faster C version by about 3%).
In any case, I think it's extremely weird that strerror is even allowed to be thread unsafe. Those returned strings should be pointers to string literals. (Please enlighten me, but I can't think of a case where they should be synthesized at runtime). And string literals are by definition read only and access to read only data is always thread safe.
Nobody has provided a definitive answer yet, so I looked into this further and there's a better function for the job, perror(3), as you will probably want to display this error somewhere, which is what I'd recommend you use unless your requirements really require you not to.
That's not a full answer, but the reason to use it is because it uses proper size buffer suitable for any locale. It internally uses strerror_r(3), these two functions conform to POSIX standard and are widely available, therefore in my eyes they're authoritative source of truth in this matter.
excerpt from glibc implementation:
static void
perror_internal (FILE *fp, const char *s, int errnum)
{
char buf[1024];
const char *colon;
const char *errstring;
if (s == NULL || *s == '\0')
s = colon = "";
else
colon = ": ";
errstring = __strerror_r (errnum, buf, sizeof buf);
(void) __fxprintf (fp, "%s%s%s\n", s, colon, errstring);
}
From this I can infer, that at this moment in time, and given stability of such things, in forseeable future, you will never go wrong with a buffer size of 1024 chars.
Related
I have a question pertaining to the extern char **environ. I'm trying to make a C program that counts the size of the environ list, copies it to an array of strings (array of array of chars), and then sorts it alphabetically with a bubble sort. It will print in name=value or value=name order depending on the format value.
I tried using strncpy to get the strings from environ to my new array, but the string values come out empty. I suspect I'm trying to use environ in a way I can't, so I'm looking for help. I've tried to look online for help, but this particular program is very limited. I cannot use system(), yet the only help I've found online tells me to make a program to make this system call. (This does not help).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
extern char **environ;
int main(int argc, char *argv[])
{
char **env = environ;
int i = 0;
int j = 0;
printf("Hello world!\n");
int listSZ = 0;
char temp[1024];
while(env[listSZ])
{
listSZ++;
}
printf("DEBUG: LIST SIZE = %d\n", listSZ);
char **list = malloc(listSZ * sizeof(char**));
char **sorted = malloc(listSZ * sizeof(char**));
for(i = 0; i < listSZ; i++)
{
list[i] = malloc(sizeof(env[i]) * sizeof(char)); // set the 2D Array strings to size 80, for good measure
sorted[i] = malloc(sizeof(env[i]) * sizeof(char));
}
while(env[i])
{
strncpy(list[i], env[i], sizeof(env[i]));
i++;
} // copy is empty???
for(i = 0; i < listSZ - 1; i++)
{
for(j = 0; j < sizeof(list[i]); j++)
{
if(list[i][j] > list[i+1][j])
{
strcpy(temp, list[i]);
strcpy(list[i], list[i+1]);
strcpy(list[i+1], temp);
j = sizeof(list[i]); // end loop, we resolved this specific entry
}
// else continue
}
}
This is my code, help is greatly appreciated. Why is this such a hard to find topic? Is it the lack of necessity?
EDIT: Pasted wrong code, this was a separate .c file on the same topic, but I started fresh on another file.
In a unix environment, the environment is a third parameter to main.
Try this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char *argv[], char **envp)
{
while (*envp) {
printf("%s\n", *envp);
*envp++;
}
}
There are multiple problems with your code, including:
Allocating the 'wrong' size for list and sorted (you multiply by sizeof(char **), but should be multiplying by sizeof(char *) because you're allocating an array of char *. This bug won't actually hurt you this time. Using sizeof(*list) avoids the problem.
Allocating the wrong size for the elements in list and sorted. You need to use strlen(env[i]) + 1 for the size, remembering to allow for the null that terminates the string.
You don't check the memory allocations.
Your string copying loop is using strncpy() and shouldn't (actually, you should seldom use strncpy()), not least because it is only copying 4 or 8 bytes of each environment variable (depending on whether you're on a 32-bit or 64-bit system), and it is not ensuring that they're null terminated strings (just one of the many reasons for not using strncpy().
Your outer loop of your 'sorting' code is OK; your inner loop is 100% bogus because you should be using the length of one or the other string, not the size of the pointer, and your comparisons are on single characters, but you're then using strcpy() where you simply need to move pointers around.
You allocate but don't use sorted.
You don't print the sorted environment to demonstrate that it is sorted.
Your code is missing the final }.
Here is some simple code that uses the standard C library qsort() function to do the sorting, and simulates POSIX strdup()
under the name dup_str() — you could use strdup() if you have POSIX available to you.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
extern char **environ;
/* Can also be spelled strdup() and provided by the system */
static char *dup_str(const char *str)
{
size_t len = strlen(str) + 1;
char *dup = malloc(len);
if (dup != NULL)
memmove(dup, str, len);
return dup;
}
static int cmp_str(const void *v1, const void *v2)
{
const char *s1 = *(const char **)v1;
const char *s2 = *(const char **)v2;
return strcmp(s1, s2);
}
int main(void)
{
char **env = environ;
int listSZ;
for (listSZ = 0; env[listSZ] != NULL; listSZ++)
;
printf("DEBUG: Number of environment variables = %d\n", listSZ);
char **list = malloc(listSZ * sizeof(*list));
if (list == NULL)
{
fprintf(stderr, "Memory allocation failed!\n");
exit(EXIT_FAILURE);
}
for (int i = 0; i < listSZ; i++)
{
if ((list[i] = dup_str(env[i])) == NULL)
{
fprintf(stderr, "Memory allocation failed!\n");
exit(EXIT_FAILURE);
}
}
qsort(list, listSZ, sizeof(list[0]), cmp_str);
for (int i = 0; i < listSZ; i++)
printf("%2d: %s\n", i, list[i]);
return 0;
}
Other people pointed out that you can get at the environment via a third argument to main(), using the prototype int main(int argc, char **argv, char **envp). Note that Microsoft explicitly supports this. They're correct, but you can also get at the environment via environ, even in functions other than main(). The variable environ is unique amongst the global variables defined by POSIX in not being declared in any header file, so you must write the declaration yourself.
Note that the memory allocation is error checked and the error reported on standard error, not standard output.
Clearly, if you like writing and debugging sort algorithms, you can avoid using qsort(). Note that string comparisons need to be done using strcmp(), but you can't use strcmp() directly with qsort() when you're sorting an array of pointers because the argument types are wrong.
Part of the output for me was:
DEBUG: Number of environment variables = 51
0: Apple_PubSub_Socket_Render=/private/tmp/com.apple.launchd.tQHOVHUgys/Render
1: BASH_ENV=/Users/jleffler/.bashrc
2: CDPATH=:/Users/jleffler:/Users/jleffler/src:/Users/jleffler/src/perl:/Users/jleffler/src/sqltools:/Users/jleffler/lib:/Users/jleffler/doc:/Users/jleffler/work:/Users/jleffler/soq/src
3: CLICOLOR=1
4: DBDATE=Y4MD-
…
47: VISUAL=vim
48: XPC_FLAGS=0x0
49: XPC_SERVICE_NAME=0
50: _=./pe17
If you want to sort the values instead of the names, you have to do some harder work. You'd need to define what output you wish to see. There are multiple ways of handling that sort.
To get the environment variables, you need to declare main like this:
int main(int argc, char **argv, char **env);
The third parameter is the NULL-terminated list of environment variables. See:
#include <stdio.h>
int main(int argc, char **argv, char **environ)
{
for(size_t i = 0; env[i]; ++i)
puts(environ[i]);
return 0;
}
The output of this is:
LD_LIBRARY_PATH=/home/shaoran/opt/node-v6.9.4-linux-x64/lib:
LS_COLORS=rs=0:di=01;34:ln=01;36:m
...
Note also that sizeof(environ[i]) in your code does not get you the length of
the string, it gets you the size of a pointer, so
strncpy(list[i], environ[i], sizeof(environ[i]));
is wrong. Also the whole point of strncpy is to limit based on the destination,
not on the source, otherwise if the source is larger than the destination, you
will still overflow the buffer. The correct call would be
strncpy(list[i], environ[i], 80);
list[i][79] = 0;
Bare in mind that strncpy might not write the '\0'-terminating byte if the
destination is not large enough, so you have to make sure to terminate the
string. Also note that 79 characters might be too short for storing env variables. For example, my LS_COLORS variable
is huge, at least 1500 characters long. You might want to do your list[i] = malloc calls based based on strlen(environ[i])+1.
Another thing: your swapping
strcpy(temp, list[i]);
strcpy(list[i], list[i+1]);
strcpy(list[i+1], temp);
j = sizeof(list[i]);
works only if all list[i] point to memory of the same size. Since the list[i] are pointers, the cheaper way of swapping would be by
swapping the pointers instead:
char *tmp = list[i];
list[i] = list[i+1];
list[i+1] = tmp;
This is more efficient, is a O(1) operation and you don't have to worry if the
memory spaces are not of the same size.
What I don't get is, what do you intend with j = sizeof(list[i])? Not only
that sizeof(list[i]) returns you the size of a pointer (which will be constant
for all list[i]), why are you messing with the running variable j inside the
block? If you want to leave the loop, the do break. And you are looking for
strlen(list[i]): this will give you the length of the string.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char compliance[256] = {'\0'};
if(compliance == NULL)
{
printf("compliance is null \n");
return 0;
}
printf("length of compliance %zd \n",strlen(compliance));
return 0;
}
Output:
length of compliance 0
int main(int argc, char *argv[])
{
char compliance[256] = {'\0'};
memset(compliance,0,256);
if(compliance == NULL)
{
printf("compliance is null \n");
return 0;
}
printf("length of compliance %zd \n",strlen(compliance));
return 0;
}
Output
length of compliance 0
As many of you have pointed out I wanted to use memset (instead of memcpy).But still don't get why in the second program compliance is not NULL? or in other words How do I make it NULL?
both programs are broken;
if(compliance == NULL)
makes no sense as compliance is never NULL (it is a variable on stack)
In the second part
memcpy(compliance,0,256);
copies from source address 0 (NULL) which causes segfault on most platform. You probably want to use memset here
compliance is an array, not a pointer (An array name are automatically converted to a pointer to the first element in some situations, but arrays are not pointers), it will never be equal to a null pointer.
The segmentation fault in the second example is caused by the call to memcpy.
memcpy(compliance,0,256);
You are copying from a null pointer. Probably what you want is memset.
you probably meant memset
you are copying from address 0 - 256 bytes
I'm writing a method to parse a string in a specific format, "55555;fhihehj;"
I have used sscanf in the past to do something similar, so I thought why not.
Here is my current code.
char toBreak[] = "55555;fjfjfhhj;";
char* strNum = malloc(256); //256 * sizeof(char) = 256
char* name = malloc(256);
if (sscanf(toBreak, "%[^;];%[^;];", strNum, name)!=2)
return -1;
printf("%s, %s\n", strNum, name);
For some reason, it isn't parsing the string correctly and I am not sure why.
Taking your compilable code and making it into an SSCCE (Short, Self-Contained, Correct Example) gives us:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char toBreak[] = "55555;fjfjfhhj;";
char *strNum = malloc(256);
char *name = malloc(256);
if (sscanf(toBreak, "%255[^;];%255[^;];", strNum, name) != 2)
return -1;
printf("%s, %s\n", strNum, name);
return 0;
}
I added the 255 to prevent buffer overflows in the general case where the input is not a constant in the program. Here, of course, the 256 byte allocations are very much larger than necessary, and overflow is not a problem. Note the 'off-by-one' on the length; the length in the format does not include the null byte, but there must be space for the null byte.
When compiled and run on Mac OS X 10.9 Mavericks with GCC 4.8.2, that gives:
55555, fjfjfhhj
What platform are you running on — o/s and compiler? What do you get with the code I show?
The following simple code is supposed to read one wide char from stdin and echo it back to stdout, except that it dies of SIGSEGV on the iconv() call. The question is – what's wrong with the code?
#include <unistd.h> /* STDIN_FILENO */
#include <locale.h> /* LC_ALL, setlocale() */
#include <langinfo.h> /* nl_langinfo(), CODESET */
#include <wchar.h> /* wchar_t, putwchar() */
#include <iconv.h> /* iconv_t, iconv_open(), iconv(), iconv_close() */
#include <stdlib.h> /* malloc(), EXIT_SUCCESS */
int main(void) {
setlocale(LC_ALL, ""); // We initialize the locale
iconv_t converter = iconv_open("WCHAR_T", nl_langinfo(CODESET)); // We initialize a converter
wchar_t out; // We allocate memory for one wide char on stack
wchar_t* pOut = &out;
size_t outLeft = sizeof(wchar_t);
while(outLeft > 0) { // Until we've read one wide char...
char in; // We allocate memory for one byte on stack
char* pIn=∈
size_t inLeft = 1;
if(read(STDIN_FILENO, pIn, 1) == 0) break; // We read one byte from stdin to the buffer
iconv(&converter, &pIn, &inLeft, (char**)&pOut, &outLeft); // We feed the byte to the converter
}
iconv_close(converter); // We deinitialize a converter
putwchar(out); // We echo the wide char back to stdout
return EXIT_SUCCESS;
}
UPDATE: After the following update based on #gsg's answer:
iconv(converter, &pIn, &inLeft, &pOut, &outLeft);
the code doesn't throw SIGSEGV anymore, but out == L'\n' for any non-ASCII input.
The signature of iconv is
size_t iconv(iconv_t cd,
char **inbuf, size_t *inbytesleft,
char **outbuf, size_t *outbytesleft);
But you call it with a first argument of pointer to iconv_t:
iconv(&converter, &pIn, &inLeft, (char**)&pOut, &outLeft);
Which should be
iconv(converter, &pIn, &inLeft, (char**)&pOut, &outLeft);
An interesting question is why a warning is not generated. For that, let's look at the definition in iconv.h:
/* Identifier for conversion method from one codeset to another. */
typedef void *iconv_t;
That's an... unfortunate choice.
I would program this a bit differently:
#define _XOPEN_SOURCE 500
#include <stdio.h>
#include <unistd.h>
#include <locale.h>
#include <langinfo.h>
#include <wchar.h>
#include <iconv.h>
#include <stdlib.h>
#include <err.h>
int main(void)
{
iconv_t converter;
char input[8]; /* enough space for a multibyte char */
wchar_t output[8];
char *pinput = input;
char *poutput = (char *)&output[0];
ssize_t bytes_read;
size_t error;
size_t input_bytes_left, output_bytes_left;
setlocale(LC_ALL, "");
converter = iconv_open("WCHAR_T", nl_langinfo(CODESET));
if (converter == (iconv_t)-1)
err(2, "failed to alloc conv_t");
bytes_read = read(STDIN_FILENO, input, sizeof input);
if (bytes_read <= 0)
err(2, "bad read");
input_bytes_left = bytes_read;
output_bytes_left = sizeof output;
error = iconv(converter,
&pinput, &input_bytes_left,
&poutput, &output_bytes_left);
if (error == (size_t)-1)
err(2, "failed conversion");
printf("%lc\n", output[0]);
iconv_close(converter);
return EXIT_SUCCESS;
}
I am by no means an expert, but here's an example that follows what you seem to be trying to do:
http://www.gnu.org/software/libc/manual/html_node/iconv-Examples.html
From the website:
The example also shows the problem of using wide character strings
with iconv. As explained in the description of the iconv function
above, the function always takes a pointer to a char array and the
available space is measured in bytes. In the example, the output
buffer is a wide character buffer; therefore, we use a local variable
wrptr of type char *, which is used in the iconv calls.
This looks rather innocent but can lead to problems on platforms that
have tight restriction on alignment. Therefore the caller of iconv has
to make sure that the pointers passed are suitable for access of
characters from the appropriate character set. Since, in the above
case, the input parameter to the function is a wchar_t pointer, this
is the case (unless the user violates alignment when computing the
parameter). But in other situations, especially when writing generic
functions where one does not know what type of character set one uses
and, therefore, treats text as a sequence of bytes, it might become
tricky.
Essentially, there are issues with alignment with iconv. In fact, there have been a few bugs listed regarding this very issue:
http://lists.debian.org/debian-glibc/2007/02/msg00043.html
Hope that this at least gets you started. I'd try using a char* instead of a wchar_t* for pOut, as shown in the example.
I was trying to implement getchar() function using read() in unistd.h.
Since system calls are pricy, I wanted to execute less read() functions as possible.
If I use "getchar", it works fine. However, "mygetchar" does not work in this case.
Can anyone point out what I have done wrong below?
#include <stdio.h>
#include <unistd.h>
#define BUF_SIZE 1024
int startIndex;
int endIndex;
int mygetchar(void){
char buffer[BUF_SIZE];
startIndex=0;
endIndex=0;
if(startIndex == endIndex){
int r;
r = read(0,buffer,BUF_SIZE);
startIndex=0;
endIndex=r;
}
return buffer[startIndex++];
}
int main(){
char c;
int i=0;
do{
c = mygetchar();
putchar(c);
i++;
}
while(c != EOF);
return 0;
}
Think carefully about your buffer. What happens to the buffer when the function call ends? It goes away.
This means that for 1023 out of 1024 calls, your buffer is unitialized and your offsets are pointing into nonsensical data.
Basically you need a global variable for the buffer too:
static char buf[BUF_SIZE];
static size_t bufCur = 0;
static size_t bufEnd = 0;
int mygetchar(void)
{
// ...
}
(Note that the static is pretty much pointless when your code is all in one file. If you were to pull your mygetchar into a header and implementation file though, you would want to use a static global so as to keep it from being linkable from outside of the same compilation unit.)
(Fun fact: the 0s for bufCur and bufEnd actually can be left implicit. For clarity, I would put them, but the standard dictates that they are to be zero-initialized.
As Jonathan Leffler pointed out, unless you plan on using the buffer elsewhere (and I don't know where that would be), there's no need for a global. You can just use a static variable inside of the function:
void mygetchar(void)
{
static buf[BUF_SIZE];
static size_t bufCur = 0;
static size_t bufEnd = 0;
// ...
}