bzero() & bcopy() versus memset() & memcpy() - c

Is there any reason to use the non-standard bzero() and bcopy() instead of memset() and memcpy() in a Linux environment? I've heard many say that they're better for Linux compilers, but haven't seen any advantages over the standard functions.
Are they more optimized than the standard ones, or do they have any behavioral particularity for which they're preferred?

While bzero and bcopy functions aren't ISO C (the actual standard that I assume you're talking about when referring to them as non-standard), they were a POSIX standard thing, although they pre-dated both ISO and POSIX.
And note that use of the word "were" - these functions were deprecated in POSIX.1-2001 and fianally removed in POSIX.1-2008, in deference to memset, memcpy and memmove. So you're better off using the standard C functions where possible.
If you have a lot of code that uses them and you don't want to have to go and change it all (though you probably should at some point), you can use the following quick substitutions:
// void bzero(void *s, size_t n);
#define bzero(s, n) memset((s), 0, (n))
// void bcopy(const void *s1, void *s2, size_t n);
#define bcopy(s1, s2, n) memmove((s2), (s1), (n))

#include <strings.h>
void bcopy(const void *src, void *dest, size_t n);
Description
The bcopy() function copies n bytes from src to dest. The result is correct, even when both areas overlap.
Conforming to:
4.3BSD, it seems b comes from BSD and it seems deprecated.
Which means bcopy is analogous to memmove() not memcpy() as R.. said at his comment.
Note: strings.h is also distinct from string.h.

Actually nowdays it could be the other way around. You see that because memcpy and memset is included in the standard the compiler will be allowed to assume that function called so does exactly what the standard prescribes. This means that the compiler can replace them with the most efficient way of performing the operation it can figure out. With bcopy and bzero on the other hand the standard does not prescribe any behavior on those so the compiler can't assume anything - which means that the compiler would need to issue an actual function call.
However GCC for example knows about bcopy and bzero if it's built for an OS that have them.

Related

Is it valid to compare 2 strings using type punning?

I had just discovered the mindf*** that is type-punning when learning C and while experimenting I ran this code:
char* str="abc";
void* n=(void*)str;
uint32_t str_in_int=*(uint32_t*)n;
printf("%u", str_in_int);
which obviously gave out a uint32_t integer.
Since I thought that this is a pointer operation, if the addresses would be different it would give a different result, but each time I ran it it gave the same result. I also stored a duplicate value in another variable and compared it with the original (in case there were some addressing shenanigans going on under the hood) and it still came out the same. The code:
char* str="abc";
char* str2="abc";
void* n=(void*)str;
void* n2=(void*)str2;
uint32_t str_in_int=*(uint32_t*)n;
uint32_t str_in_int2=*(uint32_t*)n2;
printf("%u %u", str_in_int,str_in_int2);
Is this a viable form of string comparison in case of smaller strings as an alternative to strcmp or comparing character by character? Also an example where the resulting uint is the same for different strings is also welcome if it exists.
It is Undefined Behaviour as you break the string aliasing rules.
The correct way of doing it:
char *str = "abc";
uint32_t x;
memcpy(&x, str, sizeof(x));
printf("%"PRIu32"\n", x);
Most optimizing compilers will not call memcpy and the performance will be the same as using dangerous pointer punning.
https://godbolt.org/z/fnrn7b9jK
Is this a viable form of string comparison in case of smaller strings as an alternative to strcmp or comparing character by character?
Yes and no. Implementations of strcmp and memcmp may use techniques like this to compare multiple bytes at once. However, because an implementation of the C standard library is coordinated with the compiler, the code in the library implementation may use things that are not completely defined by the C standard, because they are defined by the compiler. Further, the library will be written to respect alignment requirements and memory mapping issues.
When similar code is written in an ordinary program, the semantics that are not fully defined by the C standard may be changed by the compiler, particularly when high optimization is required. Most particularly, if an object is defined as an array of char but you use it as an uint32_t, the behavior is not defined by the C standard. Sometimes these issues can be worked around, as the library implementors do, but doing so requires a good knowledge of the C standard and the particular features of the compiler being used.

Is there a way to check if <string.h> is included?

I'm creating a header only library and I wanted to check if the user has defined <string.h> so that I can use memcpy. I read online about how libraries like stdio have guard macros, but I couldn't find one for string.h. Any ideas? Or is there a way just to see if memcpy is a function?
You can portably tell if string.h has not been included.
Per 7.24.1 String function conventions, paragraph 1 of the (draft) C11 standard:
The header <string.h> declares one type and several functions, and defines one macro useful for manipulating arrays of character type and other objects treated as arrays of character type. The type is size_t and the macro is NULL ...
If NULL is not defined, then the user could not have included string.h prior to including your header(s).
I see no portable way of definitively determining if string.h has been included.
If you need <string.h>, include it yourself. You can also forward-declare memcpy and use it without including anything:
void* memcpy( void *restrict dest, const void *restrict src, size_t count );

Which is most standard: strnlen or strnlen_s?

In my current project, I am coding according to the C11 standard (building with gcc -std=c11) and needed something like strnlen (a "safe" version of strlen which returns the length of a 0-terminated string, but only up to a given maximum). So I looked it up (e.g. https://en.cppreference.com/w/c/string/byte/strlen) and it seems the C11 standard mentions such a function, but with the name strnlen_s.
Hence I went with strnlen_s, but this turned out to be undefined when including string.h. On the other hand, strnlen is defined, so my current solution is to use strnlen with a remark that the standard name seems to be strnlen_s but that this is not defined by GCC.
The question is: am I correct to assume that strnlen is the most portable name to use or what could I do for the code to be most portable/standard?
Note: Microsoft (https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/strnlen-strnlen-s) implements both functions with the distinction that strnlen_s checks if the string pointer is NULL and returns 0 in that case while strnlen has no such check.
The question is: am I correct to assume that strnlen is the most portable name to use or what could I do for the code to be most portable/standard?
No, it isn't portable at all. It was never part of C. It is included in POSIX, which doesn't mean much.
I would imagine the reason why the function doesn't exist in the standard, probably because it's superfluous when we already have memchr(str, '\0', max);.
strnlen_s is part of the optional bounds-checking interface in C11 annex K. This whole chapter turned out a huge fiasco and barely any compiler implements it. Microsoft has similar named functions but they are sometimes not compatible. So I would assume that all _s functions are completely non-portable.
So use neither of these, use memchr or strlen.
EDIT
In case you must implement strnlen yourself for some reason, then this is what I'd recommend:
#include <string.h>
size_t strnlength (const char* s, size_t n)
{
const char* found = memchr(s, '\0', n);
return found ? (size_t)(found-s) : n;
}
strnlen_s() is specified in Annex K of the C Standard starting at version C11. This Annex is not widely implemented and even Microsoft's implementation is not fully conformant with the specified version. The semantics are contorted especially regarding error handling. I would recommend not using it.
strnlen() is a simple function specified in POSIX.1-2008 and available on many platforms. It is easy to implement on platforms that do not provide it:
#include <string.h>
size_t strnlen(const char *s, size_t n) {
size_t i;
for (i = 0; i < n && s[i] != '\0'; i++)
continue;
return i;
}
The question is: am I correct to assume that strnlen is the most portable name to use or what could I do for the code to be most portable/standard?
For C, strnlen is OK as the name is not reserved. It is not part of the standard, so OK for you to add.
POSIX reserves str...(), so you might want to use another name.
strnlen_s collides with K.3.7.4.4 The strnlen_s function and has a controversial history that you might not want your code tied into. Avoid naming your function strnlen_s().
I would avoid name coalitions to common libraries with any function one adds with 2 names: the formal less-likely-to-collide-name and macro
size_t nielsen_strnlen(const char *s, size_t maxsize);
#define slength nielsen_strnlen
Or simply go directly with something less likely to collide.
size_t nstrnlen(const char *s, size_t maxsize);
Deeper: OP appears to want to use a popular function that is outside the standard C library (or current version), but might be available when code is ported to other systems. OP wants to provide a use-my-code-if-not-available function.
Careful where you tread.
I would use a macro (or a wrapper function)
#if ON_SYSTEM_WITH_strnlen
#define slength strnlen
#else
#define slength nielsen_strnlen
#endif
... and then use calls to slenth().
Problems comes up when OP's version of code is not exactly like the desired (today and tomorrow) or because it is not standard, various implementations vary - a little bit, on its implementation. To mitigate, consider a macro or function wrapper indirection.
Side issue: Parameter order and a potential new principle to the "original principles" of C.
size_t foo1(const char *s, size_t maxsize);
// arranged such that the size of an array appears before the array.
size_t foo2(size_t maxsize, const char *s);
size_t foo3(size_t maxsize, const char s[maxsize]);
string is the c++ header and string.h is the c header (at least with gcc). strlen_s (afaik) is a Microsoft extension to the C library. You right, strlen would be the more standard. You could also use memchr if you need a byte count. To #Basile's point, if you need count of characters you need something that is UTF-8 aware.

Dev C++ strtok_s throws [Warning] assignment makes pointer from integer without a cast

I have the following program:
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char *tp = NULL, *cp = NULL, *next_token = NULL;
char TokenListe[] = "Hello,I Am,1";
tp = strtok_s(TokenListe, ", ", &next_token);
printf(tp);
return 0;
}
When I compile it with Visual Studio 2015 it compiles, without any warning.
But when I compile it with Dev C++ 5.11 I get the following warning in line 10:
[Warning] assignment makes pointer from integer without a cast
Is there any solution to fix that warning?
Since C11, strtok_s is now standard C, part of the optional "bounds-checking interface" (Annex K). Compilers need not support it.
But if they do, the format is this (C11 K.3.7.3.1):
#define __STDC_WANT_LIB_EXT1__ 1
#include <string.h>
char *strtok_s(char * restrict s1,
rsize_t * restrict s1max,
const char * restrict s2,
char ** restrict ptr);
Any other format is non-standard garbage and should not be used, including Microsoft strtok_s.
Dev C++ is no longer maintained and therefore only contains a very old version of gcc. It does not support C11, but to my knowledge, no newer version of gcc + libraries yet support the C11 bounds-checking interface either. Visual Studio is a non-conforming compiler and can't be used for compiling standard C. Generally, I would advise to use neither of these compilers, but to update to a new version of gcc (for example Codeblocks with Mingw).
Summary: strtok_s cannot be used in sensible ways. Use strtok instead. Simply ensure that all buffers involved are large enough and can't be overrun. In case of a multi-threaded program, simply don't use strtok at all.
If Dev C++ doesn't have the non-standard strtok_s, in C it will be implicitly declared, and assumed to return integer.
Note: strtok_s is in the standard, but as an "optional extension", according to (my free draft copy of the) C11 standard.
You should enable other warnings too, such as the warning for implicit declarations of functions.
If Dev C++ does contain an implementation of strtok_s, and links with it, declaring it yourself might work. But a better option is to find the right header file, or compiler flags, to get it declared, if any such options exist. Consult the documentation.
But note, as Michael Walz commented, that the strtok_s in the C11 standard and Microsoft's strtok_s are different, and don't have the same parameters! I don't know which version Dev C++ implements.
Based on the answer from #thomas-padron-mccarthy, I could fix my problem with declaring the strtok_s function in my header file.
extern char* strtok_s(char*, char*, char**);

posix_memalign, malloc and calloc have problems with lli interpreter

I use polybench kernels. In polybench.c, code has a line as follows:
int ret = posix_memalign (&new, 32, num);
This line makes problem with lli interpreter. I tries to use malloc instead, but I have the same error
LLVM ERROR: Tried to execute an unknown external function: posix_memalign
Is there any other function could be used without having this problem?
You will not be surprised to hear that posix_memalign() is standardized as part of POSIX, not part of standard C. As such, providing that function is not a requirement on conforming C implementations. On the other hand, as part of POSIX, it is widely available.
malloc() promises to return a pointer to memory aligned properly for an object of any type. I'm not sure why you want to ensure an even stronger alignment requirement, but your next best bet for doing so is the aligned_alloc() function, which is standard C since C2011. If your C library conforms to C2011, then you can replace your posix_memalign() call with
#include <stdlib.h>
#include <errno.h>
// ...
new = aligned_alloc(32, num);
int ret = (new ? 0 : errno);
If you don't have aligned_alloc(), either, then your implementation may provide other alternatives, but none of them are standard.

Resources