I'm using the following function on two different computers. One one computer is running Ubuntu and the other OS X. The function works on OS X, but not Ubuntu.
#include <stdio.h>
#define MAXBUF 256
char *safe_strncat(char *dest, const char *src, size_t n) {
snprintf(dest, n, "%s%s", dest, src);
return dest;
}
int main(int argc, const char * argv[]){
char st1[MAXBUF+1] = "abc";
char st2[MAXBUF+1] = "def";
char* st3;
printf("%s + %s = ",st1, st2);
st3 = safe_strncat(st1, st2, MAXBUF);
printf("%s\n",st3);
printf("original string = %s\n",st1);
}
Compile and run on Ubuntu
gcc concat_test.c -o concat_test
./concat_test
abc + def = def
original string = def
Compile and run in Xcode in OS X
abc + def = abcdef
original string = abcdef
Why does this work on mac and not on Ubuntu?
Should it work on Ubuntu?
Should it work on Mac?
I could swear it used to work in Ubuntu until recently, but I don't know what would have changed to make it stop working?
Could compiler settings have anything to do with this working or not?
Your code invokes undefined behavior because you pass the destination buffer as one of the source strings of your snprintf() format. This is not supported:
7.21.6.5 The snprintf function
Synopsis
#include <stdio.h>
int snprintf(char * restrict s, size_t n,
const char * restrict format, ...);
Description
The snprintf function is equivalent to fprintf, except that the output is written into an array (specified by argument s) rather than to a stream. If n is zero, nothing is written, and s may be a null pointer. Otherwise, output characters beyond the n-1st are discarded rather than being written to the array, and a null character is written at the end of the characters actually written into the array. If copying takes place between objects that overlap, the behavior is undefined.
(emphasis mine).
The implementation of snprintf differs between Ubuntu (glibc) and OS/X (Apple libc, based on BSD sources). The behavior differs and cannot be relied upon as it is undefined in all cases.
You can implement safe_strcat() this way:
#include <string.h>
char *safe_strcat(char *dest, size_t size, const char *src) {
char *p = memchr(dest, '\0', size);
if (p != NULL) {
strncat(p, src, size - (p - dest) - 1);
}
return dest;
}
Notes:
do not call this function safe_strncat(), it is really a safe version of strcat().
pass the size of the destination array after the destination pointer itself, not after the source pointer.
returning the destination pointer does not allow the caller to detect truncation. You could instead return the length of the result if the destination array had been large enough, like snprintf(), you it would still not tell the caller if the destination was not null terminated before the call (for safe_strcat and safe_strncat).
You can use the same model for safe versions of strcpy, strncat and strncpy (but not implementing strncpy()'s counter-intuitive semantics):
char *safe_strcpy(char *dest, size_t size, const char *src) {
if (size > 0) {
*dest = '\0';
strncat(dest, src, size - 1);
}
return dest;
}
char *safe_strncat(char *dest, size_t size, const char *src, size_t n) {
char *p = memchr(dest, '\0', size);
if (p != NULL) {
if (n > size - (p - dest) - 1)
n = size - (p - dest) - 1;
strncat(p, src, n);
}
return dest;
}
char *safe_strncpy(char *dest, size_t size, const char *src, size_t n) {
if (size > 0) {
if (n > size - 1)
n = size - 1;
*dest = '\0';
strncat(dest, src, n);
}
return dest;
}
snprintf(dest, n, "%s%s", dest, src);
This line caused undefined behaviour because you are overwriting the buffer dest. Because it's not defined, it's pointless to work out why it works on one machine and not the other.
More details can be found here: Is sprintf(buffer, "%s […]", buffer, […]) safe?
Related
I am writing a re-implementation of strlcat as an exercise. I have perform several tests and they produce similar result. However on one particular case, my function gives an segmentation fault error while the original does not, could you explain to me why? I am not allowed to use any of the standard library function, that is why I have re-implemented strlen().
Here is the code I have written :
#include <stdio.h>
#include <string.h>
int ft_strlen(char *s)
{
int i;
i = 0;
while (s[i] != '\0')
i++;
return (i);
}
unsigned int ft_strlcat(char *dest, char *src, unsigned int size)
{
size_t i;
int d_len;
int s_len;
i = 0;
d_len = ft_strlen(dest);
s_len = ft_strlen(src);
if (!src || !*src)
return (d_len);
while ((src[i] && (i < (size - d_len - 1))))
{
dest[i + d_len] = src[i];
i++;
}
dest[i + d_len] = '\0';
return (d_len + s_len);
}
int main(void)
{
char s1[5] = "Hello";
char s2[] = " World!";
printf("ft_strcat :: %s :: %u :: sizeof %lu\n", s1, ft_strlcat(s1, s2, sizeof(s1)), sizeof(s1));
// printf("strlcat :: %s :: %lu :: sizeof %lu\n", s1, strlcat(s1, s2, sizeof(s1)), sizeof(s1));
}
The output using strlcat is : strlcat :: Hello World! :: 12 :: sizeof 5. I am on macOS and I am using clang to compile if that can be of some help.
ft_strlcat() is not so bad, but it expects pointers to strings. main() is troublesome: s1 lacks a null character: so s1 is not a string.
//char s1[5] = "Hello";
char s1[] = "Hello"; // Use a string
s1[] too small for the concatenated string "HelloWorld"
char s1[11 /* or more */] = "Hello"; // Use a string
"%lu" matches unsigned long. size_t from sizeof matches "%zu".
Some ft_strlcat() issues:
unsigned, int vs. size_t
unsigned, int too narrow for long strings. Use size_t to handle all strings.
Test too late
if (!src || ...) is too late as prior ft_strlen(src); invokes UB when src == NULL.
const
ft_strlcat() should use a pointer to const to allow usage with const strings with src.
Advanced: restrict
Use restrict so the compiler can assume dest, src do not overlap and emit more efficient code - assuming they should not overlap.
Corner cases
It does not handle some pesky corner cases like when d_len >= size, but I will leave that detailed analysis for later.
Suggested signature
// unsigned int ft_strlcat(char *dest, char *src, unsigned int size)
size_t ft_strlcat(char * restrict dest, const char * restrict src, size_t size)
Some untested code for your consideration:
Tries to mimic strlcat().
Returns sum of string lengths, but not more that size.
Does not examine more than size characters to prevent reading out of bounds.
Does not append a null character when not enough room.
Does not check for dst, src as NULL. Add if you like.
Does not handle overlapping dest, src. To do so is tricky unless library routines available.
Use unsigned char * pointer to properly handle rare signed non-2's complement char.
size_t my_strlcat(char * restrict dst, const char * restrict src, size_t size) {
const size_t size_org = size;
// Walk dst
unsigned char *d = (unsigned char*) dst;
while (size > 0 && *d) {
d++;
size--;
}
if (size == 0) {
return size_org;
}
// Copy src to dst
const unsigned char *s = (const unsigned char*) src;
while (size > 0 && *s) {
*d++ = *s++;
size--;
}
if (size == 0) {
return size_org;
}
*d = '\0';
return (size_t) (d - (unsigned char*) dst);
}
If the return value is less than size, success!
s1 is not even long enough to accommodate the "Hello"
Use the correct type for sizes.
size_t ft_strlcat(char *dest, const char *src, size_t len)
{
char *savedDest = dest;
if(dest && src && len)
{
while(*dest && len)
{
len--;
dest++;
}
if(len)
{
while((*dest = *src) && len)
{
len--;
dest++;
*src++;
}
}
if(!len) dest[-1] = 0;
}
return dest ? dest - savedDest : 0;
}
Also your printf invokes undefined behaviour as order of function parameters evaluation is not determined. It should be:
int main(void)
{
char s1[5] = "Hello"; //will only work for len <= sizeof(s1) as s1 is not null character terminated
char s2[] = " World!";
size_t result = ft_strlcat(s1, s2, sizeof(s1));
printf("ft_strcat :: %s :: %zu :: sizeof %zu\n", s1, result, sizeof(s1));
}
https://godbolt.org/z/8hhbKjsbx
My task is like this: I should implement the strcpy function under the following constraints:
The function should use pointer expression (*(d+i))
I should implement it without using <string.h>
I'm programming in Visual Studio 2019.
I searched some source code in google and run them, but my program has a logical error. The program ends right away, each time. I don't know what I'm doing wrong.
Here's my code in Visual Studio 2019 on Windows. Please tell me what's wrong.
#include <stdio.h>
void strcpy(char*, char*);
int main()
{
char* sen1 = "Hello";
char* sen2 = "Friends";
strcpy(sen1, sen2);
printf("The result: %s\n", sen1);
return 0;
}
void strcpy(char* str1, char* str2)
{
int i = 0;
while (*(str2 + i) != '\0')
{
*(str1 + i) = *(str2 + i);
i++;
}
*(str1 + i) = '\0';
}
In addition to needing to provide writable storage for sen1, you should also check to ensure str2 != NULL in your function before dereferencing str2 (otherwise, even if you fix all other errors -- a segfault will likely result)
For example, in your code you can define a constant to use in setting the size of a sen1 array (or you can allocate storage with malloc(), calloc(), or realloc() -- save that for later). Using an array you can do, e.g.
#include <stdio.h>
#include <stdlib.h>
#define MAXC 64 /* if you need a constant, #define one (or more) */
...
int main (void)
{
char sen1[MAXC] = "Hello";
char *sen2 = "Friends";
mystrcpy (sen1, sen2);
printf ("The result: %s\n", sen1);
}
In your strcpy function, check that str2 isn't NULL before using str2 in your function, e.g.
char *mystrcpy (char *dest, const char *src)
{
char *p = dest;
if (!src || !dest) { /* ensure src or dest is not NULL */
fputs ("error: src or dest parameters NULL in mystrcpy().\n", stderr);
exit (EXIT_FAILURE);
}
do /* loop */
*p++ = *src; /* copy each char in src to dest */
while (*src++); /* (including the nul-termianting char) */
return dest; /* return pointer to dest */
}
Now you will copy your source string to your destination string in your (renamed) mystrcpy() function, receiving the results you expect:
Example Use/Output
$ ./bin/mystrcpy
The result: Friends
Look things over and let me know if you have further questions.
Two problems, at least:
String literals are not writable in C. Often the symptom is a crash (SIGSEGV).
You are not allowed to use the identifier strcpy for your own function. Use another name.
Three clean code issues, at least:
Turn int main() into int main(void) to make it properly typed.
str1 and str2 are too generic names. They don't indicate which is the source and which is the destination pointer. What about my_strcpy(char *dest, char *src)?
I'd use size_t i for the index counter instead of int, because that's the type all the string length functions and the sizeof operator return. It's also an unsigned type and can copy really long strings :-) The size_t is available after #include <stddef.h>.
You want this:
...
char* source = "Hello";
// or char source[] = "Hello";
char destination[1000]; // destination buffer long enough for playing around
my_strcpy(destination, source);
printf("%s\n", destination); // should print "Hello" if my_strcpy is corect
...
For the rest read Jens's answer.
Among the other good answers, just regarding the implementation of your strcpy function and not a detailed issue analyze of your actual code, another approach is this:
char * n_strcpy(char * dest, char const * src)
{
if (dest == NULL || src == NULL)
{
return NULL;
}
char *ptr = dest;
while ((*dest++ = *src++));
return ptr;
}
I am trying to replicate memcpy function, but when I try with NULL as both parameters but with size (5 for example) the original function gives the abort error but my program writes random characters.
void *ft_memcpy(void *dst, const void *src, size_t n)
{
size_t i;
char *d;
char *s;
s = (char*)src;
d = (char*)dst;
i = 0;
while (i < n)
{
d[i] = s[i];
i++;
}
i = 0;
return (dst);
}
int main()
{
char dst[0];
char src[0];
size_t n = 5;
printf("%s", ft_memcpy(dst, src, n));
printf("%s\n", memcpy(dst, src, n));
return (0);
}
src and dst have size 0, which is one way of specifying flexible arrays in C. You usually only define them inside structures that are going to be dynamically allocated, for example:
struct buffer {
size_t len;
char bytes[0]
};
#define NBYTES 8
struct buffer* ptr = malloc(sizeof(struct buffer) + NBYTES * sizeof(char));
const char* src = "hello!";
ptr->len = strlen(src);
memcpy(ptr->bytes, src, ptr->len);
Basically, indexing any of those arrays in your example will end up in a buffer overflow (you are accessing beyond the limits of the array).
The difference between this and passing NULL as parameters is that src and dst point to valid memory (main function stack). In C a buffer overflow has no defined behaviour (undefined behaviour), so the compiler is free to do what it wants. If you use a memory sanitizer (compile with -fsanitize=address) it will warn you about this problem and ask you to fix the error.
I recommend you using a debugger or add the following print statements in your copy function:
printf("%s: src: %p, dst: %p\n", __func__, src, dst);
See Array of zero length
Update: since you asked how to generate the abort error, the easiest and most convenient way for this scenario is using assertions.
#include <assert.h>
void function(void *addr) {
assert(addr != NULL);
}
will abort the execution if the condition addr != NULL evaluates to false.
Assertions are very useful to expose what conditions you assume will always be valid and for whom you don't want to pay the cost of checking them when you build the code for production, since these checks may have a performance impact. You can disable them by compiling with the flag -DNDEBUG.
See also: When should we use asserts in C?
Another way is making the program to simply abort:
#include <cstdlib.h>
void function(void *addr) {
if(addr == NULL) abort();
}
or to set errno variable to EINVAL:
#include <errno.h>
void function(void *addr) {
if (addr == NULL) {
errno = EINVAL;
return;
}
}
I have the following code:
char* get_address_string(PACKAGE* pkg){
char *c;
sprintf(c, "%02x:%02x:%02x:%02x:%02x:%02x", pkg->address[0], pkg->address[1],
pkg->address[2], pkg->address[3], pkg->address[4], pkg->address[5]);
return c;
}
The code works fine. However, I know this is not the proper way to return a string in C. I am receiving the warning "c is used uninitialized in this function".
What is the proper way to write this function in C?
"Proper way to return a string in C" is not truly possible. In C, a string is a character array (up to and including the null character) and arrays, by themselves, cannot be returned from a function.
A function can return pointers. So the usual method of "return a string" it to:
Return a pointer. char *foo1(...) like char *strdup()
Pass in a pointer to a character array and modify its contents. void foo2(char *,...) like int sprintf(char *dest, const char *format, ...)
Combine 1 & 2 char *foo3(char *, ...) like char *strcpy(char *dest, char *src)
Pass the address of a pointer and update that. foo4(char **ptr) like ssize_t getline(char **lineptr, size_t *n, FILE *stream)
The key is that the memory associated with the pointer must be valid after the function is complete. Returning a pointer to a function's non-static memory is undefined behavior. Successful methods include having the calling code pass in the pointer, or the function providing it via memory allocation of pointer to some persistent value like a global variable or string constant.
What is the proper way to write this function in C?
Current design practice encourages functions like #2 & #3 above to also supply a size_t size so the function knowns the limitations of the memory available.
char *foo2(char *s, size_t size, const pkg_T *pkg) {
int result = snprintf(s, size, "%02x:%02x:%02x:%02x:%02x:%02x",
pkg->address[0], pkg->address[1], pkg->address[2],
pkg->address[3], pkg->address[4], pkg->address[5]);
// encoding error or not enough room
if (result < 0 || result >= size) return NULL;
return s;
}
Another method would allocate memory (I favor the above though). This obliges the calling code to free() the memory.
#define UINT_MAX_WIDTH (sizeof(unsigned)*CHAR_BIT/3 + 3)
char *foo2alloc(char *s, size_t size, const pkg_T *pkg) {
char buf[(UINT_MAX_WIDTH+3)*6 + 1];
int result = snprintf(buf, sizeof buf, "%02x:%02x:%02x:%02x:%02x:%02x",
pkg->address[0], pkg->address[1], pkg->address[2],
pkg->address[3], pkg->address[4], pkg->address[5]);
// encoding error or not enough room
if (result < 0 || result >= size) return NULL;
return strdup(buf);
}
c is a pointer, but no memory is allocated. The return value is ok, that's how it can be done in C.
But you need to allocate memory.
Since c is uninitialized, sprintf writes to an unknown memory location, which leads to unspecified behavior. It might crash immediately, it might not crash at all, or it might crash on some completely unrelated line of code.
You need to initialize the pointer by allocating memory to it with malloc.
char* get_address_string(PACKAGE* pkg){
char *c = malloc(20); // enough room for output as 00:11:22:33:44:55 plus null terminator
if (c == null) {
perror("malloc failed");
exit(1);
}
sprintf(c, "%02x:%02x:%02x:%02x:%02x:%02x", pkg->address[0], pkg->address[1], pkg->address[2], pkg->address[3], pkg->address[4], pkg->address[5]);
return c;
}
Note that even though you know ahead of time how much memory you need, you can't set it aside at compile time via an array. This is wrong:
char* get_address_string(PACKAGE* pkg){
char c[20]; // allocated on the stack, contents unspecified on return
sprintf(c, "%02x:%02x:%02x:%02x:%02x:%02x", pkg->address[0], pkg->address[1], pkg->address[2], pkg->address[3], pkg->address[4], pkg->address[5]);
return c;
}
As is this:
char* get_address_string(PACKAGE* pkg){
char c[20]; // allocated on the stack, contents unspecified on return
char *p = c;
sprintf(p, "%02x:%02x:%02x:%02x:%02x:%02x", pkg->address[0], pkg->address[1], pkg->address[2], pkg->address[3], pkg->address[4], pkg->address[5]);
return p;
}
Since c is allocated on the stack, when get_address_string returns the contents are unspecified, leading again to unspecified behavior.
I prefer allocating heap from the caller so that it's clear who should free it.
#include <stdio.h>
#include <malloc.h>
bool GetString(char ** retString, size_t size)
{
// use size to do range check
sprintf_s(*retString, size, "blah blah blah");
return true;
}
int _tmain(int argc, _TCHAR* argv[])
{
size_t size = 100;
char *data = (char *)malloc(size);
if (data)
{
GetString(&data, size);
free(data);
}
return 0;
}
I know C is purposefully bare-bones, but I'm curious as to why something as commonplace as a substring function is not included in <string.h>.
Is it that there is not one "right enough" way to do it? Too many domain specific requirements? Can anyone shed any light?
BTW, this is the substring function I came up with after a bit of research.
Edit: I made a few updates based on comments.
void substr (char *outStr, const char *inpStr, int startPos, size_t strLen) {
/* Cannot do anything with NULL. */
if (inpStr == NULL || outStr == NULL) return;
size_t len = strlen (inpStr);
/* All negative positions to go from end, and cannot
start before start of string, force to start. */
if (startPos < 0) {
startPos = len + startPos;
}
if (startPos < 0) {
startPos = 0;
}
/* Force negative lengths to zero and cannot
start after end of string, force to end. */
if ((size_t)startPos > len) {
startPos = len;
}
len = strlen (&inpStr[startPos]);
/* Adjust length if source string too short. */
if (strLen > len) {
strLen = len;
}
/* Copy string section */
memcpy(outStr, inpStr+startPos, strLen);
outStr[strLen] = '\0';
}
Edit: Based on a comment from r I also came up with this one liner. You're on your own for checks though!
#define substr(dest, src, startPos, strLen) snprintf(dest, BUFF_SIZE, "%.*s", strLen, src+startPos)
Basic standard library functions don't burden themselves with excessive expensive safety checks, leaving them to the user. Most of the safety checks you carry out in your implementation are of expensive kind: totally unacceptable in such a basic library function. This is C, not Java.
Once you get some checks out of the picture, the "substrung" function boils down to ordinary strlcpy. I.e ignoring the safety check on startPos, all you need to do is
char *substr(const char *inpStr, char *outStr, size_t startPos, size_t strLen) {
strlcpy(outStr, inpStr + startPos, strLen);
return outStr;
}
While strlcpy is not a part of the standard library, but it can be crudely replaced by a [misused] strncpy. Again, ignoring the safety check on startPos, all you need to do is
char *substr(const char *inpStr, char *outStr, size_t startPos, size_t strLen) {
strncpy(outStr, inpStr + startPos, strLen);
outStr[strLen] = '\0';
return outStr;
}
Ironically, in your code strncpy is misused in the very same way. On top of that, many of your safety checks are the direct consequence of your choosing a signed type (int) to represent indices, while proper type would be an unsigned one (size_t).
Perhaps because it's a one-liner:
snprintf(dest, dest_size, "%.*s", sub_len, src+sub_start);
You DO have strcpy and strncpy. Aren't enough for you? With strcpy you can simulate the substring from character to end, with strncpy you can simulate the substring from character for a number of characters (you only need to remember to add the \0 at the end of the string). strncpy is even better than the C# equivalent, because you can overshoot the length of the substring and it won't throw an error (if you have allocated enough space in dest, you can do strncpy(dest, src, 1000) even if src is long 1. In C# you can't.)
As written in the comment, you can even use memcpy, but remember to always add a \0 at the end of the string, and you must know how many characters you are copying (so you must know exactly the length of the src substring) AND it's a little more complex to use if a day you want to refactor your code to use wchar_t AND it's not type-safe (because it accepts void* instead of char*). All this in exchange for a little more speed over strncpy
In C you have a function that returns a subset of symbols from a string via pointers: strstr.
char *ptr;
char string1[] = "Hello World";
char string2[] = "World";
ptr = strstr(string1, string2)
*ptr will be pointing to the first character occurrence.
BTW you did not write a function but a procedure, ANSI string functions: string.h
Here's a lighter weight version of what you want. Avoids the redundant strlen calls and guarantees null termination on the destination buffer (something strncpy won't do).
void substr(char* pszSrc, int start, int N, char* pszDst, int lenDest)
{
const char* psz = pszSrc + start;
int x = 0;
while ((x < N) && (x < lenDest))
{
char ch = psz[x];
pszDst[x] = ch;
x++;
if (ch == '\0')
{
return;
}
}
// guarantee null termination
if (x > 0)
{
pszDest[x-1] = 0;
}
}
Example:
char *pszLongString = "This is a long string";
char szSub[10];
substr(pszLongString, 0, 4, szSub, 10); // copies "long" into szSub and includes the null char
So while there isn't a formal substring function in C, C++ string classes usually have such a method:
#include <string>
...
std::string str;
std::string strSub;
str = "This is a long string";
strSub = str.substr(10, 4); // "long"
printf("%s\n", strSub.c_str());
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const char* substr(const char *string, size_t from, size_t to);
int main(int argc, char *argv[])
{
char *string = argv[1];
const char *substring = substr(string,6,80);
printf("string is [%s] substring is [%s]\n",string,substring);
return 0;
}
const char* substr(const char *string, size_t from, size_t to)
{
if (to <= from)
return NULL;
if (from >= to)
return NULL;
if (string == NULL)
return NULL;
if (strlen(string) == 0)
return NULL;
if (from < 0)
from = 0;
if (to > strlen(string))
to = strlen(string);
char *substring = malloc(sizeof(char) * ((to-from)+1));
size_t index;
for (index = 0; from < to; from++, index++)
substring[index] = string[from];
substring[index] = '\0';
return substring;
}