Copying n chars with strncpy more efficiently in C - c

I'm wondering if there's a cleaner and more efficient way of doing the following strncpy considering a max amount of chars. I feel like am overdoing it.
int main(void)
{
char *string = "hello world foo!";
int max = 5;
char *str = malloc (max + 1);
if (str == NULL)
return 1;
if (string) {
int len = strlen (string);
if (len > max) {
strncpy (str, string, max);
str[max] = '\0';
} else {
strncpy (str, string, len);
str[len] = '\0';
}
printf("%s\n", str);
}
return 0;
}

I wouldn't use strncpy for this at all. At least if I understand what you're trying to do, I'd probably do something like this:
char *duplicate(char *input, size_t max_len) {
// compute the size of the result -- the lesser of the specified maximum
// and the length of the input string.
size_t len = min(max_len, strlen(input));
// allocate space for the result (including NUL terminator).
char *buffer = malloc(len+1);
if (buffer) {
// if the allocation succeeded, copy the specified number of
// characters to the destination.
memcpy(buffer, input, len);
// and NUL terminate the result.
buffer[len] = '\0';
}
// if we copied the string, return it; otherwise, return the null pointer
// to indicate failure.
return buffer;
}

Firstly, for strncpy, "No null-character is implicitly appended to the end of destination, so destination will only be null-terminated if the length of the C string in source is less than num."
We use memcpy() because strncpy() checks each byte for 0 on every copy. We already know the length of the string, memcpy() does it faster.
First calculate the length of the string, then decide on what to allocate and copy
int max = 5; // No more than 5 characters
int len = strlen(string); // Get length of string
int to_allocate = (len > max ? max : len); // If len > max, it'll return max. If len <= max, it'll return len. So the variable will be bounded within 0...max, whichever is smaller
char *str = malloc(to_allocate + 1); // Only allocate as much as we need to
if (!str) { // handle bad allocation here }
memcpy(str,string,to_allocate); // We don't need any if's, just do the copy. memcpy is faster, since we already have done strlen() we don't need strncpy's overhead
str[to_allocate] = 0; // Make sure there's a null terminator

Basically you're reinventing the strlcpy that was introduced in 1996 - see the strlcpy and strlcat - consistent, safe, string copy and concatenation paper by Todd C. Miller and Theo de Raadt. You might have not heard about it because it was refused to be added to glibc, called “horribly inefficient BSD crap” by the glibc maintainer and fought to this day even when adopted by all other operating systems - see the Secure Portability paper by Damien Miller (Part 4: Choosing the right API).
You can use strlcpy on Linux using the libbsd project (packaged on Debian, Ubuntu and other distros) or by simply copying the source code easily found on the web (e.g. on the two links in this answer).
But going back to your question on what would be most efficient in your case, where you're not using the source string length here is my idea based on the strlcpy source from OpenBSD at http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/lib/libc/string/strlcpy.c?rev=1.11 but without checking the length of the original string, which may potentially be very long but still with proper '\0' ending:
char *d = str; // the destination in your example
const char *s = string; // the source in your example
size_t n = max; // the max length in your example
/* Copy as many bytes as will fit */
if (n != 0) {
while (--n != 0) {
if ((*d++ = *s++) == '\0')
break;
}
}
/* Not enough room in dst, add NUL */
if (n == 0) {
if (max != 0)
*d = '\0'; /* NUL-terminate dst */
}
Here is a version of strlcpy on http://cantrip.org/strlcpy.c that uses memcpy:
/*
* ANSI C version of strlcpy
* Based on the NetBSD strlcpy man page.
*
* Nathan Myers <ncm-nospam#cantrip.org>, 2003/06/03
* Placed in the public domain.
*/
#include <stdlib.h> /* for size_t */
size_t
strlcpy(char *dst, const char *src, size_t size)
{
const size_t len = strlen(src);
if (size != 0) {
memcpy(dst, src, (len > size - 1) ? size - 1 : len);
dst[size - 1] = 0;
}
return len;
}
Which one would be more efficient I think depends on the source string. For very long source strings the strlen may take long and if you don't need to know the original length then maybe the first example would be faster for you.
It all depends on your data so profiling on real data would the only way to find out.

You can reduce the volume of code by:
int main(void)
{
char *string = "hello world foo!";
int max = 5;
char *str = malloc(max + 1);
if (str == NULL)
return 1;
if (string) {
int len = strlen(string);
if (len > max)
len = max;
strncpy(str, string, len);
str[len] = '\0';
printf("%s\n", str);
}
return 0;
}
There isn't much you can do to speed the strncpy() up further. You could reduce the time by using:
char string[] = "hello world foo!";
and then avoid the strlen() by using sizeof(string) instead.
Note that if the maximum size is large and the string to be copied is small, then the fact that strncpy() writes a null over each unused position in the target string can really slow things down.

strncpy() will automatically stop once it hits a NUL; passing max without checking is enough.

I believe this is sufficient:
char *str = malloc(max+1);
if(! str)
return 1;
int len = strlen(string);
memset(str, 0, max+1);
int copy = len > max ? max : len;
strncpy(str, string, copy);

Related

reduce the size of a string

(disclaimer: this is not a complete exercise because I have to finish it, but error occurred in this part of code)
I did this exercise to practice memory allocation.
create a function that takes an url (a C string) and returns the name of the website (with "www." and with the extension).
for example, given wikipedia's link, "http://www.wikipedia.org/", it has to return only "www.wikipedia.org" in another string (dynamically allocated in the heap).
this is what I did so far:
do a for-loop, and when "i" is greater than 6, then start copying each character in another string until "/" is reached.
I need to allocate the other string, and then reallocate that.
here's my attempt so far:
char *read_website(const char *url) {
char *str = malloc(sizeof(char));
if (str == NULL) {
exit(1);
}
for (unsigned int i = 0; url[i] != "/" && i > 6; ++i) {
if (i <= 6) {
continue;
}
char* s = realloc(str, sizeof(char) + 1);
if (s == NULL) {
exit(1);
}
*str = *s;
}
return str;
}
int main(void) {
char s[] = "http://www.wikipedia.org/";
char *str = read_website(s);
return 0;
}
(1) by debugging line-by-line, I've noticed that the program ends once for-loop is reached.
(2) another thing: I've chosen to create another pointer when I've used realloc, because I have to check if there's memory leak. Is it a good practice? Or should I've done something else?
There are multiple problems in your code:
url[i] != "/" is incorrect, it is a type mismatch. You should compare the character url[i] with a character constant '/', not a string literal "/".
char *s = realloc(str, sizeof(char) + 1); reallocates only to size 2, not the current length plus 1.
you do not increase the pointers, neither do you use the index variable.
instead of using malloc and realloc, you should first compute the length of the server name and allocate the array with the correct size directly.
Here is a modified version:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *read_website(const char *url) {
// skip the protocol part
if (!strncmp(url, "http://", 7))
url += 7;
else if (!strncmp(url, "https://", 8))
url += 8;
// compute the length of the domain name, stop at ':' or '/'
size_t n = strcspn(url, "/:");
// return an allocated copy of the substring
return strndup(url, n);
}
int main(void) {
char s[] = "http://www.wikipedia.org/";
char *str = read_website(s);
printf("%s -> %s\n", s, str);
free(str);
return 0;
}
strndup() is a POSIX function available on many systems and that will be part of the next version of the C Standard. If it is not available on your target, here is a simple implementation:
char *strndup(const char *s, size_t n) {
char *p;
size_t i;
for (i = 0; i < n && s[i]; i++)
continue;
p = malloc(i + 1);
if (p) {
memcpy(p, s, i);
p[i] = '\0';
}
return p;
}
The assignment doesn't say the returned string must be of minimal size, and the amount of memory used for URLs is minimal.
Building on chqrlie's solution, I'd start by finding the beginning of the domain name (skipping the protocol portion), duplicate the rest of the string, and then truncate the result. Roughly:
char *prot[] = { "http://", "https://" };
for( int i=0; i < 2; i++ ) {
if( 0 == strncmp(url, http, strlen(prot)) )
s += strlen(prot);
break;
}
}
char *output = strdup(s);
if( output ) {
size_t n = strcspn(output, "/:");
output[n] = '\0';
}
return output;
The returned pointer can still be freed by the caller, so the total "wasted" space is limited to the trailing part of the truncated URL.

allocating enough space for the actual length of the string

here's a part of my code. I'm putting some lines of my text file into array1, I chose the number 28 but it has to be a different number of each line I'm storing. I need to allocate space for the actual length of each line and I'm not sure how to find the length of each string because sizeof(str)always gives me 100.
while (fgets(str, sizeof(char)*100, fp) != NULL) {
array1[j] = (char *)malloc(sizeof(char)*28);
strcpy(array1[j], str);
j++;
//rest of the code
}
allocating enough space for the actual length of the string
The cast (char *) is not needed in (char *)malloc(sizeof(char)*28);.
Find the length by using strlen(str) #M Oehm. This length does not include the '\0'.
Find the size needed by adding 1 to the length.
Allocate for the string size, not length.
Best to use size_t for string length/size computations. int may be insuffceint.
The problem is like writing a string duplicate function. Research the common strdup() function.
char *s96_strdup(const char *s) {
size_t length = strlen(s); // Get the string length = does not include the \0
size_t size = length + 1;
char *new_string = malloc(size);
// Was this succesful?
if (new_string) {
memcpy(new_string, s, size); // copy
}
return new_string;
}
Usage. fgets() reads a line which usually includes a '\n'.
char str[100];
while (j < JMAX && fgets(str, sizeof str, fp) != NULL) {
array1[j] = s96_strdup(str);
j++;
}
Remember to eventually call free(array1[j] for each string allocated.

Copying char array into another without copying newline

I'm using fgets to read a string in a char array, and then I want to move the pointer over 5 indices and copy the rest of the array into a separate array, BUT I don't want it to copy a newline character; so I've got something like this:
char str1[45], str2[50];
fgets(str2, 50, stdin);
*str2 += 5;
sprintf(str1, "%[^\n]s", str2);
but when I try to compile, I get an error that says:
unknown conversion type character â[â in format [-Wformat]
I'm pretty sure I've used the "%[^\n]s" before with scanf, and it worked just fine, any suggestions?
The pattern %[^n]s is valid format for scanf but it is not a valid format specifier for printf (or sprintf).
Additionally, *str2 += 5 does not skip the first 5 characters (as it appears to be the intention) but instead adds 5 for the byte stored in the first element of str2. str2 = str2 + 5 will not compile since str2 is an array. You could assign the result to a temporary or pass it directly to sprintf.
Here is a slightly better way of doing what you are asking:
size_t len;
char *str1 = NULL, str2[50];
fgets(str2, 50, stdin);
len = strlen(str2);
if (len > 5) {
if (str2[len-1] == '\n') {
str2[len-1] = '\0'; // remove newline
len--;
}
str1 = str2 + 5;
len -= 5;
}
if (str1 != NULL) {
// do stuff
}
Try removing the final "s" in "%[^\n]s"
"%[^\n]s" is ok for scanf(), but not with printf(). Note: certainly the "s" is superfluous.
Various methods exist to trim the trailing \n. Suggest
if (fgets(str2, 50, stdin) == NULL) HAnlde_EOForIOError();
size_t len = strlen(str2);
if (len > 0 && str2[len-1] == '\n') len--;
if (len < 5) Handle_ShortString();
memcpy(str1, str2 + 5, len-5+1);
Note that strings returned from fegts() do not always end in '\n'.
See trimming-fgets
To solve this problem properly, you must not lose sight of the fact that you're dealing with C arrays that have a limited size. The copy must not only stop at the newline, as required, but must ensure that the target array is properly null terminate, and that it is not overrun.
For this, it may be best to write a function, like:
#include <string.h>
/* Copy src to dst, until the point that a character from the bag set
* is encountered in src.(That character is not included in the copy.
* Ensures that dst is null terminated, unless dstsize is zero.
* dstsize gives the size of the dst.
* Returns the number of characters required to perform a complete copy;
* if this exceeds dstsize, then the copy was truncated.
*/
size_t copyspan(char *dst, size_t dstsize, const char *src, const char *bag)
{
size_t ideal_length = strcspn(src, bag); /* how many chars to copy */
size_t limited_length = (ideal_length < dstsize) ? ideal_length : dstsize - 1;
if (dstsize > 0) {
memcpy(dst, src, limited_length);
dst[limited_length] = 0;
}
return ideal_length + 1;
}
With this function we can now do:
if (copyspan(str1, str2, "\n") > sizeof str1) {
/* oops, truncated: handle this somehow */
}
Of course, there is also the issue that fgets may have truncated the original data already.
Dealing with just the trailing newline that is usually returned by fgets (except in the case of an overflowing line or a file not terminated by a newline) is usually done like this:
{
char line[128];
/*...*/
if (fgets(line, sizeof line, file)) {
char *pnl = strchr(line, '\n'); /* obtain pointer to first newline */
if (pnl != 0) /* if found, overwrite it with null */
*pnl = 0;
}
/*...*/
}
If you are doing this kind of line reading in many places, it is better to make a wrapper than to repeat this logic, of course.
#include <stdio.h>
int main(void){
char str1[45], str2[50];
if(fgets(str2, sizeof str2, stdin)){
int i=0, j;
for(j=0; str2[j] && str2[j] != '\n' && j < 5; ++j)
;
if(j == 5){//move success
while(str2[j] && str2[j] != '\n')
str1[i++] = str2[j++];
str1[i]=0;
puts(str1);
}
}
return 0;
}
Personally, I would just implement it from scratch with a simple for loop.
char str1[45], str2[50];
fgets(str2, 50, stdin);
size_t len = strlen(str2);
for (size_t k = 5; k < len; k += sizeof(char)) {
str1[k - 5] = str2[k];
}

How do you insert a string within a dynamic target using realloc()?

I'm struggling with memory allocation. I wanted to input a string into another and I made two functions that stop working at the same place - realloc. These functions are very similar. In first one I copy char by char into a temporary string and when I try to copy temporary string to the first one is the place where I get errors. In the second function I copy the end of first string (from the given position) to a temporary string, reallocate the first string (this is where I get errors) and remove everything in i from the given position. Then I append second string and temporary to a first string. Here is my code.
First function:
// str2 - is a string that I want to input in first string(str)
// at certain position (pos)
void ins (char **str, char *str2, int pos)
{
// lenght of first and second strings
int len = strlen(str[0]),
len2 = strlen(str2),
i, j, l = 0;
// creating temporary string
char *s = (char *) malloc ((len + len2) * sizeof(char));
// copying first part of first string
for (i = 0; i < pos; i++)
s[i] = str[0][i];
// copying second string
for (j = 0; j < len2; j++)
s[i + j] = str2[j];
// copying second part of first string
for (int k = pos; k < len; k++)
{
s[i + j + l] = str[0][k];
l++;
}
// reallocating additional space for second string
// and copying temporary string to first string
str[0] = (char *) realloc (str[0], (len + len2) * sizeof(char));
strcpy(str[0], s);
free(s);
s = NULL;
}
Second function:
void ins2 (char **str,char *str2, int pos)
{
// lenght of first and second string
int len = strlen(str[0]),
len2 = strlen(str2);
// creating a temporary string and copying
// from the given position
char *s = (char *) malloc ((len - pos) * sizeof(char));
strcpy(s, str[0] + pos);
// reallocating space for string that will be added
// deleting part of it from the given position
str[0] = (char *) realloc(str[0], (len + len2) * sizeof(char));
str[0][pos] = '\0';
// adding second string and temporary string
strcat(str[0], str2);
strcat(str[0], s);
// be free, temporary string
free(s);
s = NULL;
}
If you're doing what I think you're trying to do, you need one realloc() for this, assuming the incoming string is indeed already dynamically allocated (it better be):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void ins (char **str, const char *str2, size_t pos)
{
// lenght of first and second strings
size_t len = strlen(*str);
size_t len2 = strlen(str2);
// reallocate new string
char *tmp = realloc(*str, len + len2 + 1);
if (tmp != NULL)
{
*str = tmp;
memmove(tmp+pos+len2, tmp+pos, len-pos);
memcpy(tmp+pos, str2, len2);
tmp[len+len2] = 0;
}
}
int main()
{
char *str = strdup("A simple string");
char s2[] = "inserted ";
printf("%s\n", str);
ins(&str, s2, 9);
printf("%s\n", str);
free(str);
return 0;
}
Output
A simple string
A simple inserted string
How It Works
The passed-in strings are both sent through strlen() to obtain their lengths. Once we have those we know how large the resulting buffer needs to be.
Once we realloc() the buffer, the original content is preserved, but we need to (possibly) shift content of the first string to open a hole for the second string. That shift, if done, may require overlapped memory be moved (as it does in the sample). For such memory copying, memmove() is used. Unlike memcpy(), the memmove() library function supports copying where the source and destination regions may overlap.
Once the hole is made, we memcpy() the second string into position. There is no need for strcpy() since we already know the length.
We finish by tacking the last slot to a terminating 0, thereby finishing the null-terminated string and completing the operation
Note I made no affordances at all for this regarding someone passing an invalid pos (out of range), NULL strings, optimizing to nothing if str2 is empty (or NULL), etc. That cleanup I leave to you, but I hope the idea of how this can be done is clear.

Dynamic memory allocation + truncating a string issue

I've been fooling around with malloc, realloc and free in order to write some basic functions to operate on C strings (char*). I've encountered this weird issue when erasing the last character from a string. I wrote a function with such a prototype:
int string_erase_end (char ** dst, size_t size);
It's supposed to shorten the "dst" string by one character. So far I have come up with this code:
int string_erase_end (char ** dst, size_t size)
{
size_t s = strlen(*dst) - size;
char * tmp = NULL;
if (s < 0) return (-1);
if (size == 0) return 0;
tmp = (char*)malloc(s);
if (tmp == NULL) return (-1);
strncpy(tmp,*dst,s);
free(*dst);
*dst = (char*)malloc(s+1);
if (*dst == NULL) return (-1);
strncpy(*dst,tmp,s);
*dst[s] = '\0';
free(tmp);
return 0;
}
In main(), when I truncate strings (yes, I called malloc on them previously), I get strange results. Depending on the number of characters I want to truncate, it either works OK, truncates a wrong number of characters or throws a segmentation fault.
I have no experience with dynamic memory allocation and have always used C++ and its std::string to do all such dirty work, but this time I need to make this work in C. I'd appreciate if someone helped me locate and correct my mistake(s) here. Thanks in advance.
The first strncpy() doesn't put a '\0' at the end of tmp.
Also, you could avoid a double copy: *dst = tmp;
According to your description your function is supposed to erase the last n characters in a string:
/* Assumes passed string is zero terminated... */
void string_erase_last_char(char * src, int num_chars_to_erase)
{
size_t len = strlen(src);
if (num_chars_to_erase > len)
{
num_chars_to_erase = len;
}
src[len - num_chars_to_erase] = '\0';
}
I don't understand the purpose of the size parameter.
If your strings are initially allocated using malloc(), you should just use realloc() to change their size. That will retain the content automatically, and require fewer operations:
int string_erase_end (char ** dst)
{
size_t len;
char *ns;
if (dst == NULL || *dst == NULL)
return -1;
len = strlen(*dst);
if (len == 0)
return -1;
ns = realloc(*dst, len - 1);
if (ns == NULL)
return -1;
ns[len - 1] = '\0';
*dst = ns;
return 0;
}
In the "real world", you would generally not change the allocated size for a 1-char truncation; it's too inefficient. You would instead keep track of the string's length and its allocated size separately. That makes it easy for strings to grow; as long as there is allocated space already, it's very fast to append a character.
Also, in C you never need to cast the return value of malloc(); it serves no purpose and can hide bugs so don't do it.

Resources