Working with substrings - c

I have string char * buff and want to work on its susbtring (from buff + x till buff + y.
Do I have to copy this sting to other variable? Or is there any better way to reach it?
Right now I only want to write this substring to a file.

No, just do the write directly just as you wish:
fwrite(buff + x, y - x + 1, 1, my_file);
The above assumes a closed interval by the way, if you mean a half-open you need to remove the + 1. For instance, with const char *buff = "hello, world"; the above will write "world" if x = 7 and y = 12.
The write will be done from the "slice" of your buffer, since that's all you say to fwrite(). It has no idea that the data it receives is part of something larger, of course.
As pointer out in a comment, the above treats the slice as binary data which might be bad if it's really a string. In that case, to be able to use e.g. fprintf() with %s, you should use a dynamic format string (you need %.Ns where N is y - x + 1):
static int substring_print(const char *s, size_t start, size_t end)
{
char fmt[16];
snprintf(fmt, sizeof fmt, "%%.%us", end - start + 1);
fprintf(stdout, fmt, s + start);
}

Related

Size of formatted string

I am struggling to understand what happens during snprintf.
Let's say I have two numbers:
int i =11; int k = 3;
I want to format them like this "[%02d] %03d\t" and use snprintf.
Afterwards I use the resulting string with write().
snprintf needs the length/bytes n.
I do not understand what is the length I need to provide...
I have 2 theories:
a) It is
sizeof(int)*2
b) I check how many chars the formatted string will contain by counting the digits of the two integers and adding the other chars that the output will have:
2*sizeof(char) + 1*sizeof(char) + 2*sizeof(char) + 3*sizeof(char)+ 1*sizeof(char)
-> digits of i + digits of k + zeros added to first int + zeros added to second int + tab
I am struggling to understand what is the "n" I have to give to snprintf
It is the buffer size
According to a documentation:
Maximum number of bytes to be used in the buffer. The generated string
has a length of at most n-1, leaving space for the additional
terminating null character. size_t is an unsigned integral type.
Suppose you write to an array such as this:
char buf[32];
The buffer can hold 32 chars (including the null terminator). Therefore we call the function like this:
snprintf (buf, 32, "[%02d] %03d\t", i, k);
You can also check the return value to see how many chars have been written (or would have been written). In this case, if it's bigger than 32, then that would mean that some characters had to be discarded because they didn't fit.
Pass 0 and NULL first to obtain an exact amount
int n = snprintf(NULL, 0, "[%02d] %03d\t", i, k);
Then you know you need n + 1
char *buf = malloc(n + 1);
snprintf(buf, n + 1, "[%02d] %03d\t", i, k);
free(buf);
See it on ideone: https://ideone.com/pt0cOQ
n is the size of the string you're passing into snprintf, so it knows when to stop writing to the buffer. This is to prevent a category of errors knows as buffer overflows. snprintf will write n - 1 characters into the passed-in buffer and then terminate it with the null character.

Equivalent of write in printf in the following code?

I wanted to ask what will be the equivalent of this write statement in printf statement?
write(STDOUT_FILENO, buf + start, end - start);
Where buf is a char*, start is int, end is int.
The line which is confusing me is buf + start?
Or how can i save this to a char array using strcpy and then printf that char array. But i don't know how to copy the output of above code to char array. I am unable to understand what the line buf+start is doing.
thanks
The expression buf + start uses pointer arithmetic and is equivalent to &buf[start], the pointer to the position start in buf. The code you show prints the slice from start to end (exclusivley) of your char buffer buf.
If your buffer doesn't contain zeros, you can rewrite that as:
printf("%.*s", (int) (end - start), buf + start);
The cast to (int) isn't strictly necessary in your case, but the * precision in printf requires an int and one often uses size_t for positions, so I've made that a habit.
To copy this data you need
char *mybuffer;
mybuffer = malloc(end - start + 1);
if (mybuffer != NULL)
{
memcpy(mybuffer, buf + start, end - start);
mybuffer[end - start] = '\0';
}
There you go, now mybuffer can be used in a printf like function, you need to remember to do free(mybuffer) at some point after you are done using mybuffer. Also you need to check end - start >= 0 and be aware that if there is a null byte embeded into the data, the string will be shorter than end - start for what printf and family care.

Performing lots of string concatenation in C?

I'm porting some code from Java to C, and so far things have gone well.
However, I have a particular function in Java that makes liberal use of StringBuilder, like this:
StringBuilder result = new StringBuilder();
// .. build string out of variable-length data
for (SolObject object : this) {
result.append(object.toString());
}
// .. some parts are conditional
if (freezeCount < 0) result.append("]");
else result.append(")");
I realize SO is not a code translation service, but I'm not asking for anyone to translate the above code.
I'm wondering how to efficiently perform this type of mass string concatenation in C. It's mostly small strings, but each is determined by a condition, so I can't combine them into a simple sprintf call.
How can I reliably do this type of string concatenation?
A rather "clever" way to conver a number of "objects" to string is:
char buffer[100];
char *str = buffer;
str += sprintf(str, "%06d", 123);
str += sprintf(str, "%s=%5.2f", "x", 1.234567);
This is fairly efficient, since sprintf returns the length of the string copied, so we can "move" str forward by the return value, and keep filling in.
Of course, if there are true Java Objects, then you'll need to figure out how to make a Java style ToString function into "%somethign" in C's printf family.
The performance problem with strcat() is that it has to scan the destination string to find the terminating \0' before it can start appending to it.
But remember that strcat() doesn't take strings as arguments, it takes pointers.
If you maintain a separate pointer that always points to the terminating '\0' of the string you're appending to, you can use that pointer as the first argument to strcat(), and it won't have to re-scan it every time. For that matter, you can use strcpy() rater than strcat().
Maintaining the value of this pointer and ensuring that there's enough room are left as an exercise.
NOTE: you can use strncat() to avoid overwriting the end of the destination array (though it will silently truncate your data). I don't recommend using strncpy() for this purpose. See my rant on the subject.
If your system supports them, the (non-standard) strcpy() and strlcat() functions can be useful for this kind of thing. They both return the total length of the string they tried to create. But their use makes your code less portable; on the other hand, there are open-source implementations that you can use anywhere.
Another solution is to call strlen() on the string you're appending. This isn't ideal, since it's then scanned twice, once by strcat() and once by strlen() -- but at least it avoids re-scanning the entire destination string.
The cause of poor performance when concatenating strings is the reallocation of memory. Joel Spolsky discusses this in his article Back to basics. He describes the naive method of concatenating strings:
Shlemiel gets a job as a street painter, painting the dotted lines down the middle of the road. On the first day he takes a can of paint out to the road and finishes 300 yards of the road. "That's pretty good!" says his boss, "you're a fast worker!" and pays him a kopeck.
The next day Shlemiel only gets 150 yards done. "Well, that's not nearly as good as yesterday, but you're still a fast worker. 150 yards is respectable," and pays him a kopeck.
The next day Shlemiel paints 30 yards of the road. "Only 30!" shouts his boss. "That's unacceptable! On the first day you did ten times that much work! What's going on?"
"I can't help it," says Shlemiel. "Every day I get farther and farther away from the paint can!"
If you can, you want to know how large your destination buffer needs to be before allocating it. The only realistic way to do this is to call strlen on all of the strings you want to concatenate. Then allocate the appropriate amount of memory and use a slightly modified version of strncpy that returns a pointer to the end of the destination buffer.
// Copies src to dest and returns a pointer to the next available
// character in the dest buffer.
// Ensures that a null terminator is at the end of dest. If
// src is larger than size then size - 1 bytes are copied
char* StringCopyEnd( char* dest, char* src, size_t size )
{
size_t pos = 0;
if ( size == 0 ) return dest;
while ( pos < size - 1 && *src )
{
*dest = *src;
++dest;
++src;
++pos;
}
*dest = '\0';
return dest;
}
Note how you have to set the size parameter to be the number of bytes left until the end of the destination buffer.
Here's a sample test function:
void testStringCopyEnd( char* str1, char* str2, size_t size )
{
// Create an oversized buffer and fill it with A's so that
// if a string is not null terminated it will be obvious.
char* dest = (char*) malloc( size + 10 );
memset( dest, 'A', size + 10 );
char* end = StringCopyEnd( dest, str1, size );
end = StringCopyEnd( end, str2, size - ( end - dest ) );
printf( "length: %d - '%s'\n", strlen( dest ), dest );
}
int main(int argc, _TCHAR* argv[])
{
// Test with a large enough buffer size to concatenate 'Hello World'.
// and then reduce the buffer size from there
for ( int i = 12; i > 0; --i )
{
testStringCopyEnd( "Hello", " World", i );
}
return 0;
}
Which produces:
length: 11 - 'Hello World'
length: 10 - 'Hello Worl'
length: 9 - 'Hello Wor'
length: 8 - 'Hello Wo'
length: 7 - 'Hello W'
length: 6 - 'Hello '
length: 5 - 'Hello'
length: 4 - 'Hell'
length: 3 - 'Hel'
length: 2 - 'He'
length: 1 - 'H'
length: 0 - ''
If operations like these are very frequent, you could implement them in your own buffer class. Example (error handling omitted for brevity ;-):
struct buff {
size_t used;
size_t size;
char *data;
} ;
struct buff * buff_new(size_t size)
{
struct buff *bp;
bp = malloc (sizeof *bp);
bp->data = malloc (size);
bp->size = size;
bp->used = 0;
return bp;
}
void buff_add_str(struct buff *bp, char *add)
{
size_t len;
len = strlen(add);
/* To be implemented: buff_resize() ... */
if (bp->used + len +1 >= bp->size) buff_resize(bp, bp->used+1+len);
memcpy(buff->data + buff->used, add, len+1);
buff->used += len;
return;
}
Given that the strings look so small, I'd be inclined just to use strcat and revisit if performance becomes an issue.
You could make your own method that remembers the string length so it doesn't need to iterate through the string to find the end (which is potentially the slow bit of strcat if you are doing lots of appends to long strings)

malloc double datatype with strlen

how do I allocate memory for strlen(esc) in a proper way? The temp and str are char datatypes.
double esc = t1.tv_sec+(t1.tv_usec/1000000.0);
strAll = malloc(strlen(temp) + strlen(str) + strlen(esc) + 1);
You cannot take strlen(esc). As I am sure the compiler has already told you, the argument to strlen() must be char *, you are passing it a double. Try first converting the double to array of char with snprintf().
You can find the length you need using snprintf. Passing '0' as the size will prevent is from writing any bytes, and it returns the number of bytes it would have needed.
size_t length = snprintf(0, 0, "%lf%s%lf", esc, temp, esc) + 1;
strAll = malloc(length);
snprintf(strAll, length, "%lf%s%lf", esc, temp, esc);
You'll need to convert esc to a string, probably with sprintf(). Then use the length from that in the malloc():
char buffer[32];
int n = snprintf(buffer, sizeof(buffer), "%.6f", esc);
if (n >= sizeof(buffer))
...handle overlong string problems (bail out)...
char *strAll = malloc(strlen(temp) + strlen(str) + n + 1);
if (strAll == 0)
...handle out of memory problem (bail out)...
sprintf(strAll, "%s%s%s", temp, str, buffer);
(I didn't check the length returned by sprintf() because 'it cannot go wrong'. You calculated the length of the component strings, and therefore, it will fill exactly the allocated space. If you do decide to check it, then preserve the length that is the argument to malloc() and test against that.)
Your code don't compile. strlen expects a string argument, that is a pointer to a sequence of char (like an array).
Perhaps you want something like
char buf[30];
double esc = somedoublefunction();
snprintf (buf, sizeof(buf), "%f", esc);
return strdup(buf);
of course you should care to later free the resulting pointer.
Try using one of the following if you're just trying to save all the data in one buffer:
sizeof(esc) or sizeof(double)
If you want to turn esc into a string. Otherwise, I would suggest using a fixed point format when converting to a string e.g. snprintf(buffer, 7, "%03.3f", esc);

The simplest way of printing a portion of a char[] in C

Let's say I have a char* str = "0123456789" and I want to cut the first and the last three letters and print just the middle, what is the simplest, and safest, way of doing it?
Now the trick: The portion to cut and the portion to print are of variable size, so I could have a very long char*, or a very small one.
You can use printf(), and a special format string:
char *str = "0123456789";
printf("%.6s\n", str + 1);
The precision in the %s conversion specifier specifies the maximum number of characters to print. You can use a variable to specify the precision at runtime as well:
int length = 6;
char *str = "0123456789";
printf("%.*s\n", length, str + 1);
In this example, the * is used to indicate that the next argument (length) will contain the precision for the %s conversion, the corresponding argument must be an int.
Pointer arithmetic can be used to specify the starting position as I did above.
[EDIT]
One more point, if your string is shorter than your precision specifier, less characters will be printed, for example:
int length = 10;
char *str = "0123456789";
printf("%.*s\n", length, str + 5);
Will print "56789". If you always want to print a certain number of characters, specify both a minimum field width and a precision:
printf("%10.10s\n", str + 5);
or
printf("%*.*s\n", length, length, str + 5);
which will print:
" 56789"
You can use the minus sign to left-justify the output in the field:
printf("%-10.10s\n", str + 5);
Finally, the minimum field width and the precision can be different, i.e.
printf("%8.5s\n", str);
will print at most 5 characters right-justified in an 8 character field.
Robert Gamble and Steve separately have most of the pieces.
Assembled into a whole:
void print_substring(const char *str, int skip, int tail)
{
int len = strlen(str);
assert(skip >= 0);
assert(tail >= 0 && tail < len);
assert(len > skip + tail);
printf("%.*s", len - skip - tail, str + skip);
}
Invocation for the example:
print_substring("0123456789", 1, 3);
If you don't mind modifying the data, you could just do some pointer arithmetic. This is assuming that str is a char pointer and not an array:
char string[] = "0123456789";
char *str = string;
str += 3; // "removes" the first 3 items
str[4] = '\0'; // sets the 5th item to NULL, effectively truncating the string
printf(str); // prints "3456"
Here is a clean and simple substring function I dug up from my personal library that may be useful:
char *
substr(const char *src, size_t start, size_t len)
{
char *dest = malloc(len+1);
if (dest) {
memcpy(dest, src+start, len);
dest[len] = '\0';
}
return dest;
}
It's probably self-explanatory but it takes a string, a starting position (starting at zero), and a length and returns a substring of the original string or a null pointer if malloc fails. The pointer returned can be free'd by the caller when the memory is no longer needed. In the spirit of C, the function doesn't validate the starting position and length provided.
I believe there is some magic you can do with printf that will only print a certain number of characters, but it's not commonly understood or used. We tried to do it at a previous job and couldn't get it to work consistently.
What I would do is save off a character, null that character in the string, print it, then save it back.

Resources