getline check if line is whitespace - c

Is there an easy way to check if a line is empty. So i want to check if it contains any white space such as \r\n\t and spaces.
Thanks

You can use the isspace function in a loop to check if all characters are whitespace:
int is_empty(const char *s) {
while (*s != '\0') {
if (!isspace((unsigned char)*s))
return 0;
s++;
}
return 1;
}
This function will return 0 if any character is not whitespace (i.e. line is not empty), or 1 otherwise.

If a string s consists only of white space characters then strspn(s, " \r\n\t") will return the length of the string. Therefore a simple way to check is strspn(s, " \r\n\t") == strlen(s) but this will traverse the string twice. You can also write a simple function that would traverse at the string only once:
bool isempty(const char *s)
{
while (*s) {
if (!isspace(*s))
return false;
s++;
}
return true;
}

I won't check for '\0' since '\0' is not space and the loop will end there.
int is_empty(const char *s) {
while ( isspace( (unsigned char)*s) )
s++;
return *s == '\0' ? 1 : 0;
}

Given a char *x=" "; here is what I can suggest:
bool onlyspaces = true;
for(char *y = x; *y != '\0'; ++y)
{
if(*y != '\n') if(*y != '\t') if(*y != '\r') if(*y != ' ') { onlyspaces = false; break; }
}

Consider the following example:
#include <iostream>
#include <ctype.h>
bool is_blank(const char* c)
{
while (*c)
{
if (!isspace(*c))
return false;
c++;
}
return false;
}
int main ()
{
char name[256];
std::cout << "Enter your name: ";
std::cin.getline (name,256);
if (is_blank(name))
std::cout << "No name was given." << std:.endl;
return 0;
}

My suggestion would be:
int is_empty(const char *s)
{
while ( isspace(*s) && s++ );
return !*s;
}
with a working example.
Loops over the characters of the string and stops when
either a non-space character was found,
or nul character was visited.
Where the string pointer has stopped, check if the contains of the string is the nul character.
In matter of complexity, it's linear with O(n), where n the size of the input string.

For C++11 you can check is a string is whitespace using std::all_of and isspace (isspace checks for spaces, tabs, newline, vertical tab, feed and carriage return:
std::string str = " ";
std::all_of(str.begin(), str.end(), isspace); //this returns true in this case
if you really only want to check for the character space then:
std::all_of(str.begin(), str.end(), [](const char& c) { return c == ' '; });

This can be done with strspn in one pass (just bool expression):
char *s;
...
( s[ strspn(s, " \r\n\t") ] == '\0' )

You can use sscanf to look for a non-whitespace string of length 1. sscanf will then return -1 if it only finds whitespace.
char test_string[] = " \t\r\n"; // 'blank' example string to be tested
char dummy_string[2]; // holds the first char if it exists
bool isStringOnlyWhiteSpace = (-1 == sscanf(test_string, "%1s", dummy_string));

Related

Deleting extra spaces in a string - C [duplicate]

Is there a clean, preferably standard method of trimming leading and trailing whitespace from a string in C? I'd roll my own, but I would think this is a common problem with an equally common solution.
If you can modify the string:
// Note: This function returns a pointer to a substring of the original string.
// If the given string was allocated dynamically, the caller must not overwrite
// that pointer with the returned value, since the original pointer must be
// deallocated using the same allocator with which it was allocated. The return
// value must NOT be deallocated using free() etc.
char *trimwhitespace(char *str)
{
char *end;
// Trim leading space
while(isspace((unsigned char)*str)) str++;
if(*str == 0) // All spaces?
return str;
// Trim trailing space
end = str + strlen(str) - 1;
while(end > str && isspace((unsigned char)*end)) end--;
// Write new null terminator character
end[1] = '\0';
return str;
}
If you can't modify the string, then you can use basically the same method:
// Stores the trimmed input string into the given output buffer, which must be
// large enough to store the result. If it is too small, the output is
// truncated.
size_t trimwhitespace(char *out, size_t len, const char *str)
{
if(len == 0)
return 0;
const char *end;
size_t out_size;
// Trim leading space
while(isspace((unsigned char)*str)) str++;
if(*str == 0) // All spaces?
{
*out = 0;
return 1;
}
// Trim trailing space
end = str + strlen(str) - 1;
while(end > str && isspace((unsigned char)*end)) end--;
end++;
// Set output size to minimum of trimmed string length and buffer size minus 1
out_size = (end - str) < len-1 ? (end - str) : len-1;
// Copy trimmed string and add null terminator
memcpy(out, str, out_size);
out[out_size] = 0;
return out_size;
}
Here's one that shifts the string into the first position of your buffer. You might want this behavior so that if you dynamically allocated the string, you can still free it on the same pointer that trim() returns:
char *trim(char *str)
{
size_t len = 0;
char *frontp = str;
char *endp = NULL;
if( str == NULL ) { return NULL; }
if( str[0] == '\0' ) { return str; }
len = strlen(str);
endp = str + len;
/* Move the front and back pointers to address the first non-whitespace
* characters from each end.
*/
while( isspace((unsigned char) *frontp) ) { ++frontp; }
if( endp != frontp )
{
while( isspace((unsigned char) *(--endp)) && endp != frontp ) {}
}
if( frontp != str && endp == frontp )
*str = '\0';
else if( str + len - 1 != endp )
*(endp + 1) = '\0';
/* Shift the string so that it starts at str so that if it's dynamically
* allocated, we can still free it on the returned pointer. Note the reuse
* of endp to mean the front of the string buffer now.
*/
endp = str;
if( frontp != str )
{
while( *frontp ) { *endp++ = *frontp++; }
*endp = '\0';
}
return str;
}
Test for correctness:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
/* Paste function from above here. */
int main()
{
/* The test prints the following:
[nothing to trim] -> [nothing to trim]
[ trim the front] -> [trim the front]
[trim the back ] -> [trim the back]
[ trim front and back ] -> [trim front and back]
[ trim one char front and back ] -> [trim one char front and back]
[ trim one char front] -> [trim one char front]
[trim one char back ] -> [trim one char back]
[ ] -> []
[ ] -> []
[a] -> [a]
[] -> []
*/
char *sample_strings[] =
{
"nothing to trim",
" trim the front",
"trim the back ",
" trim front and back ",
" trim one char front and back ",
" trim one char front",
"trim one char back ",
" ",
" ",
"a",
"",
NULL
};
char test_buffer[64];
char comparison_buffer[64];
size_t index, compare_pos;
for( index = 0; sample_strings[index] != NULL; ++index )
{
// Fill buffer with known value to verify we do not write past the end of the string.
memset( test_buffer, 0xCC, sizeof(test_buffer) );
strcpy( test_buffer, sample_strings[index] );
memcpy( comparison_buffer, test_buffer, sizeof(comparison_buffer));
printf("[%s] -> [%s]\n", sample_strings[index],
trim(test_buffer));
for( compare_pos = strlen(comparison_buffer);
compare_pos < sizeof(comparison_buffer);
++compare_pos )
{
if( test_buffer[compare_pos] != comparison_buffer[compare_pos] )
{
printf("Unexpected change to buffer # index %u: %02x (expected %02x)\n",
compare_pos, (unsigned char) test_buffer[compare_pos], (unsigned char) comparison_buffer[compare_pos]);
}
}
}
return 0;
}
Source file was trim.c. Compiled with 'cc -Wall trim.c -o trim'.
My solution. String must be changeable. The advantage above some of the other solutions that it moves the non-space part to the beginning so you can keep using the old pointer, in case you have to free() it later.
void trim(char * s) {
char * p = s;
int l = strlen(p);
while(isspace(p[l - 1])) p[--l] = 0;
while(* p && isspace(* p)) ++p, --l;
memmove(s, p, l + 1);
}
This version creates a copy of the string with strndup() instead of editing it in place. strndup() requires _GNU_SOURCE, so maybe you need to make your own strndup() with malloc() and strncpy().
char * trim(char * s) {
int l = strlen(s);
while(isspace(s[l - 1])) --l;
while(* s && isspace(* s)) ++s, --l;
return strndup(s, l);
}
Here's my C mini library for trimming left, right, both, all, in place and separate, and trimming a set of specified characters (or white space by default).
contents of strlib.h:
#ifndef STRLIB_H_
#define STRLIB_H_ 1
enum strtrim_mode_t {
STRLIB_MODE_ALL = 0,
STRLIB_MODE_RIGHT = 0x01,
STRLIB_MODE_LEFT = 0x02,
STRLIB_MODE_BOTH = 0x03
};
char *strcpytrim(char *d, // destination
char *s, // source
int mode,
char *delim
);
char *strtriml(char *d, char *s);
char *strtrimr(char *d, char *s);
char *strtrim(char *d, char *s);
char *strkill(char *d, char *s);
char *triml(char *s);
char *trimr(char *s);
char *trim(char *s);
char *kill(char *s);
#endif
contents of strlib.c:
#include <strlib.h>
char *strcpytrim(char *d, // destination
char *s, // source
int mode,
char *delim
) {
char *o = d; // save orig
char *e = 0; // end space ptr.
char dtab[256] = {0};
if (!s || !d) return 0;
if (!delim) delim = " \t\n\f";
while (*delim)
dtab[*delim++] = 1;
while ( (*d = *s++) != 0 ) {
if (!dtab[0xFF & (unsigned int)*d]) { // Not a match char
e = 0; // Reset end pointer
} else {
if (!e) e = d; // Found first match.
if ( mode == STRLIB_MODE_ALL || ((mode != STRLIB_MODE_RIGHT) && (d == o)) )
continue;
}
d++;
}
if (mode != STRLIB_MODE_LEFT && e) { // for everything but trim_left, delete trailing matches.
*e = 0;
}
return o;
}
// perhaps these could be inlined in strlib.h
char *strtriml(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_LEFT, 0); }
char *strtrimr(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_RIGHT, 0); }
char *strtrim(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_BOTH, 0); }
char *strkill(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_ALL, 0); }
char *triml(char *s) { return strcpytrim(s, s, STRLIB_MODE_LEFT, 0); }
char *trimr(char *s) { return strcpytrim(s, s, STRLIB_MODE_RIGHT, 0); }
char *trim(char *s) { return strcpytrim(s, s, STRLIB_MODE_BOTH, 0); }
char *kill(char *s) { return strcpytrim(s, s, STRLIB_MODE_ALL, 0); }
The one main routine does it all.
It trims in place if src == dst, otherwise,
it works like the strcpy routines.
It trims a set of characters specified in the string delim, or white space if null.
It trims left, right, both, and all (like tr).
There is not much to it, and it iterates over the string only once. Some folks might complain that trim right starts on the left, however, no strlen is needed which starts on the left anyway. (One way or another you have to get to the end of the string for right trims, so you might as well do the work as you go.) There may be arguments to be made about pipelining and cache sizes and such -- who knows. Since the solution works from left to right and iterates only once, it can be expanded to work on streams as well. Limitations: it does not work on unicode strings.
Here is my attempt at a simple, yet correct in-place trim function.
void trim(char *str)
{
int i;
int begin = 0;
int end = strlen(str) - 1;
while (isspace((unsigned char) str[begin]))
begin++;
while ((end >= begin) && isspace((unsigned char) str[end]))
end--;
// Shift all characters back to the start of the string array.
for (i = begin; i <= end; i++)
str[i - begin] = str[i];
str[i - begin] = '\0'; // Null terminate string.
}
Late to the trim party
Features:
1. Trim the beginning quickly, as in a number of other answers.
2. After going to the end, trimming the right with only 1 test per loop. Like #jfm3, but works for an all white-space string)
3. To avoid undefined behavior when char is a signed char, cast *s to unsigned char.
Character handling "In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined." C11 §7.4 1
#include <ctype.h>
// Return a pointer to the trimmed string
char *string_trim_inplace(char *s) {
while (isspace((unsigned char) *s)) s++;
if (*s) {
char *p = s;
while (*p) p++;
while (isspace((unsigned char) *(--p)));
p[1] = '\0';
}
// If desired, shift the trimmed string
return s;
}
#chqrlie commented the above does not shift the trimmed string. To do so....
// Return a pointer to the (shifted) trimmed string
char *string_trim_inplace(char *s) {
char *original = s;
size_t len = 0;
while (isspace((unsigned char) *s)) {
s++;
}
if (*s) {
char *p = s;
while (*p) p++;
while (isspace((unsigned char) *(--p)));
p[1] = '\0';
// len = (size_t) (p - s); // older errant code
len = (size_t) (p - s + 1); // Thanks to #theriver
}
return (s == original) ? s : memmove(original, s, len + 1);
}
Here's a solution similar to #adam-rosenfields in-place modification routine but without needlessly resorting to strlen(). Like #jkramer, the string is left-adjusted within the buffer so you can free the same pointer. Not optimal for large strings since it does not use memmove. Includes the ++/-- operators that #jfm3 mentions. FCTX-based unit tests included.
#include <ctype.h>
void trim(char * const a)
{
char *p = a, *q = a;
while (isspace(*q)) ++q;
while (*q) *p++ = *q++;
*p = '\0';
while (p > a && isspace(*--p)) *p = '\0';
}
/* See http://fctx.wildbearsoftware.com/ */
#include "fct.h"
FCT_BGN()
{
FCT_QTEST_BGN(trim)
{
{ char s[] = ""; trim(s); fct_chk_eq_str("", s); } // Trivial
{ char s[] = " "; trim(s); fct_chk_eq_str("", s); } // Trivial
{ char s[] = "\t"; trim(s); fct_chk_eq_str("", s); } // Trivial
{ char s[] = "a"; trim(s); fct_chk_eq_str("a", s); } // NOP
{ char s[] = "abc"; trim(s); fct_chk_eq_str("abc", s); } // NOP
{ char s[] = " a"; trim(s); fct_chk_eq_str("a", s); } // Leading
{ char s[] = " a c"; trim(s); fct_chk_eq_str("a c", s); } // Leading
{ char s[] = "a "; trim(s); fct_chk_eq_str("a", s); } // Trailing
{ char s[] = "a c "; trim(s); fct_chk_eq_str("a c", s); } // Trailing
{ char s[] = " a "; trim(s); fct_chk_eq_str("a", s); } // Both
{ char s[] = " a c "; trim(s); fct_chk_eq_str("a c", s); } // Both
// Villemoes pointed out an edge case that corrupted memory. Thank you.
// http://stackoverflow.com/questions/122616/#comment23332594_4505533
{
char s[] = "a "; // Buffer with whitespace before s + 2
trim(s + 2); // Trim " " containing only whitespace
fct_chk_eq_str("", s + 2); // Ensure correct result from the trim
fct_chk_eq_str("a ", s); // Ensure preceding buffer not mutated
}
// doukremt suggested I investigate this test case but
// did not indicate the specific behavior that was objectionable.
// http://stackoverflow.com/posts/comments/33571430
{
char s[] = " foobar"; // Shifted across whitespace
trim(s); // Trim
fct_chk_eq_str("foobar", s); // Leading string is correct
// Here is what the algorithm produces:
char r[16] = { 'f', 'o', 'o', 'b', 'a', 'r', '\0', ' ',
' ', 'f', 'o', 'o', 'b', 'a', 'r', '\0'};
fct_chk_eq_int(0, memcmp(s, r, sizeof(s)));
}
}
FCT_QTEST_END();
}
FCT_END();
I'm not sure what you consider "painless."
C strings are pretty painful. We can find the first non-whitespace character position trivially:
while (isspace(* p)) p++;
We can find the last non-whitespace character position with two similar trivial moves:
while (* q) q++;
do { q--; } while (isspace(* q));
(I have spared you the pain of using the * and ++ operators at the same time.)
The question now is what do you do with this? The datatype at hand isn't really a big robust abstract String that is easy to think about, but instead really barely any more than an array of storage bytes. Lacking a robust data type, it is impossible to write a function that will do the same as PHperytonby's chomp function. What would such a function in C return?
Another one, with one line doing the real job:
#include <stdio.h>
int main()
{
const char *target = " haha ";
char buf[256];
sscanf(target, "%s", buf); // Trimming on both sides occurs here
printf("<%s>\n", buf);
}
If you're using glib, then you can use g_strstrip
I didn't like most of these answers because they did one or more of the following...
Returned a different pointer inside the original pointer's string (kind of a pain to juggle two different pointers to the same thing).
Made gratuitous use of things like strlen() that pre-iterate the entire string.
Used non-portable OS-specific lib functions.
Backscanned.
Used comparison to ' ' instead of isspace() so that TAB / CR / LF are preserved.
Wasted memory with large static buffers.
Wasted cycles with high-cost functions like sscanf/sprintf.
Here is my version:
void fnStrTrimInPlace(char *szWrite) {
const char *szWriteOrig = szWrite;
char *szLastSpace = szWrite, *szRead = szWrite;
int bNotSpace;
// SHIFT STRING, STARTING AT FIRST NON-SPACE CHAR, LEFTMOST
while( *szRead != '\0' ) {
bNotSpace = !isspace((unsigned char)(*szRead));
if( (szWrite != szWriteOrig) || bNotSpace ) {
*szWrite = *szRead;
szWrite++;
// TRACK POINTER TO LAST NON-SPACE
if( bNotSpace )
szLastSpace = szWrite;
}
szRead++;
}
// TERMINATE AFTER LAST NON-SPACE (OR BEGINNING IF THERE WAS NO NON-SPACE)
*szLastSpace = '\0';
}
Use a string library, for instance:
Ustr *s1 = USTR1(\7, " 12345 ");
ustr_sc_trim_cstr(&s1, " ");
assert(ustr_cmp_cstr_eq(s1, "12345"));
...as you say this is a "common" problem, yes you need to include a #include or so and it's not included in libc but don't go inventing your own hack job storing random pointers and size_t's that way only leads to buffer overflows.
A bit late to the game, but I'll throw my routines into the fray. They're probably not the most absolute efficient, but I believe they're correct and they're simple (with rtrim() pushing the complexity envelope):
#include <ctype.h>
#include <string.h>
/*
Public domain implementations of in-place string trim functions
Michael Burr
michael.burr#nth-element.com
2010
*/
char* ltrim(char* s)
{
char* newstart = s;
while (isspace( *newstart)) {
++newstart;
}
// newstart points to first non-whitespace char (which might be '\0')
memmove( s, newstart, strlen( newstart) + 1); // don't forget to move the '\0' terminator
return s;
}
char* rtrim( char* s)
{
char* end = s + strlen( s);
// find the last non-whitespace character
while ((end != s) && isspace( *(end-1))) {
--end;
}
// at this point either (end == s) and s is either empty or all whitespace
// so it needs to be made empty, or
// end points just past the last non-whitespace character (it might point
// at the '\0' terminator, in which case there's no problem writing
// another there).
*end = '\0';
return s;
}
char* trim( char* s)
{
return rtrim( ltrim( s));
}
Very late to the party...
Single-pass forward-scanning solution with no backtracking. Every character in the source string is tested exactly once twice. (So it should be faster than most of the other solutions here, especially if the source string has a lot of trailing spaces.)
This includes two solutions, one to copy and trim a source string into another destination string, and the other to trim the source string in place. Both functions use the same code.
The (modifiable) string is moved in-place, so the original pointer to it remains unchanged.
#include <stddef.h>
#include <ctype.h>
char * trim2(char *d, const char *s)
{
// Sanity checks
if (s == NULL || d == NULL)
return NULL;
// Skip leading spaces
const unsigned char * p = (const unsigned char *)s;
while (isspace(*p))
p++;
// Copy the string
unsigned char * dst = (unsigned char *)d; // d and s can be the same
unsigned char * end = dst;
while (*p != '\0')
{
if (!isspace(*dst++ = *p++))
end = dst;
}
// Truncate trailing spaces
*end = '\0';
return d;
}
char * trim(char *s)
{
return trim2(s, s);
}
Just to keep this growing, one more option with a modifiable string:
void trimString(char *string)
{
size_t i = 0, j = strlen(string);
while (j > 0 && isspace((unsigned char)string[j - 1])) string[--j] = '\0';
while (isspace((unsigned char)string[i])) i++;
if (i > 0) memmove(string, string + i, j - i + 1);
}
I know there have many answers, but I post my answer here to see if my solution is good enough.
// Trims leading whitespace chars in left `str`, then copy at almost `n - 1` chars
// into the `out` buffer in which copying might stop when the first '\0' occurs,
// and finally append '\0' to the position of the last non-trailing whitespace char.
// Reture the length the trimed string which '\0' is not count in like strlen().
size_t trim(char *out, size_t n, const char *str)
{
// do nothing
if(n == 0) return 0;
// ptr stop at the first non-leading space char
while(isspace(*str)) str++;
if(*str == '\0') {
out[0] = '\0';
return 0;
}
size_t i = 0;
// copy char to out until '\0' or i == n - 1
for(i = 0; i < n - 1 && *str != '\0'; i++){
out[i] = *str++;
}
// deal with the trailing space
while(isspace(out[--i]));
out[++i] = '\0';
return i;
}
The easiest way to skip leading spaces in a string is, imho,
#include <stdio.h>
int main()
{
char *foo=" teststring ";
char *bar;
sscanf(foo,"%s",bar);
printf("String is >%s<\n",bar);
return 0;
}
Ok this is my take on the question. I believe it's the most concise solution that modifies the string in place (free will work) and avoids any UB. For small strings, it's probably faster than a solution involving memmove.
void stripWS_LT(char *str)
{
char *a = str, *b = str;
while (isspace((unsigned char)*a)) a++;
while (*b = *a++) b++;
while (b > str && isspace((unsigned char)*--b)) *b = 0;
}
#include <ctype.h>
#include <string.h>
char *trim_space(char *in)
{
char *out = NULL;
int len;
if (in) {
len = strlen(in);
while(len && isspace(in[len - 1])) --len;
while(len && *in && isspace(*in)) ++in, --len;
if (len) {
out = strndup(in, len);
}
}
return out;
}
isspace helps to trim all white spaces.
Run a first loop to check from last byte for space character and reduce the length variable
Run a second loop to check from first byte for space character and reduce the length variable and increment char pointer.
Finally if length variable is more than 0, then use strndup to create new string buffer by excluding spaces.
This one is short and simple, uses for-loops and doesn't overwrite the string boundaries.
You can replace the test with isspace() if needed.
void trim (char *s) // trim leading and trailing spaces+tabs
{
int i,j,k, len;
j=k=0;
len = strlen(s);
// find start of string
for (i=0; i<len; i++) if ((s[i]!=32) && (s[i]!=9)) { j=i; break; }
// find end of string+1
for (i=len-1; i>=j; i--) if ((s[i]!=32) && (s[i]!=9)) { k=i+1; break;}
if (k<=j) {s[0]=0; return;} // all whitespace (j==k==0)
len=k-j;
for (i=0; i<len; i++) s[i] = s[j++]; // shift result to start of string
s[i]=0; // end the string
}//_trim
If, and ONLY IF there's only one contiguous block of text between whitespace, you can use a single call to strtok(3), like so:
char *trimmed = strtok(input, "\r\t\n ");
This works for strings like the following:
" +1.123.456.7890 "
" 01-01-2020\n"
"\t2.523"
This will not work for strings that contain whitespace between blocks of non-whitespace, like " hi there ". It's probably better to avoid this approach, but now it's here in your toolbox if you need it.
Personally, I'd roll my own. You can use strtok, but you need to take care with doing so (particularly if you're removing leading characters) that you know what memory is what.
Getting rid of trailing spaces is easy, and pretty safe, as you can just put a 0 in over the top of the last space, counting back from the end. Getting rid of leading spaces means moving things around. If you want to do it in place (probably sensible) you can just keep shifting everything back one character until there's no leading space. Or, to be more efficient, you could find the index of the first non-space character, and shift everything back by that number. Or, you could just use a pointer to the first non-space character (but then you need to be careful in the same way as you do with strtok).
#include "stdafx.h"
#include "malloc.h"
#include "string.h"
int main(int argc, char* argv[])
{
char *ptr = (char*)malloc(sizeof(char)*30);
strcpy(ptr," Hel lo wo rl d G eo rocks!!! by shahil sucks b i g tim e");
int i = 0, j = 0;
while(ptr[j]!='\0')
{
if(ptr[j] == ' ' )
{
j++;
ptr[i] = ptr[j];
}
else
{
i++;
j++;
ptr[i] = ptr[j];
}
}
printf("\noutput-%s\n",ptr);
return 0;
}
Most of the answers so far do one of the following:
Backtrack at the end of the string (i.e. find the end of the string and then seek backwards until a non-space character is found,) or
Call strlen() first, making a second pass through the whole string.
This version makes only one pass and does not backtrack. Hence it may perform better than the others, though only if it is common to have hundreds of trailing spaces (which is not unusual when dealing with the output of a SQL query.)
static char const WHITESPACE[] = " \t\n\r";
static void get_trim_bounds(char const *s,
char const **firstWord,
char const **trailingSpace)
{
char const *lastWord;
*firstWord = lastWord = s + strspn(s, WHITESPACE);
do
{
*trailingSpace = lastWord + strcspn(lastWord, WHITESPACE);
lastWord = *trailingSpace + strspn(*trailingSpace, WHITESPACE);
}
while (*lastWord != '\0');
}
char *copy_trim(char const *s)
{
char const *firstWord, *trailingSpace;
char *result;
size_t newLength;
get_trim_bounds(s, &firstWord, &trailingSpace);
newLength = trailingSpace - firstWord;
result = malloc(newLength + 1);
memcpy(result, firstWord, newLength);
result[newLength] = '\0';
return result;
}
void inplace_trim(char *s)
{
char const *firstWord, *trailingSpace;
size_t newLength;
get_trim_bounds(s, &firstWord, &trailingSpace);
newLength = trailingSpace - firstWord;
memmove(s, firstWord, newLength);
s[newLength] = '\0';
}
This is the shortest possible implementation I can think of:
static const char *WhiteSpace=" \n\r\t";
char* trim(char *t)
{
char *e=t+(t!=NULL?strlen(t):0); // *e initially points to end of string
if (t==NULL) return;
do --e; while (strchr(WhiteSpace, *e) && e>=t); // Find last char that is not \r\n\t
*(++e)=0; // Null-terminate
e=t+strspn (t,WhiteSpace); // Find first char that is not \t
return e>t?memmove(t,e,strlen(e)+1):t; // memmove string contents and terminator
}
These functions will modify the original buffer, so if dynamically allocated, the original
pointer can be freed.
#include <string.h>
void rstrip(char *string)
{
int l;
if (!string)
return;
l = strlen(string) - 1;
while (isspace(string[l]) && l >= 0)
string[l--] = 0;
}
void lstrip(char *string)
{
int i, l;
if (!string)
return;
l = strlen(string);
while (isspace(string[(i = 0)]))
while(i++ < l)
string[i-1] = string[i];
}
void strip(char *string)
{
lstrip(string);
rstrip(string);
}
What do you think about using StrTrim function defined in header Shlwapi.h.? It is straight forward rather defining on your own.
Details can be found on:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb773454(v=vs.85).aspx
If you have
char ausCaptain[]="GeorgeBailey ";
StrTrim(ausCaptain," ");
This will give ausCaptain as "GeorgeBailey" not "GeorgeBailey ".
To trim my strings from the both sides I use the oldie but the gooody ;)
It can trim anything with ascii less than a space, meaning that the control chars will be trimmed also !
char *trimAll(char *strData)
{
unsigned int L = strlen(strData);
if(L > 0){ L--; }else{ return strData; }
size_t S = 0, E = L;
while((!(strData[S] > ' ') || !(strData[E] > ' ')) && (S >= 0) && (S <= L) && (E >= 0) && (E <= L))
{
if(strData[S] <= ' '){ S++; }
if(strData[E] <= ' '){ E--; }
}
if(S == 0 && E == L){ return strData; } // Nothing to be done
if((S >= 0) && (S <= L) && (E >= 0) && (E <= L)){
L = E - S + 1;
memmove(strData,&strData[S],L); strData[L] = '\0';
}else{ strData[0] = '\0'; }
return strData;
}
I'm only including code because the code posted so far seems suboptimal (and I don't have the rep to comment yet.)
void inplace_trim(char* s)
{
int start, end = strlen(s);
for (start = 0; isspace(s[start]); ++start) {}
if (s[start]) {
while (end > 0 && isspace(s[end-1]))
--end;
memmove(s, &s[start], end - start);
}
s[end - start] = '\0';
}
char* copy_trim(const char* s)
{
int start, end;
for (start = 0; isspace(s[start]); ++start) {}
for (end = strlen(s); end > 0 && isspace(s[end-1]); --end) {}
return strndup(s + start, end - start);
}
strndup() is a GNU extension. If you don't have it or something equivalent, roll your own. For example:
r = strdup(s + start);
r[end-start] = '\0';
Here i use the dynamic memory allocation to trim the input string to the function trimStr. First, we find how many non-empty characters exist in the input string. Then, we allocate a character array with that size and taking care of the null terminated character. When we use this function, we need to free the memory inside of main function.
#include<stdio.h>
#include<stdlib.h>
char *trimStr(char *str){
char *tmp = str;
printf("input string %s\n",str);
int nc = 0;
while(*tmp!='\0'){
if (*tmp != ' '){
nc++;
}
tmp++;
}
printf("total nonempty characters are %d\n",nc);
char *trim = NULL;
trim = malloc(sizeof(char)*(nc+1));
if (trim == NULL) return NULL;
tmp = str;
int ne = 0;
while(*tmp!='\0'){
if (*tmp != ' '){
trim[ne] = *tmp;
ne++;
}
tmp++;
}
trim[nc] = '\0';
printf("trimmed string is %s\n",trim);
return trim;
}
int main(void){
char str[] = " s ta ck ove r fl o w ";
char *trim = trimStr(str);
if (trim != NULL )free(trim);
return 0;
}

Making myfgets using char pointer as an input

char *myfgets(char *s, int n, FILE *in) {
char ch;
if (n<=0) {
return s;
}
while (((ch = getc(in)) != '\n')) {
if(ch == EOF){
return NULL;
}
else if (ch == '\n') {
break;
} else {
*s=ch;
s++;
}
}
*s = '\n';
if (ferror(in) != 0) {
return NULL;
} else {
return s;
}
if (strlen(s) > 512) {
return NULL;
}
}
I want to take 1 line only from some specific file and put the line into the char pointer (*s). I made this function but when I run the program it didn't work really well. For the result, there are a lot of unknown characters that are displayed.
Is there anything wrong with my code?
From the man page of strlen():
The strlen() function calculates the length of the string pointed to by s, excluding the terminating null byte ('\0').
So, strlen counts the number of characters until it finds a null byte '\0', and not the newline character '\n'. Consider changing
*s = '\n'
to
*s = '\0'
Alternatively, you could write your own strlen that looks for '\n' instead of '\0'.

Remove punctuation at beginning and end of a string

I have a string and I want to remove all the punctuation from the beginning and the end of it only, but not the middle.
I have wrote a code to remove the punctuation from the first and last character of a string only, which is clearly very inefficient and useless if a string has 2 or more punctuations at the end.
Here is an example:
{ Hello ""I am:: a Str-ing!! }
Desired output
{ Hello I am a Str-ing }
Are there any functions that I could use? Thanks.
This is what I've done so far. I'm actually editing the string in a linked-list
if(ispunct(removeend->string[(strlen(removeend->string))-1]) != 0) {
removeend->string[(strlen(removeend->string))-1] = '\0';
}
else {}
Iterate over the string, use isalpha() to check each character, write the characters which pass into a new string.
char *rm_punct(char *str) {
char *h = str;
char *t = str + strlen(str) - 1;
while (ispunct(*p)) p++;
while (ispunct(*t) && p < t) { *t = 0; t--; }
/* also if you want to preserve the original address */
{ int i;
for (i = 0; i <= t - p + 1; i++) {
str[i] = p[i];
} p = str; } /* --- */
return p;
}
Iterate over the string, use isalpha() to check each character, after the first character that passes start writing into a new string.
Iterate over the new string backwards, replace all punctuation with \0 until you find a character which isn't punctuation.
#include <stdio.h>
#include <ctype.h>
#include <string.h>
char* trim_ispunct(char* str){
int i ;
char* p;
if(str == NULL || *str == '\0') return str;
for(i=strlen(str)-1; ispunct(str[i]);--i)
str[i]='\0';
for(p=str;ispunct(*p);++p);
return strcpy(str, p);
}
int main(){
//test
char str[][16] = { "Hello", "\"\"I", "am::", "a", "Str-ing!!" };
int i, size = sizeof(str)/sizeof(str[0]);
for(i = 0;i<size;++i)
printf("%s\n", trim_ispunct(str[i]));
return 0;
}
/* result:
Hello
I
am
a
Str-ing
*/
Ok, in a while iteration, call multiple times the strtok function to separate each single string by the character (white space). You could also use sscanf instead of strtok.
Then, for each string, you have to do a for cycle, but beginning from the end of the string up to the beginning.As soon as you encounter !isalpha(current character) put a \0 in the current string position. You have eliminated the tail's punctuation chars.
Now, do another for cycle on the same string. Now from 0 to strlen(currentstring). While is !isalpha(current character) continue. If isalpha put the current character in in a buffer and all the remaining characters. The buffer is the cleaned string. Copy it into the original string.
Repeat the above two steps for the others strtok's outputs. End.
Construct a tiny state machine. The cha2class() function divides the characters into equivalence classes. The state machine will always skip punctuation, except when it has alphanumeric characters on the left and the right; in that case it will be preserved. (that is the memmove() in state 3)
#include <stdio.h>
#include <string.h>
#define IS_ALPHA 1
#define IS_WHITE 2
#define IS_PUNCT 3
int cha2class(int ch);
void scrutinize(char *str);
int cha2class(int ch)
{
if (ch >= 'a' && ch <= 'z') return IS_ALPHA;
if (ch >= 'A' && ch <= 'Z') return IS_ALPHA;
if (ch == ' ' || ch == '\t') return IS_WHITE;
if (ch == EOF || ch == 0) return IS_WHITE;
return IS_PUNCT;
}
void scrutinize(char *str)
{
size_t pos,dst,start;
int typ, state ;
state = 0;
for (dst = pos = start=0; ; pos++) {
typ = cha2class(str[pos]);
switch(state) {
case 0: /* BOF, white seen */
if (typ==IS_WHITE) break;
else if (typ==IS_ALPHA) { start = pos; state =1; }
else if (typ==IS_PUNCT) { start = pos; state =2; continue;}
break;
case 1: /* inside a word */
if (typ==IS_ALPHA) break;
else if (typ==IS_WHITE) { state=0; }
else if (typ==IS_PUNCT) { start = pos; state =3;continue; }
break;
case 2: /* inside punctuation after whitespace: skip it */
if (typ==IS_PUNCT) continue;
else if (typ==IS_WHITE) { state=0; }
else if (typ==IS_ALPHA) {state=1; }
break;
case 3: /* inside punctuation after a word */
if (typ==IS_PUNCT) continue;
else if (typ==IS_WHITE) { state=0; }
else if (typ==IS_ALPHA) {
memmove(str+dst, str+start, pos-start); dst += pos-start;
state =1; }
break;
}
str[dst++] = str[pos];
if (str[pos] == '\0') break;
}
}
int main (int argc, char **argv)
{
char test[] = ".This! is... ???a.string?" ;
scrutinize(test);
printf("Result=%s\n", test);
return 0;
}
int main (int argc, char **argv)
{
char test[] = ".This! is... ???a.string?" ;
scrutinize(test);
printf("Result=%s\n", test);
return 0;
}
OUTPUT:
Result=This is a.string

Trim a string in C [duplicate]

This question already has answers here:
How do I trim leading/trailing whitespace in a standard way?
(40 answers)
Closed 5 years ago.
Briefly:
I'm after the equivalent of .NET's String.Trim in C using the win32 and standard C api (compiling with MSVC2008 so I have access to all the C++ stuff if needed, but I am just trying to trim a char*).
Given that there is strchr, strtok, and all manner of other string functions, surely there should be a trim function, or one that can be repurposed...
Thanks
There is no standard library function to do this, but it's not too hard to roll your own. There is an existing question on SO about doing this that was answered with source code.
This made me want to write my own - I didn't like the ones that had been provided. Seems to me there should be 3 functions.
char *ltrim(char *s)
{
while(isspace(*s)) s++;
return s;
}
char *rtrim(char *s)
{
char* back = s + strlen(s);
while(isspace(*--back));
*(back+1) = '\0';
return s;
}
char *trim(char *s)
{
return rtrim(ltrim(s));
}
You can use the standard isspace() function in ctype.h to achieve this. Simply compare the beginning and end characters of your character array until both ends no longer have spaces.
"spaces" include:
' ' (0x20) space (SPC)
'\t' (0x09) horizontal tab (TAB)
'\n' (0x0a) newline (LF)
'\v' (0x0b) vertical tab (VT)
'\f' (0x0c) feed (FF)
'\r' (0x0d) carriage return (CR)
although there is no function which will do all of the work for you, you will have to roll your own solution to compare each side of the given character array repeatedly until no spaces remain.
Edit:
Since you have access to C++, Boost has a trim implementation waiting for you to make your life a lot easier.
Surprised to see such implementations. I usually do trim like this:
char *trim(char *s) {
char *ptr;
if (!s)
return NULL; // handle NULL string
if (!*s)
return s; // handle empty string
for (ptr = s + strlen(s) - 1; (ptr >= s) && isspace(*ptr); --ptr);
ptr[1] = '\0';
return s;
}
It is fast and reliable - serves me many years.
/* Function to remove white spaces on both sides of a string i.e trim */
void trim (char *s)
{
int i;
while (isspace (*s)) s++; // skip left side white spaces
for (i = strlen (s) - 1; (isspace (s[i])); i--) ; // skip right side white spaces
s[i + 1] = '\0';
printf ("%s\n", s);
}
#include "stdafx.h"
#include <string.h>
#include <ctype.h>
char* trim(char* input);
int _tmain(int argc, _TCHAR* argv[])
{
char sz1[]=" MQRFH ";
char sz2[]=" MQRFH";
char sz3[]=" MQR FH";
char sz4[]="MQRFH ";
char sz5[]="MQRFH";
char sz6[]="M";
char sz7[]="M ";
char sz8[]=" M";
char sz9[]="";
char sz10[]=" ";
printf("sz1:[%s] %d\n",trim(sz1), strlen(sz1));
printf("sz2:[%s] %d\n",trim(sz2), strlen(sz2));
printf("sz3:[%s] %d\n",trim(sz3), strlen(sz3));
printf("sz4:[%s] %d\n",trim(sz4), strlen(sz4));
printf("sz5:[%s] %d\n",trim(sz5), strlen(sz5));
printf("sz6:[%s] %d\n",trim(sz6), strlen(sz6));
printf("sz7:[%s] %d\n",trim(sz7), strlen(sz7));
printf("sz8:[%s] %d\n",trim(sz8), strlen(sz8));
printf("sz9:[%s] %d\n",trim(sz9), strlen(sz9));
printf("sz10:[%s] %d\n",trim(sz10), strlen(sz10));
return 0;
}
char *ltrim(char *s)
{
while(isspace(*s)) s++;
return s;
}
char *rtrim(char *s)
{
char* back;
int len = strlen(s);
if(len == 0)
return(s);
back = s + len;
while(isspace(*--back));
*(back+1) = '\0';
return s;
}
char *trim(char *s)
{
return rtrim(ltrim(s));
}
Output:
sz1:[MQRFH] 9
sz2:[MQRFH] 6
sz3:[MQR FH] 8
sz4:[MQRFH] 7
sz5:[MQRFH] 5
sz6:[M] 1
sz7:[M] 2
sz8:[M] 2
sz9:[] 0
sz10:[] 8
I like it when the return value always equals the argument. This way, if the string array has been allocated with malloc(), it can safely be free() again.
/* Remove leading whitespaces */
char *ltrim(char *const s)
{
size_t len;
char *cur;
if(s && *s) {
len = strlen(s);
cur = s;
while(*cur && isspace(*cur))
++cur, --len;
if(s != cur)
memmove(s, cur, len + 1);
}
return s;
}
/* Remove trailing whitespaces */
char *rtrim(char *const s)
{
size_t len;
char *cur;
if(s && *s) {
len = strlen(s);
cur = s + len - 1;
while(cur != s && isspace(*cur))
--cur, --len;
cur[isspace(*cur) ? 0 : 1] = '\0';
}
return s;
}
/* Remove leading and trailing whitespaces */
char *trim(char *const s)
{
rtrim(s); // order matters
ltrim(s);
return s;
}
void ltrim(char str[PATH_MAX])
{
int i = 0, j = 0;
char buf[PATH_MAX];
strcpy(buf, str);
for(;str[i] == ' ';i++);
for(;str[i] != '\0';i++,j++)
buf[j] = str[i];
buf[j] = '\0';
strcpy(str, buf);
}
static inline void ut_trim(char * str) {
char * start = str;
char * end = start + strlen(str);
while (--end >= start) { /* trim right */
if (!isspace(*end))
break;
}
*(++end) = '\0';
while (isspace(*start)) /* trim left */
start++;
if (start != str) /* there is a string */
memmove(str, start, end - start + 1);
}
How about this... It only requires one iteration over the string (doesn't use strlen, which iterates over the string). When the function returns you get a pointer to the start of the trimmed string which is null terminated. The string is trimmed of spaces from the left (until the first character is found). The string is also trimmed of all trailing spaces after the last nonspace character.
char* trim(char* input) {
char* start = input;
while (isSpace(*start)) { //trim left
start++;
}
char* ptr = start;
char* end = start;
while (*ptr++ != '\0') { //trim right
if (!isSpace(*ptr)) { //only move end pointer if char isn't a space
end = ptr;
}
}
*end = '\0'; //terminate the trimmed string with a null
return start;
}
bool isSpace(char c) {
switch (c) {
case ' ':
case '\n':
case '\t':
case '\f':
case '\r':
return true;
break;
default:
return false;
break;
}
}
/* iMode 0:ALL, 1:Left, 2:Right*/
char* Trim(char* szStr,const char ch, int iMode)
{
if (szStr == NULL)
return NULL;
char szTmp[1024*10] = { 0x00 };
strcpy(szTmp, szStr);
int iLen = strlen(szTmp);
char* pStart = szTmp;
char* pEnd = szTmp+iLen;
int i;
for(i = 0;i < iLen;i++){
if (szTmp[i] == ch && pStart == szTmp+i && iMode != 2)
++pStart;
if (szTmp[iLen-i-1] == ch && pEnd == szTmp+iLen-i && iMode != 1)
*(--pEnd) = '\0';
}
strcpy(szStr, pStart);
return szStr;
}
Here's my implementation, behaving like the built-in string functions in libc (that is, it expects a c-string, it modifies it and returns it to the caller).
It trims leading spaces & shifts the remaining chars to the left, as it parses the string from left to right. It then marks a new end of string and starts parsing it backwards, replacing trailing spaces with '\0's until it finds either a non-space char or the start of the string. I believe those are the minimum possible iterations for this particular task.
// ----------------------------------------------------------------------------
// trim leading & trailing spaces from string s (return modified string s)
// alg:
// - skip leading spaces, via cp1
// - shift remaining *cp1's to the left, via cp2
// - mark a new end of string
// - replace trailing spaces with '\0', via cp2
// - return the trimmed s
//
char *s_trim(char *s)
{
char *cp1; // for parsing the whole s
char *cp2; // for shifting & padding
// skip leading spaces, shift remaining chars
for (cp1=s; isspace(*cp1); cp1++ ) // skip leading spaces, via cp1
;
for (cp2=s; *cp1; cp1++, cp2++) // shift left remaining chars, via cp2
*cp2 = *cp1;
*cp2-- = 0; // mark new end of string for s
// replace trailing spaces with '\0'
while ( cp2 > s && isspace(*cp2) )
*cp2-- = 0; // pad with '\0's
return s;
}
Not the best way but it works
char* Trim(char* str)
{
int len = strlen(str);
char* buff = new char[len];
int i = 0;
memset(buff,0,len*sizeof(char));
do{
if(isspace(*str)) continue;
buff[i] = *str; ++i;
} while(*(++str) != '\0');
return buff;
}
void inPlaceStrTrim(char* str) {
int k = 0;
int i = 0;
for (i=0; str[i] != '\0';) {
if (isspace(str[i])) {
// we have got a space...
k = i;
for (int j=i; j<strlen(str)-1; j++) {
str[j] = str[j+1];
}
str[strlen(str)-1] = '\0';
i = k; // start the loop again where we ended..
} else {
i++;
}
}
}
Easiest thing to do is a simple loop. I'm going to assume that you want the trimmed string returned in place.
char *
strTrim(char * s){
int ix, jx;
int len ;
char * buf
len = strlen(s); /* possibly should use strnlen */
buf = (char *) malloc(strlen(s)+1);
for(ix=0, jx=0; ix < len; ix++){
if(!isspace(s[ix]))
buf[jx++] = s[ix];
buf[jx] = '\0';
strncpy(s, buf, jx); /* always looks as far as the null, but who cares? */
free(buf); /* no good leak goes unpunished */
return s; /* modifies s in place *and* returns it for swank */
}
This gets rid of embedded blanks too, if String.Trim doesn't then it needs a bit more logic.

How do I trim leading/trailing whitespace in a standard way?

Is there a clean, preferably standard method of trimming leading and trailing whitespace from a string in C? I'd roll my own, but I would think this is a common problem with an equally common solution.
If you can modify the string:
// Note: This function returns a pointer to a substring of the original string.
// If the given string was allocated dynamically, the caller must not overwrite
// that pointer with the returned value, since the original pointer must be
// deallocated using the same allocator with which it was allocated. The return
// value must NOT be deallocated using free() etc.
char *trimwhitespace(char *str)
{
char *end;
// Trim leading space
while(isspace((unsigned char)*str)) str++;
if(*str == 0) // All spaces?
return str;
// Trim trailing space
end = str + strlen(str) - 1;
while(end > str && isspace((unsigned char)*end)) end--;
// Write new null terminator character
end[1] = '\0';
return str;
}
If you can't modify the string, then you can use basically the same method:
// Stores the trimmed input string into the given output buffer, which must be
// large enough to store the result. If it is too small, the output is
// truncated.
size_t trimwhitespace(char *out, size_t len, const char *str)
{
if(len == 0)
return 0;
const char *end;
size_t out_size;
// Trim leading space
while(isspace((unsigned char)*str)) str++;
if(*str == 0) // All spaces?
{
*out = 0;
return 1;
}
// Trim trailing space
end = str + strlen(str) - 1;
while(end > str && isspace((unsigned char)*end)) end--;
end++;
// Set output size to minimum of trimmed string length and buffer size minus 1
out_size = (end - str) < len-1 ? (end - str) : len-1;
// Copy trimmed string and add null terminator
memcpy(out, str, out_size);
out[out_size] = 0;
return out_size;
}
Here's one that shifts the string into the first position of your buffer. You might want this behavior so that if you dynamically allocated the string, you can still free it on the same pointer that trim() returns:
char *trim(char *str)
{
size_t len = 0;
char *frontp = str;
char *endp = NULL;
if( str == NULL ) { return NULL; }
if( str[0] == '\0' ) { return str; }
len = strlen(str);
endp = str + len;
/* Move the front and back pointers to address the first non-whitespace
* characters from each end.
*/
while( isspace((unsigned char) *frontp) ) { ++frontp; }
if( endp != frontp )
{
while( isspace((unsigned char) *(--endp)) && endp != frontp ) {}
}
if( frontp != str && endp == frontp )
*str = '\0';
else if( str + len - 1 != endp )
*(endp + 1) = '\0';
/* Shift the string so that it starts at str so that if it's dynamically
* allocated, we can still free it on the returned pointer. Note the reuse
* of endp to mean the front of the string buffer now.
*/
endp = str;
if( frontp != str )
{
while( *frontp ) { *endp++ = *frontp++; }
*endp = '\0';
}
return str;
}
Test for correctness:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
/* Paste function from above here. */
int main()
{
/* The test prints the following:
[nothing to trim] -> [nothing to trim]
[ trim the front] -> [trim the front]
[trim the back ] -> [trim the back]
[ trim front and back ] -> [trim front and back]
[ trim one char front and back ] -> [trim one char front and back]
[ trim one char front] -> [trim one char front]
[trim one char back ] -> [trim one char back]
[ ] -> []
[ ] -> []
[a] -> [a]
[] -> []
*/
char *sample_strings[] =
{
"nothing to trim",
" trim the front",
"trim the back ",
" trim front and back ",
" trim one char front and back ",
" trim one char front",
"trim one char back ",
" ",
" ",
"a",
"",
NULL
};
char test_buffer[64];
char comparison_buffer[64];
size_t index, compare_pos;
for( index = 0; sample_strings[index] != NULL; ++index )
{
// Fill buffer with known value to verify we do not write past the end of the string.
memset( test_buffer, 0xCC, sizeof(test_buffer) );
strcpy( test_buffer, sample_strings[index] );
memcpy( comparison_buffer, test_buffer, sizeof(comparison_buffer));
printf("[%s] -> [%s]\n", sample_strings[index],
trim(test_buffer));
for( compare_pos = strlen(comparison_buffer);
compare_pos < sizeof(comparison_buffer);
++compare_pos )
{
if( test_buffer[compare_pos] != comparison_buffer[compare_pos] )
{
printf("Unexpected change to buffer # index %u: %02x (expected %02x)\n",
compare_pos, (unsigned char) test_buffer[compare_pos], (unsigned char) comparison_buffer[compare_pos]);
}
}
}
return 0;
}
Source file was trim.c. Compiled with 'cc -Wall trim.c -o trim'.
My solution. String must be changeable. The advantage above some of the other solutions that it moves the non-space part to the beginning so you can keep using the old pointer, in case you have to free() it later.
void trim(char * s) {
char * p = s;
int l = strlen(p);
while(isspace(p[l - 1])) p[--l] = 0;
while(* p && isspace(* p)) ++p, --l;
memmove(s, p, l + 1);
}
This version creates a copy of the string with strndup() instead of editing it in place. strndup() requires _GNU_SOURCE, so maybe you need to make your own strndup() with malloc() and strncpy().
char * trim(char * s) {
int l = strlen(s);
while(isspace(s[l - 1])) --l;
while(* s && isspace(* s)) ++s, --l;
return strndup(s, l);
}
Here's my C mini library for trimming left, right, both, all, in place and separate, and trimming a set of specified characters (or white space by default).
contents of strlib.h:
#ifndef STRLIB_H_
#define STRLIB_H_ 1
enum strtrim_mode_t {
STRLIB_MODE_ALL = 0,
STRLIB_MODE_RIGHT = 0x01,
STRLIB_MODE_LEFT = 0x02,
STRLIB_MODE_BOTH = 0x03
};
char *strcpytrim(char *d, // destination
char *s, // source
int mode,
char *delim
);
char *strtriml(char *d, char *s);
char *strtrimr(char *d, char *s);
char *strtrim(char *d, char *s);
char *strkill(char *d, char *s);
char *triml(char *s);
char *trimr(char *s);
char *trim(char *s);
char *kill(char *s);
#endif
contents of strlib.c:
#include <strlib.h>
char *strcpytrim(char *d, // destination
char *s, // source
int mode,
char *delim
) {
char *o = d; // save orig
char *e = 0; // end space ptr.
char dtab[256] = {0};
if (!s || !d) return 0;
if (!delim) delim = " \t\n\f";
while (*delim)
dtab[*delim++] = 1;
while ( (*d = *s++) != 0 ) {
if (!dtab[0xFF & (unsigned int)*d]) { // Not a match char
e = 0; // Reset end pointer
} else {
if (!e) e = d; // Found first match.
if ( mode == STRLIB_MODE_ALL || ((mode != STRLIB_MODE_RIGHT) && (d == o)) )
continue;
}
d++;
}
if (mode != STRLIB_MODE_LEFT && e) { // for everything but trim_left, delete trailing matches.
*e = 0;
}
return o;
}
// perhaps these could be inlined in strlib.h
char *strtriml(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_LEFT, 0); }
char *strtrimr(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_RIGHT, 0); }
char *strtrim(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_BOTH, 0); }
char *strkill(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_ALL, 0); }
char *triml(char *s) { return strcpytrim(s, s, STRLIB_MODE_LEFT, 0); }
char *trimr(char *s) { return strcpytrim(s, s, STRLIB_MODE_RIGHT, 0); }
char *trim(char *s) { return strcpytrim(s, s, STRLIB_MODE_BOTH, 0); }
char *kill(char *s) { return strcpytrim(s, s, STRLIB_MODE_ALL, 0); }
The one main routine does it all.
It trims in place if src == dst, otherwise,
it works like the strcpy routines.
It trims a set of characters specified in the string delim, or white space if null.
It trims left, right, both, and all (like tr).
There is not much to it, and it iterates over the string only once. Some folks might complain that trim right starts on the left, however, no strlen is needed which starts on the left anyway. (One way or another you have to get to the end of the string for right trims, so you might as well do the work as you go.) There may be arguments to be made about pipelining and cache sizes and such -- who knows. Since the solution works from left to right and iterates only once, it can be expanded to work on streams as well. Limitations: it does not work on unicode strings.
Here is my attempt at a simple, yet correct in-place trim function.
void trim(char *str)
{
int i;
int begin = 0;
int end = strlen(str) - 1;
while (isspace((unsigned char) str[begin]))
begin++;
while ((end >= begin) && isspace((unsigned char) str[end]))
end--;
// Shift all characters back to the start of the string array.
for (i = begin; i <= end; i++)
str[i - begin] = str[i];
str[i - begin] = '\0'; // Null terminate string.
}
Late to the trim party
Features:
1. Trim the beginning quickly, as in a number of other answers.
2. After going to the end, trimming the right with only 1 test per loop. Like #jfm3, but works for an all white-space string)
3. To avoid undefined behavior when char is a signed char, cast *s to unsigned char.
Character handling "In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined." C11 §7.4 1
#include <ctype.h>
// Return a pointer to the trimmed string
char *string_trim_inplace(char *s) {
while (isspace((unsigned char) *s)) s++;
if (*s) {
char *p = s;
while (*p) p++;
while (isspace((unsigned char) *(--p)));
p[1] = '\0';
}
// If desired, shift the trimmed string
return s;
}
#chqrlie commented the above does not shift the trimmed string. To do so....
// Return a pointer to the (shifted) trimmed string
char *string_trim_inplace(char *s) {
char *original = s;
size_t len = 0;
while (isspace((unsigned char) *s)) {
s++;
}
if (*s) {
char *p = s;
while (*p) p++;
while (isspace((unsigned char) *(--p)));
p[1] = '\0';
// len = (size_t) (p - s); // older errant code
len = (size_t) (p - s + 1); // Thanks to #theriver
}
return (s == original) ? s : memmove(original, s, len + 1);
}
Here's a solution similar to #adam-rosenfields in-place modification routine but without needlessly resorting to strlen(). Like #jkramer, the string is left-adjusted within the buffer so you can free the same pointer. Not optimal for large strings since it does not use memmove. Includes the ++/-- operators that #jfm3 mentions. FCTX-based unit tests included.
#include <ctype.h>
void trim(char * const a)
{
char *p = a, *q = a;
while (isspace(*q)) ++q;
while (*q) *p++ = *q++;
*p = '\0';
while (p > a && isspace(*--p)) *p = '\0';
}
/* See http://fctx.wildbearsoftware.com/ */
#include "fct.h"
FCT_BGN()
{
FCT_QTEST_BGN(trim)
{
{ char s[] = ""; trim(s); fct_chk_eq_str("", s); } // Trivial
{ char s[] = " "; trim(s); fct_chk_eq_str("", s); } // Trivial
{ char s[] = "\t"; trim(s); fct_chk_eq_str("", s); } // Trivial
{ char s[] = "a"; trim(s); fct_chk_eq_str("a", s); } // NOP
{ char s[] = "abc"; trim(s); fct_chk_eq_str("abc", s); } // NOP
{ char s[] = " a"; trim(s); fct_chk_eq_str("a", s); } // Leading
{ char s[] = " a c"; trim(s); fct_chk_eq_str("a c", s); } // Leading
{ char s[] = "a "; trim(s); fct_chk_eq_str("a", s); } // Trailing
{ char s[] = "a c "; trim(s); fct_chk_eq_str("a c", s); } // Trailing
{ char s[] = " a "; trim(s); fct_chk_eq_str("a", s); } // Both
{ char s[] = " a c "; trim(s); fct_chk_eq_str("a c", s); } // Both
// Villemoes pointed out an edge case that corrupted memory. Thank you.
// http://stackoverflow.com/questions/122616/#comment23332594_4505533
{
char s[] = "a "; // Buffer with whitespace before s + 2
trim(s + 2); // Trim " " containing only whitespace
fct_chk_eq_str("", s + 2); // Ensure correct result from the trim
fct_chk_eq_str("a ", s); // Ensure preceding buffer not mutated
}
// doukremt suggested I investigate this test case but
// did not indicate the specific behavior that was objectionable.
// http://stackoverflow.com/posts/comments/33571430
{
char s[] = " foobar"; // Shifted across whitespace
trim(s); // Trim
fct_chk_eq_str("foobar", s); // Leading string is correct
// Here is what the algorithm produces:
char r[16] = { 'f', 'o', 'o', 'b', 'a', 'r', '\0', ' ',
' ', 'f', 'o', 'o', 'b', 'a', 'r', '\0'};
fct_chk_eq_int(0, memcmp(s, r, sizeof(s)));
}
}
FCT_QTEST_END();
}
FCT_END();
I'm not sure what you consider "painless."
C strings are pretty painful. We can find the first non-whitespace character position trivially:
while (isspace(* p)) p++;
We can find the last non-whitespace character position with two similar trivial moves:
while (* q) q++;
do { q--; } while (isspace(* q));
(I have spared you the pain of using the * and ++ operators at the same time.)
The question now is what do you do with this? The datatype at hand isn't really a big robust abstract String that is easy to think about, but instead really barely any more than an array of storage bytes. Lacking a robust data type, it is impossible to write a function that will do the same as PHperytonby's chomp function. What would such a function in C return?
Another one, with one line doing the real job:
#include <stdio.h>
int main()
{
const char *target = " haha ";
char buf[256];
sscanf(target, "%s", buf); // Trimming on both sides occurs here
printf("<%s>\n", buf);
}
If you're using glib, then you can use g_strstrip
I didn't like most of these answers because they did one or more of the following...
Returned a different pointer inside the original pointer's string (kind of a pain to juggle two different pointers to the same thing).
Made gratuitous use of things like strlen() that pre-iterate the entire string.
Used non-portable OS-specific lib functions.
Backscanned.
Used comparison to ' ' instead of isspace() so that TAB / CR / LF are preserved.
Wasted memory with large static buffers.
Wasted cycles with high-cost functions like sscanf/sprintf.
Here is my version:
void fnStrTrimInPlace(char *szWrite) {
const char *szWriteOrig = szWrite;
char *szLastSpace = szWrite, *szRead = szWrite;
int bNotSpace;
// SHIFT STRING, STARTING AT FIRST NON-SPACE CHAR, LEFTMOST
while( *szRead != '\0' ) {
bNotSpace = !isspace((unsigned char)(*szRead));
if( (szWrite != szWriteOrig) || bNotSpace ) {
*szWrite = *szRead;
szWrite++;
// TRACK POINTER TO LAST NON-SPACE
if( bNotSpace )
szLastSpace = szWrite;
}
szRead++;
}
// TERMINATE AFTER LAST NON-SPACE (OR BEGINNING IF THERE WAS NO NON-SPACE)
*szLastSpace = '\0';
}
Use a string library, for instance:
Ustr *s1 = USTR1(\7, " 12345 ");
ustr_sc_trim_cstr(&s1, " ");
assert(ustr_cmp_cstr_eq(s1, "12345"));
...as you say this is a "common" problem, yes you need to include a #include or so and it's not included in libc but don't go inventing your own hack job storing random pointers and size_t's that way only leads to buffer overflows.
A bit late to the game, but I'll throw my routines into the fray. They're probably not the most absolute efficient, but I believe they're correct and they're simple (with rtrim() pushing the complexity envelope):
#include <ctype.h>
#include <string.h>
/*
Public domain implementations of in-place string trim functions
Michael Burr
michael.burr#nth-element.com
2010
*/
char* ltrim(char* s)
{
char* newstart = s;
while (isspace( *newstart)) {
++newstart;
}
// newstart points to first non-whitespace char (which might be '\0')
memmove( s, newstart, strlen( newstart) + 1); // don't forget to move the '\0' terminator
return s;
}
char* rtrim( char* s)
{
char* end = s + strlen( s);
// find the last non-whitespace character
while ((end != s) && isspace( *(end-1))) {
--end;
}
// at this point either (end == s) and s is either empty or all whitespace
// so it needs to be made empty, or
// end points just past the last non-whitespace character (it might point
// at the '\0' terminator, in which case there's no problem writing
// another there).
*end = '\0';
return s;
}
char* trim( char* s)
{
return rtrim( ltrim( s));
}
Very late to the party...
Single-pass forward-scanning solution with no backtracking. Every character in the source string is tested exactly once twice. (So it should be faster than most of the other solutions here, especially if the source string has a lot of trailing spaces.)
This includes two solutions, one to copy and trim a source string into another destination string, and the other to trim the source string in place. Both functions use the same code.
The (modifiable) string is moved in-place, so the original pointer to it remains unchanged.
#include <stddef.h>
#include <ctype.h>
char * trim2(char *d, const char *s)
{
// Sanity checks
if (s == NULL || d == NULL)
return NULL;
// Skip leading spaces
const unsigned char * p = (const unsigned char *)s;
while (isspace(*p))
p++;
// Copy the string
unsigned char * dst = (unsigned char *)d; // d and s can be the same
unsigned char * end = dst;
while (*p != '\0')
{
if (!isspace(*dst++ = *p++))
end = dst;
}
// Truncate trailing spaces
*end = '\0';
return d;
}
char * trim(char *s)
{
return trim2(s, s);
}
Just to keep this growing, one more option with a modifiable string:
void trimString(char *string)
{
size_t i = 0, j = strlen(string);
while (j > 0 && isspace((unsigned char)string[j - 1])) string[--j] = '\0';
while (isspace((unsigned char)string[i])) i++;
if (i > 0) memmove(string, string + i, j - i + 1);
}
I know there have many answers, but I post my answer here to see if my solution is good enough.
// Trims leading whitespace chars in left `str`, then copy at almost `n - 1` chars
// into the `out` buffer in which copying might stop when the first '\0' occurs,
// and finally append '\0' to the position of the last non-trailing whitespace char.
// Reture the length the trimed string which '\0' is not count in like strlen().
size_t trim(char *out, size_t n, const char *str)
{
// do nothing
if(n == 0) return 0;
// ptr stop at the first non-leading space char
while(isspace(*str)) str++;
if(*str == '\0') {
out[0] = '\0';
return 0;
}
size_t i = 0;
// copy char to out until '\0' or i == n - 1
for(i = 0; i < n - 1 && *str != '\0'; i++){
out[i] = *str++;
}
// deal with the trailing space
while(isspace(out[--i]));
out[++i] = '\0';
return i;
}
The easiest way to skip leading spaces in a string is, imho,
#include <stdio.h>
int main()
{
char *foo=" teststring ";
char *bar;
sscanf(foo,"%s",bar);
printf("String is >%s<\n",bar);
return 0;
}
Ok this is my take on the question. I believe it's the most concise solution that modifies the string in place (free will work) and avoids any UB. For small strings, it's probably faster than a solution involving memmove.
void stripWS_LT(char *str)
{
char *a = str, *b = str;
while (isspace((unsigned char)*a)) a++;
while (*b = *a++) b++;
while (b > str && isspace((unsigned char)*--b)) *b = 0;
}
#include <ctype.h>
#include <string.h>
char *trim_space(char *in)
{
char *out = NULL;
int len;
if (in) {
len = strlen(in);
while(len && isspace(in[len - 1])) --len;
while(len && *in && isspace(*in)) ++in, --len;
if (len) {
out = strndup(in, len);
}
}
return out;
}
isspace helps to trim all white spaces.
Run a first loop to check from last byte for space character and reduce the length variable
Run a second loop to check from first byte for space character and reduce the length variable and increment char pointer.
Finally if length variable is more than 0, then use strndup to create new string buffer by excluding spaces.
This one is short and simple, uses for-loops and doesn't overwrite the string boundaries.
You can replace the test with isspace() if needed.
void trim (char *s) // trim leading and trailing spaces+tabs
{
int i,j,k, len;
j=k=0;
len = strlen(s);
// find start of string
for (i=0; i<len; i++) if ((s[i]!=32) && (s[i]!=9)) { j=i; break; }
// find end of string+1
for (i=len-1; i>=j; i--) if ((s[i]!=32) && (s[i]!=9)) { k=i+1; break;}
if (k<=j) {s[0]=0; return;} // all whitespace (j==k==0)
len=k-j;
for (i=0; i<len; i++) s[i] = s[j++]; // shift result to start of string
s[i]=0; // end the string
}//_trim
If, and ONLY IF there's only one contiguous block of text between whitespace, you can use a single call to strtok(3), like so:
char *trimmed = strtok(input, "\r\t\n ");
This works for strings like the following:
" +1.123.456.7890 "
" 01-01-2020\n"
"\t2.523"
This will not work for strings that contain whitespace between blocks of non-whitespace, like " hi there ". It's probably better to avoid this approach, but now it's here in your toolbox if you need it.
Personally, I'd roll my own. You can use strtok, but you need to take care with doing so (particularly if you're removing leading characters) that you know what memory is what.
Getting rid of trailing spaces is easy, and pretty safe, as you can just put a 0 in over the top of the last space, counting back from the end. Getting rid of leading spaces means moving things around. If you want to do it in place (probably sensible) you can just keep shifting everything back one character until there's no leading space. Or, to be more efficient, you could find the index of the first non-space character, and shift everything back by that number. Or, you could just use a pointer to the first non-space character (but then you need to be careful in the same way as you do with strtok).
#include "stdafx.h"
#include "malloc.h"
#include "string.h"
int main(int argc, char* argv[])
{
char *ptr = (char*)malloc(sizeof(char)*30);
strcpy(ptr," Hel lo wo rl d G eo rocks!!! by shahil sucks b i g tim e");
int i = 0, j = 0;
while(ptr[j]!='\0')
{
if(ptr[j] == ' ' )
{
j++;
ptr[i] = ptr[j];
}
else
{
i++;
j++;
ptr[i] = ptr[j];
}
}
printf("\noutput-%s\n",ptr);
return 0;
}
Most of the answers so far do one of the following:
Backtrack at the end of the string (i.e. find the end of the string and then seek backwards until a non-space character is found,) or
Call strlen() first, making a second pass through the whole string.
This version makes only one pass and does not backtrack. Hence it may perform better than the others, though only if it is common to have hundreds of trailing spaces (which is not unusual when dealing with the output of a SQL query.)
static char const WHITESPACE[] = " \t\n\r";
static void get_trim_bounds(char const *s,
char const **firstWord,
char const **trailingSpace)
{
char const *lastWord;
*firstWord = lastWord = s + strspn(s, WHITESPACE);
do
{
*trailingSpace = lastWord + strcspn(lastWord, WHITESPACE);
lastWord = *trailingSpace + strspn(*trailingSpace, WHITESPACE);
}
while (*lastWord != '\0');
}
char *copy_trim(char const *s)
{
char const *firstWord, *trailingSpace;
char *result;
size_t newLength;
get_trim_bounds(s, &firstWord, &trailingSpace);
newLength = trailingSpace - firstWord;
result = malloc(newLength + 1);
memcpy(result, firstWord, newLength);
result[newLength] = '\0';
return result;
}
void inplace_trim(char *s)
{
char const *firstWord, *trailingSpace;
size_t newLength;
get_trim_bounds(s, &firstWord, &trailingSpace);
newLength = trailingSpace - firstWord;
memmove(s, firstWord, newLength);
s[newLength] = '\0';
}
This is the shortest possible implementation I can think of:
static const char *WhiteSpace=" \n\r\t";
char* trim(char *t)
{
char *e=t+(t!=NULL?strlen(t):0); // *e initially points to end of string
if (t==NULL) return;
do --e; while (strchr(WhiteSpace, *e) && e>=t); // Find last char that is not \r\n\t
*(++e)=0; // Null-terminate
e=t+strspn (t,WhiteSpace); // Find first char that is not \t
return e>t?memmove(t,e,strlen(e)+1):t; // memmove string contents and terminator
}
These functions will modify the original buffer, so if dynamically allocated, the original
pointer can be freed.
#include <string.h>
void rstrip(char *string)
{
int l;
if (!string)
return;
l = strlen(string) - 1;
while (isspace(string[l]) && l >= 0)
string[l--] = 0;
}
void lstrip(char *string)
{
int i, l;
if (!string)
return;
l = strlen(string);
while (isspace(string[(i = 0)]))
while(i++ < l)
string[i-1] = string[i];
}
void strip(char *string)
{
lstrip(string);
rstrip(string);
}
What do you think about using StrTrim function defined in header Shlwapi.h.? It is straight forward rather defining on your own.
Details can be found on:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb773454(v=vs.85).aspx
If you have
char ausCaptain[]="GeorgeBailey ";
StrTrim(ausCaptain," ");
This will give ausCaptain as "GeorgeBailey" not "GeorgeBailey ".
To trim my strings from the both sides I use the oldie but the gooody ;)
It can trim anything with ascii less than a space, meaning that the control chars will be trimmed also !
char *trimAll(char *strData)
{
unsigned int L = strlen(strData);
if(L > 0){ L--; }else{ return strData; }
size_t S = 0, E = L;
while((!(strData[S] > ' ') || !(strData[E] > ' ')) && (S >= 0) && (S <= L) && (E >= 0) && (E <= L))
{
if(strData[S] <= ' '){ S++; }
if(strData[E] <= ' '){ E--; }
}
if(S == 0 && E == L){ return strData; } // Nothing to be done
if((S >= 0) && (S <= L) && (E >= 0) && (E <= L)){
L = E - S + 1;
memmove(strData,&strData[S],L); strData[L] = '\0';
}else{ strData[0] = '\0'; }
return strData;
}
I'm only including code because the code posted so far seems suboptimal (and I don't have the rep to comment yet.)
void inplace_trim(char* s)
{
int start, end = strlen(s);
for (start = 0; isspace(s[start]); ++start) {}
if (s[start]) {
while (end > 0 && isspace(s[end-1]))
--end;
memmove(s, &s[start], end - start);
}
s[end - start] = '\0';
}
char* copy_trim(const char* s)
{
int start, end;
for (start = 0; isspace(s[start]); ++start) {}
for (end = strlen(s); end > 0 && isspace(s[end-1]); --end) {}
return strndup(s + start, end - start);
}
strndup() is a GNU extension. If you don't have it or something equivalent, roll your own. For example:
r = strdup(s + start);
r[end-start] = '\0';
Here i use the dynamic memory allocation to trim the input string to the function trimStr. First, we find how many non-empty characters exist in the input string. Then, we allocate a character array with that size and taking care of the null terminated character. When we use this function, we need to free the memory inside of main function.
#include<stdio.h>
#include<stdlib.h>
char *trimStr(char *str){
char *tmp = str;
printf("input string %s\n",str);
int nc = 0;
while(*tmp!='\0'){
if (*tmp != ' '){
nc++;
}
tmp++;
}
printf("total nonempty characters are %d\n",nc);
char *trim = NULL;
trim = malloc(sizeof(char)*(nc+1));
if (trim == NULL) return NULL;
tmp = str;
int ne = 0;
while(*tmp!='\0'){
if (*tmp != ' '){
trim[ne] = *tmp;
ne++;
}
tmp++;
}
trim[nc] = '\0';
printf("trimmed string is %s\n",trim);
return trim;
}
int main(void){
char str[] = " s ta ck ove r fl o w ";
char *trim = trimStr(str);
if (trim != NULL )free(trim);
return 0;
}

Resources