I have a character array of length 32 and would like to take certain charcters out of it.
for example
111111000000000000000000111111 <32 chars
I would like to take chars 0-6 which would be 111111
Or even take chars 26-31 which would be 111111
char check_type[32];
Above is how I'm declaring.
What I would like to be able to do is define a function or use a function that takes that starting place, and end character.
Ive looked at many ways like using strncpy and strcpy but found no way yet.
I would simply wrap strncpy:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* Creates a sub-string of range [start, end], return value must be freed */
char *substr(char *src, size_t start, size_t end)
{
size_t sub_len = end - start + 1;
char * new_str = malloc(sub_len + 1); /* TODO: check malloc's return value */
strncpy(new_str, src, sub_len);
new_str[sub_len] = '\0'; /* new_str is of size sub_len + 1 */
return new_str;
}
int main(void)
{
char str[] = "111111000000000000000000111111";
char *sub_str = substr(str, 0, 5);
puts(sub_str);
free(sub_str);
return EXIT_SUCCESS;
}
Output:
111111
Use memcpy.
// Stores s[from..to) in sub.
// The caller is responsible for memory allocation.
void extract_substr(char const *s, char *sub, size_t from, size_t to)
{
size_t sublen = to - from;
memcpy(sub, s + from, sublen);
sub[sublen] = '\0';
}
Sample:
char *substr(char *source, int startpos, int endpos)
{
int len = endpos - startpos + 2; // must account for final 0
int i = 0;
char *src, *dst;
char *ret = calloc(len, sizeof(char));
if (!ret)
return ret;
src = source + startpos;
dst = ret;
while (i++ < len)
*dst++ = *src++;
*dst = 0;
return ret;
}
Of course, free the return code when you don't need it anymore. And you notice this function will not check for the validity of endpos vs startpos.
First define the required interface...perhaps:
int substring(char *target, size_t tgtlen, const char *source, size_t src_bgn, size_t src_end);
This takes a destination (target) array where the data will be copied, and is given its length. The data will come from the source array, between positions src_bgn and src_end. The return value will be -1 for an error, and the length of the output (excluding the terminating null). If the target string is too short, you will get an error.
With that set of details in place, you can implement the body fairly easily, and strncpy() might well be appropriate this time (it often isn't).
Usage (based on your question):
char check_type[32] = "111111000000000000000000111111";
char result1[10];
char result2[10];
if (substring(result1, sizeof(result1), check_type, 0, 6) <= 0 ||
substring(result2, sizeof(result2), check_type, 26, 31) <= 0)
...something went wrong...
else
...use result1 and result2...
Check this:
char* Substring(char *string, int len, int start, int end) {
/*
Creates a substring from a given string.
Args:
string: The string whose substring you need to find.
len: The length of the string.
start: The start position for the substring.
end: The end position of the substring (inclusive).
Returns:
substring: (of type char*) which is allocated on the heap.
NULL: on error.
*/
// Check that the start and end position are valid.
// If not valid, then return NULL.
if (start < 0 || start >= len || end < 0 || end >= len) {
return NULL;
}
// Allocate memory to return the substring on the heap.
char *substring = malloc(sizeof(char) * (end - start + 2));
int index = 0, i;
for (i = start; i <= end; i++) {
substring[index] = string[i];
index++;
}
// End with a null character.
substring[index] = '\0';
return substring;
}
int main() {
char str[] = "11111100000000000000000000111111";
printf("%s\n", Substring(str, strlen(str), 0, 5));
printf("%s\n", Substring(str, strlen(str), 26, 31));
}
Related
I call below function which is written in C to fetch parent of child-
char *getParent(char *child)
{
int len = strlen(child);
char *parent;
parent = strdup(substring(child, 0, len - 4));
return parent;
}
char *substring(const char* str, int beg, int n)
{
char *ret = malloc(n+1);
strncpy(ret, (str + beg), n);
*(ret+n) = '\n';
return strdup(ret);
}
child is - '11112222'
Now I am expecting output - '1111' but this function also adding extra spaces after 1111 like this '1111---here i am getting space----'.
What's wrong in this function ?
This:
*(ret+n) = '\n';
is wrong, it should be:
*(ret+n) = '\0';
to terminate the string. You're adding a linefeed, not a terminator, thus failing to produce a valid string.
Also, I would recommend prefering indexing since it's a bit cleaner syntactically:
ret[n] = '\0';
And, of course, you should check the return value of malloc() before relying on it.
UPDATE: And gosh, remove that strdup(), it's completely pointless now that you've already malloc()ed your new string.
It should be just:
char * substring(const char *str, size_t beg, size_t n)
{
char *ret = malloc(n + 1);
if(ret != NULL)
{
strncpy(ret, str + beg, n);
ret[n] = '\0';
}
return ret;
}
This still assumes that the offset and length are valid, and that str is non-NULL.
How can I implement a substring function such as the following that returns the substring but without using malloc() in the process so I don't have to worry about freeing the associated memory elsewhere in my code using the free() function. Is this even possible?
const char *substring(const char *string, int position, int length)
{
char *pointer;
int c;
pointer = malloc(length+1);
if (pointer == NULL)
{
printf("Unable to allocate memory.\n");
exit(EXIT_FAILURE);
}
for (c = 0 ; c < position -1 ; c++)
string++;
for (c = 0 ; c < length ; c++)
{
*(pointer+c) = *string;
string++;
}
*(pointer+c) = '\0';
return substr;
}
UPDATE: 30 DEC 2012
Having considered all the answers and comments it's clear that essentially what I'm trying to do is create a dynamically sized array (i.e. the substring) and that is not possible in C without somewhere along the way having to use some kind of malloc() function and a subsequent free() call on the substring pointer or without the aid of a garbage collector. I attempted to integrate the libgc garbage collector as kindly suggested by #elhadi but so far have not been able to get this to work in my Xcode project. So I have opted to stick with using the following code with malloc() and free().
char * subStr(const char* srcString, const int offset, const int len)
{
char * sub = (char*)malloc(len+1);
memcpy(sub, srcString + offset, len);
sub[len] = 0;
return sub;
}
int main()
{
const char * message = "hello universe";
char * sub = subStr( message, 6, 8 );
printf( "substring: [%s]", sub );
free(sub);
}
I see two options:
If you can destroy the source string (usually a bad thing):
{
string[ position + length] = 0;
return & string[ position ];
}
Note: (see Cole Johnsons note: free no longer works on the returned pointer!)
If you can't modify the source string:
Modify your methods signature so that the caller has to worry about it:
const char *substring(const char *source, char* destination, int position, int length)
And put the modified string into destination (and return it).
And do not even think about this:
const char *substring(const char *string, int position, int length)
{
char *pointer;
int c;
static char modifiedString[256];
...
return modifiedString;
}
Using a static variable inside the function for the modified results...
(This is not thread-safe (not re-entrant!) )
Use a local buffer (an auto array) and a function like this:
void substr(char *dst, const char *src, size_t loc, size_t len)
{
memcpy(dst, src + loc, len);
dst[len] = 0;
}
Call it like this:
const size_t size = 3;
char buf[size + 1]; // this is an auto array, it will be "freed" at the end of the scope
substr(buf, "abcdFOObar", 4, size);
Always ensure the buffer is at least len + 1 bytes long to avoid buffer overflow errors.
const char *substring(const char *string, char *substr, int position, int length)
{
int c;
for (c = 0 ; c < position -1 ; c++)
string++;
for (c = 0 ; c < length ; c++)
{
*(substr+c) = *string;
string++;
}
*(substr+c) = '\0';
return substr;
}
calling function...
int main(int argc, char * argv[]) {
char substr[10];
substring("hello! World", &substr[0], 2, 4);
}
The best way to do it is:
typedef struct vstr_t {
char *s;
int len;
} vstr_t;
#define vstr_set(d, l) \
({ \
vstr_t vs = {.s = d, .len = l}; \
\
vs; \
})
#define vstr_fmt_arg(vs) (vs).len, (vs).s
int main()
{
const char *message = "hello universe";
printf( "substring: [%.*s]\n", vstr_fmt_arg(vstr_set(smpl + 6, 8)));
return 0;
}
You can use a garbage collector, you allocate the memory the first time, the garbage collector will free the memory when no needed.
you should include
#include "gc.h"
in the main you should make something like
GC_INIT(); /* Optional on Linux/X86;*/
and your substr function is:
char *substr(const char* buffer, const int offset, int len)
{
char sub = (char*)GC_MALLOC(len+1);
memcpy(sub, buffer + offset, len);
sub[len] = 0;
return sub;
}
you should link with libgc.a
Im trying to copy part of a string to another string using pointers. My resulting string starts to copy at the correct place though it doesn't stop after exceeding the count. Also the string isn't copy from the source string rather than from the result parameter
#include <stdio.h>
char *getSub(const char *orig, int start, int count, char *res);
int main(void)
{
const char orig[] = "one two three";
char res[] = "123456789012345678";
printf("%s\n",getSub(orig, 4, 3, res));
return 0;
}
char *getSub(const char *orig, int start, int count, char *res)
{
const char *sCopy = orig;
while (*orig)
{
if (start >= (orig - sCopy)) && (res-sCopy < count))
{
*res++ = *orig++;
}
else
*orig++;
}
return res;
}
The big mistake is that you're calculating the difference of two unrelated pointers, res - sCopy (I suppose sourceCopy is also sCopy in the real code, or the other way round). Calculating the difference of pointers is only meaningful if both pointers point into (or one past the end of) the same array. As written, whether anything gets copied at all depends on the arbitrary locations of the two arrays.
if (start >= (orig - sourceCopy)) && (res-sCopy < c))
{
*res++ = *orig++;
}
else
*orig++;
anyway, that doesn't count how many characters are copied if any are copied at all.
Another mistake is that you don't 0-terminate the copy.
A correct implementation would be
char *getSub(const char *orig, int start, int count, char *res)
{
char *from = orig, *to = res;
// check whether the starting position is within orig
for( ; start > 0; --start, ++from)
{
if (*from == 0)
{
res[0] = 0;
return res;
}
}
// copy up to count characters from from to to
for( ; count > 0 && *from; --count)
{
*to++ = *from++;
}
// 0-terminate
*to = 0;
// return start of copy, change to return to if end should be returned
return res;
}
There are at least two problems with your code.
res - sCopy makes no sense because they are pointing at different objects.
You haven't null-terminated the destination string.
#include <string.h>
char *getSub(const char *orig, int start, int count, char *res){
int i,j,len = strlen(orig), limit = start + count;
if(res == NULL) return NULL;
if(start >= len || start < 0 || orig == NULL){
*res = '\0';
return res;
}
for(j=0,i=start;i<len && i < limit;++i){
res[j++]=orig[i];
}
res[j]='\0';
return res;
}
I know C is purposefully bare-bones, but I'm curious as to why something as commonplace as a substring function is not included in <string.h>.
Is it that there is not one "right enough" way to do it? Too many domain specific requirements? Can anyone shed any light?
BTW, this is the substring function I came up with after a bit of research.
Edit: I made a few updates based on comments.
void substr (char *outStr, const char *inpStr, int startPos, size_t strLen) {
/* Cannot do anything with NULL. */
if (inpStr == NULL || outStr == NULL) return;
size_t len = strlen (inpStr);
/* All negative positions to go from end, and cannot
start before start of string, force to start. */
if (startPos < 0) {
startPos = len + startPos;
}
if (startPos < 0) {
startPos = 0;
}
/* Force negative lengths to zero and cannot
start after end of string, force to end. */
if ((size_t)startPos > len) {
startPos = len;
}
len = strlen (&inpStr[startPos]);
/* Adjust length if source string too short. */
if (strLen > len) {
strLen = len;
}
/* Copy string section */
memcpy(outStr, inpStr+startPos, strLen);
outStr[strLen] = '\0';
}
Edit: Based on a comment from r I also came up with this one liner. You're on your own for checks though!
#define substr(dest, src, startPos, strLen) snprintf(dest, BUFF_SIZE, "%.*s", strLen, src+startPos)
Basic standard library functions don't burden themselves with excessive expensive safety checks, leaving them to the user. Most of the safety checks you carry out in your implementation are of expensive kind: totally unacceptable in such a basic library function. This is C, not Java.
Once you get some checks out of the picture, the "substrung" function boils down to ordinary strlcpy. I.e ignoring the safety check on startPos, all you need to do is
char *substr(const char *inpStr, char *outStr, size_t startPos, size_t strLen) {
strlcpy(outStr, inpStr + startPos, strLen);
return outStr;
}
While strlcpy is not a part of the standard library, but it can be crudely replaced by a [misused] strncpy. Again, ignoring the safety check on startPos, all you need to do is
char *substr(const char *inpStr, char *outStr, size_t startPos, size_t strLen) {
strncpy(outStr, inpStr + startPos, strLen);
outStr[strLen] = '\0';
return outStr;
}
Ironically, in your code strncpy is misused in the very same way. On top of that, many of your safety checks are the direct consequence of your choosing a signed type (int) to represent indices, while proper type would be an unsigned one (size_t).
Perhaps because it's a one-liner:
snprintf(dest, dest_size, "%.*s", sub_len, src+sub_start);
You DO have strcpy and strncpy. Aren't enough for you? With strcpy you can simulate the substring from character to end, with strncpy you can simulate the substring from character for a number of characters (you only need to remember to add the \0 at the end of the string). strncpy is even better than the C# equivalent, because you can overshoot the length of the substring and it won't throw an error (if you have allocated enough space in dest, you can do strncpy(dest, src, 1000) even if src is long 1. In C# you can't.)
As written in the comment, you can even use memcpy, but remember to always add a \0 at the end of the string, and you must know how many characters you are copying (so you must know exactly the length of the src substring) AND it's a little more complex to use if a day you want to refactor your code to use wchar_t AND it's not type-safe (because it accepts void* instead of char*). All this in exchange for a little more speed over strncpy
In C you have a function that returns a subset of symbols from a string via pointers: strstr.
char *ptr;
char string1[] = "Hello World";
char string2[] = "World";
ptr = strstr(string1, string2)
*ptr will be pointing to the first character occurrence.
BTW you did not write a function but a procedure, ANSI string functions: string.h
Here's a lighter weight version of what you want. Avoids the redundant strlen calls and guarantees null termination on the destination buffer (something strncpy won't do).
void substr(char* pszSrc, int start, int N, char* pszDst, int lenDest)
{
const char* psz = pszSrc + start;
int x = 0;
while ((x < N) && (x < lenDest))
{
char ch = psz[x];
pszDst[x] = ch;
x++;
if (ch == '\0')
{
return;
}
}
// guarantee null termination
if (x > 0)
{
pszDest[x-1] = 0;
}
}
Example:
char *pszLongString = "This is a long string";
char szSub[10];
substr(pszLongString, 0, 4, szSub, 10); // copies "long" into szSub and includes the null char
So while there isn't a formal substring function in C, C++ string classes usually have such a method:
#include <string>
...
std::string str;
std::string strSub;
str = "This is a long string";
strSub = str.substr(10, 4); // "long"
printf("%s\n", strSub.c_str());
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const char* substr(const char *string, size_t from, size_t to);
int main(int argc, char *argv[])
{
char *string = argv[1];
const char *substring = substr(string,6,80);
printf("string is [%s] substring is [%s]\n",string,substring);
return 0;
}
const char* substr(const char *string, size_t from, size_t to)
{
if (to <= from)
return NULL;
if (from >= to)
return NULL;
if (string == NULL)
return NULL;
if (strlen(string) == 0)
return NULL;
if (from < 0)
from = 0;
if (to > strlen(string))
to = strlen(string);
char *substring = malloc(sizeof(char) * ((to-from)+1));
size_t index;
for (index = 0; from < to; from++, index++)
substring[index] = string[from];
substring[index] = '\0';
return substring;
}
Using pointer arithmetic, it's possible to assign characters from one array to another. My question is, how does one do it given arbitrary start and stop points?
int main(void)
{
char string1[] = "something"; //[s][o][m][e][t][h][i][n][g][\0]
int start = 2, count = 3;
char string2[10] = {0};
char *ptr1 = &string1[start];
char *ptr2 = string2;
while (*ptr2++ = *ptr1++) { } //but stop after 3 elements???
printf("%s",&string2);
}
There's some kind of pointer arithmetic I'm missing to count/test the quantity of elements in a particular array. I do NOT want to declare an integral to count the loop! I want to do it all using pointers. Thanks!
When you write ptr1++;, it is equivalent to ptr1 = ptr1 + 1;. Adding an integer to a pointer moves the memory location of the pointer by the size (in bytes) of the type being pointed to. If ptr1 is a char pointer with value 0x5678 then incrementing it by one makes it 0x5679, because sizeof(char) == 1. But if ptr1 was a Foo *, and sizeof(Foo) == 12, then incrementing the pointer would make its value 0x5684.
If you want to point to an element that is 3 elements away from an element you already have a pointer to, you just add 3 to that pointer. In your question, you wrote:
char *ptr1 = &string1[start]; // array notation
Which is the same thing as:
char *ptr1 = string1 + start; // pointer arithmetic
You could rewrite as follows:
int main(void)
{
char string1[] = "something"; //[s][o][m][e][t][h][i][n][g][\0]
int start = 2, count = 3;
char string2[10] = {0};
// Ensure there is enough room to copy the substring
// and a terminating null character.
assert(count < sizeof(string2));
// Set pointers to the beginning and end of the substring.
const char *from = string1 + start;
const char *end = from + count;
// Set a pointer to the destination.
char *to = string2;
// Copy the indicated characters from the substring,
// possibly stopping early if the end of the substring
// is reached before count characters have been copied.
while (from < end && *from)
{
*to++ = *from++
}
// Ensure the destination string is null terminated
*to = '\0';
printf("%s",&string2);
}
Using const and meaningful variable names (from, to, or src, dst, instead of ptr1, ptr2) helps you avoid mistakes. Using assert and ensuring the string is null-terminated helps you avoid having to debug segfaults and other weirdness. In this case the destination buffer is already zeroed, but when you copy parts of this code to use in another program it may not be.
#include <stdio.h>
int main(void)
{
char string1[] = "something"; //[s][o][m][e][t][h][i][n][g][\0]
int start = 2, count = 3;
char string2[10] = {0};
char *ptr1 = &string1[start];
char *stop = ptr1 + count;
char *ptr2 = string2;
while ((ptr1 < stop) && (*ptr2++ = *ptr1++));
printf("%s",string2);
return 0;
}
I usually use a specific set of variable names in these situations, called:
src - source
dst - destination
end - the end of either the source (used here) or the destination
So:
int main(void)
{
char string1[] = "something";
int start = 2;
int count = 3;
char string2[10] = {0};
const char *src = &string1[start];
const char *end = &string1[start+count];
char *dst = string2;
assert(count < sizeof(string2);
while (src < end)
*dst++ = *src++;
*dst = '\0'; // Null-terminate copied string!
printf("%s",&string2);
return(0);
}
Or, more plausibly, packaged as a function:
char *copy_substr(char *dst, const char *str, size_t start, size_t len)
{
const char *src = str + start;
const char *end = src + len;
while (src < end)
*dst++ = *src++;
*dst = '\0';
return(dst);
}
int main(void)
{
char string1[] = "something";
char *end;
char string2[10] = {0};
end = copy_substr(string2, string1, 2, 3);
printf("%s",&string2);
return(0);
}
The function returns a pointer to the end of the string which is aconventional and doesn't provide a marked benefit in the example, but which does have some merits when you are building a string piecemeal:
struct substr
{
const char *str;
size_t off;
size_t len;
};
static struct substr list[] =
{
{ "abcdefghijklmnopqrstuvwxyz", 2, 5 },
...
{ "abcdefghijklmnopqrstuvwxyz", 18, 3 },
};
int main(void)
{
char buffer[256];
char *str = buffer;
char *end = buffer + sizeof(buffer) - 1;
size_t i;
for (i = 0; i < 5; i++)
{
if (str + list[i].len >= end)
break;
str = copy_substr(str, list[i].str, list[i].off, list[i].len);
}
printf("%s\n", buffer);
return(0);
}
The main point is that the return value - a pointer to the NUL at the end of the string - is what you need for string concatenation operations. (In this example, with strings that have known lengths, you could survive without this return value without needing to use strlen() or strcat() repeatedly; in contexts where the called function copies an amount of data that cannot be determined by the calling routine, the pointer to the end is even more useful.)
In order to get the size (i.e. number of elements) in a static array, you would usually do
sizeof(string1) / sizeof(*string1)
which will divide the size (in bytes) of the array by the size (in bytes) of each element, thus giving you the number of elements in the array.
But as you're obviously trying to implement a strcpy clone, you could simply break the loop if the source character *ptr1 is '\0' (C strings are zero-terminated). If you only want to copy N characters, you could break if ptr1 >= string1 + start + count.