I have char * source, and I want extract from it subsrting, that I know is beginning from symbols "abc", and ends where source ends. With strstr I can get the poiner, but not the position, and without position I don't know the length of the substring. How can I get the index of the substring in pure C?
Use pointer subtraction.
char *str = "sdfadabcGGGGGGGGG";
char *result = strstr(str, "abc");
int position = result - str;
int substringLength = strlen(str) - position;
newptr - source will give you the offset.
char *source = "XXXXabcYYYY";
char *dest = strstr(source, "abc");
int pos;
pos = dest - source;
Here is a C version of the strpos function with an offset feature...
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int strpos(char *haystack, char *needle, int offset);
int main()
{
char *p = "Hello there all y'al, hope that you are all well";
int pos = strpos(p, "all", 0);
printf("First all at : %d\n", pos);
pos = strpos(p, "all", 10);
printf("Second all at : %d\n", pos);
}
int strpos(char *hay, char *needle, int offset)
{
char haystack[strlen(hay)];
strncpy(haystack, hay+offset, strlen(hay)-offset);
char *p = strstr(haystack, needle);
if (p)
return p - haystack+offset;
return -1;
}
If you have the pointer to the first char of the substring, and the substring ends at the end of the source string, then:
strlen(substring) will give you its length.
substring - source will give you the start index.
Formally the others are right - substring - source is indeed the start index. But you won't need it: you would use it as index into source. So the compiler calculates source + (substring - source) as the new address - but just substring would be enough for nearly all use cases.
Just a hint for optimization and simplification.
A function to cut a word out out of a string by a start and end word
string search_string = "check_this_test"; // The string you want to get the substring
string from_string = "check"; // The word/string you want to start
string to_string = "test"; // The word/string you want to stop
string result = search_string; // Sets the result to the search_string (if from and to word not in search_string)
int from_match = search_string.IndexOf(from_string) + from_string.Length; // Get position of start word
int to_match = search_string.IndexOf(to_string); // Get position of stop word
if (from_match > -1 && to_match > -1) // Check if start and stop word in search_string
{
result = search_string.Substring(from_match, to_match - from_match); // Cuts the word between out of the serach_string
}
Related
#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]) {
char * str = "Testing replace text...\n\n";
char* buffer = malloc(sizeof(char));
char* insertPoint = &buffer[0];
char* copy = str;
char* p = strstr(str, "epl");
char* g = "gard";
int size = 0;
size = p-copy; //p = 9, which is the number of elemts till the first element of the substring
//want to allocate this space, and then increment insertPoint, by that amt(it'll be pointing
// to nothing)
buffer = realloc(buffer, size);
printf("Size: %d\n", size);
memcpy(insertPoint, copy, size);
printf("COPY: %s\n", buffer);
copy += size;
buffer = realloc(buffer, size+strlen(g));
insertPoint += size;
printf("%c", *insertPoint);
memcpy(insertPoint, g, strlen(g)); //insert after the 9 letters, the string the size of g
size += strlen(g); //size if the size of the buffer
printf("Size2: %d\n", size);
printf("COPY2: %s\n", buffer);
return EXIT_SUCCESS;
}
Just some quick experimental code; I am just trying to replace the substring epl in str with "gard" but when I print it out there are no changes to the string buffer I am printing, meaning the first string im printing works where it gets all the letters into buffer before the substring occurs, but when I try to replace with substring it doesn't work. I've testing the individual pointers and they all seem correct...not sure what is happening, any insight? Thanks...fully runnable program.
I think the problem in your code arise because strlen does not include the terminating zero. I tried to fix your code but in the end I found it easier to re-write it anew (and using more sensible variable names).
The following simple four steps work. The continuous use of strlen may be replaced by variables, but I left them for clarity. (Also, a good compiler may very well optimize this code by leaving the calls out.)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
char *str = "Testing replace text...\n\n";
char* buffer;
char* find_str = "epl";
char* repl_str = "gard";
char *find_str_pos = strstr (str, find_str);
/* 1. create new buffer of the correct size */
buffer = malloc (strlen(str) - strlen(find_str) + strlen(repl_str) + 1);
/* 2. copy first part */
memcpy (buffer, str, find_str_pos - str);
/* 3. add new text */
memcpy (buffer + (find_str_pos - str), repl_str, strlen(repl_str));
/* 4. append original text */
memcpy (buffer + (find_str_pos - str) + strlen(repl_str), find_str_pos + strlen(find_str), strlen(find_str_pos) - strlen(repl_str) + 1);
printf ("-> [%s]\n", buffer);
return EXIT_SUCCESS;
}
You are not appending remaining text "ace text..." after replacing "epl".
Can you try code below
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
int main() {
char *source_string = "Testing replace text...";
char *search_string = "epl";
char *replace_string = "gard";
int search_length = strlen(search_string);
int replace_length = strlen(replace_string);
// Find start position of search string
char *start = strstr(source_string, search_string); // start pointing to "eplace text..."
int intial_length = start - source_string; // intial_length = 9
// Get remaining text which should append after replace
char *remaining_string = (start + search_length); // remaining_string pointing to "ace text..."
int remaining_length = strlen(remaining_string); // remaining_length = 11
// Find total length of string after replacing text
int total_string_length = intial_length + replace_length + remaining_length; // 24
char *buffer = (char *)malloc(total_string_length + 1); // +1 for null pointer
char *current_index = buffer;
// Add initial text
memcpy(current_index, source_string, intial_length);
current_index += intial_length;
// Add replace text
memcpy(current_index, replace_string, replace_length);
current_index += replace_length;
// Add remaining text
memcpy(current_index, remaining_string, remaining_length);
current_index += remaining_length;
memcpy(current_index, "\0", 1); // add null pointer at last
printf("Final Output: %s", buffer); // Final Output: Testing rgardace text...
return 0;
}
I have a character pointer that points the begining of a string and an index less than the length of the string. Now I want to create a pointer to point a substring of original string from the begining to the index or a substring with above constraints. Please help me to find a way to get it done. Here is a bit of the code:
char* ch="323+465";//this is the original string
int index=2; //this is the index upto which I wish to create a substring,
// in java, it would have been ch.substring(0,3), if ch were a String
Thanks in advance.
You can't do that without creating 3 strings. The char point only marks the beginning of the string, so you would need to combine a pointer and an index into a new type. Remember you don't have strings in C. In languages like Java (and others) will create copies of the sub string anyway.
struct pseudo_string { char *s, int index; } vstring[3];
char* ch="323+465";
vstring[0].s = ch;
vstring[0].index = 2;
vstring[1].s = ch + index + 1; // weird
vstring[1].index = 1;
vstring[2].s = vstring[1].s + 1;
vstring[2].index = 2;
So it is overly complex and useless. In this case index is being used as counter...
If you want to keep the same base pointer, you gonna need 2 indices or 1 index and a len:
struct pseudo_string2 { char *s; int start; int end; };
But that's an overkill for small strings.
If don't want to use malloc, you can try to use a matrix:
char vstring[3][10]={0};
strncpy(vstring[0], ch, 3);
strncpy(vstring[1], ch+3, 1);
strncpy(vstring[2], ch+4, 3);
The advantage of the matrix, even if you waste few bytes, is that you don't need to deallocate it. But if you need to use these values outside this function, than you don't have another scape than to use malloc and free (don't consider globals for that ;-).
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char * substr(char *s, int start, int end)
{
int size = end - start + 2; // 1 for the inclusive limits and another 1 for the \0
char * r = (char*)malloc(size);
strncpy(r,s+start, size-1);
r[size-1]=0;
return r;
}
int main()
{
char* ch="323+465";
char *parts[3];
parts[0] = substr(ch, 0,2);
parts[1] = substr(ch, 3,3);
parts[2] = substr(ch, 4,6);
printf("%s %s %s\n", parts[0], parts[1], parts[2]);
free(parts[0]);
free(parts[1]);
free(parts[2]);
}
Make a copy of a suitable number of characters:
char * substr = malloc(index + 2);
strncpy(substr, ch, index + 1);
substr[index + 1] = 0;
// ...
free(substr);
If you're happy to mutilate the original string, just insert a null byte:
ch[index + 1] = 0;
The odd + 1 comes from the fact that your index seems to be inclusive, which is generally a bad idea.
You can't, because that would imply modifying the string literal, which is illegal.
Alternative:
char ch[]="323+465";
int index=2;
ch[index] = '\0';
First, make the string writable by using a char array instead of a pointer to a string literal. Then
char ch[] = "323+456";
int idx = 2;
ch[idx] = 0; /* Now ch holds the string "32" */
You should avoid an identifier clash with the classic BSD index function, that's why I used idx instead.
This solution assumes it is okay to modify the original string. If not, you need to allocate a new string first.
the same behavior of java substring (allocates new string)
char* res = (char*)malloc(index+2);
strncpy(res,ch,index+1);
res[index+1]='\0';
I see that you try to delimit by +, so easier is to use strtok :
char ch[] ="323+465";
char * res;
res = strtok (ch,"+");
// res= 323
I have a character array of length 32 and would like to take certain charcters out of it.
for example
111111000000000000000000111111 <32 chars
I would like to take chars 0-6 which would be 111111
Or even take chars 26-31 which would be 111111
char check_type[32];
Above is how I'm declaring.
What I would like to be able to do is define a function or use a function that takes that starting place, and end character.
Ive looked at many ways like using strncpy and strcpy but found no way yet.
I would simply wrap strncpy:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* Creates a sub-string of range [start, end], return value must be freed */
char *substr(char *src, size_t start, size_t end)
{
size_t sub_len = end - start + 1;
char * new_str = malloc(sub_len + 1); /* TODO: check malloc's return value */
strncpy(new_str, src, sub_len);
new_str[sub_len] = '\0'; /* new_str is of size sub_len + 1 */
return new_str;
}
int main(void)
{
char str[] = "111111000000000000000000111111";
char *sub_str = substr(str, 0, 5);
puts(sub_str);
free(sub_str);
return EXIT_SUCCESS;
}
Output:
111111
Use memcpy.
// Stores s[from..to) in sub.
// The caller is responsible for memory allocation.
void extract_substr(char const *s, char *sub, size_t from, size_t to)
{
size_t sublen = to - from;
memcpy(sub, s + from, sublen);
sub[sublen] = '\0';
}
Sample:
char *substr(char *source, int startpos, int endpos)
{
int len = endpos - startpos + 2; // must account for final 0
int i = 0;
char *src, *dst;
char *ret = calloc(len, sizeof(char));
if (!ret)
return ret;
src = source + startpos;
dst = ret;
while (i++ < len)
*dst++ = *src++;
*dst = 0;
return ret;
}
Of course, free the return code when you don't need it anymore. And you notice this function will not check for the validity of endpos vs startpos.
First define the required interface...perhaps:
int substring(char *target, size_t tgtlen, const char *source, size_t src_bgn, size_t src_end);
This takes a destination (target) array where the data will be copied, and is given its length. The data will come from the source array, between positions src_bgn and src_end. The return value will be -1 for an error, and the length of the output (excluding the terminating null). If the target string is too short, you will get an error.
With that set of details in place, you can implement the body fairly easily, and strncpy() might well be appropriate this time (it often isn't).
Usage (based on your question):
char check_type[32] = "111111000000000000000000111111";
char result1[10];
char result2[10];
if (substring(result1, sizeof(result1), check_type, 0, 6) <= 0 ||
substring(result2, sizeof(result2), check_type, 26, 31) <= 0)
...something went wrong...
else
...use result1 and result2...
Check this:
char* Substring(char *string, int len, int start, int end) {
/*
Creates a substring from a given string.
Args:
string: The string whose substring you need to find.
len: The length of the string.
start: The start position for the substring.
end: The end position of the substring (inclusive).
Returns:
substring: (of type char*) which is allocated on the heap.
NULL: on error.
*/
// Check that the start and end position are valid.
// If not valid, then return NULL.
if (start < 0 || start >= len || end < 0 || end >= len) {
return NULL;
}
// Allocate memory to return the substring on the heap.
char *substring = malloc(sizeof(char) * (end - start + 2));
int index = 0, i;
for (i = start; i <= end; i++) {
substring[index] = string[i];
index++;
}
// End with a null character.
substring[index] = '\0';
return substring;
}
int main() {
char str[] = "11111100000000000000000000111111";
printf("%s\n", Substring(str, strlen(str), 0, 5));
printf("%s\n", Substring(str, strlen(str), 26, 31));
}
I know C is purposefully bare-bones, but I'm curious as to why something as commonplace as a substring function is not included in <string.h>.
Is it that there is not one "right enough" way to do it? Too many domain specific requirements? Can anyone shed any light?
BTW, this is the substring function I came up with after a bit of research.
Edit: I made a few updates based on comments.
void substr (char *outStr, const char *inpStr, int startPos, size_t strLen) {
/* Cannot do anything with NULL. */
if (inpStr == NULL || outStr == NULL) return;
size_t len = strlen (inpStr);
/* All negative positions to go from end, and cannot
start before start of string, force to start. */
if (startPos < 0) {
startPos = len + startPos;
}
if (startPos < 0) {
startPos = 0;
}
/* Force negative lengths to zero and cannot
start after end of string, force to end. */
if ((size_t)startPos > len) {
startPos = len;
}
len = strlen (&inpStr[startPos]);
/* Adjust length if source string too short. */
if (strLen > len) {
strLen = len;
}
/* Copy string section */
memcpy(outStr, inpStr+startPos, strLen);
outStr[strLen] = '\0';
}
Edit: Based on a comment from r I also came up with this one liner. You're on your own for checks though!
#define substr(dest, src, startPos, strLen) snprintf(dest, BUFF_SIZE, "%.*s", strLen, src+startPos)
Basic standard library functions don't burden themselves with excessive expensive safety checks, leaving them to the user. Most of the safety checks you carry out in your implementation are of expensive kind: totally unacceptable in such a basic library function. This is C, not Java.
Once you get some checks out of the picture, the "substrung" function boils down to ordinary strlcpy. I.e ignoring the safety check on startPos, all you need to do is
char *substr(const char *inpStr, char *outStr, size_t startPos, size_t strLen) {
strlcpy(outStr, inpStr + startPos, strLen);
return outStr;
}
While strlcpy is not a part of the standard library, but it can be crudely replaced by a [misused] strncpy. Again, ignoring the safety check on startPos, all you need to do is
char *substr(const char *inpStr, char *outStr, size_t startPos, size_t strLen) {
strncpy(outStr, inpStr + startPos, strLen);
outStr[strLen] = '\0';
return outStr;
}
Ironically, in your code strncpy is misused in the very same way. On top of that, many of your safety checks are the direct consequence of your choosing a signed type (int) to represent indices, while proper type would be an unsigned one (size_t).
Perhaps because it's a one-liner:
snprintf(dest, dest_size, "%.*s", sub_len, src+sub_start);
You DO have strcpy and strncpy. Aren't enough for you? With strcpy you can simulate the substring from character to end, with strncpy you can simulate the substring from character for a number of characters (you only need to remember to add the \0 at the end of the string). strncpy is even better than the C# equivalent, because you can overshoot the length of the substring and it won't throw an error (if you have allocated enough space in dest, you can do strncpy(dest, src, 1000) even if src is long 1. In C# you can't.)
As written in the comment, you can even use memcpy, but remember to always add a \0 at the end of the string, and you must know how many characters you are copying (so you must know exactly the length of the src substring) AND it's a little more complex to use if a day you want to refactor your code to use wchar_t AND it's not type-safe (because it accepts void* instead of char*). All this in exchange for a little more speed over strncpy
In C you have a function that returns a subset of symbols from a string via pointers: strstr.
char *ptr;
char string1[] = "Hello World";
char string2[] = "World";
ptr = strstr(string1, string2)
*ptr will be pointing to the first character occurrence.
BTW you did not write a function but a procedure, ANSI string functions: string.h
Here's a lighter weight version of what you want. Avoids the redundant strlen calls and guarantees null termination on the destination buffer (something strncpy won't do).
void substr(char* pszSrc, int start, int N, char* pszDst, int lenDest)
{
const char* psz = pszSrc + start;
int x = 0;
while ((x < N) && (x < lenDest))
{
char ch = psz[x];
pszDst[x] = ch;
x++;
if (ch == '\0')
{
return;
}
}
// guarantee null termination
if (x > 0)
{
pszDest[x-1] = 0;
}
}
Example:
char *pszLongString = "This is a long string";
char szSub[10];
substr(pszLongString, 0, 4, szSub, 10); // copies "long" into szSub and includes the null char
So while there isn't a formal substring function in C, C++ string classes usually have such a method:
#include <string>
...
std::string str;
std::string strSub;
str = "This is a long string";
strSub = str.substr(10, 4); // "long"
printf("%s\n", strSub.c_str());
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const char* substr(const char *string, size_t from, size_t to);
int main(int argc, char *argv[])
{
char *string = argv[1];
const char *substring = substr(string,6,80);
printf("string is [%s] substring is [%s]\n",string,substring);
return 0;
}
const char* substr(const char *string, size_t from, size_t to)
{
if (to <= from)
return NULL;
if (from >= to)
return NULL;
if (string == NULL)
return NULL;
if (strlen(string) == 0)
return NULL;
if (from < 0)
from = 0;
if (to > strlen(string))
to = strlen(string);
char *substring = malloc(sizeof(char) * ((to-from)+1));
size_t index;
for (index = 0; from < to; from++, index++)
substring[index] = string[from];
substring[index] = '\0';
return substring;
}
Is there a C library function that will return the index of a character in a string?
So far, all I've found are functions like strstr that will return the found char *, not it's location in the original string.
strstr returns a pointer to the found character, so you could use pointer arithmetic: (Note: this code not tested for its ability to compile, it's one step away from pseudocode.)
char * source = "test string"; /* assume source address is */
/* 0x10 for example */
char * found = strstr( source, "in" ); /* should return 0x18 */
if (found != NULL) /* strstr returns NULL if item not found */
{
int index = found - source; /* index is 8 */
/* source[8] gets you "i" */
}
I think that
size_t strcspn ( const char * str1, const char * str2 );
is what you want. Here is an example pulled from here:
/* strcspn example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] = "fcba73";
char keys[] = "1234567890";
int i;
i = strcspn (str,keys);
printf ("The first number in str is at position %d.\n",i+1);
return 0;
}
EDIT: strchr is better only for one char.
Pointer aritmetics says "Hellow!":
char *pos = strchr (myString, '#');
int pos = pos ? pos - myString : -1;
Important: strchr () returns NULL if no string is found
You can use strstr to accomplish what you want. Example:
char *a = "Hello World!";
char *b = strstr(a, "World");
int position = b - a;
printf("the offset is %i\n", position);
This produces the result:
the offset is 6
If you are not totally tied to pure C and can use string.h there is strchr()
See here
Write your own :)
Code from a BSD licensed string processing library for C, called zString
https://github.com/fnoyanisi/zString
int zstring_search_chr(char *token,char s){
if (!token || s=='\0')
return 0;
for (;*token; token++)
if (*token == s)
return 1;
return 0;
}
You can write
s="bvbrburbhlkvp";
int index=strstr(&s,"h")-&s;
to find the index of 'h' in the given garble.