String.indexOf function in C - c

Is there a C library function that will return the index of a character in a string?
So far, all I've found are functions like strstr that will return the found char *, not it's location in the original string.

strstr returns a pointer to the found character, so you could use pointer arithmetic: (Note: this code not tested for its ability to compile, it's one step away from pseudocode.)
char * source = "test string"; /* assume source address is */
/* 0x10 for example */
char * found = strstr( source, "in" ); /* should return 0x18 */
if (found != NULL) /* strstr returns NULL if item not found */
{
int index = found - source; /* index is 8 */
/* source[8] gets you "i" */
}

I think that
size_t strcspn ( const char * str1, const char * str2 );
is what you want. Here is an example pulled from here:
/* strcspn example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] = "fcba73";
char keys[] = "1234567890";
int i;
i = strcspn (str,keys);
printf ("The first number in str is at position %d.\n",i+1);
return 0;
}

EDIT: strchr is better only for one char.
Pointer aritmetics says "Hellow!":
char *pos = strchr (myString, '#');
int pos = pos ? pos - myString : -1;
Important: strchr () returns NULL if no string is found

You can use strstr to accomplish what you want. Example:
char *a = "Hello World!";
char *b = strstr(a, "World");
int position = b - a;
printf("the offset is %i\n", position);
This produces the result:
the offset is 6

If you are not totally tied to pure C and can use string.h there is strchr()
See here

Write your own :)
Code from a BSD licensed string processing library for C, called zString
https://github.com/fnoyanisi/zString
int zstring_search_chr(char *token,char s){
if (!token || s=='\0')
return 0;
for (;*token; token++)
if (*token == s)
return 1;
return 0;
}

You can write
s="bvbrburbhlkvp";
int index=strstr(&s,"h")-&s;
to find the index of 'h' in the given garble.

Related

Get index of substring

I have char * source, and I want extract from it subsrting, that I know is beginning from symbols "abc", and ends where source ends. With strstr I can get the poiner, but not the position, and without position I don't know the length of the substring. How can I get the index of the substring in pure C?
Use pointer subtraction.
char *str = "sdfadabcGGGGGGGGG";
char *result = strstr(str, "abc");
int position = result - str;
int substringLength = strlen(str) - position;
newptr - source will give you the offset.
char *source = "XXXXabcYYYY";
char *dest = strstr(source, "abc");
int pos;
pos = dest - source;
Here is a C version of the strpos function with an offset feature...
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int strpos(char *haystack, char *needle, int offset);
int main()
{
char *p = "Hello there all y'al, hope that you are all well";
int pos = strpos(p, "all", 0);
printf("First all at : %d\n", pos);
pos = strpos(p, "all", 10);
printf("Second all at : %d\n", pos);
}
int strpos(char *hay, char *needle, int offset)
{
char haystack[strlen(hay)];
strncpy(haystack, hay+offset, strlen(hay)-offset);
char *p = strstr(haystack, needle);
if (p)
return p - haystack+offset;
return -1;
}
If you have the pointer to the first char of the substring, and the substring ends at the end of the source string, then:
strlen(substring) will give you its length.
substring - source will give you the start index.
Formally the others are right - substring - source is indeed the start index. But you won't need it: you would use it as index into source. So the compiler calculates source + (substring - source) as the new address - but just substring would be enough for nearly all use cases.
Just a hint for optimization and simplification.
A function to cut a word out out of a string by a start and end word
string search_string = "check_this_test"; // The string you want to get the substring
string from_string = "check"; // The word/string you want to start
string to_string = "test"; // The word/string you want to stop
string result = search_string; // Sets the result to the search_string (if from and to word not in search_string)
int from_match = search_string.IndexOf(from_string) + from_string.Length; // Get position of start word
int to_match = search_string.IndexOf(to_string); // Get position of stop word
if (from_match > -1 && to_match > -1) // Check if start and stop word in search_string
{
result = search_string.Substring(from_match, to_match - from_match); // Cuts the word between out of the serach_string
}

Why is substring not part of the C standard library?

I know C is purposefully bare-bones, but I'm curious as to why something as commonplace as a substring function is not included in <string.h>.
Is it that there is not one "right enough" way to do it? Too many domain specific requirements? Can anyone shed any light?
BTW, this is the substring function I came up with after a bit of research.
Edit: I made a few updates based on comments.
void substr (char *outStr, const char *inpStr, int startPos, size_t strLen) {
/* Cannot do anything with NULL. */
if (inpStr == NULL || outStr == NULL) return;
size_t len = strlen (inpStr);
/* All negative positions to go from end, and cannot
start before start of string, force to start. */
if (startPos < 0) {
startPos = len + startPos;
}
if (startPos < 0) {
startPos = 0;
}
/* Force negative lengths to zero and cannot
start after end of string, force to end. */
if ((size_t)startPos > len) {
startPos = len;
}
len = strlen (&inpStr[startPos]);
/* Adjust length if source string too short. */
if (strLen > len) {
strLen = len;
}
/* Copy string section */
memcpy(outStr, inpStr+startPos, strLen);
outStr[strLen] = '\0';
}
Edit: Based on a comment from r I also came up with this one liner. You're on your own for checks though!
#define substr(dest, src, startPos, strLen) snprintf(dest, BUFF_SIZE, "%.*s", strLen, src+startPos)
Basic standard library functions don't burden themselves with excessive expensive safety checks, leaving them to the user. Most of the safety checks you carry out in your implementation are of expensive kind: totally unacceptable in such a basic library function. This is C, not Java.
Once you get some checks out of the picture, the "substrung" function boils down to ordinary strlcpy. I.e ignoring the safety check on startPos, all you need to do is
char *substr(const char *inpStr, char *outStr, size_t startPos, size_t strLen) {
strlcpy(outStr, inpStr + startPos, strLen);
return outStr;
}
While strlcpy is not a part of the standard library, but it can be crudely replaced by a [misused] strncpy. Again, ignoring the safety check on startPos, all you need to do is
char *substr(const char *inpStr, char *outStr, size_t startPos, size_t strLen) {
strncpy(outStr, inpStr + startPos, strLen);
outStr[strLen] = '\0';
return outStr;
}
Ironically, in your code strncpy is misused in the very same way. On top of that, many of your safety checks are the direct consequence of your choosing a signed type (int) to represent indices, while proper type would be an unsigned one (size_t).
Perhaps because it's a one-liner:
snprintf(dest, dest_size, "%.*s", sub_len, src+sub_start);
You DO have strcpy and strncpy. Aren't enough for you? With strcpy you can simulate the substring from character to end, with strncpy you can simulate the substring from character for a number of characters (you only need to remember to add the \0 at the end of the string). strncpy is even better than the C# equivalent, because you can overshoot the length of the substring and it won't throw an error (if you have allocated enough space in dest, you can do strncpy(dest, src, 1000) even if src is long 1. In C# you can't.)
As written in the comment, you can even use memcpy, but remember to always add a \0 at the end of the string, and you must know how many characters you are copying (so you must know exactly the length of the src substring) AND it's a little more complex to use if a day you want to refactor your code to use wchar_t AND it's not type-safe (because it accepts void* instead of char*). All this in exchange for a little more speed over strncpy
In C you have a function that returns a subset of symbols from a string via pointers: strstr.
char *ptr;
char string1[] = "Hello World";
char string2[] = "World";
ptr = strstr(string1, string2)
*ptr will be pointing to the first character occurrence.
BTW you did not write a function but a procedure, ANSI string functions: string.h
Here's a lighter weight version of what you want. Avoids the redundant strlen calls and guarantees null termination on the destination buffer (something strncpy won't do).
void substr(char* pszSrc, int start, int N, char* pszDst, int lenDest)
{
const char* psz = pszSrc + start;
int x = 0;
while ((x < N) && (x < lenDest))
{
char ch = psz[x];
pszDst[x] = ch;
x++;
if (ch == '\0')
{
return;
}
}
// guarantee null termination
if (x > 0)
{
pszDest[x-1] = 0;
}
}
Example:
char *pszLongString = "This is a long string";
char szSub[10];
substr(pszLongString, 0, 4, szSub, 10); // copies "long" into szSub and includes the null char
So while there isn't a formal substring function in C, C++ string classes usually have such a method:
#include <string>
...
std::string str;
std::string strSub;
str = "This is a long string";
strSub = str.substr(10, 4); // "long"
printf("%s\n", strSub.c_str());
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const char* substr(const char *string, size_t from, size_t to);
int main(int argc, char *argv[])
{
char *string = argv[1];
const char *substring = substr(string,6,80);
printf("string is [%s] substring is [%s]\n",string,substring);
return 0;
}
const char* substr(const char *string, size_t from, size_t to)
{
if (to <= from)
return NULL;
if (from >= to)
return NULL;
if (string == NULL)
return NULL;
if (strlen(string) == 0)
return NULL;
if (from < 0)
from = 0;
if (to > strlen(string))
to = strlen(string);
char *substring = malloc(sizeof(char) * ((to-from)+1));
size_t index;
for (index = 0; from < to; from++, index++)
substring[index] = string[from];
substring[index] = '\0';
return substring;
}

Making specific word in string uppercase C

I'm trying very hard to figure out a way to parse a string and "highlight" the search term in the result by making it uppercase.
I've tried using strstr and moving a pointer along and "toupper"ing the characters, to no avail.
char * highlight( char *str, char *searchstr ) {
char *pnt=str;
int i;
pnt=strstr(str,searchstr);
while(pnt){
printf("ststr retured: %s\n", pnt);
for(i=0;i<strlen(searchstr);i++) {
printf("%c",toupper(pnt[i]));
}
printf("\n");
pnt=pnt+strlen(searchstr);
pnt=strstr(pnt,searchstr);
}
return str;
}
Any advice is greatly appreciated.
Since Schot mentioned every occurrence:
#include <string.h>
char *highlight(char *str, char *searchstr) {
char *pnt = str;
while (pnt = strstr(pnt, searchstr)) {
char *tmp = searchstr;
while(*(tmp++)) { *pnt = toupper(*pnt); pnt++; }
}
return str;
}
int main() {
char s[] = "hello world follow llollo";
char search[] = "llo";
puts(highlight(s, search));
return 0;
}
output is:
$ ./a.out
heLLO world foLLOw LLOLLO
You appreciate that the function takes the string as an argument and then returns that same string, while having -not- modified that string? all the function does is print to stdout the capital characters.
At some point, you would need to change the string itself, e.g.;
pnt[i] = toupper( pnt[i] );
Like Blank Xavier said, you probably want to modify the actual string. toupper does not change the value of the character you supply, but returns a new character that is its uppercase version. You have to explicitly assign it back to the original string.
Some additional tips:
Never do multiple strlen calls on a string that doesn't change, do it once and store the result.
You can express the promise of not changing searchstr by declaring it as const char *.
Below is an example with a (in my opinion) easy method of looping through all strstr matches:
#include <string.h>
#include <ctype.h>
char *highlight(char *s, const char *t)
{
char *p;
size_t i, len = strlen(t);
for (p = s; (p = strstr(p, t)); p += len)
for (i = 0; i < len; i++)
p[i] = toupper(p[i]);
return s;
}

strpos in C- how does it work

I am really new to C.
I want to use the strpos function but it is telling me it doesnt exist?
Here a complete snippet code to solve you problem.
PS: Isn't too late to help. ;)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NOT_FOUND -1
int main (){
int pos = NOT_FOUND;
if ( (pos = strpos( "subsstring", "string")) != NOT_FOUND )
printf("found at %d\n", pos);
else
printf("not found!\n");
return 0;
}
int strpos(const char *haystack, const char *needle)
{
const char *p = strstr(haystack, needle);
if (p)
return p - haystack;
return NOT_FOUND;
}
Edit: Answering Can Vural question:
No. I really think that it would be as it is. At structured programming paradigm, it's a common practice to use the scope structure as first parameter on every function that belongs to the structure's scope itself. The strstr function defined at string.h follow the same approach.
On OOP you have haystack.indexOf( needle ). At structured programming, you have indexOf( haystack, needle ).
The function you are looking for might be either strstr or strchr. You then need to include string.h. There is no strpos in the POSIX interface.
Yes. It's called strstr, related to strpos like (pseudo-code):
strpos(str, target) {
res = strstr(str, target);
if (res == NULL) return false;
else return res - str;
}
I have written strpos() function from scratch with position feature(Like PHP's strpos() function). Return value will be starting position of searched string. Enjoy! :)
In this example code output will be 12
#include <stdio.h>
#include <string.h>
int strpos(char *haystack, char *needle, int pos);
int main(){
printf("%d",strpos("abcdefabcdefabcdef asdfgavcabcddd","abc",10));
return 0;
}
int strpos(char *haystack, char *needle, int pos){
int i,j,check,result = -1;
int len_needle=strlen(needle);
int len_haystack=strlen(haystack);
i = pos;
if (len_needle>len_haystack || *needle==NULL || i>(len_haystack-1)) return result;
for(;i<len_haystack;i++){
check = 0;
for(j=0;j<len_needle;j++){
if(haystack[i+j]==needle[j]){
check++;
}
}
if(check==len_needle){
result = i;
break;
}
}
return result;
}
This is in response to Miere and Can Vural. I can't add comments yet so will add this as an answer.
Shouldn't it be strpos("string", "substring") – Can Vural
At structured programming, you have indexOf( haystack, needle ). Miere
In your code, you have:
int strpos(char *haystack, char *needle)
but you also have:
(pos = strpos( "subsstring", "string"))
I fully agree with the "int strpos(char *haystack, char *needle)" where the string to be searched comes first and the string to search FOR comes second. But to me, "subsstring" (in the context of "one is a substring and one is a string"), "subsstring" implies that IT is the shorter of the two and that you're trying to find "substring" in "string."
So the one part:
(pos = strpos( "subsstring", "string"))
should be:
(pos = strpos( "string" /*that which is being searched within*/, "substring" /*that which is being searched for in the previous parameter*/))
which would be the same as:
(pos = strpos( "haystack", "needle"))
Edit: One of the C comments above wasn't closed properly due to a typo.
There is no function strpos defined in the Standard C library nor in the POSIX Standard. PHP has a function strpos with this definition:
strpos(string $haystack, string $needle, int $offset = 0): int|false
This function locates a substring needle inside a string haystack and returns the offset from the beginning of the string haystack.
C has function strstr that can be used for this purpose and returns a pointer to the substring or NULL if no match is found:
char *strstr(const char *haystack, const char *needle);
Here is an implementation for a C equivalent of strpos relying on strstr for the dirty work. Note however that the name strpos is reserved for future functions in <string.h>:
#include <string.h>
int strpos(const char *haystack, const char *needle, int offset) {
char *p;
size_t len, pos;
len = strlen(haystack);
pos = 0;
if (offset < 0) {
if (len > INT_MAX || offset < -(int)len)
pos = len + offset;
} else {
if (len <= INT_MAX && offset > len)
return NULL;
pos = offset;
}
p = strstr(haystack + pos, needle);
if (p != NULL && p - haystack <= INT_MAX)
return (int)(p - haystack);
else
return -1;
}

Is there a function in c that will return the index of a char in a char array?

Is there a function in c that will return the index of a char in a char array?
For example something like:
char values[] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char find = 'E';
int index = findIndexOf( values, find );
strchr returns the pointer to the first occurrence, so to find the index, just take the offset with the starting pointer. For example:
char values[] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char find = 'E';
const char *ptr = strchr(values, find);
if(ptr) {
int index = ptr - values;
// do something
}
There's also size_t strcspn(const char *str, const char *set); it returns the index of the first occurence of the character in s that is included in set:
size_t index = strcspn(values, "E");
int index = strchr(values,find)-values;
Note, that if there's no find found, then strchr returns NULL, so index will be negative.
Safe index_of() function that works even when it finds nothing (returns -1 in such case).
#include <stddef.h>
#include <string.h>
ptrdiff_t index_of(const char *string, char search) {
const char *moved_string = strchr(string, search);
/* If not null, return the difference. */
if (moved_string) {
return moved_string - string;
}
/* Character not found. */
return -1;
}
What about strpos?
#include <string.h>
int index;
...
index = strpos(values, find);
Note that strpos expects a zero-terminated string, which means you should add a '\0' at the end. If you can't do that, you're left with a manual loop and search.
You can use strcspn() to get the index of a char in a string, or use my lousy implementation:
// Returns the index of the first occurrence of a char
int string_indexof(char ch, char *everything) {
int everythingLength = strlen(everything);
for (int i = 0; i < everythingLength; i++) {
if (ch == everything[i]) {
return i;
}
}
return -1;
}
You can use strchr to get a pointer to the first occurrence and the subtract that (if not null) from the original char* to get the position.

Resources