Extracting web addresses from a string in C

Extracting web addresses from a string in C - c

I have trouble with my code and I need your help! What I need to do is to write a function that will extract the web address that starts from www. and ends with .edu from an inputted string. The inputted string will have no spaces in it so scanf() should work well here.
For example:
http://www.school.edu/admission. The extracted address should be www.school.edu.
This is what I came up with so far, it obviously didn't work, and I can't think of anything else unfortunately.
void extract(char *s1, char *s2) {
int size = 0;
char *p, *j;
p = s1;
j = s2;
size = strlen(s1);
for(p = s1; p < (s1 + size); p++) {
if(*p == 'w' && *(p+1) == 'w' && *(p+2) == 'w' && *(p+3) == '.'){
for(p; p < (p+4); p++)
strcat(*j, *p);
}
else if(*p=='.' && *(p+1)=='e' && *(p+2)=='d' && *(p+3)=='u'){
for(p; (p+1) < (p+4); p++)
strcat(*j, *p);
}
}
size = strlen(j);
*(j+size+1) = '\0';
}
The function has to use pointer arithmetic. The errors I get have something to do with incompatible types and casting. Thanks ahead!

So the most trivial approach might be:
#include <stdio.h>
int main(void)
{
char str[1000];
sscanf("http://www.school.edu/admission", "%*[^/]%*c%*c%[^/]", str);
puts(str);
}
Now, here goes the fixed code:
#include <stdio.h>
#include <string.h>
void extract(char *s1, char *s2) {
size_t size = strlen(s1), i = 0;
while(memcmp(s1 + i, "www.", 4)){
i++;
}
while(memcmp(s1 + i, ".edu", 4)){
*s2++ = *(s1 + i);
i++;
}
*s2 = '\0';
strcat(s2, ".edu");
}
int main(void)
{
char str1[1000] = "http://www.school.edu/admission", str2[1000];
extract(str1, str2);
puts(str2);
}
Note that s2 must be large enough to contain the extracted web address, or you may get a segfault.

This is an easy solution for your problem:
char* extract(char *s1) {
char* ptr_www;
char* ptr_edu;
int len ;
char* s2;
ptr_www = strstr(s1,"www");
ptr_edu = strstr(s1,".edu");
len = ptr_edu -ptr_www + 4;
s2 = malloc (sizeof(char)*len+1);
strncpy(s2,ptr_www,len);
s2[len] = '\0';
printf ("%s",s2);
return s2;
}

There is a lot wrong unfortunately. Your compilation is failing because you pass a char to strcat when it expects a char*. Even if it did compile though it would crash.
for(p = s1; p < (s1 + size); p++) {
// This if statement will reference beyond s1+size when p=s1+size-2. Consequently it may segfault
if(*p=='w' && *(p+1)=='w' && *(p+2)=='w' && *(p+3)=='.') {
for(p; p < (p+4); p++) // This is an infinite loop
// strcat concatenates one string onto another.
// Dereferencing the pointer makes no sense.
// This is the likely causing your compilation error.
// If this compiled it would almost certainly segfault.
strcat(*j, *p);
}
// This will also reference beyond s1+size. Consequently it may segfault
else if(*p=='.' && *(p+1)=='e' && *(p+2)=='d' && *(p+3)=='u') {
for(p; (p+1) < (p+4); p++) // This is also an infinite loop
// Again strcat expects 2x char* (aka. strings) not 2x char
// This will also almost certainly segfault.
strcat(*j, *p);
}
}
// strlen() counts the number of chars until the first '\0' occurrence
// It is never correct to call strlen() to determine where to add a '\0' string termination character.
// If the character were actually absent this would almost certainly result in a segfault.
// As it is strcat() (when called correctly) will add the terminator anyway.
size = strlen(j);
*(j+size+1) = '\0';
EDIT: This seems like a homework question, so I thought it would be more constructive to mention where your current code is going wrong, so you can recheck your knowledge in those areas.
The answer to your exact question is it doesn't compile because you dereference the string and hence pass 2x char instead of char* to strcat().

Related

Manipulating a string and rewriting it by the function output

For some functions for string manipulation, I try to rewrite the function output onto the original string. I came up with the general scheme of
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *char_repeater(char *str, char ch)
{
int tmp_len = strlen(str) + 1; // initial size of tmp
char *tmp = (char *)malloc(tmp_len); // initial size of tmp
// the process is normally too complicated to calculate the final length here
int j = 0;
for (int i = 0; i < strlen(str); i++)
{
tmp[j] = str[i];
j++;
if (str[i] == ch)
{
tmp[j] = str[i];
j++;
}
if (j > tmp_len)
{
tmp_len *= 2; // growth factor
tmp = realloc(tmp, tmp_len);
}
}
tmp[j] = 0;
char *output = (char *)malloc(strlen(tmp) + 1);
// output matching the final string length
strncpy(output, tmp, strlen(tmp));
output[strlen(tmp)] = 0;
free(tmp); // Is it necessary?
return output;
}
int main()
{
char *str = "This is a test";
str = char_repeater(str, 'i');
puts(str);
free(str);
return 0;
}
Although it works on simple tests, I am not sure if I am on the right track.
Is this approach safe overall?
Of course, we do not re-write the string. We simply write new data (array of the characters) at the same pointer. If output is longer than str, it will rewrite the data previously written at str, but if output is shorter, the old data remains, and we would have a memory leak. How can we free(str) within the function before outputting to its pointer?

A pair of pointers can be used to iterate through the string.
When a matching character is found, increment the length.
Allocate output as needed.
Iterate through the string again and assign the characters.
This could be done in place if str was malloced in main.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *char_repeater(char *str, char ch)
{
int tmp_len = strlen(str) + 1; // initial size of tmp
char *find = str;
while ( *find) // not at terminating zero
{
if ( *find == ch) // match
{
tmp_len++; // add one
}
++find; // advance pointer
}
char *output = NULL;
if ( NULL == ( output = malloc(tmp_len)))
{
fprintf ( stderr, "malloc peoblem\n");
exit ( 1);
}
// output matching the final string length
char *store = output; // to advance through output
find = str; // reset pointer
while ( *find) // not at terminating zero
{
*store = *find; // assign
if ( *find == ch) // match
{
++store; // advance pointer
*store = ch; // assign
}
++store; // advance pointer
++find;
}
*store = 0; // terminate
return output;
}
int main()
{
char *str = "This is a test";
str = char_repeater(str, 'i');
puts(str);
free(str);
return 0;
}

For starters the function should be declared like
char * char_repeater( const char *s, char c );
because the function does not change the passed string.
Your function is unsafe and inefficient at least because there are many dynamic memory allocations. You need to check that each dynamic memory allocation was successful. Also there are called the function strlen also too ofhen.
Also this code snippet
tmp[j] = str[i];
j++;
if (str[i] == ch)
{
tmp[j] = str[i];
j++;
}
if (j > tmp_len)
//...
can invoke undefined behavior. Imagine that the source string contains only one letter 'i'. In this case the variable tmp_len is equal to 2. So temp[0] will be equal to 'i' and temp[1] also will be equal to 'i'. In this case j equal to 2 will not be greater than tmp_len. As a result this statement
tmp[j] = 0;
will write outside the allocated memory.
And it is a bad idea to reassign the pointer str
char *str = "This is a test";
str = char_repeater(str, 'i');
As for your question whether you need to free the dynamically allocated array tmp
free(tmp); // Is it necessary?
then of course you need to free it because you allocated a new array for the result string
char *output = (char *)malloc(strlen(tmp) + 1);
And as for your another question
but if output is shorter, the old data remains, and we would have a
memory leak. How can we free(str) within the function before
outputting to its pointer?
then it does not make a sense. The function creates a new character array dynamically that you need to free and the address of the allocated array is assigned to the pointer str in main that as I already mentioned is not a good idea.
You need at first count the length of the result array that will contain duplicated characters and after that allocate memory only one time.
Here is a demonstration program.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * char_repeater( const char *s, char c )
{
size_t n = 0;
for ( const char *p = s; ( p = strchr( p, c ) ) != NULL; ++p )
{
++n;
}
char *result = malloc( strlen( s ) + 1 + n );
if ( result != NULL )
{
if ( n == 0 )
{
strcpy( result, s );
}
else
{
char *p = result;
do
{
*p++ = *s;
if (*s == c ) *p++ = c;
} while ( *s++ );
}
}
return result;
}
int main( void )
{
const char *s = "This is a test";
puts( s );
char *result = char_repeater( s, 'i' );
if ( result != NULL ) puts( result );
free( result );
}
The program output is
This is a test
Thiis iis a test

My kneejerk reaction is to dislike the design. But I have reasons.
First, realloc() is actually quite efficient. If you are just allocating a few extra bytes every loop, then chances are that the standard library implementation simply increases the internal bytecount value associated with your memory. Caveats are:
Interleaving memory management.Your function here doesn’t have any, but should you start calling other routines then keeping track of all that becomes an issue. Anything that calls other memory management routines can lead to the next problem:
Fragmented memory.If at any time the available block is too small for your new request, then a much more expensive operation to obtain more memory and copy everything over becomes an issue.
Algorithmic issues are:
Mixing memory management in increases the complexity of your code.
Every occurrence of c invokes a function call with potential to be expensive. You cannot control when it is expensive and when it is not.
Worst-case options (char_repeater( "aaaaaaaaaa", 'a' )) trigger worst-case potentialities.
My recommendation is to simply make two passes.
This passes several smell tests:
Algorithmic complexity is broken down into two simpler parts:
counting space required, and
allocating and copying.
Worst-case scenarios for allocation/reallocation are reduced to a single call to malloc().
Issues with very large strings are reduced:
You need at most space for 2 large strings (not 3, possibly repeated)
Page fault / cache boundary issues are similar (or the same) for both methods
Considering there are no real downsides to using a two-pass approach, I think that using a simpler algorithm is reasonable. Here’s code:
#include <stdio.h>
#include <stdlib.h>
char * char_repeater( const char * s, char c )
{
// FIRST PASS
// (1) count occurances of c in s
size_t number_of_c = 0;
const char * p = s;
while (*p) number_of_c += (*p++ == c);
// (2) get strlen s
size_t length_of_s = p - s;
// SECOND PASS
// (3) allocate space for the resulting string
char * dest = malloc( length_of_s + number_of_c + 1 );
// (4) copy s -> dest, duplicating every occurance of c
if (dest)
{
char * d = dest;
while (*s)
if ((*d++ = *s++) == c)
*d++ = c;
*d = '\0';
}
return dest;
}
int main(void)
{
char * s = char_repeater( "Hello world!", 'o' );
puts( s );
free( s );
return 0;
}
As always, know your data
Whether or not a two-pass approach actually is better than a realloc() approach depends on more factors than what is evident in a posting on the internet.
Nevertheless, I would wager that for general purpose strings that this is a better choice.
But, even if it isn’t, I would argue that a simpler algorithm, splitting tasks into trivial sub-tasks, is far easier to read and maintain. You should only start making tricky algorithms only if you have use-case profiling saying you need to spend more attention on it.
Without that, readability and maintainability trumps all other concerns.

C string recursive function to find out equality from middle

i feel kinda lost, since we started learning about pointers i kinda cant follow and i know its really important subject in C.
anyway!
so i got to make a recursive function, that will get 2 pointers:
1) pointer to index [0].
2) pointer 2 to the middle of the string.
now.. i gotta check if the first part from 0 to middle is equal from middle to end. like..... ADAMADAM.
before i transfer the string i changed entire lower letters to capitals to avoid case sensitivity... so i got something like this.. but its refusing to work.
also using constant is prohibited...
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define TRUE 1
#define FALSE 0
#define SS 81
int CheckString(char *,int *);
int main() {
char str[SS];
int length,len,strcheck;
printf("Please enter a string:\n");
fgets(str,SS,stdin);
len=(strlen(str) - 1);
if((len>0)&&(str[len]=='\n')) // replacing '\n' key to '\0'
str[len]='\0';
length=len/2;
strcheck=CheckString(str,&length);
if (strcheck==FALSE)
printf("FALSE.\n");
else
printf("TRUE.\n");
return 0;
}
// function
int CheckString(char *str, int *i) {
if (*str != '\0')
if (*str == str[*i])
return CheckString(str+1,i++);
else
return FALSE;
return TRUE;
}
so i guess i got some problem with the pointers

It seems you mean the following
#include <stdio.h>
#include <string.h>
int CheckString(const char *s, size_t *i)
{
return s[*i] == '\0' || *s == s[*i] && CheckString(s + 1, i);
}
int main( void )
{
char *s = "ADAMADAM";
size_t i = strlen(s) / 2;
int result = CheckString(s, &i);
printf("%s\n", result ? "true" : "false");
return 0;
}
The program output
true
Note: maybe you should calculate the value for the second argument the following way
size_t i = ( strlen(s) + 1 ) / 2;
Think about this.

The outer condition in the loop inside CheckString() should be checking for *(str + *i) != '\0', or equivalently, for str[*i] != '\0'. Also, you do not need to increment *i, and certainly not i, since that is a pointer. The value *i is the distance between the characters being checked in the two halves of the string.
The modified function could look like:
int CheckString(char *str, int *i) {
if (str[*i] != '\0') {
if (*str == str[*i]) {
return CheckString(str+1,i);
} else {
return FALSE;
}
}
return TRUE;
}

The problem specification says (more or less):
I've got to make a recursive function that will get 2 pointers:
pointer 1 to index [0].
pointer 2 to the middle of the string.
I've got to check if the first part from 0 to middle is equal to the second part from middle to end, like: ADAMADAM.
As an exercise in recursion, this is fine; as a way of implementing the functionality, recursion is overkill (iteration is fine).
There is confusion (ambiguity) about the interface to the function — the wording of the question seems to suggest two char * values, but the code uses a pointer to an integer as the second argument. That's singularly peculiar. An integer value could make sense, but a pointer to an integer does not.
We need to define the conditions carefully. Taking the example string given (char str1[] = "ADAMADAM";), the two pointers might be char *p1 = &str1[0]; char *p2 = &str1[0] + strlen(str1) / 2; — meaning p1 points to the first A and p2 to the third A. Consider an alternative string: char str2[] = "MADAMADAM";; The equivalent formula would leave p1 pointing at the first M and p2 pointing at the second M.
Assuming p1 and p2 are incremented in lock-step, then:
The strings are different if, at any point before *p2 equals '\0', *p1 != *p2.
If *p2 equals '\0', then the strings are the same.
By definition, p1 and p2 point to the same array, so pointer differences are legitimate.
Further, p1 must be less than p2 to be useful; p1 equal to p2 means the strings are identical trivially.
There is a strong argument that the 'middle of the string' criterion means that either p2[p2 - p1] == '\0' or p2[p2 - p1 + 1] == '\0' (for even and odd string lengths respectively). That is, the distance between the two pointers indicates where the end of the string must be. It means that using p1 = &str[0] and p2 = &str[2] (on either of the sample strings) should fail because the end of string isn't in the right place. And if the string was "AMAMAMAM", using &str[0] and &str[2] should fail because the end of string isn't in the right place; ditto &str[0] and &str[6].
However, this 'strong argument' is also a design decision. It would be feasible to simply demand that the substring from p2 to EOS (end of string) is the same as the string from p1 for the same length. In that case, using &str[0] with either &str[2] or &str[6] (or, indeed, with the normal &str[4]) on "AMAMAMAM" would work fine.
Using some of these observations leads to this code. If you're really under instructions not to use const, simply remove the const qualifiers where they appear. The code will work the same.
#include <assert.h>
#include <stdbool.h>
#include <stdio.h>
#include <string.h>
static bool identical_halfstrings(const char *p1, const char *p2)
{
assert(p1 <= p2);
assert(strlen(p1) >= strlen(p2) + (p2 - p1));
if (*p2 == '\0')
return true;
if (*p1 != *p2)
return false;
return identical_halfstrings(p1+1, p2+1);
}
int main(void)
{
const char *strings[] =
{
"ADAMADAM",
"MADAMADAM",
"nonsense",
};
enum { NUM_STRINGS = sizeof(strings) / sizeof(strings[0]) };
for (int i = 0; i < NUM_STRINGS; i++)
{
const char *p1 = strings[i];
const char *p2 = strings[i] + strlen(strings[i]) / 2;
printf("[%s] ([%s]) = %s\n", p1, p2,
identical_halfstrings(p1, p2) ? "TRUE" : "FALSE");
}
return 0;
}
The second assertion ensures that p1 and p2 are pointing to the same string — that there isn't a null byte between the locations pointed at by p1 and p2.
Test case output:
[ADAMADAM] ([ADAM]) = TRUE
[MADAMADAM] ([MADAM]) = TRUE
[nonsense] ([ense]) = FALSE
Just for the record, an iterative version of the same function is:
static bool identical_halfstrings(const char *p1, const char *p2)
{
assert(p1 <= p2);
assert(strlen(p1) >= strlen(p2) + (p2 - p1));
while (*p2 != '\0')
{
if (*p1++ != *p2++)
return false;
}
return true;
}
It produces the same output for the sample data.

How to concatenate 2 strings using malloc and not the library functions

I need to create a function to concatenate 2 strings, in my case they are already given. I will need to concatenate the strings 'hello' and 'world!' to make it into 'helloworld!'. However, I can't use library functions besides strlen(). I also need to use malloc. I understand malloc would create n amounts of bytes for memory, however, how would I make it so that it can return a string array if thats possible.
Here is what I have so far,
#include <stdio.h>
#include <string.h>
int *my_strcat(const char* const str1, const char *const str2)
{
int s1, s2, s3, i = 0;
char *a;
s1 = strlen(str1);
s2 = strlen(str2);
s3 = s1 + s2 + 1;
a = char *malloc(size_t s3);
for(i = 0; i < s1; i++)
a[i] = str1[i];
for(i = 0; i < s2; i++)
a[i+s1] = str2[i];
a[i]='\0';
return a;
}
int main(void)
{
printf("%s\n",my_strcat("Hello","world!"));
return 0;
}
Thanks to anyone who can help me out.

This problem is imo a bit simpler with pointers:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *mystrcat(char *a, char *b) {
char *p, *q, *rtn;
rtn = q = malloc(strlen(a) + strlen(b) + 1);
for (p = a; (*q = *p) != '\0'; ++p, ++q) {}
for (p = b; (*q = *p) != '\0'; ++p, ++q) {}
return rtn;
}
int main(void) {
char *rtn = mystrcat("Hello ", "world!");
printf("Returned: %s\n", rtn);
free(rtn);
return 0;
}
But you can do the same thing with indices:
char *mystrcat(char *a, char *b) {
char *rtn = malloc(strlen(a) + strlen(b) + 1);
int p, q = 0;
for (p = 0; (rtn[q] = a[p]) != '\0'; ++p, ++q) {}
for (p = 0; (rtn[q] = b[p]) != '\0'; ++p, ++q) {}
return rtn;
}

Here is an alternate fix. First, you forgot #include <stdlib.h> for malloc(). You return a pointer to char from the function my_strcat(), so you need to change the function prototype to reflect this. I also changed the const declarations so that the pointers are not const, only the values that they point to:
char * my_strcat(const char *str1, const char *str2);
Your call to malloc() is incorrectly cast, and there is no reason to do so anyway in C. It also looks like you were trying to cast the argument in malloc() to size_t. You can do so, but you have to surround the type identifier with parentheses:
a = malloc((size_t) s3);
Instead, I have changed the type declaration for s1, s2, s3, i to size_t since all of these variables are used in the context of string lengths and array indices.
The loops were the most significant change, and the reason that I changed the consts in the function prototype. Your loops looked fine, but you can also use pointers for this. You step through the strings by incrementing a pointer, incrementing a counter i, and store the value stored there in the ith location of a. At the end, the index i has been incremented to indicate the location one past the last character, and you store a '\0' there. Note that in your original code, the counter i was not incremented to indicate the location of the null terminator of the concatenated string, because you reset it when you looped through str2. #jpw shows one way of solving this problem.
I changed main() just a little. I declared a pointer to char to receive the return value from the function call. That way you can free() the allocated memory when you are through with it.
Here is the modified code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char * my_strcat(const char *str1, const char *str2)
{
size_t s1, s2, s3, i = 0;
char *a;
s1 = strlen(str1);
s2 = strlen(str2);
s3 = s1+s2+1;
a = malloc(s3);
while(*str1 != '\0') {
a[i] = *str1;
str1++;
i++;
}
while(*str2 != '\0') {
a[i] = *str2;
str2++;
i++;
}
a[i] = '\0'; // Here i = s1 + s2
return a;
}
int main(void)
{
char *str = my_strcat("Hello", "world!");
printf("%s\n", str);
/* Always free allocated memory! */
free(str);
return 0;
}

There are a few issues:
In the return from malloc you don't need to do any cast (you had the syntax for the cast wrong anyway) (see this for more information).
You need to include the header stdlib.h for the malloc function.
And most importantly, a[i]='\0'; in this i is not what you need it to be; you want to add the null char at the end which should be a[s3]='\0'; (the length of s1+s2).
This version should be correct (unless I missed something):
#include <stdio.h>
#include <stdlib.h> //for malloc
#include <string.h>
char *my_strcat(const char* const str1, const char *const str2)
{
int s1,s2,s3,i=0;
char *a;
s1 = strlen(str1);
s2 = strlen(str2);
s3 = s1+s2+1;
a = malloc(s3);
for(i = 0; i < s1; i++) {
a[i] = str1[i];
}
for(i = 0; i < s2; i++) {
a[i+s1] = str2[i];
}
a[s3-1] = '\0'; // you need the size of s1 + s2 + 1 here, but - 1 as it is 0-indexed
return a;
}
int main(void)
{
printf("%s\n",my_strcat("Hello","world!"));
return 0;
}
Testing with Ideone renders this output: Helloworld!

How can I make a function to remove double letters in C?

I am trying to make a function that removes double letters from a string. The function is only supposed to remove double letters next to each other, not in the whole string. e.g 'aabbaa' would become 'aba' (not 'ab'). Im a fairly new to c programming and dont fully understand pointers etc. and need some help. Below is what I have so far. It does not work at all, and I have no idea what to return since when I try and return string[] it has an error:
char doubleletter( char *string[] ) {
char surname[25];
int i;
for((i = 1) ; string[i] != '\0' ; i++) {
if (string[i] == string[(i-1)]) { //Supposed to compare the ith letter in array with one before
string[i] = '\0' ; //Supposed to swap duplicate chars with null
}
}
surname[25] = string;
return surname ;

Try the following. It is a clear simple and professionally-looked code.:)
#include <stdio.h>
char * unique( char *s )
{
for ( char *p = s, *q = s; *q++; )
{
if ( *p != *q ) *++p = *q;
}
return s;
}
int main(void)
{
char s[] = "aabbaa";
puts( unique( s ) );
return 0;
}
The output is
aba
Also the function can be rewritten the following way that to escape unnecassary copying.
char * unique( char *s )
{
for ( char *p = s, *q = s; *q++; )
{
if ( *p != *q )
{
( void )( ( ++p != q ) && ( *p = *q ) );
}
}
return s;
}
Or
char * unique( char *s )
{
for ( char *p = s, *q = s; *q++; )
{
if ( *p != *q && ++p != q ) *p = *q;
}
return s;
}
It seems that the last realization is the best.:)

First of all delete those parenthenses aroung i = 1 in for loop (why you put them there in the first place ?
Secondly if you put \0 in the middle of the string, the string will just get shorter.
\0 terminates array (string) in C so if you have:
ababaabababa
and you replace second 'a' in pair with \0:
ababa\0baba
effectively for compiler it will be like you just cut this string to:
ababa
Third error here is probably that you are passing two-dimensional array to function here:
char *string[]
This is equivalent to passing char **string and essentialy you are passing array of strings while you wanna only to pass a string (which means a pointer, which means an array: char *string or ofc char string[])
Next thing: you are making internal assumption that passed string will have less than 24 chars (+ \0) but you don't check it anywhere.
I guess easiest way (though maybe not the most clever) to remove duplicated chars is to copy in this for loop passed string to another one, omitting repeated characters.

One example, It does not modify input string and returns a new dynamically allocated string. Pretty self explanatory I think:
char *new_string_without_dups(const char *input_str, size_t len)
{
int i = 1;
int j = 0;
char tmpstr[len+1] = {0};
for (; i < len; i++) {
if (input_str[i] == input_str[i-1]) {
continue;
}
tmpstr[j] = input_str[i];
j++;
}
return strdup(tmpstr);
}
Don't forget to free the returned string after usage.
Note that there are several ways to adapt/improve this. One thing now is that it requires C99 std due to array size not being known at compile time. Other things like you can get rid of the len argument if you guarantee a \0 terminated string as input. I'll leave that as excercises.

Your idea behind the code is right, but you are making two fundamental mistakes:
You return a char [] from a function that has char as return type. char [], char * and char are three different types, even though in this case char [] and char * would behave identically. However you would have to return char * from your function to be able to return a string.
You return automatically allocated memory. In other languages where memory is reference counted this is OK. In C this causes undefined behavior. You cannot use automatic memory from within a function outside this very function. The memory is considered empty after the function exits and will be reused, i.e. your value will be overwritten. You have to either pass a buffer in, to hold the result, or do a dynamic allocation within the function with malloc(). Which one you do is a matter of style. You could also reuse the input buffer, but writing the function like that is undesirable in any case where you need to preserve the input, and it will make it impossible for you to pass const char* into the function i.e. you would not be able to do do something like this:
const char *str = "abbc";
... doubleletter(str,...);
If I had to write the function I would probably call it something like this:
int doubleletter (const char *in, size_t inlen, char *out, size_t outlen){
int i;
int j = 0;
if (!inlen) return 0;
if (!outlen) return -1;
out [j++] = in[0];
for (i = 1; i < inlen; ++i){
if (in[i - 1] != in[i]){
if (j > outlen - 1) return -1;
out[j++] = in[i];
}
}
out[j] = '\0';
return j - 1;
}
int main(void) {
const char *str1 = "aabbaa";
char out[25];
int ret = doubleletter(str1, strlen(str1), out, sizeof(out)/sizeof(out[0]));
printf("Result: %s", out);
return 0;
}

I would recommend using 2 indices to modify the string in-place:
void remove_doubles(char *str)
{
// if string is 1 or 0 length do nothing.
if(strlen(str)<=1)return;
int i=0; //index (new string)
int j=1; //index (original string)
// loop until end of string
while(str[j]!=0)
{
// as soon as we find a different letter,
// copy it to our new string and increase the index.
if(str[i]!=str[j])
{
i++;
str[i]=str[j];
}
// increase index on original/old string
j++;
}
// mark new end of string
str[i+1]='\0';
}

pointers strings

to copy at the correct place but doesn't stop after the count is reached. I thought my code should work as follows
char har *orig, int start, int count, char *final);
int main(void)
{
const char source[] = "one two three";
char result[] = "123456789012345678";
printf("%s\n",GetSubstring(source, 4, 3, result));
return 0;
}
char r *orig, int start, int count, char *final)
{
char *temp = (char *)orig;
final = temp;
}
for ( ; *temp && (count > 0) ; count--)
{
rn final;
}

The first for loop doesn't check if temp array exists (how would it check for existence of an allocated memory without asking memory manager in some way?!). The temp is merely a pointer. What you're checking for is that the orig string doesn't have a zero within the first start bytes. That's OK, perhaps' that's what you meant by "existence".
Your intention is to copy from orig to final, yet you reset final to orig. That's where your error is. You must remove that line and it fixes the problem.
You don't need to create the temp pointer, you can use the orig pointer. You're free to modify it -- remember, function arguments are in effects local variables. Function arguments in C are pass-by-value, you implement pass-by-reference by passing pointers (which are values!) to data.
I should add perhaps that the premise of this function is somewhat broken. It "works", but it's not what one might reasonably expect. Notably:
There's no indication that the source string was shorter than start.
There's no indication that the source string was shorter than start + count.
Perhaps those are OK, but in cases where those conditions could be an error, it should be possible for the user of the function to get an indication of it. The caller would know what's expected and what's not, so the caller can determine it if only you'd provide some feedback to the caller.
You're returning the position that's one past the end of the output -- past the zero-termination. That's not very convenient. If one were to use the returned value to concatenate a subsequent string, it'd have to be decremented by one first.
Below is the fixed code, with sanely named variables.
char *GetSub(const char *src, int start, int count, char *dst)
{
for ( ; *src && (start > 0) ; start--)
{
src++; /* Note: *src++ works too, but is pointless */
}
for ( ; *src && (count > 0) ; count--)
{
*dst++ = *src++;
}
*dst++ = 0;
return dst; /* Notice: This returns a pointer to the end of the
memory block you just wrote. Is this intentional? */
}

There are several problems in what you have written. Let's enumerate:
char *temp = (char *)orig; - You're assigning a const char * (you promise not to modify) to a char * (you break that promise). Wrong thing to be doing.
final = temp. No this does not make the original final (the copy held by the caller) change at all. It achieves nothing. It changes your (function's) copy of final to point to the same place that temp is pointing.
*temp++; - There's no point de-referencing it if you're not going to use it. Incrementing it of course, is correct [see comment thread with KubaOber below].
final++ = *temp++; - This is just confusing to read.
*final++ = 0; return final; - You're setting the value at the address final to '0'. Then you're incrementing it (to point to somewhere in space, maybe towards a black hole). Then returning that pointer. Which is also wrong.
What you really should do is to wrap strncpy in a convenient way.
But if you insist to write your own, you'd probably want your function to be something as simple as:
char *GetSub(const char *orig, int start, int count, char *final)
{
int i;
for (i = 0; i < count; i++)
{
final[i] = orig[i+start];
if (final[i] == '\0')
break;
}
final[i] = '\0';
return final; /* Yes, we just return what we got. */
}

The problem is with the following line:
final = temp;
Remove it, and the problem should be resolved.

char *a="abcdefgh";
i want string "cde" to be copied into another.
index i got is 3(your start).
char *temp=malloc(3*sizeof(char))
strncpy(temp,a+3,3);
is this what you need?

Change your GetSubfunction:
char *GetSub(const char *orig, int start, int count, char *final)
{
char *temp = (char *)orig;
// with original final = temp and final++ you loose final valid pointer
char *final2 = final;
for ( ; *temp && (start > 0) ; )
{
start--;
// you don't need to dereference temp
temp++;
}
for ( ; *temp && (count > 0) ; count--)
{
*final2++ = *temp++;
}
*final2 = 0;
// return a valid pointer
return final;
}

you have some mistakes on your code :
char *GetSub(const char *orig, int start, int count, char *final)
{
char *temp = (char *)orig;
//final = temp; /* Why this? */
for ( ; *temp && (start > 0) ; )
{
start--;
temp++; /* Instead of *temp++ */
}
for ( ; *temp && (count > 0) ; count--)
{
*final++ = *temp++;
}
*(final+count) = '\0';
return final;
}
Hope this help.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Extracting web addresses from a string in C - c

Related

Manipulating a string and rewriting it by the function output

C string recursive function to find out equality from middle

How to concatenate 2 strings using malloc and not the library functions

How can I make a function to remove double letters in C?

pointers strings

Categories

Resources