Trying to delete a specific character from a string in C?

Trying to delete a specific character from a string in C? - c

I'm trying to delete a specific character (?) from the end of a string and return a pointer to a string, but it's not removing it at all at the moment. What am I doing wrong? Is there a better way to go about it?
char * word_copy = malloc(strlen(word)+1);
strcpy(word_copy, word);
int length = strlen(word_copy);
int i = 0;
int j = 0;
for (i = 0; word_copy[i] != '\0'; i++) {
if (word_copy[length - 1] == '?' && i == length - 1){
break;
}
}
for (int j = i; word_copy[j] != '\0'; j++) {
word_copy[j] = word_copy[j+1];
}
word = strdup(word_copy);

I'm immediately seeing a couple of problems.
The first for loop does nothing. It doesn't actually depend on i so it could be replaced with a single if statement.
if (word_copy[length - 1] == '?') {
i = length - 1;
} else {
i = length + 1;
}
The second for loop also acts as an if statement since it starts at the end of the string and can only ever run 0 or 1 times.
You could instead do something like this to remove the ?. This code will return a new malloced string with the last character removed if its ?.
char *remove_question_mark(char *word) {
unsigned int length = strlen(word);
if (length == 0) {
return calloc(1, 1);
}
if (word[length - 1] == '?') {
char *word_copy = malloc(length);
// Copy up to '?' and put null terminator
memcpy(word_copy, word, length - 1);
word_copy[length - 1] = 0;
return word_copy;
}
char *word_copy = malloc(length + 1);
memcpy(word_copy, word, length + 1);
return word_copy;
}
Or if you are feeling lazy, you could also just make the last character the new null terminator instead. Its essentially creates a memory leak of 1 byte, but that may be an acceptable loss. It should also be a fair bit faster since it doesn't need to allocate any new memory or copy the previous string.
unsigned int length = strlen(word);
if (length > 0 && word[length - 1] == '?') {
word[length] = 0;
}

Related

function doesn't pass certain test case

I have a problem with one of the test for my solution for challenge in codewars. I have to write a function that returns alphabet position of characters in input string. My solution is below. I pass all my test and also tests from codewars but fail on this one (I did not implement this test code it was pat of the test code implemented by code wars):
Test(number_tests, should_pass) {
srand(time(NULL));
char in[11] = {0};
char *ptr;
for (int i = 0; i < 15; i++) {
for (int j = 0; j < 10; j++) {
char c = rand() % 10;
in[j] = c + '0';
}
ptr = alphabet_position(in);
cr_assert_eq(strcmp(ptr, ""), 0);
free(ptr);
}
}
The error I receive is following: The expression (strcmp(ptr, "")) == (0) is false. Thanks for the help!
p.s Also I noticed that I am leaking memory (I don't know how to solve this so I suppose I would use array to keep track of string and don't use malloc) --> I suppose this is not an issue I would just free(ptr) in main function.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *alphabet_position(char *text);
// test
int main()
{
if (!strcmp("1 2 3", alphabet_position("abc")))
{
printf("success...\n");
}
else
{
printf("fail...\n");
}
if (!strcmp("", alphabet_position("..")))
{
printf("success...\n");
}
else
{
printf("fail...\n");
}
if (!strcmp("20 8 5 19 21 14 19 5 20 19 5 20 19 1 20 20 23 5 12 22 5 15 3 12 15 3 11", alphabet_position("The sunset sets at twelve o' clock.")))
{
printf("success...\n");
}
else
{
printf("fail...\n");
}
}
char *alphabet_position(char *text)
{
// signature: string -> string
// purpose: extact alphabet position of letters in input string and
// return string of alphabet positions
// return "123"; // stub
// track numerical value of each letter according to it's alphabet position
char *alph = "abcdefghijklmnopqrstuvwxyz";
// allocate maximum possible space for return string
// each char maps to two digit number + trailing space after number
char *s = malloc(sizeof(char) * (3 * strlen(text) + 1));
// keep track of the begining of return string
char *head = s;
int index = 0;
int flag = 0;
while(*text != '\0')
{
if ( ((*text > 64) && (*text < 91)) || ((*text > 96) && (*text < 123)))
{
flag = 1;
index = (int)(strchr(alph, tolower(*text)) - alph) + 1;
if (index > 9)
{
int n = index / 10;
int m = index % 10;
*s = n + '0';
s++;
*s = m + '0';
s++;
*s = ' ';
s++;
}
else
{
*s = index + '0';
s++;
*s = ' ';
s++;
}
}
text++;
}
if (flag != 0) // if string contains at least one letter
{
*(s -1) = '\0'; // remove the trailing space and insert string termination
}
return head;
}

Here is what I think is happening:
In the cases where none of the characters in the input string is an alphabet character, s is never used, and therefore the memory allocated by malloc() could be anything. malloc() does not clear / zero-out memory.
The fact that your input case of ".." passes is just coincidence. The codewars test case does many such non-alphabetical tests in a row, each of which causes a malloc(), and if any one of them fails, the whole thing fails.
I tried recreating this situation, but it's (as I say) unpredictable. To test this, add a debugging line to output the value of s when flag is still 0:
if (flag != 0) { // if string contains at least one letter
*(s -1) = '\0'; // remove the trailing space and insert string termination
}
else {
printf("flag is still 0 : %s\n", s);
}
I'll wager that sometimes you get a garbage / random string that is not "".

How do I convert an underscored pointer to a char array into camelCasing?

I am trying to write a function that will convert a "word_string" to "wordString". However, my output from the code below is "wordSttring". I'm having trouble skipping to the next element of the array after I replace the undescore with the uppercase of the next element. Any suggestions?
void convert_to_camel(char* phrase){
int j =0;
for(int i=0;i<full_len-1;i++){
if(isalphanum(phrase[i])){
phrase[j] = phrase[i];
j++;
length++;
}
}
int flag = 0;
char new[50];
for (int i=0;i<length;i++){
if(phrase[i]== '95'){
flag = 1;
}
if(flag ==1){
new[i] = toUpper(phrase[i+1]);
i++;
new[i] = phrase[i+1];
flag = 0;
}
else{
new[i] = phrase[i];
}
}

All other solutions presented so far turn "_" into an empty string and remove _ from the end of the string.
#include <stddef.h>
#include <ctype.h>
void to_camel_case(char *str) // pun_intended
{
for (size_t i = 0, k = 0; str[i]; ++i, ++k)
{
while (k && str[k] == '_' && str[k + 1]) // 1)
str[k] = k - 1 ? toupper((char unsigned)str[++k]) : str[++k]; // 2)
str[i] = str[k];
}
}
Skip consecutive '_'. Make sure to leave at least one at the beginning and one at the end of the string if present.
Replace '_' with the next character, capitalized if needed.

You are not handling the removing _ properly for that you need one more loop index(j).
And you don't need one more loop to remove the non alpha numeric chars it can be done with _ loop only,
also you need to terminate the string once trimming is completed otherwise your string will have junk chars.
void toCamelCase(char* phrase){
int j=0;
for (int i=0;i<strlen(phrase);i++){
if(phrase[i] != '_' && isalnum(phrase[i])){ //Copy Alpha numeric chars not including _.
phrase[j++] = phrase[i];
}
else if(phrase[i] == '_'){
phrase[j++] = toupper(phrase[i+1]);
i++;
}
}
phrase[j] = '\0'; //Terminate the string
}
Note::This method does not handle consecutive _(word____string).

Note that in your current implementation you are trying to use new character array to store the processed string but you won't be able to use it outside that function since it is local variable and its end of life is the very moment the flow exits that function.
Here is my proposal for such function:
#define MAX_STR_LEN 50
// assuming that 'str' is a null terminated string
void to_camel_case(char *str)
{
int idx = 0;
int newIdx = 0;
int wasUnderscore = 0;
// just to be on the safe side
if (!str || strlen(str) >= MAX_STR_LEN)
return;
while (str[idx])
{
if (str[idx] == '_')
{
idx++;
// no copy in this case, just raise a flag that '_' was met
wasUnderscore = 1;
}
else if (wasUnderscore)
{
// next letter after the '_' should be uppercased
str[newIdx++] = toupper(str[idx++]);
// drop the flag which indicates that '_' was met
wasUnderscore = 0;
}
else
{
// copy the character and increment the indices
str[newIdx++] = str[idx++];
}
}
str[newIdx] = '\0';
}
I tested it with some inputs and this is what I got:
String hello_world became helloWorld
String hello___world became helloWorld
String hel_lo_wo_rld__ became helLoWoRld
String __hello_world__ became HelloWorld

Maybe something like this might help :
void toCamelCase(char* phrase){
int length = strlen(phrase);
int res_ind = 0;
for (int i = 0; i < length ; i++) {
// check for underscore in the sentence
if (phrase[i] == '_') {
// conversion into upper case
phrase[i + 1] = toupper(s[i + 1]);
continue;
}
// If not space, copy character
else
phrase[res_ind++] = s[i];
}
phrase[res_ind] = '\0';
}

function that removes spaces and tabs

im trying to make a function that removes spaces and tabs from a given string except for the first tab or space in the string. when im using my function it removes the spaces and tabs except for the first one but it also removes the first letter after the first space or tab.
for example > "ad ad ad"> "ad dad" instead of "ad adad"
why is that?
void RemoveSpacetab(char* source) {
char* i = source;
char* j = source;
int spcflg = 0;
while(*j != 0) {
*i = *j++;
if((*i != ' ') && (*i != '\t'))
i++;
if(((*i == ' ') || (*i == '\t')) && (spcflg == 0)) {
i++;
spcflg = 1;
}
}
*i = 0;
}

You will need to separate your source and destination arrays as they will become different lengths. You could find the starting position before copying characters like this, lets say you pass the source and the length of the source as char* source, int length (you could also calculate the length of the source with strlen(source), then your function could look like this:
int i = 0;
char* dest = malloc(sizeof(char) * length);
// Increment i until not space to find starting point.
while (i < length && (source[i] == '\t' || source[i] == ' ')) i++;
int dest_size = 0;
while (i < length) {
if (source[i] != '\t' && source[i] != ' ') {
// Copy character if not space to dest array
dest[dest_size++] = source[i];
}
i++;
}
dest[dest_size++] = 0; // null terminator
// Feel free to realloc to the right length with
// realloc(dest, dest_size * sizeof(char))
return dest;

The problem caused by two if statements one after the other. Your i precedes j when you detect a space for first time.
Explanation:
In first cycle the i points to position 0 and j too. The 'a' at position 0 will be overwritten with itself then j moves onwards to position 1. Your first if block finds out that the character at position 0 is not a space and not a tab, so moves the i to position 1.
In second cycle the 'b' will be overwritten with itself then j moves to position 2 which is a space. The first if finds out that 'b' at position 1 is not a space and not a tab so moves the i to position 2. Now the second if finds out that the i points to a space for first time and moves it to the position 3 while j is still points to the position 2.
In third cycle the 'a' at position 3 will be overwritten with the space at position 2 and j catches up with i.
A possible fix to your code:
#include <stdio.h>
void RemoveSpacetab(char* source) {
char* i = source;
char* j = source;
char spcflg = 0;
while(*j != 0) {
*i = *j++;
if(*i == ' ' || *i == '\t') {
if(!spcflg) {
i++;
spcflg = 1;
}
}
else {
i++;
}
}
*i = 0;
}
int main() {
char my_string[] = "ad ad ad";
RemoveSpacetab(my_string);
printf("%s\n", my_string);
return 0;
}

Why do I keep getting extra characters at the end of my string?

I have the string, "helLo, wORld!" and I want my program to change it to "Hello, World!". My program works, the characters are changed correctly, but I keep getting extra characters after the exclamation mark. What could I be doing wrong?
void normalize_case(char str[], char result[])
{
if (islower(str[0]) == 1)
{
result[0] = toupper(str[0]);
}
for (int i = 1; str[i] != '\0'; i++)
{
if (isupper(str[i]) == 1)
{
result[i] = tolower(str[i]);
}
else if (islower(str[i]) == 1)
{
result[i] = str[i];
}
if (islower(str[i]) == 0 && isupper(str[i]) == 0)
{
result[i] = str[i];
}
if (str[i] == ' ')
{
result[i] = str[i];
}
if (str[i - 1] == ' ' && islower(str[i]) == 1)
{
result[i] = toupper(str[i]);
}
}
}

You are not null terminating result so when you print it out it will keep going until a null is found. If you move the declaration of i to before the for loop:
int i ;
for ( i = 1; str[i] != '\0'; i++)
you can add:
result[i] = '\0' ;
after the for loop, this is assuming result is large enough.

Extra random-ish characters at the end of a string usually means you've forgotten to null-terminate ('\0') your string. Your loop copies everything up to, but not including, the terminal null into the result.
Add result[i] = '\0'; after the loop before you return.
Normally, you treat the isxxxx() functions (macros) as returning a boolean condition, and you'd ensure that you only have one of the chain of conditions executed. You'd do that with more careful use of else clauses. Your code actually copies str[i] multiple times if it is a blank. In fact, I think you can compress your loop to:
int i;
for (i = 1; str[i] != '\0'; i++)
{
if (isupper(str[i]))
result[i] = tolower(str[i]);
else if (str[i - 1] == ' ' && islower(str[i]))
result[i] = toupper(str[i]);
else
result[i] = str[i];
}
result[i] = '\0';
If I put result[i] outside of the for loop, won't the compiler complain about i?
Yes, it will. In this context, you need i defined outside the loop control, because you need the value after the loop. See the amended code above.
You might also note that your pre-loop code quietly skips the first character of the string if it is not lower-case, leaving garbage as the first character of the result. You should really write:
result[0] = toupper(str[0]);
so that result[0] is always set.

You should add a statement result[i] = '\0' at the end of the loop because in the C language, the string array should end with a special character '\0', which tells the compiler "this is the end of the string".

I took the liberty of simplifying your code as a lot of the checks you do are unnecessary. The others have already explained some basic points to keep in mind:
#include <stdio.h> /* for printf */
#include <ctype.h> /* for islower and the like */
void normalise_case(char str[], char result[])
{
if (islower(str[0]))
{
result[0] = toupper(str[0]); /* capitalise at the start */
}
int i; /* older C standards (pre C99) won't like it if you don't pre-declare 'i' so I've put it here */
for (i = 1; str[i] != '\0'; i++)
{
result[i] = str[i]; /* I've noticed that you copy the string in each if case, so I've put it here at the top */
if (isupper(result[i]))
{
result[i] = tolower(result[i]);
}
if (result[i - 1] == ' ' && islower(result[i])) /* at the start of a word, capitalise! */
{
result[i] = toupper(result[i]);
}
}
result[i] = '\0'; /* this has already been explained */
}
int main()
{
char in[20] = "tESt tHIs StrinG";
char out[20] = ""; /* space to store the output */
normalise_case(in, out);
printf("%s\n", out); /* Prints 'Test This String' */
return 0;
}

C - Largest String From a Big One

So pray tell, how would I go about getting the largest contiguous string of letters out of a string of garbage in C? Here's an example:
char *s = "(2034HEY!!11 th[]thisiswhatwewant44";
Would return...
thisiswhatwewant
I had this on a quiz the other day...and it drove me nuts (still is) trying to figure it out!
UPDATE:
My fault guys, I forgot to include the fact that the only function you are allowed to use is the strlen function. Thus making it harder...

Uae strtok() to split your string into tokens, using all non-letter characters as delimiters, and find the longest token.
To find the longest token you will need to organise some storage for tokens - I'd use linked list.
As simple as this.
EDIT
Ok, if strlen() is the only function allowed, you can first find the length of your source string, then loop through it and replace all non-letter characters with NULL - basically that's what strtok() does.
Then you need to go through your modified source string second time, advancing one token at a time, and find the longest one, using strlen().

This sounds similar to the standard UNIX 'strings' utility.
Keep track of the longest run of printable characters terminated by a NULL.
Walk through the bytes until you hit a printable character. Start counting. If you hit a non-printable character stop counting and throw away the starting point. If you hit a NULL, check to see if the length of the current run is greater then the previous record holder. If so record it, and start looking for the next string.

What defines the "good" substrings compared to the many others -- being lowercase alphas only? (i.e., no spaces, digits, punctuation, uppercase, &c)?
Whatever the predicate P that checks for a character being "good", a single pass over s applying P to each character lets you easily identify the start and end of each "run of good characters", and remember and pick the longest. In pseudocode:
longest_run_length = 0
longest_run_start = longest_run_end = null
status = bad
for i in (all indices over s):
if P(s[i]): # current char is good
if status == bad: # previous one was bad
current_run_start = current_run_end = i
status = good
else: # previous one was also good
current_run_end = i
else: # current char is bad
if status == good: # previous one was good -> end of run
current_run_length = current_run_end - current_run_start + 1
if current_run_length > longest_run_length:
longest_run_start = current_run_start
longest_run_end = current_run_end
longest_run_length = current_run_length
status = bad
# if a good run ends with end-of-string:
if status == good: # previous one was good -> end of run
current_run_length = current_run_end - current_run_start + 1
if current_run_length > longest_run_length:
longest_run_start = current_run_start
longest_run_end = current_run_end
longest_run_length = current_run_length

Why use strlen() at all?
Here's my version which uses no function whatsoever.
#ifdef UNIT_TEST
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#endif
/*
// largest_letter_sequence()
// Returns a pointer to the beginning of the largest letter
// sequence (including trailing characters which are not letters)
// or NULL if no letters are found in s
// Passing NULL in `s` causes undefined behaviour
// If the string has two or more sequences with the same number of letters
// the return value is a pointer to the first sequence.
// The parameter `len`, if not NULL, will have the size of the letter sequence
//
// This function assumes an ASCII-like character set
// ('z' > 'a'; 'z' - 'a' == 25; ('a' <= each of {abc...xyz} <= 'z'))
// and the same for uppercase letters
// Of course, ASCII works for the assumptions :)
*/
const char *largest_letter_sequence(const char *s, size_t *len) {
const char *p = NULL;
const char *pp = NULL;
size_t curlen = 0;
size_t maxlen = 0;
while (*s) {
if ((('a' <= *s) && (*s <= 'z')) || (('A' <= *s) && (*s <= 'Z'))) {
if (p == NULL) p = s;
curlen++;
if (curlen > maxlen) {
maxlen = curlen;
pp = p;
}
} else {
curlen = 0;
p = NULL;
}
s++;
}
if (len != NULL) *len = maxlen;
return pp;
}
#ifdef UNIT_TEST
void fxtest(const char *s) {
char *test;
const char *p;
size_t len;
p = largest_letter_sequence(s, &len);
if (len && (len < 999)) {
test = malloc(len + 1);
if (!test) {
fprintf(stderr, "No memory.\n");
return;
}
strncpy(test, p, len);
test[len] = 0;
printf("%s ==> %s\n", s, test);
free(test);
} else {
if (len == 0) {
printf("no letters found in \"%s\"\n", s);
} else {
fprintf(stderr, "ERROR: string too large\n");
}
}
}
int main(void) {
fxtest("(2034HEY!!11 th[]thisiswhatwewant44");
fxtest("123456789");
fxtest("");
fxtest("aaa%ggg");
return 0;
}
#endif

While I waited for you to post this as a question I coded something up.
This code iterates through a string passed to a "longest" function, and when it finds the first of a sequence of letters it sets a pointer to it and starts counting the length of it. If it is the longest sequence of letters yet seen, it sets another pointer (the 'maxStringStart' pointer) to the beginning of that sequence until it finds a longer one.
At the end, it allocates enough room for the new string and returns a pointer to it.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int isLetter(char c){
return ( (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') );
}
char *longest(char *s) {
char *newString = 0;
int maxLength = 0;
char *maxStringStart = 0;
int curLength = 0;
char *curStringStart = 0;
do {
//reset the current string length and skip this
//iteration if it's not a letter
if( ! isLetter(*s)) {
curLength = 0;
continue;
}
//increase the current sequence length. If the length before
//incrementing is zero, then it's the first letter of the sequence:
//set the pointer to the beginning of the sequence of letters
if(curLength++ == 0) curStringStart = s;
//if this is the longest sequence so far, set the
//maxStringStart pointer to the beginning of it
//and start increasing the max length.
if(curLength > maxLength) {
maxStringStart = curStringStart;
maxLength++;
}
} while(*s++);
//return null pointer if there were no letters in the string,
//or if we can't allocate any memory.
if(maxLength == 0) return NULL;
if( ! (newString = malloc(maxLength + 1)) ) return NULL;
//copy the longest string into our newly allocated block of
//memory (see my update for the strlen() only requirement)
//and null-terminate the string by putting 0 at the end of it.
memcpy(newString, maxStringStart, maxLength);
newString[maxLength + 1] = 0;
return newString;
}
int main(int argc, char *argv[]) {
int i;
for(i = 1; i < argc; i++) {
printf("longest all-letter string in argument %d:\n", i);
printf(" argument: \"%s\"\n", argv[i]);
printf(" longest: \"%s\"\n\n", longest(argv[i]));
}
return 0;
}
This is my solution in simple C, without any data structures.
I can run it in my terminal like this:
~/c/t $ ./longest "hello there, My name is Carson Myers." "abc123defg4567hijklmnop890"
longest all-letter string in argument 1:
argument: "hello there, My name is Carson Myers."
longest: "Carson"
longest all-letter string in argument 2:
argument: "abc123defg4567hijklmnop890"
longest: "hijklmnop"
~/c/t $
the criteria for what constitutes a letter could be changed in the isLetter() function easily. For example:
return (
(c >= 'a' && c <= 'z') ||
(c >= 'A' && c <= 'Z') ||
(c == '.') ||
(c == ' ') ||
(c == ',') );
would count periods, commas and spaces as 'letters' also.
as per your update:
replace memcpy(newString, maxStringStart, maxLength); with:
int i;
for(i = 0; i < maxLength; i++)
newString[i] = maxStringStart[i];
however, this problem would be much more easily solved with the use of the C standard library:
char *longest(char *s) {
int longest = 0;
int curLength = 0;
char *curString = 0;
char *longestString = 0;
char *tokens = " ,.!?'\"()#$%\r\n;:+-*/\\";
curString = strtok(s, tokens);
do {
curLength = strlen(curString);
if( curLength > longest ) {
longest = curLength;
longestString = curString;
}
} while( curString = strtok(NULL, tokens) );
char *newString = 0;
if( longest == 0 ) return NULL;
if( ! (newString = malloc(longest + 1)) ) return NULL;
strcpy(newString, longestString);
return newString;
}

First, define "string" and define "garbage". What do you consider a valid, non-garbage string? Write down a concrete definition you can program - this is how programming specs get written. Is it a sequence of alphanumeric characters? Should it start with a letter and not a digit?
Once you get that figured out, it's very simple to program. Start with a naive method of looping over the "garbage" looking for what you need. Once you have that, look up useful C library functions (like strtok) to make the code leaner.

Another variant.
#include <stdio.h>
#include <string.h>
int main(void)
{
char s[] = "(2034HEY!!11 th[]thisiswhatwewant44";
int len = strlen(s);
int i = 0;
int biggest = 0;
char* p = s;
while (p[0])
{
if (!((p[0] >= 'A' && p[0] <= 'Z') || (p[0] >= 'a' && p[0] <= 'z')))
{
p[0] = '\0';
}
p++;
}
for (; i < len; i++)
{
if (s[i] && strlen(&s[i]) > biggest)
{
biggest = strlen(&s[i]);
p = &s[i];
}
}
printf("%s\n", p);
return 0;
}