Searching for block of characters (word) in a text - c

I want to search for a block of characters (word) in a text.
For example, I have the next text "Hello xyz world", and I want to search for "xyz ", note the space after the word.
// The Text
const char * text = "Hello xyz world";
// The target word
const char * patt = "xyz ";
size_t textLen = strlen(text),
pattLen = strlen(patt), i, j;
for (i = 0; i < textLen; i++) {
printf("%c", text[i]);
for (j = 0; j < pattLen; j++) {
if (text[i] == patt[j]) {
printf(" <--");
break;
}
}
printf("\n");
}
The result must be like following:
But unfortunately, the result as the following:
It collects all the similar characters in the whole text, not just the target characters (the word).
How to fix that problem?

You have to do a full substring match before you print; mark the applicable characters on a first pass, and then have a second pass to print the results. In your case, you'd create a second array, with boolean values corresponding to the first, something like
text = "Hello xyz world";
match 000000111100000
I assume that you can find a basic substring match program online. Printing on the second pass will be easy: you already have the logic. Instead of if (text[i] == patt[j]), just use if match[i].
Is that enough of a hint?

You need to make sure that there is a full match before starting to print any <--. And to avoid to do accesses passed end of array on patt, you will have to stop searching when less than pattLen characters remain in array.
Then when you have found a full match, you can print the content of patt followed with <-- and increment position of pointer of pattLen-1. And at the end you will have to copy remaining characters from text.
Code could become:
// The Text
const char * text = "Hello xyz world";
// The target word
const char * patt = "xyz ";
size_t textLen = strlen(text),
pattLen = strlen(patt), i, j;
for (i = 0; i <= textLen - pattLen; i++) { // don't search if less that pattLen remains
printf("%c", text[i]);
if (text[i] == patt[0]) { // first char matches
int found = 1; // be optimistic...
for (j = 1; j < pattLen; j++) {
if (patt[j] != text[i + j]) {
found = 0;
break; // does not fully match, go on
}
}
if (found) { // yeah, a full match!
printf(" <--"); // already printed first char
for (j = 1; j < pattLen; j++) {
printf("\n%c <--", patt[j]);// print all others chars from patt
}
i += pattLen - 1; // increase index...
}
}
printf("\n");
}
while (i < textLen) {
printf("%c\n", text[i++]); // process the end of text
}
Above code gives expected output for "xyz " and also "llo"...

You should check every letter of your pattern from the beginning (and not check the whole pattern). Try this (not tested):
int currIndex = 0;
for (i = 0; i < textLen; i++) {
printf("%c", text[i]);
if (text[i] == patt[currIndex]) {
for (j = 0; j < pattLen; j++) {
if(text[i+j] != patt[j]){
continue;
}
}
printf(" <--");
currIndex++;
if(currIndex==pattLen)
currIndex = 0;
}
else{
currIndex = 0;
}
printf("\n");
}
Note: It is not the best way to achieve this but the easiest with your example
Note 2: This question should be closed as it is:
Questions seeking debugging help ("why isn't this code working?") must
include the desired behavior, a specific problem or error and the
shortest code necessary to reproduce it in the question itself.
Questions without a clear problem statement are not useful to other
readers. See: How to create a Minimal, Complete, and Verifiable
example.

Related

Inserting %20 in a string character clarification

Hi I am doing a Cracking the Coding I have coded up a solution, by reading the answer explained in plain english, however I do not understand one line of code.
The Question
Replace all white space with "%20", empty spaces have been added at the end of the text to accomodate for the new symbols
Input : "Mr John Smith ", 13
Output : Mr%20John%20Smith
My Solution
*static void replaceSpaces(char[] arr,int trueLength)
{
int spaces = 0;
int newLength = 0;
int length = 0;
for(int i = 0; i<trueLength; ++i)
{
if(arr[i] == ' ')
{
++spaces;
}
newLength = trueLength +spaces*2; // We already have one space, so we need to add 2 extra spaces to fit the %20 symbol
}
for(int i = trueLength-1; i>=0; i--)
{
if(arr[i] == ' ')
{
arr[newLength-1] = '0';
arr[newLength-2] = '2';
arr[newLength-3] = '%';
newLength = newLength - 3;
}
else
{
arr[newLength-1] = arr[i];
newLength = newLength - 1;
}
}
System.out.println(arr);
}*
I dont understand why we need this line of code (newLength = newLength - 3), I think we need it because after we remove space with the symbol, we subtract 3 to go to the next empty space, is this correct?
That's correct, if you mean: to the next empty space to write a new character.
The codeline newLength = newLength - 3; exists because you need to skip 3 characters ('0', '2' and '%'). Else you would overwrite them.
I must mention that your code is quite typical since you are filling the array backwards.

Recursion problems in C

I am working on an anagram solver in C. Hit a problem where the solver will return the first few anagrams correctly, however on ones that extend past 2 words, it begins to enter an infinite loop.
Example:
I enter "team sale rest" into the anagram solver, it responds with teamster ale, and a few others. Then when it arrives at releases, it enters an infinite loop where it prints "releases am matt" "releases am am matt" etc.
Here is the code base:
//recursively find matches for each sub-word
int findMatches(char string[], char found_so_far[])
{
printf("String entering function: %s\n", string);
int string_length = strlen(string);
int_char_ptr *results = getPowerSet(string, string_length);
if(!results)
return 2;
// selects length of subset, starting with the largest
for (int i = string_length - 1; i > 0; i--)
{
// iterates through all the subsets of a particular length
for(int j = 0; j < results->count[i]; j++)
{
word_array *matches = NULL;
// check words against dictionary
matches = dictionary_check(results->table[i][j]);
if (matches)
{
// iterate through matches
for(size_t k = 0; k < matches->size; k++)
{
int found_length;
// find out length of string needed for found
if (strcmp(found_so_far, "") == 0)
found_length = strlen(matches->arr[k]) + 1;
else
found_length = strlen(found_so_far) + strlen(matches->arr[k]) + 2;
char found[found_length];
// on first passthrough, copy directly from matches
if (strcmp(found_so_far, "") == 0)
strcpy(found, matches->arr[k]);
else
sprintf(found, "%s %s", found_so_far, matches->arr[k]);
char tempstr[string_length];
strcpy(tempstr, string);
char *remain = get_remaining_letters(tempstr, results->table[i][j]);
// if there are no letters remaining
if (strcmp(remain, "") == 0)
{
printf("MATCH FOUND: %s \n", found);
// alternatively, could store strings to array
}
else
{
findMatches(remain, found);
}
}
}
}
free(results->table[i][results->count[i] - 1]);
free(results->table[i]);
}
return 0;
}
How I read it (I am obviously missing something) is that it should try to match all matches, and if it can't , it should move to the next subset of letters found.
I have tries going through with a debugger, and cant make rhyme or reason of it.
As mentioned above in the commment:
get_remaining_letters used the original results->table[i][j] and removed the letters. This would leave an empty string for the next iteration and cause it to not perform as expected. Fixed by copying the string to a temporary one inside that function.

HackerRank staircase C confusion

Hi i'm pretty new and trying to improve through hackerrank, i am on the staircase excercise staircase excercise
However my output is different to the question, as it seems my staircase has a extra space infront of the result thus making it incorrect. here is my code.
#include <stdio.h>
int main ()
{
int size = 0;
//input size of staircase
scanf("%d" , &size);
//create array to hold staircase
char list [size];
//iterate through and fill up array with spaces
for (int i = 0; i <size; ++i)
{
list[i] = ' ';
}
//the iterate backwards -1 each time replacing each spcae with a '#' and printing each stair case starting from smallest at the top.
for (int i = size; i >0; i--)
{
list[i] = '#';
printf("%s\n", list);
}
return 0;
}
I am confused as to what the problem is and why there is my staircase more spaced out than the expected question? i've been trying to work it out, and any help is really much needed.
My output:
#
##
###
####
#####
######
*EDIT - thanks for the help, all the answers were helpful.
There are several mistakes:
1) You forgot to put a null character ('\0') at the end of the string. Do this:
for (int i = 0; i <size; ++i)
{
list[i] = ' ';
}
list[i] ='\0';
2)
for (int i = size; i >0; i--)
{
list[i] = '#';
printf("%s\n", list);
}
Here the index you are trying to access for the string is invalid (when i=size). Do like this:
for (int i = size-1; i > -1; i--)
{
list[i] = '#';
printf("%s\n", list);
}
An array of length size is indexed from 0 to size - 1:
E.g. for size == 4:
[0][1][2][3]
In your second loop you go from size to 1, first writing outside the array, and finally leaving the first entry untouched:
[ ][#][#][#] #
You must either change your second loop to
for (int i = size - 1; i >= 0; i--)
or alternatively do
list[i - 1] = '#';
instead of list[i] = '#'.

Parsing character array to words held in pointer array (C-programming)

I am trying to separate each word from a character array and put them into a pointer array, one word for each slot. Also, I am supposed to use isspace() to detect blanks. But if there is a better way, I am all ears. At the end of the code I want to print out the content of the parameter array.
Let's say the line is: "this is a sentence". What happens is that it prints out "sentence" (the last word in the line, and usually followed by some random character) 4 times (the number of words). Then I get "Segmentation fault (core dumped)".
Where am I going wrong?
int split_line(char line[120])
{
char *param[21]; // Here I want to put one word for each slot
char buffer[120]; // Word buffer
int i; // For characters in line
int j = 0; // For param words
int k = 0; // For buffer chars
for(i = 0; i < 120; i++)
{
if(line[i] == '\0')
break;
else if(!isspace(line[i]))
{
buffer[k] = line[i];
k++;
}
else if(isspace(line[i]))
{
buffer[k+1] = '\0';
param[j] = buffer; // Puts word into pointer array
j++;
k = 0;
}
else if(j == 21)
{
param[j] = NULL;
break;
}
}
i = 0;
while(param[i] != NULL)
{
printf("%s\n", param[i]);
i++;
}
return 0;
}
There are many little problems in this code :
param[j] = buffer; k = 0; : you rewrite at the beginning of buffer erasing previous words
if(!isspace(line[i])) ... else if(isspace(line[i])) ... else ... : isspace(line[i]) is either true of false, and you always use the 2 first choices and never the third.
if (line[i] == '\0') : you forget to terminate current word by a '\0'
if there are multiple white spaces, you currently (try to) add empty words in param
Here is a working version :
int split_line(char line[120])
{
char *param[21]; // Here I want to put one word for each slot
char buffer[120]; // Word buffer
int i; // For characters in line
int j = 0; // For param words
int k = 0; // For buffer chars
int inspace = 0;
param[j] = buffer;
for(i = 0; i < 120; i++) {
if(line[i] == '\0') {
param[j++][k] = '\0';
param[j] = NULL;
break;
}
else if(!isspace(line[i])) {
inspace = 0;
param[j][k++] = line[i];
}
else if (! inspace) {
inspace = 1;
param[j++][k] = '\0';
param[j] = &(param[j-1][k+1]);
k = 0;
if(j == 21) {
param[j] = NULL;
break;
}
}
}
i = 0;
while(param[i] != NULL)
{
printf("%s\n", param[i]);
i++;
}
return 0;
}
I only fixed the errors. I leave for you as an exercise the following improvements :
the split_line routine should not print itself but rather return an array of words - beware you cannot return an automatic array, but it would be another question
you should not have magic constants in you code (120), you should at least have a #define and use symbolic constants, or better accept a line of any size - here again it is not simple because you will have to malloc and free at appropriate places, and again would be a different question
Anyway good luck in learning that good old C :-)
This line does not seems right to me
param[j] = buffer;
because you keep assigning the same value buffer to different param[j] s .
I would suggest you copy all the char s from line[120] to buffer[120], then point param[j] to location of buffer + Next_Word_Postition.
You may want to look at strtok in string.h. It sounds like this is what you are looking for, as it will separate words/tokens based on the delimiter you choose. To separate by spaces, simply use:
dest = strtok(src, " ");
Where src is the source string and dest is the destination for the first token on the source string. Looping through until dest == NULL will give you all of the separated words, and all you have to do is change dest each time based on your pointer array. It is also nice to note that passing NULL for the src argument will continue parsing from where strtok left off, so after an initial strtok outside of your loop, just use src = NULL inside. I hope that helps. Good luck!

Reduce amount of C code [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
How can I refactor this with less code?
This is homework and is cracking a Caesar cipher-text using frequency distribution.
I have completed the assignment but would like it to be cleaner.
int main(int argc, char **argv){
// first allocate some space for our input text (we will read from stdin).
char* text = (char*)malloc(sizeof(char)*TEXT_SIZE+1);
char textfreq[ALEN][2];
char map[ALEN][2];
char newtext[TEXT_SIZE];
char ch, opt, tmpc, tmpc2;
int i, j, tmpi;
// Check the CLI arguments and extract the mode: interactive or dump and store in opt.
if(!(argc == 2 && isalpha(opt = argv[1][1]) && (opt == 'i' || opt == 'd'))){
printf("format is: '%s' [-d|-i]\n", argv[0]);
exit(1);
}
// Now read TEXT_SIZE or feof worth of characters (whichever is smaller) and convert to uppercase as we do it.
for(i = 0, ch = fgetc(stdin); i < TEXT_SIZE && !feof(stdin); i++, ch = fgetc(stdin)){
text[i] = (isalpha(ch)?upcase(ch):ch);
}
text[i] = '\0'; // terminate the string properly.
// Assign alphabet to one dimension of text frequency array and a counter to the other dimension
for (i = 0; i < ALEN; i++) {
textfreq[i][0] = ALPHABET[i];
textfreq[i][1] = 0;
}
// Count frequency of characters in the given text
for (i = 0; i < strlen(text); i++) {
for (j = 0; j < ALEN; j++) {
if (text[i] == textfreq[j][0]) textfreq[j][1]+=1;
}
}
//Sort the character frequency array in descending order
for (i = 0; i < ALEN-1; i++) {
for (j= 0; j < ALEN-i-1; j++) {
if (textfreq[j][1] < textfreq[j+1][1]) {
tmpi = textfreq[j][1];
tmpc = textfreq[j][0];
textfreq[j][1] = textfreq[j+1][1];
textfreq[j][0] = textfreq[j+1][0];
textfreq[j+1][1] = tmpi;
textfreq[j+1][0] = tmpc;
}
}
}
//Map characters to most occurring English characters
for (i = 0; i < ALEN; i++) {
map[i][0] = CHFREQ[i];
map[i][1] = textfreq[i][0];
}
// Sort the map lexicographically
for (i = 0; i < ALEN-1; i++) {
for (j= 0; j < ALEN-i-1; j++) {
if (map[j][0] > map[j+1][0]) {
tmpc = map[j][0];
tmpc2 = map[j][1];
map[j][0] = map[j+1][0];
map[j][1] = map[j+1][1];
map[j+1][0] = tmpc;
map[j+1][1] = tmpc2;
}
}
}
if(opt == 'd'){
decode_text(text, newtext, map);
} else {
// do option -i
}
// Print alphabet and map to stderr and the decoded text to stdout
fprintf(stderr, "\n%s\n", ALPHABET);
for (i = 0; i < ALEN; i++) {
fprintf(stderr, "%c", map[i][1]);
}
printf("\n%s\n", newtext);
return 0;
}
Um, Refactoring != less code. Obfuscation can sometimes result in less code, if that is your objective :)
Refactoring is done to improved code readability and reduced complexity. Suggestions for improvement in your case:
Look at the chunks of logic you've implemented and consider replacing them with in built functions is usually a good place to begin. I'm convinced that some of the sorting you've performed can be replaced with qsort(). However, side note, if this is your assignment, your tutor may be a douche and want to see you write out the code in FULL VS using C's in built function, and dock you points on being too smart. (Sorry personal history here :P)
Move your logical units of work into dedicated functions, and have a main function to perform orchestration.

Resources