Comparing substring of a character array with another character array in C - c

I have two characters arrays called arraypi and arraye containing numbers that I read from a file. Each have 1,000,000 characters. I need to start from the first character in arraye (In this case, 7) and search for it in arraypi. If 7 exists in arraypi then I have to search for the next substring of arraye(in this case, 71). Then search for 718, 7182 and so on until the substring does not exist in arraypi. Then I have to simply put the length of the biggest substring in a integer variable and print it.
Worth mentioning that arraypi contains a newline every 50 characters whereas arraye contains a newline every 80 although I don't think that will be problem right?
I tried thinking about a way to accomplish this but so far I haven't thought of something.

I am not absolutely sure if I got this right. I have something like this on my mind:
Assume that we have the whole arraypi is in a browser
You use the key combination ctrl+f for find
Start typing the contents of arraye letter by letter until you see the red no match
You want the number of characters you were able to type until then
If that's right, then an algorithm like the following should do the trick:
#include <stdio.h>
#define iswhitespace(X) ((X) == '\n' || (X) == ' ' || (X) == '\t')
int main( ) {
char e[1000] = "somet\n\nhing";
char pi[1000] = "some other t\nhing\t som\neth\n\ning";
int longestlen = 0;
int longestx = 0;
int pix = 0;
int ex = 0;
int piwhitespace = 0; // <-- added
int ewhitespace = 0; // <-- these
while ( pix + ex + piwhitespace < 1000 ) {
// added the following 4 lines to make it whitespace insensitive
while ( iswhitespace(e[ex + ewhitespace]) )
ewhitespace++;
while ( iswhitespace(pi[pix + ex + piwhitespace]) )
piwhitespace++;
if ( e[ex + ewhitespace] != '\0' && pi[pix + ex + piwhitespace] != '\0' && pi[pix + ex + piwhitespace] == e[ex + ewhitespace] ) {
// the following 4 lines are for obtaining correct longestx value
if ( ex == 0 ) {
pix += piwhitespace;
piwhitespace = 0;
}
ex++;
}
else {
if ( ex > longestlen ) {
longestlen = ex;
longestx = pix;
}
pix += piwhitespace + 1;
piwhitespace = 0;
// the two lines above could be replaced with
// pix++;
// and it would work just fine, the injection is unnecessary here
ex = 0;
ewhitespace = 0;
}
}
printf( "Longest sqn is %d chars long starting at %d", longestlen, longestx + 1 );
putchar( 10 );
return 0;
}
What's happening there is, the loop searches for a starting point for match first. Until it finds a match, it increments the index for the array being examined. When it finds a starting point, it then starts incrementing the index for the array containing the search term, keeping the other index constant.
Until a next mismatch, which is when a record-check is made, search term index is reset and examinee index starts getting incremented once again.
I hope this helps, somehow, hopefully more than resolving this single-time struggle.
Edit:
Changed the code to disregard white space characters.

Okay, since you apparently weren't really wanting this for arrays, but rather for two files with text inside, here's an appropriate solution to achieve that:
#include <stdio.h>
#define iswhitespace(X) ((X) == '\n' || (X) == ' ' || (X) == '\t')
int main( ) {
FILE * e;
FILE * pi;
if ( ( e = fopen( "e", "r" ) ) == NULL ) {
printf( "failure at line %d\n", __LINE__ );
return -1;
}
if ( ( pi = fopen( "pi", "r" ) ) == NULL ) {
printf( "failure at line %d\n", __LINE__ );
return -1;
}
int curre = fgetc( e );
int currpi = fgetc( pi );
int currentlength = 0;
int longestlength = 0;
int longestindex = 0;
int whitespaces = 0;
fpos_t startpoint;
if ( curre == EOF || currpi == EOF ) {
printf( "either one of the files are empty\n" );
return -1;
}
while ( 1 ) {
while ( iswhitespace( currpi ) )
currpi = fgetc( pi );
while ( iswhitespace( curre ) )
curre = fgetc( e );
if ( curre == currpi && currpi != EOF ) {
if ( currentlength == 0 && fgetpos( pi, &startpoint ) ) {
printf( "failure at line %d\n", __LINE__ );
return -1;
}
currentlength++;
curre = fgetc( e );
}
else if ( currentlength != 0 ) {
if ( currentlength > longestlength ) {
longestlength = currentlength;
longestindex = startpoint;
}
if ( curre == EOF ) {
printf( "Complete match!\n" );
break;
}
fsetpos( pi, &startpoint );
rewind( e );
curre = fgetc( e );
currentlength = 0;
}
if ( currpi == EOF )
break;
currpi = fgetc( pi );
}
printf( "Longest sequence is %d characters long starting at %d",
longestlength, longestindex );
putchar( 10 );
return 0;
}
It searches for a starting point, stores that starting point to return back to after determining the length of the current match. Determines the length of the current match, disregarding the whitespace on the way. Updates the record length if necessary, completely rewinds the search term file, partially-rewinds the examinee file back to the stored position.
Here's my e file:
somet
hing
And here is my pi file:
some other nhing som
eth
ing
And here's the output I get:
Complete match!
Longest sequence is 9 characters long starting at 20
By the way, fread and fwrite do not function humanly intuitive, as far as I remember. You can think of it like, computer uses a language that it itself understands while issuing those functions.

You can use strstr() function.Consider using it in a loop with return string as one of the argument.

Related

Replacing words in a string with words given in a 2D array

I'm currently working on a program that corrects given words in a sentence to be more polite.
I'm building a function that is given the original sentence and a 2D array, that stores the words we should look for and the ones we will replace them with.
This is my main function where the "dictionary" is declared:
int main(){
const char * d1 [][2] =
{
{ "hey", "hello" },
{ "bro", "sir" },
{ NULL, NULL }
};
printf("%s\n", newSpeak("well hey bro", d1) );
}
This functions job is to go through every pointer of the original string and check it with the first character of each word, that could potentially be 'bad'. If it catches the first letter, then it will go through the rest of the word and if it goes all the way to the end of the word, it will skip the original word and replace it with the 'good' word.
This is the function itself:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#include <assert.h>
char * newSpeak ( const char * text, const char * (*replace)[2] )
{
char * result = (char*)malloc( sizeof(char) );
int resIndex = 0; // Pointer to final text
int matches = 0; // 1 - Matches word from library, 0 - Does not
// Run through the whole original text
for ( int index = 0; text[index] != '\0'; index++ ){
for ( int line = 0; replace[line][0] != NULL; line++ ){
// If the first letter of the word matches, do the others match too?
// If yes, don't rewrite the original word, skip it, and write the replacement one by one.
if ( replace[line][0][0] == text[index] ){
matches = 1;
// Check one by one if letters from the word align with letters in the original string
for ( int letter = 0; replace[line][0][letter] != '\0'; letter++ ){
if ( replace[line][0][letter] != text[index + letter] ){
matches = 0;
break;
}
}
// If the whole word matches, skip what would be copied from original text (the bad word) and put in replacement letter by letter
if ( matches == 1 ){
// Push pointer of original string after the word
index += strlen( replace[line][0] );
for ( int r = 0; r < strlen( replace[line][1] ); r++){
result = (char*)realloc(result, (strlen( result ) + 1) * sizeof(char));
result[resIndex + r] = replace[line][1][r];
index += r;
}
}
}
}
if ( matches == 0 ){
result = (char*)realloc(result, (strlen( result ) + 1) * sizeof(char));
result[resIndex] = text[index];
}
resIndex++;
}
return result;
}
After this is run, my expected outcome is well hello sir, but instead, the function only returns well hello.
I am looking for an explanation to why the loop would stop and not check for the rest of the string, any help would be appreciated.
At least this problem:
strlen( result ) in result = (char*)realloc(result, (strlen( result ) + 1) * sizeof(char)); is not valid as result does not point to a string. Missing null character.

Checking whether a string consists of two repetitions

I am writing a function that returns 1 if a string consists of two repetitions, 0 otherwise.
Example: If the string is "hellohello", the function will return 1 because the string consists of the same two words "hello" and "hello".
The first test I did was to use a nested for loop but after a bit of reasoning I thought that the idea is wrong and is not the right way to solve, here is the last function I wrote.
It is not correct, even if the string consists of two repetitions, it returns 0.
Also, I know this problem could be handled differently with a while loop following another algorithm, but I was wondering if it could be done with the for as well.
My idea would be to divide the string in half and check it character by character.
This is the last function I tried:
int doubleString(char *s){
int true=1;
char strNew[50];
for(int i=0;i<strlen(s)/2;i++){
strNew[i]=s[i];
}
for(int j=strlen(s)/2;j<strlen(s);j++){
if(!(strNew[j]==s[j])){
true=0;
}
}
return true;
}
The problem in your function is with the comparison in the second loop: you are using the j variable as an index for both the second half of the given string and for the index in the copied first half of that string. However, for that copied string, you need the indexes to start from zero – so you need to subtract the s_length/2 value from j when accessing its individual characters.
Also, it is better to use the size_t type when looping through strings and comparing to the results of functions like strlen (which return that type). You can also improve your code by saving the strlen(s)/2 value, so it isn't computed on each loop. You can also dispense with your local true variable, returning 0 as soon as you find a mismatch, or 1 if the second loop completes without finding such a mismatch:
int doubleString(char* s)
{
char strNew[50] = { 0, };
size_t full_len = strlen(s);
size_t half_len = full_len / 2;
for (size_t i = 0; i < half_len; i++) {
strNew[i] = s[i];
}
for (size_t j = half_len; j < full_len; j++) {
if (strNew[j - half_len] != s[j]) { // x != y is clearer than !(x == y)
return 0;
}
}
return 1;
}
In fact, once you have appreciated why you need to subtract that "half length" from the j index of strNew, you can remove the need for that temporary copy completely and just use the modified j as an index into the original string:
int doubleString(char* s)
{
size_t full_len = strlen(s);
size_t half_len = full_len / 2;
for (size_t j = half_len; j < full_len; j++) {
if (s[j - half_len] != s[j]) { // x != y is clearer than !(x == y)
return 0;
}
}
return 1;
}
This loop
for(int j=strlen(s)/2;j<strlen(s);j++){
if(!(strNew[j]==s[j])){
true=0;
}
}
is incorrect. The index in the array strNew shall start from 0 instead of the value of the expression strlen( s ) / 2.
But in any case your approach is incorrect because at least you are using an intermediate array with the magic number 50. The user can pass to the function a string of any length.
char strNew[50];
The function can look much simpler.
For example
int doubleString( const char *s )
{
int double_string = 0;
size_t n = 0;
if ( ( double_string = *s != '\0' && ( n = strlen( s ) ) % 2 == 0 ) )
{
double_string = memcmp( s, s + n / 2, n / 2 ) == 0;
}
return double_string;
}
That is the function at first checks that the passed string is not empty and its length is an even number. If so then the function compares two halves of the string.
Here is a demonstration program.
#include <stdio.h>
#include <string.h>
int doubleString( const char *s )
{
int double_string = 0;
size_t n = 0;
if (( double_string = *s != '\0' && ( n = strlen( s ) ) % 2 == 0 ))
{
double_string = memcmp( s, s + n / 2, n / 2 ) == 0;
}
return double_string;
}
int main( void )
{
printf( "doubleString( \"\" ) = %d\n", doubleString( "" ) );
printf( "doubleString( \"HelloHello\" ) = %d\n", doubleString( "HelloHello" ) );
printf( "doubleString( \"Hello Hello\" ) = %d\n", doubleString( "Hello Hello" ) );
}
The program output is
doubleString( "" ) = 0
doubleString( "HelloHello" ) = 1
doubleString( "Hello Hello" ) = 0
Pay attention to that the function parameter should have the qualifier const because the passed string is not changed within the function. And you will be able to call the function with constant arrays without the need to defined one more function for constant character arrays.
it's better to do it with a while loop since you don't always have to iterate through all the elements of the string but since you want the for loop version here it is (C++ version):
int doubleString(string s){
int s_length = s.length();
if(s_length%2 != 0) {
return 0;
}
for (int i = 0; i < s_length/2; i++) {
if (s[i] != s[s_length/2 + i]){
return 0;
}
}
return 1;
}

Sides of a Triangle (or Right Triangle)

float a, b, c;
printf( "Enter three nonzero values:\n" );
scanf( "%f%f%f", &a, &b, &c );
When I input a = 2, b = 2, and c = 4, why does...
if( a < ( b + c ))
{
if( b < ( a + c ))
{
if( c < ( a + b ))
{
printf( "This is a triangle." );
}
else
{
printf( "This is not a triangle." );
}
}
}
return 0;
}
...print "This is not a triangle" but...
if( a < ( b + c ))
{
if( b < ( a + c ))
{
if( c < ( a + b ))
{
printf( "This is a triangle." );
}
}
}
else
{
printf( "This is not a triangle." );
}
...does not? A solution I am looking at uses the latter code. However, the line where "This is not a triangle" should be is just blank in my command propt when I run the program.
I am using notepad++ with Developer Command Prompt for VS 2019.
vvvvvvv SOLVED vvvvvvvv
This is how I solved the exercise if anyone at the same learning stage as me is interested. I am using a book to learn c programming and I've only read three chapters. Therefore, I wanted to use only what I have learned from the first three chapters.
float a, b, c, temp, no = 0, count = 1;
printf( "Enter three nonzero values:\n" );
scanf( "%f%f%f", &a, &b, &c );
while ( count < 3 ) {
if(a + b > c){
/* Switch place of a and b */
temp = a;
a = b;
b = temp;
/* Switch place of b and c */
temp = c;
c = b;
b = temp;
count++;
}
else {
no++;
/* Switch place of a and b */
temp = a;
a = b;
b = temp;
/* Switch place of b and c */
temp = c;
c = b;
b = temp;
count++;
}
}
if ( no > 0 ) {
printf ( "This is not a valid triangle." );
}
else {
printf ( "This is a valid triangle." );
}
return 0;
This solution uses a while loop to determine if the sides of a triangle can be a valid triangle. I use variable temp to switch places of the sides and then add one to variable no if one of the combinations is not a valid triangle.
I also want to add that the solution for this exercise on chegg.com is false. It confused me a lot.
vvvvvvvv ANOTHER SOLUTION vvvvvvv
The next exercise in my book was to calculate if three values could be the sides of a right triangle. When I solved this I realized that the same code could be used for a regular triangle, too.
float a, b, largest, temp;
printf( "Enter three nonzero values:\n" );
scanf( "%f%f%f", &largest, &a , &b );
if ( a > largest ) {
temp = largest;
largest = a;
a = temp;
}
if ( b > largest ) {
temp = largest;
largest = b;
b = temp;
}
/* a + b > largest if it's a regular triangle */
if ( a * a + b * b == largest * largest ) {
printf( "This is a valid right triangle" );
}
else {
printf( "This is not a valid right triangle" );
}
return 0;
So instead of looping, just compare the variables a and b with largest to find out which variable is largest. And then use the pythagorean theorem to find out if the sides can represent a right triangle, or the a + b > c if the it is a regular triangle.
I prefer this way rather than using unnecessary looping.
Because this is false: if( c < ( a + b )) since 4 < (2 + 2) is false. Therefore the else branch will executes. In your second variant it will only print "This is not a triangle." when the first if is false otherwise not. Try this
if ( (a < ( b + c )) && (b < ( a + c )) && (c < ( a + b )) )
printf( "This is a triangle.");
else
printf( "This is not a triangle.");
Note: This would also print "This is not a triangle." since by your logic with those values it is not a triangle anyway.
You are missing a bracket
if( c < ( a + b ))
{
printf( "This is a triangle." );
}else {
printf( "This is not a triangle." );
}
If wasn't closed before else
Notice the curly brackets and indentation.
In the 1st example:
if(...)
if(...)
if( c < ( a + b ))
{
printf( "This is a triangle." );
}
else
{
printf( "This is not a triangle." );
}
In order to get to the third if statement, the previous two ifs (the first AND the second if) have to be true. Our third if statement has an else branch, which will execute if the third condition is false.
In the 2nd example:
if( a < ( b + c ))
{
// some more code
}
else
{
printf( "This is not a triangle." );
}
else branch will execute only if the first condition is false. If the first condition is true, this else will never be executed.
That being said, the first example translates to:
If a is less than b+c AND b is less than a+c AND c is less than a+b
Then this is a triangle
Else this is not a triangle
The second example translates to:
If a is less than b+c
Then check other conditions
Else this is not a triangle
As a side note, be careful when comparing float numbers

Print the last word of String inside an array with only one loop

The program gets me out of the loop, it shows me in check that it is coming to NULL
Although it should continue to advance to the following letters in the string.
Thanks to all the assistants
void main()
{
char string[2][10] = { "lior","king" };
int words, letter;
for (words=0,letter = 0;words<2 , string[words][letter] != NULL;)
{
letter++;
if (string[words][letter] = NULL)
{
printf("%c\n", string[words][letter - 1]);
words++;
}
}
}
The ambition is that when it reaches the end of the first word, it will print the first letter and advance to the next string
This condition in the loop
words<2 , string[words][letter] != NULL;
is wrong. It seems you mean just
words<2
The first statement in the body pf the loop
letter++;
is also wrong because you skipped the index 0.
If I have understood correctly what you need is the following
#include <stdio.h>
int main(void)
{
enum { N = 10 };
char string[][N] = { "lior","king" };
const size_t M = sizeof( string ) / sizeof( *string );
for ( size_t word = 0, letter = 0; word < M; )
{
if (string[word][letter] == '\0' )
{
if ( letter != 0 ) printf( "%c\n", string[word][letter - 1] );
letter = 0;
++word;
}
else
{
++letter;
}
}
return 0;
}
The program output is
r
g

Check for particular characters in array in C

I would like to check for certain characters in an array at certain positions.
The array starts with $$$$ then has eight characters then another $, eight more characters and finishes with $$$$. For example char my_array[50] = "$$$$01FF4C68$02543EFE$$$$";
I want to check that all the positions where there are supposed to be $ do have them.
I could split the array into the three parts that contain the characters and then test for them separately but is there a better way of doing this?
Why complicate things?
if (my_array[0] != '$'
|| my_array[1] != '$'
|| my_array[2] != '$'
|| my_array[3] != '$'
|| my_array[12] != '$'
|| my_array[21] != '$'
|| my_array[22] != '$'
|| my_array[23] != '$'
|| my_array[24] != '$')
{
printf("Wrong!\n");
}
Use strstr()
To check if the array begins with eight $ : strstr(my_array, "$$$$$$$$")
To check if the array ends with eight $ : strstr(my_array + 16, "$$$$$$$$")
The +16 is here to shift the pointer so the beginning of my_array + 16 will be the place were the $ are supposed to be.
You might want to use the strstr functinn to find the $$$....
yes there is, you might want to use Regular Expressions, Please read http://www.peope.net/old/regex.html
If you use POSIX-compatible platform and some more complex patterns are about to emerge in your code, you can take a look at regular expressions, e.g. PCRE
You could also avoid using strstr since the format is simple and fixed; until the example format holds::
bool ok = strlen(my_array) >= 25 /* just be sure there are at least all expected chars */ &&
strncmp(my_array, "$$$$", 4) == 0 &&
strncmp(my_array + 12, "$", 1) == 0 /* my_array[12] == '$' */&&
strncmp(my_array + 21, "$$$$", 4) == 0;
A long option without using the string.h library is, make 3 tests:
#include <stdio.h>
int firstTest( char a[] );
int secondTest( char a[] );
int thirdTest( char a[] );
int main (void)
{
int result;
char my_array[50] = "$$$$01FF4C68$02543EFE$$$$";
if( ( firstTest( my_array ) == 1 ) && ( secondTest( my_array ) == 1 ) && ( thirdTest( my_array ) == 1 ) ){
printf( "The string is valid.\n" );
result = 1;
}
else{
printf( "The string is invalid.\n" );
result = 0;
}
return 0;
}
int firstTest( char a[] )
{
int i;
for( i = 0; i < 4; i++ ){
if ( a[i] != '$' ){
return 0;
break;
}
return 1;
}
}
int secondTest( char a[] )
{
if( my_array[12] != '$' )
return 0;
else
return 1;
}
int thirdTest( char a[] )
{
int i;
for( i = 21; i < 25; i++ ){
if ( a[i] != '$' ){
return 0;
break;
}
return 1;
}
}
sscanf should do the work
char my_array[50] = "$$$$01FF4C68$02543EFE$$$$";
int n,m;
if( !sscanf(my_array,"$$$$%*8[0-9A-H]%n$%*8[0-9A-H]$$$$%n",&n,&m) && n==12 && m==25 )
puts("ok");
else
puts("not ok");

Resources