help with C string i/o - c

Reads through a .txt file and puts all chars that pass isalpha() in a char array. for spaces it puts \0, so the characters in the array are separated by strings. This works.
I need help with the second part. I need to read a string that the user inputs (lets call it the target string), find all instances of the target string in the char array, and then for each instance:
1. print the 5 words before the target string
2. print the target string itself
3. and print the 5 words after the target string
I can't figure it out, i'm new to C in general, and I find this i/o really difficult after coming from Java. Any help would be appreciated, here's the code I have right now:
#include <stdio.h>
#include <string.h>
main(argc, argv)
int argc;
char *argv[];
{
FILE *inFile;
char ch, ch1;
int i, j;
int arrayPointer = 0;
char wordArray [150000];
for (i = 0; i < 150000; i++)
wordArray [i] = ' ';
/* Reading .txt, strip punctuation, conver to lowercase, add char to array */
void extern exit(int);
if(argc > 2) {
fprintf(stderr, "Usage: fread <filename>\n");
exit(-1);
}
inFile = fopen(argv[1], "r");
ch = fgetc(inFile);
while (ch != EOF) {
if(isalpha(ch)) {
wordArray [arrayPointer] = tolower(ch);
arrayPointer++;
}
else if (isalpha(ch1)) {
wordArray [arrayPointer] = '\0';
arrayPointer++;
}
ch1 = ch;
ch = fgetc(inFile);
}
fclose;
/* Getting the target word from the user */
char str [20];
do {
printf("Enter a word, or type \"zzz\" to quit: ");
scanf ("%s", str);
char* pch;
pch = strstr(wordArray, str);
printf("Found at %d\n", pch - wordArray + 1);
pch = strstr(pch + 1, str);
} while (pch != NULL);
}

There are a number of problems here, but the one that is probably tripping you up the most is the use of strstr as you've got it. Both parameters are strings; the first is the haystack, and the second is the needle. The definition of a C string is (basically) a sequence of characters terminated by '\0'. Take a look at how you've constructed your wordArray; it's effectively a series of strings one right after the other. So when you are using strstr the first time, you are only ever looking at the first string.
I realize this isn't the entire answer you are looking for, but hopefully it points you in the right direction. You may want to consider building up an array of char * that points into your wordArray at each word. Iterate over that new array checking for the string the user is looking for. If you find it, you now have an index you can work backwards and forwards from.

Related

Seperating all characters in a string before a strstr pointer in C

I am building a find and replace program in C. it needs to be able to replace whatever is searched for in a text file with some thing else.
Eg. Search elo in input.txt replace with ELO
Result: developers devELOpers
eg search el replace EL hello hELlo welcome wELcome
I am having real trouble getting all the characters in a string before a pointer. I can get the single words and the pointer for the search word and i can get everything after the search term but i cannot get everything before the searched characters.
Tried strlen on the whole word and subtracting to no avail. I also can get the length of the first part of the word i need.
Thanks for any help.
search for era in
operating:
Outputs are
firstMatch= erating
End part =ting
startLength 2
#include <stdio.h>
#include<stdlib.h>
#include <string.h>
#define LINE_LENGTH 1000
//Function to remove a newline character and replace it with a null terminator.
void remove_newline(char *str)
{
int len =strlen (str);
if (len>0 &&str[len -1] == '\n')
str[len -1] = '0';
}
int main(int argc, char **argv)
{
char* searchWord = argv[1]; //Define the string searched for as argument 1.
char* replaceWord = ""; // Initialise the word for replacing search word.
char* text_File= ""; // Initialise the text file
for(int i = 0; i < argc; i++)
{
if (strcmp(argv[i],"-r")==0)
{
replaceWord = argv[i+1]; // Identify the replacemnt word as the argument that comes after "-r".
}
if (strcmp(argv[i],"-i")==0)
{
text_File = argv[i+1]; // Identify the file word as the argument that comes after "-i".
}
}
char slen= strlen(searchWord);
printf("You have searched for %s\n", searchWord); // Clarify for the user their search terms and input file.
printf("In the file %s\n", text_File);
FILE *input_file = fopen(text_File, "r"); // Open the text file
char line [LINE_LENGTH];
FILE *write_file = fopen("output.txt", "w"); //Create a new file for writing to.
` while (fgets(line, LINE_LENGTH, input_file) != NULL) ***// Loop through the contents of the input file.***
{
char *currentWord; ***// Separate the words at any of "\n ,.-".***
char *tempWord;
currentWord = strtok(line, "\n ,.-");
while (currentWord != NULL) ***// Loop through every seperated word.***
{
remove_newline(currentWord); //Remove the newline character form the current word.
if (strstr(currentWord, searchWord)!= NULL) ***// Check if the current word contains the searh word***
{
printf ("%s ", currentWord); ***// If it does print to console the word containing it***
char currLength= strlen(currentWord);
printf("%d\n", currLength);
char* firstMatch= strstr(currentWord, searchWord); ***// Find the first occurence of the searched term***
printf ("firstMatch %s\n ", firstMatch); ***//Print evrything after and including search.***
char* lastPart = firstMatch + slen; ***// Get the part after the search***
printf ("End part %s\n ", lastPart);
char rest = strlen(firstMatch);
char startLength = currLength - rest;
This is where it doesn't work.
char* startPart = firstMatch - startLength;
printf ("start Part %s\n ", startPart);
printf ("startLength %d\n\n ", startLength);`
The reason for the unexpected result when you try to print the "before the match" portion of the word is that there's nothing in the word string that will cause the
printf("start Part %s\n ", startPart);
call to stop after it has printed the first startLength characters of the word. When printf is told to print a string, it prints all characters from the starting point until it encounters a \0 terminator. Here, the only \0 is at the end of the word, so printf prints the entire word.
If you want to only print the first few characters of the word, you have to either construct a \0-terminated string that only contains those first few characters or you have to print them by using a mechanism that does not try to treat them as a string.
To construct a \0-terminated start string you could temporarily overwrite the first character of the match with a \0, then call printf, and then restore the match character. Something like:
char savedFirstMatch = *firstMatch;
*firstMatch = '\0';
printf("start Part %s\n ", startPart);
*firstMatch = savedFirstMatch;
If you don't want to do that then you could use a for loop to print only the first startLength characters as individual characters, not as a string, preceded and followed by a printf or puts that emits whatever extra stuff you want to print around those characters. In this case the extra stuff is a "start Part " string before the characters, and a newline and space afterwards (assuming that that space isn't just a typo). That would look something like:
puts("start Part ");
unsigned startIndex;
for (startIndex = 0; startIndex < startLength; ++startIndex) {
putchar(startPart[startIndex]);
}
puts("\n ");
Of course if you aren't comfortable with puts and putchar you can use printf("%s", ...) and printf("%c", ...) instead.

ANSI C strcmp() function never returning 0, where am I going wrong?

C isn't the language I know so I'm out of my comfort zone (learning C) and I have ran into an issue that I can't currently figure out.
I am trying to read from a text file one word at a time and compare it to a word that I have passed into the function as a pointer.
I am currently reading it from the file one character at a time and storing those characters in a new char array until it hits a space, then comparing that char array to the original word stored in the pointer (stored where it's pointing to, anyway).
When I do a printf to check if both arrays are the same they are, they both equal "Hello". At first I thought maybe it's because my char array doesn't have an end terminator but I tried adding one but still nothing is seeming to work.
My code is below and I would appreciate any help. Again C isn't my strong area.
If I do "Hello" it will be > 0 by the way, so I think it's because the gets() stdin function is also including the enter key or something of that sort. I am not sure of a better way to grab the string though.
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
int partA(char*);
main()
{
// Array to store my string
char myWord[81];
// myword = pointer to my char array to store. 80 = the size (maximum). stdin = standard input from my keyboard.
fgets(myWord, 80, stdin);
partA(myWord);
}
int partA(char *word)
{
// points to file.
FILE *readFile;
fopen_s(&readFile, "readThisFile.txt", "r");
char character;
char newWord[50];
int i = 0;
while ((character = fgetc(readFile)) != EOF)
{
if (character == ' ')
{
newWord[i] = '\0';
int sameWord = strcmp(word, newWord);
printf("Word: %s", word);
printf("newWord: %s", newWord);
if (sameWord == 0)
printf(" These words are the same.");
if (sameWord > 0)
printf(" sameWord > 0.");
if (sameWord < 0)
printf(" sameWord < 0.");
printf("\n");
i = 0;
}
if (character != ' ')
{
newWord[i] = character;
i++;
}
printf("%c", character);
}
fclose(readFile);
return 1;
}

C - char array getting phantom values after memset

My program reads in a text file line by line and prints out the largest word in each sentence line. However, it sometimes prints out previous highest words although they have nothing to do with the current sentence and I reset my char array at the end of processing each line. Can someone explain to me what is happening in memory to make this happen? Thanks.
//Program Written and Designed by R.Sharpe
//LONGEST WORD CHALLENGE
//Purpose: Find the longest word in a sentence
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "memwatch.h"
int main(int argc, char** argv)
{
FILE* file;
file = fopen(argv[1], "r");
char* sentence = (char*)malloc(100*sizeof(char));
while(fgets(sentence, 100, file) != NULL)
{
//printf("%s\n", sentence);
char sub[100];
char maxWord[100];
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
char* word;
int maxLength = 0;
word = strtok(sentence, " ");
while(word != NULL)
{
if(strlen(word) > maxLength)
{
maxLength = strlen(word);
strcpy(maxWord, word);
//printf("%s\n", maxWord);
}
word = strtok(NULL, " ");
}
printf("%s\n", maxWord);
memset(maxWord, 0, sizeof(char));
maxLength = 0; //reset for next sentence;
}
free(sentence);
return 0;
}
my text file contains . .
some line with text
another line of words
Jimmy John took the a apple and something reallyreallylongword it was nonsense
test test BillGatesSteveJobsWozWasMagnificant
a b billy
the output of the program is . .
some
another
reallyreallylongword
BillGatesSteveJobsWozWasMagnificantllyreallylongword
BillGatesSteveJobsWozWasMagnificantllyreallylongword //should be billy
Also when I arbitrarily change the length of the 5th sentence the last word sometimes
comes out to be "reallyreallylongword" which is odd.
EDIT: Even when I comment MEMSET out I still get the same result so it may not have anything to do with memset but not completely sure
Trailing NULL bytes (\0) are the bane of string manipulation. You have a copy sequence that is not quite doing what you desire of it:
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
Sentence is copied into sub, and then back again. Except, strncpy does not copy the '\0' out of sentence. When you copy the string from sub back into sentence, you are copying an unknown length of data back into sentence. Because the stack is being reused and the char arrays are uninitialized, the data is likely residing there from the previous iteration and thus being seen by the next execution.
Adding the following between the two strcpys fixes the problem:
sub[strlen(sentence) - 1] = '\0';
You've got a missing null terminator.
char sub[100];
char maxWord[100];
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
When you strncpy, if src is longer than the number of characters to be copied, no null terminator is added. You've guaranteed this is the case, so sub has no terminator, and you're rapidly running into behavior you don't want. It looks like you're trying to trim the last character from the string; the easier way to do that is simply set the character at index strlen(sentence)-1 to '\0'.
This is bad:
strncpy(sub, sentence, strlen(sentence)-1);
strcpy(sentence, sub);
The strncpy function does not null-terminate its buffer if the source string doesn't fit. By doing strlen(sentence)-1 you guaranteed it doesn't fit. Then the strcpy causes undefined behaviour because sub isn't a string.
My advice is to not use strncpy, it is almost never a good solution to a problem. Use strcpy or snprintf.
In this case you never even use sub so you could replace these lines with:
sentence[ strlen(sentence) - 1 ] = 0;
which has the effect of removing the \n on the end that was left by fgets. (If the input was longer than 100 then this deletes a character of input).
Find the corrected code in below
int main(int argc, char** argv)
{
FILE* file;
file = fopen(argv[1], "r");
char sub[100];
char maxWord[100];
char* word;
int maxLength = 0;
char* sentence = (char*)malloc(100*sizeof(char));
while(fgets(sentence, 100, file) != NULL)
{
maxLength = 0;
strncpy(sub, sentence, strlen(sentence)-1);
sub[strlen(sentence) - 1] = '\0'; //Fix1
strcpy(sentence, sub);
word = strtok(sentence, " ");
while(word != NULL)
{
if(strlen(word) > maxLength)
{
maxLength = strlen(word);
strcpy(maxWord, word);
}
word = strtok(NULL, " ");
}
printf("%s\n", maxWord);
memset(maxWord, 0, sizeof(char));
maxLength = 0; //reset for next sentence;
}
free(sentence);
fclose (file); //Fix2
return 0;
}
Ensure that the file is closed at the end. It is good practice.

Building a basic shell, more specifically using execvp()

In my program I am taking user input and parsing it into a 2d char array. The array is declared as:
char parsedText[10][255] = {{""},{""},{""},{""},{""},
{""},{""},{""},{""},{""}};
and I am using fgets to grab the user input and parsing it with sscanf. This all works as I think it should.
After this I want to pass parsedText into execvp, parsedText[0] should contain the path and if any arguments are supplied then they should be in parsedText[1] thru parsedText[10].
What is wrong with execvp(parsedText[0], parsedText[1])?
One thing probably worth mentioning is that if I only supply a command such as "ls" without any arguments it appears to work just fine.
Here is my code:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include "308shell.h"
int main( int argc, char *argv[] )
{
char prompt[40] = "308sh";
char text[40] = "";
char parsedText[10][40] = {{""},{""},{""},{""},{""},
{""},{""},{""},{""},{""}};
// Check for arguments to change the prompt.
if(argc >= 3){
if(!(strcmp(argv[1], "-p"))){
strcpy(prompt, argv[2]);
}
}
strcat(prompt, "> ");
while(1){
// Display the prompt.
fputs(prompt, stdout);
fflush(stdout);
// Grab user input and parse it into parsedText.
mygetline(text, sizeof text);
parseInput(text, parsedText);
// Check if the user wants to exit.
if(!(strcmp(parsedText[0], "exit"))){
break;
}
execvp(parsedText[0], parsedText[1]);
printf("%s\n%s\n", parsedText[0], parsedText[1]);
}
return 0;
}
char *mygetline(char *line, int size)
{
if ( fgets(line, size, stdin) )
{
char *newline = strchr(line, '\n'); /* check for trailing '\n' */
if ( newline )
{
*newline = '\0'; /* overwrite the '\n' with a terminating null */
}
}
return line;
}
char *parseInput(char *text, char parsedText[][40]){
char *ptr = text;
char field [ 40 ];
int n;
int count = 0;
while (*ptr != '\0') {
int items_read = sscanf(ptr, "%s%n", field, &n);
strcpy(parsedText[count++], field);
field[0]='\0';
if (items_read == 1)
ptr += n; /* advance the pointer by the number of characters read */
if ( *ptr != ' ' ) {
strcpy(parsedText[count], field);
break; /* didn't find an expected delimiter, done? */
}
++ptr; /* skip the delimiter */
}
}
execvp takes a pointer to a pointer (char **), not a pointer to an array. It's supposed to be a pointer to the first element of an array of char * pointers, terminated by a null pointer.
Edit: Here's one (not very good) way to make an array of pointers suitable for execvp:
char argbuf[10][256] = {{0}};
char *args[10] = { argbuf[0], argbuf[1], argbuf[2], /* ... */ };
Of course in the real world your arguments probably come from a command line string the user entered, and they probably have at least one character (e.g. a space) between them, so a much better approach would be to either modify the original string in-place, or make a duplicate of it and then modify the duplicate, adding null terminators after each argument and setting up args[i] to point to the right offset into the string.
You could instead do a lot of dynamic allocation (malloc) every step of the way, but then you have to write code to handle every possible point of failure. :-)

Parsing text in C

I have a file like this:
...
words 13
more words 21
even more words 4
...
(General format is a string of non-digits, then a space, then any number of digits and a newline)
and I'd like to parse every line, putting the words into one field of the structure, and the number into the other. Right now I am using an ugly hack of reading the line while the chars are not numbers, then reading the rest. I believe there's a clearer way.
Edit: You can use pNum-buf to get the length of the alphabetical part of the string, and use strncpy() to copy that into another buffer. Be sure to add a '\0' to the end of the destination buffer. I would insert this code before the pNum++.
int len = pNum-buf;
strncpy(newBuf, buf, len-1);
newBuf[len] = '\0';
You could read the entire line into a buffer and then use:
char *pNum;
if (pNum = strrchr(buf, ' ')) {
pNum++;
}
to get a pointer to the number field.
fscanf(file, "%s %d", word, &value);
This gets the values directly into a string and an integer, and copes with variations in whitespace and numerical formats, etc.
Edit
Ooops, I forgot that you had spaces between the words.
In that case, I'd do the following. (Note that it truncates the original text in 'line')
// Scan to find the last space in the line
char *p = line;
char *lastSpace = null;
while(*p != '\0')
{
if (*p == ' ')
lastSpace = p;
p++;
}
if (lastSpace == null)
return("parse error");
// Replace the last space in the line with a NUL
*lastSpace = '\0';
// Advance past the NUL to the first character of the number field
lastSpace++;
char *word = text;
int number = atoi(lastSpace);
You can solve this using stdlib functions, but the above is likely to be more efficient as you're only searching for the characters you are interested in.
Given the description, I think I'd use a variant of this (now tested) C99 code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
struct word_number
{
char word[128];
long number;
};
int read_word_number(FILE *fp, struct word_number *wnp)
{
char buffer[140];
if (fgets(buffer, sizeof(buffer), fp) == 0)
return EOF;
size_t len = strlen(buffer);
if (buffer[len-1] != '\n') // Error if line too long to fit
return EOF;
buffer[--len] = '\0';
char *num = &buffer[len-1];
while (num > buffer && !isspace((unsigned char)*num))
num--;
if (num == buffer) // No space in input data
return EOF;
char *end;
wnp->number = strtol(num+1, &end, 0);
if (*end != '\0') // Invalid number as last word on line
return EOF;
*num = '\0';
if (num - buffer >= sizeof(wnp->word)) // Non-number part too long
return EOF;
memcpy(wnp->word, buffer, num - buffer);
return(0);
}
int main(void)
{
struct word_number wn;
while (read_word_number(stdin, &wn) != EOF)
printf("Word <<%s>> Number %ld\n", wn.word, wn.number);
return(0);
}
You could improve the error reporting by returning different values for different problems.
You could make it work with dynamically allocated memory for the word portion of the lines.
You could make it work with longer lines than I allow.
You could scan backwards over digits instead of non-spaces - but this allows the user to write "abc 0x123" and the hex value is handled correctly.
You might prefer to ensure there are no digits in the word part; this code does not care.
You could try using strtok() to tokenize each line, and then check whether each token is a number or a word (a fairly trivial check once you have the token string - just look at the first character of the token).
Assuming that the number is immediately followed by '\n'.
you can read each line to chars buffer, use sscanf("%d") on the entire line to get the number, and then calculate the number of chars that this number takes at the end of the text string.
Depending on how complex your strings become you may want to use the PCRE library. At least that way you can compile a perl'ish regular expression to split your lines. It may be overkill though.
Given the description, here's what I'd do: read each line as a single string using fgets() (making sure the target buffer is large enough), then split the line using strtok(). To determine if each token is a word or a number, I'd use strtol() to attempt the conversion and check the error condition. Example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/**
* Read the next line from the file, splitting the tokens into
* multiple strings and a single integer. Assumes input lines
* never exceed MAX_LINE_LENGTH and each individual string never
* exceeds MAX_STR_SIZE. Otherwise things get a little more
* interesting. Also assumes that the integer is the last
* thing on each line.
*/
int getNextLine(FILE *in, char (*strs)[MAX_STR_SIZE], int *numStrings, int *value)
{
char buffer[MAX_LINE_LENGTH];
int rval = 1;
if (fgets(buffer, buffer, sizeof buffer))
{
char *token = strtok(buffer, " ");
*numStrings = 0;
while (token)
{
char *chk;
*value = (int) strtol(token, &chk, 10);
if (*chk != 0 && *chk != '\n')
{
strcpy(strs[(*numStrings)++], token);
}
token = strtok(NULL, " ");
}
}
else
{
/**
* fgets() hit either EOF or error; either way return 0
*/
rval = 0;
}
return rval;
}
/**
* sample main
*/
int main(void)
{
FILE *input;
char strings[MAX_NUM_STRINGS][MAX_STRING_LENGTH];
int numStrings;
int value;
input = fopen("datafile.txt", "r");
if (input)
{
while (getNextLine(input, &strings, &numStrings, &value))
{
/**
* Do something with strings and value here
*/
}
fclose(input);
}
return 0;
}

Resources