I'm trying to write a program that takes any number of one-word text string arguments, each less than 128 characters long. The program copies text from stdin to stdout, except that any of the words seen in the input are replaced with the word "CENSORED".
Example:
I have this file called poem.txt:
Said Hamlet to Ophelia,
I'll draw a sketch of thee,
What kind of pencil shall I use?
2B or not 2B?
The program should do this:
./censor Ophelia < poem.txt
Said Hamlet to CENSORED,
I'll draw a sketch of thee,
What kind of pencil shall I use?
2B or not 2B?
Here's my code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char lines[200][200];
int numLines=0,i,j;
int nbytes = 128;
int bytes_read=0;
char *my_string;
char * pch;
//reading from stdin
while(stdin)
{
my_string=(char *) malloc (nbytes + 1);
bytes_read = getline (&my_string, &nbytes, stdin);
strcpy(lines[numLines++],my_string);
}
//scanning and replacing specified words by "CENSORED"
for(i=0;i<argc;i++)
{
for(j=0;j<numLines;j++)
{
pch = strstr (lines[j],argv[i]);
strncpy (pch,"CENSORED",8);
}
}
//display the result in output screen
for(j=0;j<numLines;j++)
{
printf("\n%s",lines[i]);
}
}
The problem is that this is giving segmentation fault, but I can't identify the mistake.
You're not properly overwritting a hit with the replacement which might be longer or shorter -- you're just stuffing it in regardless (potentially overwriting the terminal \0, possibly leading to your segmentation fault). Also, it looks like you miss double hits as you only check each command line word once against each line. Finally, you've made this more complicated by storing all the lines -- no line affects any other so why store them rather than process and print each line in turn?
Here's a overall simplified approach with more detailed replacement code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define REPLACEMENT "CENSORED"
#define BUFFER_SIZE (1024)
int main(int argc, char *argv[])
{
ssize_t bytes_read;
char *s, *line = malloc(BUFFER_SIZE);
size_t nbytes = BUFFER_SIZE, replacement_length = strlen(REPLACEMENT);
// read from stdin
while ((bytes_read = getline(&line, &nbytes, stdin)) != -1)
{
// scanning and replacing specified words
for (int i = 1; i < argc; i++)
{
while ((s = strstr(line, argv[i])) != NULL)
{
size_t search_length = strlen(argv[i]);
size_t tail_length = strlen(s + search_length);
(void) memmove(s + replacement_length, s + search_length, tail_length + 1);
(void) memcpy(s, REPLACEMENT, replacement_length);
}
}
// display the result in output screen
(void) fputs(line, stdout);
}
free(line);
}
Oh yeah, and you forgot to free what you malloc'd. And you're searching for the name of the program as one of your targets...
EXAMPLE
> ./a.out pencil 2B < poem.txt
Said Hamlet to Ophelia,
I'll draw a sketch of thee,
What kind of CENSORED shall I use?
CENSORED or not CENSORED?
Related
I'm using a Ubuntu Machine compiling with Clang.
I'm reading a simple file, storing it into a buffer then getting the length. I'm anticipating receiving a 5 but got a 6.
strlen() isn't suppose to include the null terminator. Is this perhaps because I performed a cast on the buffer?
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main() {
unsigned char buffer[30];
memset(buffer, '\0', 30);
int fd_read = open("test.txt", O_RDONLY);
read(fd_read, buffer, 29);
ssize_t length = strlen((const char *)buffer);
printf("%zu\n", length);
}
Contents of test.txt:
Hello
Output:
6
strlen() isn't suppose to include the null terminator.
That is true.
Is this perhaps because I performed a cast on the buffer?
The cast is unnecessary but it is not what is causing the problem.
I'm reading a simple file, storing it into a buffer then getting the length. I'm anticipating receiving a 5 but got a 6.
The likely scenario is that you have newline character at the end of the read string, as pointed out by Chris Dodd, which strlen will count. To remove it:
buffer[strcspn(buffer, "\n")] = '\0';
Other considerations about your code:
You should verify the return value of open to confirm that the file was successfuly accessed.
memset(buffer, '\0', 30); is unnecessary, you can null terminate buffer:
ssize_t nbytes = read(fd_read, buffer, sizeof buffer - 1);
if(nbytes >= 0)
buffer[nbytes] = '\0';
Or you can initialize the array with 0s:
unsigned char buffer[30] = {'\0'}; // or 0
Your program is somewhat convoluted, using modified types for no reason. Yet the problem does not come from these typing issues nor the use of casts, it is much more likely the file contains 6 bytes instead 5, namely the letters Hello and a newline(*).
Here is a modified version:
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main() {
char buffer[30] = { 0 };
int fd_read = open("test.txt", O_RDONLY);
if (fd_read >= 0) {
int count = read(fd_read, buffer, sizeof(buffer) - 1);
size_t length = strlen(buffer);
printf("count=%d, length=%zu\n", count, length);
printf("contents: {");
for (size_t i = 0; i < count; i++) {
printf("%3.2X", (unsigned char)buffer[i]);
}
printf(" }\n");
close(fd_read);
}
return 0;
}
(*)or possibly on legacy platforms, Hello and an end of line sequence CR/LF (7 bytes) that is translated to a single '\n' byte by the read library function that is a wrapper on system calls that performs complex postprocessing
my program was built as a test to input as many sentences as the user want (until he enters -1), and then concatenate all sentences (\n included). If i input some characters is fine, but if i input more then 25 characters i have the two errors listed above, i tried simulating what would happen in paper and i can´t find the problem, help is appreciated, thanks in advance.
The code is displayed below:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(void)
{
char *s = malloc(1), *sentence = malloc(0);
int sSize = 0;
printf("Insert sentences, press '-1' if you want to exit:\n");
do
{
fgets(s,100,stdin);
if(strcmp(s,"-1\n") != 0)
{
sSize += strlen(s);
sentence = realloc(sentence, sSize * sizeof(char));
//s[strcspn(s, "\0")] = '\n';
strcat(sentence, s);
}
}while(strcmp(s,"-1\n") != 0);
printf("==================sentence================\n");
printf("%s", sentence);
return 0;
}
This is a classic buffer overrun problem:
s = malloc(1) - s now points to a one-character buffer.
fgets(s,100,stdin); - reads up to 100 characters into s - which is a one-character buffer.
EDIT
Here's a version which works and doesn't use a separate "sentence buffer":
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(void)
{
const char *terminator = "-1\n";
char *sentences = malloc(100);
char *pNext_sentence;
printf("Insert sentences, press '-1' if you want to exit:\n");
*sentences = '\0';
do
{
sentences = realloc(sentences, strlen(sentences)+100);
pNext_sentence = sentences + strlen(sentences);
fgets(pNext_sentence, 100, stdin);
} while(strcmp(pNext_sentence, terminator) != 0);
*(sentences + (strlen(sentences) < strlen(terminator) ? 0 : strlen(sentences) - strlen(terminator))) = '\0';
printf("==================sentences================\n");
printf("%s", sentences);
free(sentences);
return 0;
}
You must use reallocate memory with realloc before using fgets, which, in your case, reads 100 bytes.
Your string has the initial size of 1.
I am attempting to scanf a multiline input in C and output it. However, I'm having trouble handling spaces and newline characters. If the input is:
Hello.
My name is John.
Pleased to meet you!
I want to output all three lines. But my output ends up being just:
Hello.
Here's my code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char s[100];
scanf("%[^\n]%*c", &s);
printf(s);
return 0;
}
Its much easier to use fgets():
#include <stdio.h>
int main(void)
{
char buffer[1000];
while (fgets(buffer, sizeof(buffer), stdin) && buffer[0] != '\n') {
printf("%s", buffer);
}
}
An empty line (first character is newline) ends input.
If you have to read all input first before printing the result, things get a little bit more complicated:
#include <stddef.h> // size_t
#include <stdlib.h> // EXIT_FAILURE, realloc(), free()
#include <stdio.h> // fgets(), puts()
#include <string.h> // strlen(), strcpy()
int main(void)
{
char buffer[1000];
char *text = NULL; // pointer to memory that will contain the whole text
size_t total_length = 0; // keep track of where to copy our buffer to
while (fgets(buffer, sizeof(buffer), stdin) && buffer[0] != '\n') {
size_t length = strlen(buffer); // remember so we don't have to call
// strlen() twice.
// (re)allocate memory to copy the buffer to:
char *new_text = realloc(text, total_length + length + 1); // + 1 for the
if (!new_text) { // if (re)allocation failed terminating '\0'
free(text); // clean up our mess
fputs("Not enough memory :(\n\n", stderr);
return EXIT_FAILURE;
}
text = new_text; // now its safe to discard the old pointer
strcpy(text + total_length, buffer); // strcpy instead of strcat so we don't
total_length += length; // have to care about uninitialized memory
} // on the first pass *)
puts(text); // print all of it
free(text); // never forget
}
*) and it is also more efficient since strcat() would have to find the end of text before appending the new string. Information we already have.
I have been trying to get strcmp to return true in the following program for many days now, I have read the man pages for strcmp, read, write... I have other's posts who have had the exact same problem. The following source code is just a test program that is frustrating the heck out of me, there are some commented out lines that are other attempts I've made at getting strcmp to work as expected. I have compiled with 'gdb -g' and stepped through one instruction at a time. The printf statements pretty much tell the whole story. I cannot get the value of buf, or bufptr to equal 't' ever. I have simplified the program, and had it just print one character at a time one after the other to the screen and they print as expected from whatever file is read-in, however, as soon as I start playing with strcmp, things get crazy. I cannot for the life of me figure out a way to get the value in buf to be the single char that I am expecting it to be.
When simplified to just the write(1,...) call, it writes the expected single char to stdout, but strcmp to a single 't' never returns 0. !!!!! Thank you in advance. I originally didnt have bufptr in there and was doing a strcmp to buf itself and also tried using bufptr[0] = buf[0] and the still were not the same.
#include <stdlib.h>
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#define BUF_SIZE 1
void main(int argc, char *argv[])
{
char buf[BUF_SIZE];
int inputFd = open(argv[1], O_RDONLY);
char tee[] = "t";
int fff = 999;
char bufptr[BUF_SIZE];
// char *bufptr[BUF_SIZE];
while (read(inputFd, buf, BUF_SIZE) > 0) {
bufptr[0] = buf[0];
// bufptr = buf;
printf("********STRCMP RETURNED->%d\n", fff); // for debugging purposes
printf("--------tee is -> %s\n", tee); // for debugging purposes
printf("++++++++buf is -> %s\n", buf); // " " "
printf("########bufptr is -> %s", bufptr); // " " "
write (1, buf, BUF_SIZE);
if ((fff = strcmp(tee, bufptr)) == 0)
printf("THIS CHARACTER IS A T");
}
close(inputFd);
}
The str-family of functions expects strings as inputs, which are arrays storing null-terminated character sequences. However, you do not provide space in the buffer for the null character. To make the buffers strings, you need to add space for the null-character and zero-out the value so that they end with the null character.
void main(int argc, char *argv[])
{
char buf[ BUF_SIZE + 1 ] = {0};
int inputFd = open(argv[1], O_RDONLY);
char tee[] = "t";
while (read(inputFd, buf, BUF_SIZE) > 0) {
if ( strcmp( tee, buf ) == 0 )
printf("THIS CHARACTER IS A T");
}
close(inputFd);
}
I want to use read() and write() methods for reading from and writing to console instead of the original scanf() and printf(), as the first ones has system calls support using signals.
I have to make a mini Unix shell, which forks into children when performing a command. Here is my initial try for testing the reading and writing:
#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define COMMAND_LENGTH 1024
#define NUM_TOKENS (COMMAND_LENGTH / 2 + 1)
void read_command(char *buff, char *tokens[], _Bool *in_background) {
// to be implemented later
}
void createStr(char **str) {
if (*str != NULL) {
free(*str);
*str = NULL;
}
*str = (char*)malloc(sizeof(char) * COMMAND_LENGTH);
}
void delStr(char** str) {
if (*str != NULL) {
free(*str);
*str = NULL;
}
}
int main(int argc, char *argv[]) {
char input_buffer[COMMAND_LENGTH];
char *tokens[NUM_TOKENS];
char *inp = NULL;
while (true) {
write(STDOUT_FILENO, "> ", strlen("> "));
createStr(&inp);
read(STDIN_FILENO, input_buffer, sizeof(char) * COMMAND_LENGTH);
strcpy(inp, input_buffer);
write(STDOUT_FILENO, inp, strlen(inp));
_Bool in_background = false;
read_command(inp, tokens, &in_background);
}
delStr(&inp);
return 0;
}
My output for sample inputs are not the desired ones. Here is a sample output:
> Peterson
Peterson
��> Makr
Makr
son
��> Mark
Mark
son
��> Jon
Jon
son
��>
I don't know what is going on. Like why the special characters are showing up, as well as having parts of my last input in my new input. I need help in this.
You must store the return value of read, verify that read completed and null terminate the string before passing it to strcpy. read can indeed be interrupted by a signal and must be restarted in this case. read can also return a partial line, you should keep reading and concatenating into the command buffer, carefully reallocating the buffer if needed, until you receive either an end of file of the character '\n'.