Ignore first word from the line in c - c

I am working on a code and need some help.
There is a line which needs to be read from a file. The first word must be ignored and the remaining characters (white spaces included) have to be stored into variable. How do I do it?

This will work if your word has no spaces in front of it and you use white space (' ') as separating character.
#include <stdio.h>
#include <string.h>
int main()
{
char buffer[80];
char storage[80];
fgets(buffer, 80, stdin); // enter: »hello nice world!\n«
char *rest = strchr(buffer, ' '); // rest becomes » nice world!\n«
printf("rest: »%s«\n", rest); // » nice world!\n«
printf("buffer: »%s«\n", buffer); // »hello nice world!\n«
strncpy( storage, rest, 80 ); // storage contains now » nice world!\n«
printf("storage: »%s«\n", storage); // » nice world!\n«
// if you'd like the separating character after the "word" to be any white space
char *rest2 = buffer;
rest2 += strcspn( buffer, " \t\r\n" ); // rest2 points now too to » nice world!\n«
printf("rest2: »%s«\n", rest2); // » nice world!\n«
return 0;
}

Some examples. Read the comments in the program to understand the effect. This will assume that words are delimited by whitespace characters (as defined by isspace()). Depending on your definition of "word", the solution may differ.
#include <stdio.h>
int main() {
char rest[1000];
// Remove first word and consume all space (ASCII = 32) characters
// after the first word
// This will work well even when the line contains only 1 word.
// rest[] contains only characters from the same line as the first word.
scanf("%*s%*[ ]");
fgets(rest, sizeof(rest), stdin);
printf("%s", rest);
// Remove first word and remove all whitespace characters as
// defined by isspace()
// The content of rest will be read from next line if the current line
// only has one word.
scanf("%*s ");
fgets(rest, sizeof(rest), stdin);
printf("%s", rest);
// Remove first word and leave spaces after the word intact.
scanf("%*s");
fgets(rest, sizeof(rest), stdin);
printf("%s", rest);
return 0;
}

Related

How to get each string within a buffer fetched with "getline" from a file in C

I'm trying to read every string separated with commas, dots or whitespaces from every line of a text from a file (I'm just receiving alphanumeric characters with scanf for simplicity). I'm using the getline function from <stdio.h> library and it reads the line just fine. But when I try to "iterate" over the buffer that was fetched with it, it always returns the first string read from the file. Let's suppose I have a file called "entry.txt" with the following content:
test1234 test hello
another test2
And my "main.c" contains the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_WORD 500
int main()
{
FILE *fp;
int currentLine = 1;
size_t characters, maxLine = MAX_WORD * 500;
/* Buffer can keep up to 500 words of 500 characters each */
char *word = (char *)malloc(MAX_WORD * sizeof(char)), *buffer = (char *)malloc((int)maxLine * sizeof(char));
fp = fopen("entry.txt", "r");
if (fp == NULL) {
return 1;
}
for (currentLine = 1; (characters = getline(&buffer, &maxLine, fp)) != -1; currentLine++)
{
/* This line gets "test1234" onto "word" variable, as expected */
sscanf(buffer, "%[a-zA-Z_0-9]", word);
printf("%s", word); // As expected
/* This line should get "test" string, but again it obtains "test1234" from the buffer */
sscanf(buffer, "%[a-zA-Z_0-9]", word);
printf("%s", word); // Not intended...
// Do some stuff with the "word" and "currentLine" variables...
}
return 0;
}
What happens is that I'm trying to get every alphanumeric string (namely word from now on) in sequence from the buffer, when the sscanf function just gives me the first occurrence of a word within the specified buffer string. Also, every line on the entry file can contain an unknown amount of words separated by either whitespaces, commas, dots, special characters, etc.
I'm obtaining every line from the file separately with "getline" because I need to get every word from every line and store it in other place with the "currentLine" variable, so I'll know from which line a given word would've come. Any ideas of how to do that?
fscanf has an input stream argument. A stream can change its state, so that the second call to fscanf reads a different thing. For example:
fscanf(stdin, "%s", str1); // str1 contains some string; stdin advances
fscanf(stdin, "%s", str2); // str2 contains some other sting
scanf does not have a stream argument, but it has a global stream to work with, so it works exactly like fscanf(stdin, ...).
sscanf does not have a stream argument, nor there is any global state to keep track of what was read. There is an input string. You scan it, some characters get converted, and... nothing else changes. The string remains the same string (how could it possibly be otherwise?) and no information about how far the scan has advanced is stored anywhere.
sscanf(buffer, "%s", str1); // str1 contains some string; nothing else changes
sscanf(buffer, "%s", str2); // str2 contains the same sting
So what does a poor programmer fo?
Well I lied. No information about how far the scan has advanced is stored anywhere only if you don't request it.
int nchars;
sscanf(buffer, "%s%n", str1, &nchars); // str1 contains some string;
// nchars contains number of characters consumed
sscanf(buffer+nchars, "%s", str2); // str2 contains some other string
Error handling and %s field widths omitted for brevity. You should never omit them in real code.

access an argument in a string pointer char*

I need to try and fix sentences from an input in c, so I tried separating tokens and making new strings and then I wanted to access the first char of each string and make it a capital letter.
Now I am having trouble understanding how to access only one char of each new string, like trying to access only 'e' in hello which is in str1[0] second char.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main()
{
char str1[601], * str2[601];
int i = 0, j = 0;
printf_s("*************** Welcome to the text cleaner ***************\n\n");
printf_s("Please enter text:\n");
gets_s(str1, sizeof(str1));
char* sentence=NULL,*next_sentence=NULL;
sentence = strtok_s(str1,".",&next_sentence);
while (sentence != NULL)
{
printf(" %s\n", sentence);
str2[i++] = sentence;
sentence = strtok_s(NULL, ".", &next_sentence);
}
str2[i++] = '\0';
printf_s("%s", str2[1]);
}
Code and content of variables in debugger
Here is my take on what you are trying to do. I'm showing the code and the results. I have simplified your effort since you are mixing printf and printf_s. You use the _s variant for buffer overflow control. That does not seem to be your concern while simply learning about arrays.
Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main() {
char str1[601]; // This is an array of chars.
// If storing a string, final elem is 0x0
char *str2[601]; // This is a pointer to an array of chars.
//
int i = 0;
int j = 0;
// I removed your _s variants of standard libraries. Let's keep
// things simple.
printf("*************** Welcome to the text cleaner ***************\n\n");
printf("Please enter text:\n");
// ditto for gets to fgets
//
// Excerpt rom the manpage
//
// char *fgets(char *s, int size, FILE *stream);
//
// fgets() reads in at most one less than size characters from
// stream and stores them into the buffer pointed to by s.
// Reading stops after an EOF or a newline. If a newline is read,
// it is stored into the buffer. A terminating null byte ('\0')
// is stored after the last character in the buffer.
//
// fgets() returns s on success, and NULL on error or when end of
// file occurs while no characters have been read.
//str1 = fgets(str1, sizeof(str1), stdin);
// I would do a null check here.
if (NULL == fgets(str1, sizeof(str1), stdin)) {
return; // graceful exit
}
// Notice on the bracket print of your text, the closing >
// is shown on the next line. This is because its capturing the
// newline/carriage return character.
printf("You entered %d chars and the text was:\n<%s>\n", strlen(str1), str1);
// These are for your strtok operation
// I would call them tokens or words, but
// whatever.
char *sentence=NULL;
char *next_sentence=NULL;
// wants to parse a string
// Excerpt from manpage
//
// char *strtok(char *str, const char *delim);
// Ahh, now I see why you name is sentence. You
// are looking for periods to separage sentences.
printf("Lets use strtok\n");
sentence = strtok(str1, ".");
while (sentence != NULL) {
printf("A sentence is:\n %s\n", sentence);
str2[i++] = sentence;
sentence = strtok(NULL, ".");
}
// So now, your individual sentences are stored
// in the array str2.
// str2[0] is the first sentence.
// str2[1] is the next sentence.
//
// To access the characters, specify a sentence and
// then specify the character.
//
// You can do the math, but do a man ascii, look at
// difference in lowercase a and uppercase A in terms
// of ascii. If its not captializ3ed already, simply
// add that offset or error out if not in set a-z.
//
// Here I will just make the first letter of the second
// sentence to be J.
str2[1][0] = 'J';
// Note, since you are going to have in the 'space'
// since you are delimitting on '.', It will have the
// effect of replacing 'space' with 'J'.
printf("Sentence two is: \n%s\n", str2[1]);
}
Here is the code in action.
*************** Welcome to the text cleaner ***************
Please enter text:
John was here. and here.
You entered 25 chars and the text was:
<John was here. and here.
>
Lets use strtok
A sentence is:
John was here
A sentence is:
and here
A sentence is:
Sentence two is:
Jand here
I hope that helps. TLDR use str2[x][y] to access a string x at character y.

create string array with leading spaces

Is there a way where I can initialize an empty string array and then later ask for an input from user which is saved into the string array leaving the empty leading spaces if the input is smaller.
I am planning on using a longer string array with addition spaces so that I can do inplace character replacements .
for example :
char foo[25];
scanf(%s,foo);
foo = this is a test"
print foo;
Result be like :
"this is a test "
Your question is inconsistent, you ask about leading whitespace but your example shows trailing whitespace. If you mean trailing whitespace, you could do it this way:
#include <stdio.h>
#include <string.h>
#define BUFFER_SIZE 25
int main() {
char string[BUFFER_SIZE];
memset(string, ' ', BUFFER_SIZE - 1); // initialize with spaces
string[BUFFER_SIZE - 1] = '\0'; // terminate properly
if (fgets(string, BUFFER_SIZE, stdin) != NULL) {
size_t length = strlen(string);
string[length - 1] = ' '; // replace the newline \n
if (length < BUFFER_SIZE - 1) {
string[length] = ' '; // replace extra '\0' as needed
}
printf("'%s'\n", string); // extra single quotes to visualize length
}
return 0;
}
USAGE
> ./a.out
this is a test
'this is a test '
>
The single quote were only added so you could actually see the spaces were preserved. The approach of #BLUEPIXY makes perfect sense except that it appends new whitespace to the input where you specifically asked about preserving existing whitespace.
If instead you want to preserve leading whitespace, that can probably be done as well.

Split string using more than one char as delimeter

Let's say I have a string "file1.h: file2.c,file3.cpp" and I want to split it into "file1.h" and "file2.c,file3.cpp" - that is using : (: and whitespace) as delimiter. How can I do it?
I tried this code with no help:
int main(int argc, char *argv[]) {
char str[] = "file1.h: file2.c,file3.cpp";
char name[100];
char depends[100];
sscanf(str, "%s: %s", name, depends);
printf("Name: %s\n", name);
printf("Deps: %s\n", depends);
}
And the output I get is:
Name: file1.h:
Deps:
What you seem to need is strtok(). Read about it in the man page. Related quote from C11, chapter §7.24.5.8
A sequence of calls to the strtok function breaks the string pointed to by s1 into a
sequence of tokens, each of which is delimited by a character from the string pointed to
by s2. [...]
In your case, you can use a delimiter like
char * delim = ": "; //combination of : and a space
go get the job done.
Things to mention additionally,
the input needs to be modifiable (which is, in your case) for strtok()
and it actually destroys the input fed to it, keep a copy around if you need the actual later.
This is an alternative way to do it, it uses strchr(), but this assumes that the input string always has the format
name: item1,item2,item3,...,itemN
Here is the program
#include <string.h>
#include <stdio.h>
int
main(void)
{
const char *const string = "file1.h: file2.c,file3.cpp ";
const char *head;
const char *tail;
const char *next;
// This basically makes a pointer to the `:'
head = string;
// If there is no `:' this string does not follow
// the assumption that the format is
//
// name: item1,item2,item3,...,itemN
//
if ((tail = strchr(head, ':')) == NULL)
return -1;
// Save a pointer to the next character after the `:'
next = tail + 1;
// Strip leading spaces
while (isspace((unsigned char) *head) != 0)
++head;
// Strip trailing spaces
while (isspace((unsigned char) *(tail - 1)) != 0)
--tail;
fputc('*', stdout);
// Simply print the characters between `head' and `tail'
// you could as well copy them, or whatever
fwrite(head, 1, tail - head, stdout);
fputc('*', stdout);
fputc('\n', stdout);
head = next;
while (head != NULL) {
tail = strchr(head, ',');
if (tail == NULL) {
// This means there are no more `,'
// so we now try to point to the end
// of the string
tail = strchr(head, '\0');
}
// This is basically the same algorithm
// just with a different delimiter which
// will presumably be the same from
// here
next = tail + 1;
// Strip leading spaces
while (isspace((unsigned char) *head) != 0)
++head;
// Strip trailing spaces
while (isspace((unsigned char) *(tail - 1)) != 0)
--tail;
// Here is where you can extract the string
// I print it surrounded by `*' to show that
// it's stripping white spaces
fputc('*', stdout);
fwrite(head, 1, tail - head, stdout);
fputc('*', stdout);
fputc('\n', stdout);
// Try to point to the next one
// or make head `NULL' if this is
// the end of the string
//
// Note that the original `tail' pointer
// that was pointing to the next `,' or
// the end of the string, has changed but
// we have saved it's original value
// plus one, we now inspect what was
// there
if (*(next - 1) == '\0') {
head = NULL;
} else {
head = next;
}
}
fputc('\n', stderr);
return 0;
}
It's excessively commented to guide the reader.
As Sourav says, you really need to use strtok for tokenizing strings. But this doesn't explain why your existing code is not working.
The answer lies in the specification for sscanf and how it handles a '%s' in the format string.
From the man page:
s Matches a sequence of non-white-space characters;
So, the presence of a colon-space in your format string is largely irrelevant for mathcing the first '%s'. When sscanf sees the first %s it simply consumes the input string until a whitespace character is encountered, giving you your value for name of "file1.h:" (note the inclusion of the colon).
Next it tries to deal with the colon-space sequence in your format string.
Again, from the man page
The format string consists of a sequence of directives which describe how to process the sequence of input characters.
The colon-space sequence does not match any known directive (i.e. "%" followed by something) and thus you get a matching failure.
If, instead, your format string was simply "%s%s", then sscanf will get you almost exactly what you want.
int main(int argc, char *argv[]) {
char str[] = "file1.h: file2.c,file3.cpp";
char name[100];
char depends[100];
sscanf(str, "%s%s", name, depends);
printf("str: '%s'\n", str);
printf("Name: %s\n", name);
printf("Deps: %s\n", depends);
return 0;
}
Which gives this output:
str: 'file1.h: file2.c,file3.cpp'
Name: file1.h:
Deps: file2.c,file3.cpp
At this point, you can simply check that sscanf gave a return value of 2 (i.e. it found two values), and that the last character of name is a colon. Then just truncate name and you have your answer.
Of course, by this logic, you aren't going to be able to use sscanf to parse your depends variable into multiple strings ... which is why others are recommending using strtok, strpbrk etc because you are both parsing and tokenizing your input.
Well, I am pretty late. I do not have much knowledge on inbuilt functions in C. So I started writing a solution for you. I don't think you need this now. But, anyway here it is and modify it as per your need. If you find any bug feel free to tell.

Reading from text file with '/n'

Okay. So I'm reading and storing text from a text file into a char array, this is working as intended. However, the textfile contains numerous newline escape sequences. The problem then is that when I print out the string array with the stored text, it ignores these newline sequences and simply prints them out as "\n".
Here is my code:
char *strings[100];
void readAndStore(FILE *file) {
int count = 0;
char buffer[250];
while(!feof(file)) {
char *readLine = fgets(buffer, sizeof(buffer), file);
if(readLine) {
strings[count] = malloc(sizeof(buffer));
strcpy(strings[count], buffer);
++count;
}
}
}
int main() {
FILE *file1 = fopen("txts", "r");
readAndStore(&*file1);
printf("%s\n", strings[0]);
printf("%s\n", strings[1]);
return 0;
}
And the output becomes something like this:
Lots of text here \n More text that should be on a new line, but isn't \n And so \n on and
and on \n
Is there any way to make it read the "\n" as actual newline escape sequences or do I just need to remove them from my text file and figure out some other way to space out my text?
No. Fact is that \n is a special escape sequence for your compiler, which turns it into a single character literal, namely "LF" (line feed, return), having ASCII code 0x0A. So, it's the compiler which gives a special meaning to that sequence.
Instead, when reading from file, \n is read as two distinct character, ASCII codes 0x5c,0x6e.
You will need to write a routine which replaces all occurences of \\n (the string composed by characters \ and n, the double escape is necessary to tell the compiler not to interpret it as an escape sequence) with \n (the single escape sequence, meaning new line).
If you only intend to replace '\n' by the actual character, use a custom replacement function like
void replacenewlines(char * str)
{
while(*str)
{
if (*str == '\\' && *(str+1) == 'n') //found \n in the string. Warning, \\n will be replaced also.
{
*str = '\n'; //this is one character to replace two characters
memmove(str, str+1, strlen(str)); //So we need to move the rest of the string leftwards
//Note memmove instead of memcpy/strcpy. Note that '\0' will be moved as well
}
++str;
}
}
This code is not tested, but the general idea must be clear. It is not the only way to replace the string, you may use your own or find some other solution.
If you intend to replace all special characters, it might be better to lookup some existing implementation or sanitize the string and pass it as the format parameter to printf. As the very minimum you will need to duplicate all '%' signs in the string.
Do not pass the string as the first argument of printf as is, that would cause all kinds of funny stuff.

Resources