Okay. So I'm reading and storing text from a text file into a char array, this is working as intended. However, the textfile contains numerous newline escape sequences. The problem then is that when I print out the string array with the stored text, it ignores these newline sequences and simply prints them out as "\n".
Here is my code:
char *strings[100];
void readAndStore(FILE *file) {
int count = 0;
char buffer[250];
while(!feof(file)) {
char *readLine = fgets(buffer, sizeof(buffer), file);
if(readLine) {
strings[count] = malloc(sizeof(buffer));
strcpy(strings[count], buffer);
++count;
}
}
}
int main() {
FILE *file1 = fopen("txts", "r");
readAndStore(&*file1);
printf("%s\n", strings[0]);
printf("%s\n", strings[1]);
return 0;
}
And the output becomes something like this:
Lots of text here \n More text that should be on a new line, but isn't \n And so \n on and
and on \n
Is there any way to make it read the "\n" as actual newline escape sequences or do I just need to remove them from my text file and figure out some other way to space out my text?
No. Fact is that \n is a special escape sequence for your compiler, which turns it into a single character literal, namely "LF" (line feed, return), having ASCII code 0x0A. So, it's the compiler which gives a special meaning to that sequence.
Instead, when reading from file, \n is read as two distinct character, ASCII codes 0x5c,0x6e.
You will need to write a routine which replaces all occurences of \\n (the string composed by characters \ and n, the double escape is necessary to tell the compiler not to interpret it as an escape sequence) with \n (the single escape sequence, meaning new line).
If you only intend to replace '\n' by the actual character, use a custom replacement function like
void replacenewlines(char * str)
{
while(*str)
{
if (*str == '\\' && *(str+1) == 'n') //found \n in the string. Warning, \\n will be replaced also.
{
*str = '\n'; //this is one character to replace two characters
memmove(str, str+1, strlen(str)); //So we need to move the rest of the string leftwards
//Note memmove instead of memcpy/strcpy. Note that '\0' will be moved as well
}
++str;
}
}
This code is not tested, but the general idea must be clear. It is not the only way to replace the string, you may use your own or find some other solution.
If you intend to replace all special characters, it might be better to lookup some existing implementation or sanitize the string and pass it as the format parameter to printf. As the very minimum you will need to duplicate all '%' signs in the string.
Do not pass the string as the first argument of printf as is, that would cause all kinds of funny stuff.
Related
In a program I am writing, I need to be able to tokenize a input text file into words, do some encoding, and then write to an output file. Problem is, I need to preserve the new lines.
The approach I was trying is to have strtok preserve the newlines at the end of a word, however, strtok will only include one newline character before moving on. If there is a following newline, it becomes its own token. How can I change this behavior so that tokens include all newlines before moving onto the next word?
int changeNewLine(char* p) {
p = p + (strlen(p)-1);
int newlines = 0;
while(*p == '\n') {
*p = '\0';
newlines++;
p--;
}
return newlines;
}
void main(int argc, char *argv[]) {
FILE *inputfile = fopen(argv[1],"rw");
FILE *outputfile = fopen("output.txt","wb");
char buffer[128];
char *token;
char words[MAX_CODE][WORDLEN];
int i = 0;
unsigned short newlines[MAX_CODE];
while(fgets(buffer, 128, inputfile)){
token = strtok(buffer," ");
while(token != NULL) {
newlines[i] = changeNewLine(token);
strcpy(words[i], token);
i++;
token = strtok(NULL," ");
}
}
...
}
Above is a fragment of my code. The idea is to count the number of newlines in a token, and then write them back out later.
strtok already does include newlines in the token, since you are using a delimiter string that does not contain the newline. But in your program as it now is, you will never have more than one in a token because fgets reads (at most) one line at a time. That's its whole purpose. It will never give you a string containing two or more newlines, nor containing a newline anywhere other than the last character.
Your general alternatives are
to look ahead at subsequent lines in order to spot additional newlines, or
retrospectively update the previous line's newline count when encounter a line starting with a newline (and, therefore, containing nothing else).
Alternative (1) could include employing an altogether different approach to reading input, too, such as a block read with fread() or a character-at-a-time read with fgetc().
I am trying to save one character and 2 strings into variables.
I use sscanf to read strings with the following form :
N "OldName" "NewName"
What I want : char character = 'N' , char* old_name = "OldName" , char* new_name = "NewName" .
This is how I am trying to do it :
sscanf(mystring,"%c %s %s",&character,old_name,new_name);
printf("%c %s %s",character,old_name,new_name);
The problem is , my problem stops working without any outputs .
(I want to ignore the quotation marks too and save only its content)
When you do
char* new_name = "NewName";
you make the pointer new_name point to the read-only string array containing the constant string literal. The array contains exactly 8 characters (the letters of the string plus the terminator).
First of all, using that pointer as a destination for scanf will cause scanf to write to the read-only array, which leads to undefined behavior. And if you give a string longer than 7 character then scanf will also attempt to write out of bounds, again leading to undefined behavior.
The simple solution is to use actual arrays, and not pointers, and to also tell scanf to not read more than can fit in the arrays. Like this:
char old_name[64]; // Space for 63 characters plus string terminator
char new_name[64];
sscanf(mystring,"%c %63s %63s",&character,old_name,new_name);
To skip the quotation marks you have a couple of choices: Either use pointers and pointer arithmetic to skip the leading quote, and then set the string terminator at the place of the last quote to "remove" it. Another solution is to move the string to overwrite the leading quote, and then do as the previous solution to remove the last quote.
Or you could rely on the limited pattern-matching capabilities of scanf (and family):
sscanf(mystring,"%c \"%63s\" \"%63s\"",&character,old_name,new_name);
Note that the above sscanf call will work iff the string actually includes the quotes.
Second note: As said in the comment by Cool Guy, the above won't actually work since scanf is greedy. It will read until the end of the file/string or a white-space, so it won't actually stop reading at the closing double quote. The only working solution using scanf and family is the one below.
Also note that scanf and family, when reading string using "%s" stops reading on white-space, so if the string is "New Name" then it won't work either. If this is the case, then you either need to manually parse the string, or use the odd "%[" format, something like
sscanf(mystring,"%c \"%63[^\"]\" \"%63[^\"]\"",&character,old_name,new_name);
You must allocate space for your strings, e.g:
char* old_name = malloc(128);
char* new_name = malloc(128);
Or using arrays
char old_name[128] = {0};
char new_name[128] = {0};
In case of malloc you also have to free the space before the end of your program.
free(old_name);
free(new_name);
Updated:...
The other answers provide good methods of creating memory as well as how to read the example input into buffers. There are two additional items that may help:
1) You expressed that you want to ignore the quotation marks too.
2) Reading first & last names when separated with space. (example input is not)
As #Joachim points out, because scanf and family stop scanning on a space with the %s format specifier, a name that includes a space such as "firstname lastname" will not be read in completely. There are several ways to address this. Here are two:
Method 1: tokenizing your input.
Tokenizing a string breaks it into sections separated by delimiters. Your string input examples for instance are separated by at least 3 usable delimiters: space: " ", double quote: ", and newline: \n characters. fgets() and strtok() can be used to read in the desired content while at the same time strip off any undesired characters. If done correctly, this method can preserve the content (even spaces) while removing delimiters such as ". A very simple example of the concept below includes the following steps:
1) reading stdin into a line buffer with fgets(...)
2) parse the input using strtok(...).
Note: This is an illustrative, bare-bones implementation, sequentially coded to match your input examples (with spaces) and includes none of the error checking/handling that would normally be included.
int main(void)
{
char line[128];
char delim[] = {"\n\""};//parse using only newline and double quote
char *tok;
char letter;
char old_name[64]; // Space for 63 characters plus string terminator
char new_name[64];
fgets(line, 128, stdin);
tok = strtok(line, delim); //consume 1st " and get token 1
if(tok) letter = tok[0]; //assign letter
tok = strtok(NULL, delim); //consume 2nd " and get token 2
if(tok) strcpy(old_name, tok); //copy tok to old name
tok = strtok(NULL, delim); //consume 3rd " throw away token 3
tok = strtok(NULL, delim); //consume 4th " and get token 4
if(tok) strcpy(new_name, tok); //copy tok to new name
printf("%c %s %s\n", letter, old_name, new_name);
return 0;
}
Note: as written, this example (as do most strtok(...) implementations) require very narrowly defined input. In this case input must be no longer than 127 characters, comprised of a single character followed by space(s) then a double quoted string followed by more space(s) then another double quoted string, as defined by your example:
N "OldName" "NewName"
The following input will also work in the above example:
N "old name" "new name"
N "old name" "new name"
Note also about this example, some consider strtok() broken, while others suggest avoiding its use. I suggest using it sparingly, and only in single threaded applications.
Method 2: walking the string.
A C string is just an array of char terminated with a NULL character. By selectively copying some characters into another string, while bypassing the one you do not want (such as the "), you can effectively strip unwanted characters from your input. Here is an example function that will do this:
char * strip_ch(char *str, char ch)
{
char *from, *to;
char *dup = strdup(str);//make a copy of input
if(dup)
{
from = to = dup;//set working pointers equal to pointer to input
for (from; *from != '\0'; from++)//walk through input string
{
*to = *from;//set destination pointer to original pointer
if (*to != ch) to++;//test - increment only if not char to strip
//otherwise, leave it so next char will replace
}
*to = '\0';//replace the NULL terminator
strcpy(str, dup);
free(dup);
}
return str;
}
Example use case:
int main(void)
{
char line[128] = {"start"};
while(strstr(line, "quit") == NULL)
{
printf("Enter string (\"quit\" to leave) and hit <ENTER>:");
fgets(line, 128, stdin);
sprintf(line, "%s\n", strip_ch(line, '"'));
printf("%s", line);
}
return 0;
}
I have a text file and I need to get it in a string but it has to show the "\n"
For example, this is hello.txt:
Hello,
World
And I need the string to return: "Hello,\nWorld\n"
Any idea how can I do this?
Perhaps you could read the file one character at a time.
Test each character that you read to see if it is a newline character ('\n').
If it is not a newline character, print the character that was read.
If it is a newline character, print "\n".
Good Luck!
Test each char of the string. Once code starts using "\n" to show '\n', then one needs to escape '\\'. To print a string like "Hello World", code may want to escape '\"' to distinguish if quotes are part of the print-out. If the string contains non-printable or non-ASCII char, what then? Maybe print an octal escape sequence like \377.
#include <ctype.h>
#include <string.h>
#include <stdio.h>
void EscapePrint(int ch) {
// Delete or adjust these 2 arrays per code's goals
// All simple-escape-sequence C11 6.4.4.4
static const char *escapev = "\a\b\t\n\v\f\r\"\'\?\\";
static const char *escapec = "abtnvfr\"\'\?\\";
char *p = strchr(escapev, ch);
if (p && *p) {
printf("\\%c", escapec[p - escapev]);
} else if (isprint(ch)) {
fputc(ch, stdout);
} else {
// Use octal as hex is problematic reading back
printf("\\%03o", ch);
}
}
More detail: Escape all special characters in printf()
I want to get a string as input by using scanf and if the string is just a space or blank I have to print error message.
This is what I've tried to do:
char string1[20]
scanf("%s",string1)
if(string1=='')
print error message
But that didn't work, actually I didn't expect it to work because string1 is an array of chars.
Any hint how to do it?
You should note that the scanf function will never scan a string with only blanks in it. Instead check the return value of the function, if it's (in your case) less than one it failed to read a string.
You may want to use fgets to read a line, remove the trailing newline, and then check if each character in the string is a space (with the isspace function).
Like this:
char string1[20];
if (fgets(string1, sizeof(string1), stdin) != NULL)
{
/* Remove the trailing newline left by the `fgets` function */
/* This is done by changing the last character (which is the newline)
* to the string terminator character
*/
string1[strlen(string1) - 1] = '\0';
/* Now "remove" leading whitespace */
for (char *ptr = string1; *ptr != '\0' && isspace(*ptr); ++ptr)
;
/* After the above loop, `*ptr` will either be the string terminator,
* in which case the string was all blanks, or else `ptr` will be
* pointing to the actual text
*/
if (*ptr == '\0')
{
/* Error, string was empty */
}
else
{
/* Success, `ptr` points to the input */
/* Note: The string may contain trailing whitespace */
}
}
scanf() does not always skip leading blanks.
Select formats specifies like "%s", "%d", "%f" do skip leading blanks. (whitespace).
Other formats specifies like "%c", "%[]", "%n" do not skip skip leading whitespace.
Scan in line and look for spaces. (string1 may contain whitespace)
char string1[20];
// Scan in up to 19 non-LineFeed chars, then the next char (assumed \n)
int result = scanf("%19[^\n]%*c", string1);
if (result < 0) handle_IOError_or_EOF();
else if (result == 0) handle_nothing_entered();
else {
const char *p = string1;
while (isspace(*p)) p++;
if (*p == '\0')
print error message
}
First, scanf will skip any blank spaces if you put a space (or other white space characters like '\n' or '\t') before the format specifier, like scanf(" %s", &str)
Second, if(string1=='') will compare the char pointer string1 with the blank char '' which will never be true because an existing variable's address will be non-NULL. That said, there's no "blank" char like that '' in C. You need to get the line input and parse whether it is a blank line or contains only spaces
While I could use strings, I would like to understand why this small example I'm working on behaves in this way, and how can I fix it ?
int ReadInput() {
char buffer [5];
printf("Number: ");
fgets(buffer,5,stdin);
return atoi(buffer);
}
void RunClient() {
int number;
int i = 5;
while (i != 0) {
number = ReadInput();
printf("Number is: %d\n",number);
i--;
}
}
This should, in theory or at least in my head, let me read 5 numbers from input (albeit overwriting them).
However this is not the case, it reads 0, no matter what.
I understand printf puts a \0 null terminator ... but I still think I should be able to either read the first number, not just have it by default 0. And I don't understand why the rest of the numbers are OK (not all 0).
CLARIFICATION: I can only read 4/5 numbers, first is always 0.
EDIT:
I've tested and it seems that this was causing the problem:
main.cpp
scanf("%s",&cmd);
if (strcmp(cmd, "client") == 0 || strcmp(cmd, "Client") == 0)
RunClient();
somehow.
EDIT:
Here is the code if someone wishes to compile. I still don't know how to fix
http://pastebin.com/8t8j63vj
FINAL EDIT:
Could not get rid of the error. Decided to simply add #ReadInput
int ReadInput(BOOL check) {
...
if (check)
printf ("Number: ");
...
# RunClient()
void RunClient() {
...
ReadInput(FALSE); // a pseudo - buffer flush. Not really but I ignore
while (...) { // line with garbage data
number = ReadInput(TRUE);
...
}
And call it a day.
fgets reads the input as well as the newline character. So when you input a number, it's like: 123\n.
atoi doesn't report errors when the conversion fails.
Remove the newline character from the buffer:
buf[5];
size_t length = strlen(buffer);
buffer[length - 1]=0;
Then use strtol to convert the string into number which provides better error detection when the conversion fails.
char * fgets ( char * str, int num, FILE * stream );
Get string from stream.
Reads characters from stream and stores them as a C string into str until (num-1) characters have been read or either a newline or the end-of-file is reached, whichever happens first.
A newline character makes fgets stop reading, but it is considered a valid character by the function and included in the string copied to str. (This means that you carry \n)
A terminating null character is automatically appended after the characters copied to str.
Notice that fgets is quite different from gets: not only fgets accepts a stream argument, but also allows to specify the maximum size of str and includes in the string any ending newline character.
PD: Try to have a larger buffer.