converting new line character - c

I am trying to convert new line character \n to dos style which is \r\n , without using libc.
here is my attempt, what am I doing wrong?
for(i = 0; str[i]!='\0'; ++i)
{
if ('\r' == str[i] && '\n'==str[i+1])
++count;
}
strPtr = malloc(i + 1 + count);
for(i = j = 0; str[i]!='\0'; ++i)
{
if ('\r' == str[i])
strPtr[j++] = "";
}
strPtr[j] = 0;
output should now be
"Hi\r\n, How are you \r\n, are you okay\r\n"

There are many problems here. Firstly, you are modifying the string in place using the original buffer. However, the original buffer does not have enough space to store the additional \r characters. You'll need to allocate a larger buffer.
Secondly, a UNIX-style carriage return character is not stored as two separate \ and n characters. It is a single ASCII character with a value of 0xA, which can be represented using the escape sequence \n. So to check if your current character is a newline character, you want to say strPtr[i] == '\n'
Finally, you are overwriting the old buffer when you say strPtr[i-1] = '\r'. This will replace the character before the \n, (such as the i in Hi).
Basically, what you want to do is create a second buffer for output, and copy the string character by character to the output buffer. When you encounter a \n character, instead of copying a single \n to the new buffer, you copy \r\n.
The size of the output buffer needs to be twice the size of the input buffer to handle the case of an input string where every character is \n, plus 1 for the NULL terminator. However, you can compute an optimal size for the output buffer by counting the number of \n characters in the original string beforehand.

All the escape sequence characters in C language are one character which is stored in one byte of memory only, dont consider it as two.
So you can directly check for a byte to \n as you are checking for \0.
If you want to replace \n(1 character) with \r\n(2 character) means, str should have additional memory, but in your program its not having additional memory.
char *a = "\n"; //This is one byte
char *b = "\\\n"; //This is two byte, 1st byte for '\' and 2nd byte for new line
char *c = "\\\r\n"; //Similarly this is three byte
char *c = "\r\n"; //Similarly this is two byte
All the below escape sequence characters are single byte character in C language.
\n – New line
\r – Carriage return
\t – Horizontal tab
\\ – Backslash
\' – Single quotation mark
\" – Double quotation mark

You can't do this in-place. You're adding a new character ('\r') for every '\n' which means the string must expand. The worst case scenario is that every character is a '\n' which means we would double the size of the string. Thus, let's make a buffer twice the size of the original string.
strtmp = malloc(strlen(str) * 2 + 1); /* +1 for null */
strcpy(strtmp, str);
strptr = strtmp;
for (i = 0; str[i] != 0; i++)
{
if ((str[i] == '\\') && (str[i+1] == 'n'))
{
*strptr++ = '\\';
*strptr++ = 'r';
}
*strptr++ = str[i];
}
printf(strtmp);
free(strtmp);

The \n in your string is an escape sequence and is represented by one character.
Your code should be like this:
int main(void)
{
char str[] = "Hi\n, How are you \n, are you okay\n";
char *strPtr = str;
int i, j;
int count=0;
for(i = 0; str[i]!='\0'; ++i)
{
if (`\n` == str[i]) ++count;
}
strPtr = malloc(i + 1 + count);
for(i = j = 0; str[i]!='\0'; ++i)
{
if ('\n' == str[i]) strPtr[j++] = `\r`;
strPtr[j++] = str[i];
}
strPtr[j] = 0;
printf("This many times we changed it", count);
}
EDIT
As you have decided to change the question (BTW - Just add to the question for clarification and not delete huge chunks of the original OP as the answers will not make any sense for future visitors) - here is the code:
int main(void)
{
char str[] = "Hi\r\n, How are you \r\n, are you okay\r\n";
int i, j;
for (i = j = 0; 0 != str[i]; ++i)
{
if ('\r' == str[i] && '\n' == str[i + 1])
{
++count;
}
else
{
str[j++] = str[i];
}
}
str[j] = 0;
.. etc - str is without \r\n but \n, count is the number of lines.

Related

fgets() and scanf() not working properly in unison. Buffer problem encountered

My assignment: -
Write a program that replaces the occurence of a given character (say
c) in a primary string (say PS) with another string (say s).
Input: The first line contains the primary string (PS) The next line
contains a character (c) The next line contains a string (s)
Output: Print the string PS with every occurence of c replaced by s.
Test case 1: -
Input: -
abcxy
b
mf
Expected output: -
amfcxy
Test case 2: -
Input: -
Al#bal#20owL
l
LL
Expected output: -
ALL#baLL#20owL
My code below: -
#include <stdio.h>
#include <stdlib.h>
int main() {
char PS[101];
char c;
char S[11];
fgets(PS, 101, stdin); //PS value input.
scanf("%c", &c);
if (c == '\n' || c == '\0') {
scanf("%c", &c); //Clearing the buffer. I want the real value of 'c' from STDIN not '\n'
}
fgets(S, 11, stdin); //S value input.
int i = 0;
while (PS[i] != '\0') { //Removing the '\n' from PS
if (PS[i] == '\n') {
PS[i] = '\0';
break;
}
i++;
}
i = i - 1; //i now holds the value of the size of the string PS (excluding '\0')
int j = 0;
while (S[j] != '\0') {
if (S[j] == '\n') {
S[j] = '\0';
break;
}
j++;
}
j = j - 1; //j now holds the value of the size of the string S (excluding '\0')
int k = 0; //work as an initializer
int move = 0; //work as an initializer.
while (PS[k] != '\0') { //This loops checks the whole array for the same character mentioned in char 'c'
if (PS[k] == c) {
for (move = i; move > k; move --) { //This loop advances the all the characters in PS by '(j - 1)' steps to make space for string S characters.
PS[move + (j - 1)] = PS[move];
}
for (move = 0; move < j; move++) { //This loop adds all the characters of string S into string PS at the relevant place.
PS[k + move] = S[move];
}
i = i + (j - 1); // 'i' now holds the new value of size of string PS after adding all the characters of string S.
}
k++;
}
puts(PS);
return 0;
}
Now the problem is that the code is not taking the input for string S.
After inputting first 2 inputs, it executes and gives a gibberish answer. I cannot figure out the bug, but what I do know is that there is some issue related to the buffer in C. Please help.
Edit: -
Thanks to #WeatherVane I have now edited the code with this: -
scanf("%c", &c);
if (c == '\n' || c == '\0') {
scanf("%c", &c); //Clearing the buffer. I want the real value of 'c' from STDIN not '\n'
}
char x;
x = getchar(); //New addition. It eats the '\n' after scanf().
fgets(S, 11, stdin); //S value input.
Now my code is working fine but the output is still not correct. It is sometimes failing to copy the last char from string S or giving me gibberish output.
The problem with the code was: -
i = i - 1; //i now holds the value of the size of the string PS (excluding '\0')
j = j - 1; //j now holds the value of the size of the string S (excluding '\0')
The value of i and j are the true values of the size of string PS and string S; not i = i - 1 and j = j - 1.
Lesson learnt from this assignment: -
scanf() does not treat '\n' in any way. It WILL be left in the
buffer.
If possible use fgets and then remove '\n' from your respective array/pointer.
Be extra careful of your C buffer when dealing with chars and strings.
The final correct code is: -
#include <stdio.h>
#include <stdlib.h>
int main()
{
char PS[101];
char c;
char S[11];
fgets(PS, 101, stdin); //PS value input.
scanf("%c", &c);
if(c == '\n' || c == '\0')
{
scanf("%c", &c); //Clearing the buffer. I want the real value of 'c' from STDIN not '\n'
}
char x;
x = getchar(); //New addition. It eats the '\n' after scanf().
fgets(S, 11, stdin); //S value input.
int i = 0;
while(PS[i] != '\0') //Removing the '\n' from PS
{
if(PS[i] == '\n')
{
PS[i] = '\0';
break;
}
i++;
}
i = i; //i now holds the value of the size of the string PS (excluding '\0')
int j = 0;
while(S[j] != '\0')
{
if(S[j] == '\n')
{
S[j] = '\0';
break;
}
j++;
}
j = j; //j now holds the value of the size of the string S (excluding '\0')
int k = 0; //work as an initializer
int move = 0; //work as an initializer.
while(PS[k] != '\0') //This loops checks the whole array for the same character mentioned in char 'c'
{
if(PS[k] == c)
{
for(move = i; move > k; move --) //This loop advances the all the characters in PS by '(j - 1)' steps to make space for string S characters.
{
PS[move + (j - 1)] = PS[move];
}
for(move = 0; move < j; move++) //This loop adds all the characters of string S into string PS at the relevant place.
{
PS[k + move] = S[move];
}
i = i + (j - 1); // 'i' now holds the new value of size of string PS after adding all the characters of string S.
}
k++;
}
puts(PS);
return 0;
}
Warning: -
The above code is very unoptimised and unreadable. Do not use it for
long term projects. It just "works".
Any suggestions for improvements of the above code are welcomed in
the comments.
Further necessary reading material recommended if you face any issue regarding C buffer in the future: -
Read 1
Read 2

Why is this still counting spaces within a String?

I'm just starting to code and I need help figuring out why this loop counts spaces within a string.
To my understanding, this code should tell the computer to not count a space "/0" and increase count if the loop goes through the string and it's any other character.
int main(void)
{
string t = get_string("Copy & Past Text\n");
int lettercount = 0;
for (int i = 0; t[i] != '\0'; i++)
{
lettercount++;
}
printf("%i", lettercount);
printf("/n");
}
\0 represents the null character, not a space. It is found at the end of strings to indicate their end. To only check for spaces, add a conditional statement inside the loop.
int main(void)
{
string t = get_string("Copy & Past Text\n");
int lettercount = 0;
for (int i = 0; t[i] != '\0'; i++)
{
if (t[i] != ' ')
lettercount++;
}
printf("%i", lettercount);
printf("\n");
}
Space is considered a character, your code goes through the string (an array of characters) and counts the characters until it reaches the string-terminating character which is '\0'.
Edit: set an if condition in the loop if(t[i] != ' ') and you wouldn't count the spaces anymore.
You misunderstand the nature of C strings.
A string is an array of characters with a low value ( '\0') marking the end of the string. Within the string some of the characters could be spaces (' ' or x20).
So the " t[i] != '\0' " condition marks the end of the loop.
A simple change:
if ( t[i] != ' ') {
lettercount++;
}
Will get your program working.
This for loop
for (int i = 0; t[i] != '\0'; i++)
iterates until the current character is the terminating zero character '\0' that is a null character. So the character is not counted.
In C there is the standard function isalpha declared in the header <ctype.h> that determines whether a character represents a letter.
Pay attention to that the user can for example enter punctuation symbols in a string. Or he can use the tab character '\t' instead of the space character ' '. For example his input can look like "~!##$%^&" where there is no any letter.
So it would be more correctly to write the loop the following way
size_t lettercount = 0;
for ( string s = t; *s; ++s )
{
if ( isalpha( ( unsigned char )*s ) ) ++lettercount;
}
printf("%zu\n", lettercount );
This statement
printf("/n");
shall be removed. I think instead you mean
printf("\n");
that is you want to output the new line character '\n'. But this character can be inserted in the previous call of printf as I showed above
printf("%zu\n", lettercount );
A null-terminator is the last leading element in a character array consisting of a string literal (e.g. Hello there!\0). It terminates a loop and prevent further continuation to read the next element.
And remember, a null-terminator isn't a space character. Both could be represented in the following way:
\0 - null terminator | ' ' - a space
If you want to count the letters except the space, try this:
#include <stdio.h>
#define MAX_LENGTH 100
int main(void) {
char string[MAX_LENGTH];
int letters = 0;
printf("Enter a string: ");
fgets(string, MAX_LENGTH, stdin);
// string[i] in the For loop is equivalent to string[i] != '\0'
// or, go until a null-terminator occurs
for (int i = 0; string[i]; i++)
// if the current iterated char is not a space, then count it
if (string[i] != ' ')
letters++;
// the fgets() reads a newline too (enter key)
letters -= 1;
printf("Total letters without space: %d\n", letters);
return 0;
}
You'll get something like:
Enter a string: Hello world, how are you today?
Total letters without space: 26
If a string literal has no any null-terminator, then it can't be stopped from getting read unless the maximum number of elements are manually given to be read till by the programmer.

How to correctly add characters to a C String in C Language?

Currently I am coding a program that can go through a text file and analyze each character. If a character is alphanumeric and a valid identifier, I want to be able to add that character into a string.
My current code to do so is this:
char final[256]={'\0'};
unsigned int len = 0;
static int current = ' ';
static int temp = ' ';
if(isalpha(current)){
final[0]=current;
len = 1;
for (temp = fgetc(file); isalnum(temp) || temp == '_';){
for(int i = len; i <= len; i++){
final[i] = temp;
len++;
}
}
final[len] = '\0';
Am I correct to approach this problem the current way? Can you add characters to index positions of strings in C?
The code itself is simple.
char final[256];
unsigned int len = 0;
final[len] = fgetc(file); //we read the character but do not "approve" it.
//while (!isalpha(final[len])) final[len] = fgetc(file); //uncomment if you want to read the file until a valid identifier begins. Also it's barely an example: it lacks EOF check.
if(isalpha(final[len])){
len = 1; //We "approve" the first character
while ( isalnum( final[len] = fgetc(file) ) || final[len] == '_') //In C, conditions are checked left to right so if isalnum()==0 we check for '_' with correctly updated final[len] value.
len++; //We "approve" the next character;
}
}
final[len] = 0; //The last character has been read but not "approved" so we overwrite it with null-term.
About the second question... yes, you can add a character to an indexed position. But it must be either last position or it'll overwrite an existing one. If you want to insert some characters, use memmove() function first.

Why would a character array be unchanged after a for-loop?

I have built a function with the goal of taking text that is fed from elsewhere in the program and removing all whitespace and punctuation from it. I'm able to remove whitespace and punctuation, but the changes don't stay after they are made. For instance, I put the character array/string into a for-loop to remove whitespace and verify that the whitespace is removed by printing the current string to the screen. When I send the string through a loop to remove punctuation, though, it acts as though I did not remove whitespace from earlier. This is an example of what I'm talking about:
Example of output to screen
The function that I'm using is here.
//eliminates all punctuation, capital letters, and whitespace in plaintext
char *formatPlainText(char *plainText) {
int length = strlen(plainText);
//turn capital letters into lower case letters
for (int i = 0; i < length; i++)
plainText[i] = tolower(plainText[i]);
//remove whitespace
for (int i = 0; i < length; i++) {
if (plainText[i] == ' ')
plainText[i] = plainText[i++];
printf("%c", plainText[i]);
}
printf("\n\n");
//remove punctuation from text
for (int i = 0; i < length; i++) {
if (ispunct(plainText[i]))
plainText[i] = plainText[i++];
printf("%c", plainText[i]);
}
}
Any help as to why the text is unchanged after if exits the loop would be appreciated.
Those for loops are not necessary. Your function can be modified as follows and I commented where I made those changes:
char* formatPlainText(char *plainText)
{
char *dest = plainText; //dest to hold the modified version of plainText
while ( *plainText ) // as far as *plainText is not '\0'
{
int k = tolower(*plainText);
if( !ispunct(k) && k != ' ') // check each char for ' ' and any punctuation mark
*dest++ = tolower(*plainText); // place the lower case of *plainText to *dest and increment dest
plainText++;
}
*dest = '\0'; // This is important because in the while loop we escape it
return dest;
}
From main:
int main( void ){
char str[] = "Practice ????? &&!!! makes ??progress!!!!!";
char * res = formatPlainText(str);
printf("%s \n", str);
}
The code does convert the string to lower case, but the space and punctuation removal phases are broken: plainText[i] = plainText[i++]; has undefined behavior because you use i and modify it elsewhere in the same expression.
Furthermore, you do not return plainText from the function. Depending on how you use the function, this leads to undefined behavior if you store the return value to a pointer and later dereference it.
You can fix the problems by using 2 different index variables for reading and writing to the string when removing characters.
Note too that you should not use a length variable as the string length changes in the second and third phase. Texting for the null terminator is simpler.
Also note that tolower() and ispunct() and other functions from <ctype.h> are only defined for argument values in the range 0..UCHAR_MAX and the special negative value EOF. char arguments must be cast as (unsigned char) to avoid undefined behavior on negative char values on platforms where char is signed by default.
Here is a modified version:
#include <ctype.h>
//eliminate all punctuation, capital letters, and whitespace in plaintext
char *formatPlainText(char *plainText) {
size_t i, j;
//turn capital letters into lower case letters
for (i = 0; plainText[i] != '\0'; i++) {
plainText[i] = tolower((unsigned char)plainText[i]);
}
printf("lowercase: %s\n", plainText);
//remove whitespace
for (i = j = 0; plainText[i] != '\0'; i++) {
if (plainText[i] != ' ')
plainText[j++] = plainText[i];
}
plainText[j] = '\0';
printf("no white space: %s\n", plainText);
//remove punctuation from text
for (i = j = 0; plainText[i] != '\0'; i++) {
if (!ispunct((unsigned char)plainText[i]))
plainText[j++] = plainText[i];
}
plainText[j] = '\0';
printf("no punctuation: %s\n", plainText);
return plainText;
}

Attach a String to another String in C WITHOUT any spaces

this is my first post in this forum so please be patient.
I need to make a short programm, where the user can enter 2 strings which should be attached afterwards.
I already got this code below (I am not allowed to use other "includes").
What I need to know is: How can I deny any spaces which the user will enter?
Example: 1. String "Hello " | 2. String "World" Result should be "HelloWorld" instead of "Hello World".
#include <stdio.h>
void main()
{
char eingabe1[100];
char eingabe2[100];
int i = 0;
int j = 0;
printf("Gib zwei Wörter ein, die aneinander angehängt werden sollen\n");
printf("1. Zeichenkette: ");
gets(eingabe1);
printf("\n");
printf("2. Zeichenkette: ");
gets(eingabe2);
printf("\n");
while (eingabe1[i] != '\0')
{
i++;
}
while (eingabe2[j] != '\0')
{
eingabe1[i++] = eingabe2[j++];
}
eingabe1[i] = '\0';
printf("Nach Verketten: ");
puts(eingabe1);
}
You have to filter out the spaces as you copy your strings.
You have two string indices, i for the first string and and j for the second string. You could make better use of these indices if you used i for the reading position (of both strings subsequently; you can "reuse" loop counters in independent loops) and j for the writing position.
Here's how. Note that the code attempts to prevent buffer overflow by only adding characters if there is space in the string. This check needs only to be done when copying the second string, because j <= i when you process the first string.
#include <stdio.h>
int main()
{
char str1[100] = "The quick brown fox jumps over ";
char str2[100] = "my big sphinx of quartz";
int i = 0;
int j = 0;
while (str1[i] != '\0') {
if (str1[i] != ' ') str1[j++] = str1[i];
i++;
}
i = 0;
while (str2[i] != '\0') {
if (str2[i] != ' ' && j + 1 < sizeof(str1)) str1[j++] = str2[i];
i++;
}
str1[j] = '\0';
printf("'%s'\n", str1);
return 0;
}
In addition to avoiding spaces between your two words, you also have to avoid the newline ('\n') character placed in the input buffer by the user pressing Enter. You can do that with a simple test after you have read the line with fgets() NOT gets(). gets() is no longer part of the standard C library and should not be used due to insecurity reasons. Plus fgets provides simple length control over the number of characters a user may enter at any time.
Below, you run into trouble when you read eingabe1. After the read, eingabe1 contains a '\n' character at its end. (as it would using any of the line-oriented input functions (e.g. getline(), fgets(), etc) To handle the newline, you can simply compare its length minus '1' after you loop over the string to find the nul character. e.g.:
if (eingabe1[i-1] == '\n') i--; /* remove trailing '\n', update i */
By simply reducing the index 'i', this will guarantee that the concatenation with eingabe2 will not have any spaces or newline characters between the words.
Putting the pieces together, and using fgets in place of the insecure gets, after #define MAX 100'ing a constant to prevent hardcoding your array indexes, you could come up with something similar to:
#include <stdio.h>
#define MAX 100
int main (void)
{
char eingabe1[MAX] = {0};
char eingabe2[MAX] = {0};
int i = 0;
int j = 0;
printf("Gib zwei Wörter ein, die aneinander angehängt werden sollen\n");
printf("1. Zeichenkette: ");
/* do NOT use gets - it is no longer part of the C library */
fgets(eingabe1, MAX, stdin);
putchar ('\n');
printf("2. Zeichenkette: ");
/* do NOT use gets - it is no longer part of the C library */
fgets(eingabe2, MAX, stdin);
putchar ('\n');
while (eingabe1[i]) i++; /* set i (index) to terminating nul */
if (i > 0) {
if (eingabe1[i-1] == '\n') i--; /* remove trailing '\n' */
while (i && eingabe1[i-1] == ' ') /* remove trailing ' ' */
i--;
}
while (eingabe2[j]) { /* concatenate string - no spaces */
eingabe1[i++] = eingabe2[j++];
}
eingabe1[i] = 0; /* nul-terminate eingabe1 */
printf("Nach Verketten: %s\n", eingabe1);
return 0;
}
Output
$ ./bin/strcatsimple
Gib zwei Wörter ein, die aneinander angehängt werden sollen
1. Zeichenkette: Lars
2. Zeichenkette: Kenitsche
Nach Verketten: LarsKenitsche
Let me know if you have any further questions. I have highlighted the changes with comments above.
/**
return: the new len of the string;
*/
int removeChar(char* string, char c) {
int i, j;
int len = strlen(string)+1; // +1 to include '\0'
for(i = 0, j = 0 ; i < len ; i++){
if( string[i] == c )
continue; // avoid incrementing j and copying c
string[ j ] = string[ i ]; // shift characters
j++;
}
return j-1; // do not count '\0';
}
int main(){
char str1[] = "sky is flat ";
char str2[100] = "earth is small ";
strcat( str2, str1 );
printf("with spaces:\n\t'%s'\n", str2) ;
removeChar(str2, ' ');
printf("without spaces:\n\t'%s'\n", str2 );
}
/**
BONUS: this will remove many characters at once, eg "\n \r\t"
return: the new len of the string;
*/
int removeChars(char* string, char *chars) {
int i, j;
int len = strlen(string);
for(i = 0, j = 0 ; i < len ; i++){
if( strchr(chars,string[i]) )
continue; // avoid incrementing j and copying c
string[ j ] = string[ i ]; // shift characters
j++;
}
string[ j ]=0;
return j;
}
Thank you everyone for all the answers.
I got the solution now.
I read some advices from you and will try to remember for the future.
See the code below:
(Excuse me for the strange names for the variables, I use german words)
A few notices:
I am not allowed to use library functions
I am not allowed to use fgets for some reasons as a trainee
#include <stdio.h>
void main()
{
char eingabe1[100];
char eingabe2[100];
int i = 0;
int j = 0;
printf("gib zwei wörter ein, die aneinander angehängt werden sollen\n");
printf("1. zeichenkette: ");
gets(eingabe1);
printf("\n");
printf("2. zeichenkette: ");
gets(eingabe2);
printf("\n");
//Attach Strings
while (eingabe1[i] != '\0')
{
i++;
}
while (eingabe2[j] != '\0')
{
eingabe1[i++] = eingabe2[j++];
}
//Remove Space
eingabe1[i] = '\0';
i = 0;
j = 0;
while (eingabe1[i] != '\0')
{
if (eingabe1[i] != 32)
{
eingabe2[j++] = eingabe1[i];
}
i++;
}
eingabe2[j] = '\0';
printf("Nach verketten: ");
puts(eingabe2);
}
Sounds like homework to me.
I just wanted to mention that you probably shouldn't use sizeof() on strings these days because there may be multibyte characters in there. Use strlen() instead. The only time sizeof() would be appropriate is if you're going to malloc() a certain number of bytes to store it.
I write little loops fairly often to do low level text stuff one character at a time, just be aware that strings in C usually have a 0 byte at the end. You have to expect to encounter one and be sure you put one on the output. Space is 0x20 or decimal 32 or ' ', it's just another character.

Resources