How to remove ^M ^J characters in linux - c

I have an external machine which send me results in my Raspberry pi. In my emulator Cutecom I have the results line by line without problems. I use Codeblocks and I wrote my own C application to read these data every 10 seconds. But something strange happens. Sometimes I have the results line by line and sometimes I have the strange characters ^M ^J at the end of each line and as a result I have terrible finals results. I think that these EOF characters are because the external machine has developed in Windows.
The good results
+PARAMETERS: 45 BYTES FROM 0000:0000 (063)
MACHINE_1:(AN=23.45,H=34.56,D=12.34)
The bad results
+PARAMETERS: 45 BYTES FROM 0000:0000 (063)^M^JMACHINE_1:
(AN=21.45,H=33.56,D=10.34)
Ok, until here the only problem is the way the command line shows the result but my results are ok. But if I try to use strtok to get some tokens then I have serious problems because of these characters. What can I do? Can I add something to escape these characters?This is the part of the code which I use to read data from the machine
char buff[300];
memset(buff, 0, sizeof(buff));
for (;;)
{
n=read(fd,buff,sizeof(buff));
sleep(1);
printf("%s", buff);
printf("\n");
....
....

You're just reading blocks of 300 characters, so there is no string termination \0.
You'll have to look at n to see how much data you have read and then process the data before printing it i.e. look for the ^J^M and terminate the line,then continue reading the rest of the data.
FYI ^J^M is Windows line termination (it's just ^J form linux)
The following should read multiple messages and convert ^ and J to \n and ignore ^M.
Note this uses STDIN, not a serial port.
#include <stdio.h>
#include <unistd.h>
int main(int argc, char** argv)
{
int fd=STDIN_FILENO;
int i,n;
int c=0;
char buff[300];
memset(buff, 0, sizeof(buff));
for (;;)
{
n=read(fd,buff,sizeof(buff));
for (i=0; i<n; i++)
{
switch(buff[i])
{
case '^':
if(c)
{
// ^^ so output first ^
putchar('^');
}
else
{
// Possible ^M or ^J
c++;
}
break;
case 'M':
if (c)
{
// ignore ^M
c=0;
}
else
{
// just M
putchar(buff[i]);
}
break;
case 'J':
if (c)
{
// ^J is \n
putchar('\n');
c=0;
}
else
{
// just J
putchar(buff[i]);
}
break;
default:
if (c)
{
// ^ followed by other than J or M
putchar('^');
c=0;
}
putchar(buff[i]);
}
}
}
return 0;
}

I think you can still use strtok() with this. Just add ^M and ^J in the char *delimiters parameter.

Just execute the command "sed -e 's/\^\M$//g' filename"
or I got this from the website.
#!/usr/bin/python
while True:
file = raw_input('Input file name:(input "q" to quit)')
if file == 'q':
break
file_ = open(file).read()
list_ = list(file_)
new_file = ''
for x in list_:
if x != '^' and x != 'M':
new_file = new_file + x
file_ = open(file,'w')
file_.write(new_file)
file_.close()

Related

How to stop or interrupt low-level read function fetching the rest of keyboard input buffer

What should I do to avoid program to self-loop through all keyboard buffer?
I don't know very well how low-level read() function operates, but obviously it is like on a separate non-blocking thread which I can not force to stop it to receive more than I want.
For example in the below code I simply want to force the user to enter 4 characters via keyboard in terminal. But due to '\n' line feed, even when entered "123"+[enter] , the ret=read(0,buffer,4) returns 4 not 3...thus the while loop condition validates and the program enters the loop, though it shouldnt..Besides ret still gets value of 4 even when entered "1234"+[enter]...but this time meets the second problem,the while loop executes twice, thus prompting two consecutive lines of "Enter text:"...
Could you please help solve how to correctly organize the code to avoid further processing of keyboard buffer reads? as well as how to overcome "123"+[enter]=4 characters wrong evaluation?
Thnx
NOTE: if you observe the output image, please note that when entered 3 chars like "123" and "456" they are reprinted, indicating that entered inside the while-loop(although, shouldnt have)..and the second problem can be observed when entered "1234" and "7898"...they are still reprinted showing we're inside loop- that is OK...but there shouldn't be double prompts of "Enter text:" printed...they should be single lines..shows the read function somehow iterates once more over the loop, for the extra line-feed character-[enter key]... And when several chars entered, the loop shouldnt run for each input of 4 chars length, multiple times.
This is the code:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#define BUFSIZE 1000
unsigned char end=0;
int main(int argc, char const *argv[])
{
char *buffer=(char*)malloc(sizeof(char)*BUFSIZE);
char inp,*bptr=buffer;
int ret;
for(;;)
{
printf("Enter text : \n");
/* READ automatically enters this loop even when bytes read are
more than 4, and it does it for each 4 next character in keyboard buffer
*/
while((ret=read(0,buffer,4))==4)
{
if( !strcmp(bptr,"1111") || ret<0 )
{
end=1;
break;
}
else {
for(int n=0;*(bptr+n) && *(bptr+n)!='\n' && n<=ret;n++)
{
putchar(*(bptr+n));
}
putchar('\n');
buffer=bptr;
break;
}
}
/* end of while loop */
if(end)
{printf("End!");break;};
}
return 0;
}
Screen-shot program-output:
There's lots of ways to do this, but if you really want to use read, you can copy the contents of the buffer filled by read to a second buffer something like so:
char tempbuf[4];
int index = 0;
while((ret=read(0,buffer,4))==4)
{
index = index % 4;
for( int i = 0; i < 4; i++ )
{
if(!isspace( buffer[i]) // or if(buffer[i] != '\n')
tempbuf[++index] =buffer[i];
}
if( !strcmp(tempbuf,"1111") || ret<0 )
{
end=1;
break;
}
else {
for(int n=0;*(bptr+n) && *(bptr+n)!='\n' && n<=ret;n++)
{
putchar(*(bptr+n));
}
putchar('\n');
buffer=bptr;
break;
}
}
/* end of while loop */
if(end)
{printf("End!");break;};
}
int main(int argc, char const *argv[])
{
char buf[BUFSIZE];
int ret;
for(;;)
{
printf("Enter text : \n");
while((ret=read(0,buf,4))==4)
{
buf[ret]='\0';
if(!strcmp(buf,"exit"))
exit(0);
if(buf[ret-1]=='\n') break;
else {
printf("-- entered : %s \n",buf);
}
}/* end of while loop */
}/* end of infinite loop */
return 0;
}

Problem with passing ESCAPE key into stdin buffer in C Windows console

I want to put the text into my string and process it in an infinite loop, but I want to break the loop if the first character of the input is an ESCAPE key, but getchar returns 10 when I press ESCAPE followed by ENTER.
printf correctly prints the inputted character, but if I press ESCAPE nothing happens. I also don't want to use _getch() nor getche() which actually would solve my problem, but they also remove c from the buffer and doesn't display it properly.
What can I do?
int c;
while( 1 )
{
c = getchar();
printf( "c = %d\n", c ); // just for debug
if( c == 27 ) break;
else ungetc( c, stdin );
fgets( StrIn, BUF_SIZE - 1, stdin );
// REST OF CODE ********
}
EDIT I've just found Microsoft function
if( GetAsyncKeyState( VK_ESCAPE ) )
break;
that works fine for me. Thanks for reading and for your time, my problem is solved.
Pressing the ESCAPE key on a keyboard does not usually send an ESCAPE character to a console, the application or OS will often process the character and not pass it on. I'd use a different character to represent the fact that you want to break, for example a '~'.
When you say it doesn't display right if you remove c from the buffer, perhaps you can just add the first character the buffer manually.
Add a macro for the break character
#define BREAK_CHAR '~'
The code spinet could then look like this:
int c;
while(1)
{
c = getchar();
/* check if we should break */
if(c == BREAK_CHAR)
break;
/* assign the first character that was already read in */
StrIn[0] = c;
/* get the rest of the string */
fgets( StrIn + 1, BUF_SIZE - 2, stdin );
}
if you use windows - use windows specific functions as well. In my function ESC cancels the input.
char *wingetsn(char *str, size_t maxlen)
{
size_t len = 0;
int ch;
int exit = 0;
while (len < maxlen - 1 && !exit)
{
while(!_kbhit());
switch((ch = _getch()))
{
case 27:
str[0] = 0;
exit = 1;
break;
case '\r':
exit = 1;
break;
case '\n':
break;
default:
printf("%c", ch);
str[len++] = ch;
break;
}
}
str[len++] = 0;
return str;
}

C Program - printf overlapping with other printf chars

I'm working on a program in C that will open, read, and close a file with Linux system calls, and print the contents of the file(s) to the screen. The command format is
$ mycat [-bens] f1 [f2 ...].
The switches are as follows:
-b displays the line number for each non-blank line, starting at 1
-e displays a '$' at the end of each line
-n displays the line number for every line
-s removes all empty lines from the output (effectively single-spacing the output)
The problem is that when I use the -b or -n switch, printf appears to be "overlapping" the line number with what the buffer is trying to print from the text file itself.
Here is the code I have written for the program:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <unistd.h>
#include <fcntl.h>
#include <getopt.h>
#define BUFFERSIZE 4096
void oops(char *, char *);
int main(int ac, char *av[])
{
int fd, numRead, curr, i, c;
char buf[BUFFERSIZE] = {0};
extern char *optarg;
extern int optind;
int tmpS = 0;
int tmpB = 0;
int bFlag = 0;
int eFlag = 0;
int nFlag = 0;
int sFlag = 0;
int bLineNum = 1;
int nLineNum = 1;
/* Flag processing in argument list */
while( (c = getopt(ac, av, "bens")) != -1)
{
switch(c)
{
case 'b':
bFlag = 1;
break;
case 'e':
eFlag = 1;
break;
case 'n':
nFlag = 1;
break;
case 's':
sFlag = 1;
break;
default:
exit(EXIT_FAILURE);
}
}
/* Scan through each argument after flag */
for(i = optind; i < ac; i++)
{
/* Error handling when opening each file */
if((fd = open(av[i], O_RDONLY)) == -1)
oops("Cannot open ", av[i]);
/* Read from file to buffer, until end is reached */
while( (numRead = read(fd, buf, BUFFERSIZE)) > 0)
{
/* Once buffer is filled, process each address in buffer */
for(curr = 0; curr < BUFFERSIZE; curr++)
{
/* sFlag squeezes output, eliminating blank lines */
if(sFlag && buf[curr] == '\n')
{
tmpS = curr + 1;
while(buf[tmpS] != '\r')
{
if(isspace(buf[tmpS]))
tmpS++;
else
break;
}
curr = tmpS + 1;
}
/* nFlag numbers each line, starting from 1 */
if(nFlag && buf[curr] == '\n')
printf("%d ", nLineNum++);
/* eFlag puts a '$' at the end of every line */
if(eFlag && buf[curr] == '\r')
printf(" $");
/* bFlag numbers every non-blank line, starting from 1 */
if(bFlag && buf[curr] == '\n')
{
tmpB = curr + 1;
if(isEmptyLine(buf, tmpB))
printf("%d ", bLineNum++);
}
/* Print the current character in the buffer address */
printf("%c", buf[curr]);
}
}
if(numRead == -1)
oops("Read error from ", av[i]);
}
return 0;
}
void oops(char *s1, char *s2)
{
fprintf(stderr, "Error: %s ", s1);
perror(s2);
exit(1);
}
int isEmptyLine(char *buf, int tmp)
{
while(buf[tmp] != '\n')
{
if(!isspace(buf[tmp]))
return 0;
tmp++;
}
return 1;
}
Sample input (file1.txt):
An excerpt from LEARNING DOS FOR THE COMPLETE NOVICE, by Steven Woas, copyright 1993.
1. Change to the compressed drive and then issue a CHKDSK
command like so:
c: <ENTER>
chkdsk /f <ENTER>
The /F tells DOS to fix errors.
Another option is to do it like so:
dblspace /chkdsk /f <ENTER>
A shortcut for the DBLSPACE /CHKDSK /F command is:
dblspace /chk /f <ENTER>
Output with -n flag on and running:
sh-4.2$ ./main -n file1.txt
1 excerpt from LEARNING DOS FOR THE COMPLETE NOVICE, by Steven Woas, copyright 1993.
2
3 Change to the compressed drive and then issue a CHKDSK
4 command like so:
5
6 c: <ENTER>
7
8 chkdsk /f <ENTER>
9
10 The /F tells DOS to fix errors.
11
12 Another option is to do it like so:
13
14 dblspace /chkdsk /f <ENTER>
15
16 A shortcut for the DBLSPACE /CHKDSK /F command is:
17
18 dblspace /chk /f <ENTER>
I'm having the same problem with the -b flag and I don't know why. Does it have to do with \r and \n not being read properly?
Your program exhibits the misbehavior you describe for files with Windows- (DOS-)style line endings (\r\n), but different misbehavior for files with UNIX-style line endings (\n alone) and yet different misbehavior for files with MacOS class-style line endings (\r alone). Inasmuch as you seem to be assuming Windows-style line endings overall, I'll focus on that.
Consider what happens when your program reaches the end of a line. It first processes the \r character, ultimately printing it. This causes the output position to return to the beginning of the line (which is possible because the standard output is line-buffered by default). You then print the line number, overwriting whatever may have been there before, and finally print the \n character, causing the buffer to be flushed and the output to move to the next line.
You probably ought to recognize the \r\n sequence as a line ending, instead of trying to handle these characters individually. That may prove to be a bit challenging, as you need to account for the possibility that the pair is split across two read()s, but that's shouldn't be too hard. This will also give you the opportunity to consider what to do if you encounter a lone \n and / or a lone \r, which your program could handle more gracefully than it now does.

Get text while not EOF

Here's my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#define N 256
int main(int argc, const char * argv[]) {
char testo[N];
int i;
printf("PER TERMINARE L'INSERIMENTO PREMERE CTRL+Z oppure CTRL+D \n");
for(i=0;i<N;i++)
{
scanf("%c",&testo[i]);
/* if(testo[i]=='h' && testo[i-1]=='c')
{
i--;
testo[i]='k';
}
if(testo[i]==testo[i-1])
{
i--;
} */
if(testo[i]==EOF)
{
break;
}
}
puts(testo);
return 0;
}
When the code in /* ... */ is compiled, I can't stop the insert of text with EOF, but when the code is built and run as shown here, the EOF works.
Does anyone have any idea what the problem is?
You're testing for EOF incorrectly. With scanf(), you need to look at the return value. In fact, with almost all input functions, you need to test, if not capture and test, the return value.
Superficially, you need:
for (i = 0; i < N; i++)
{
if (scanf("%c", &testo[i]) == EOF)
break;
…
}
However, in general, you should check that scanf() made as many successful conversions as you requested, so it is better to write:
for (i = 0; i < N; i++)
{
if (scanf("%c", &testo[i]) != 1)
break;
…
}
In this example, it really won't matter. If you were reading numeric data, though, it would matter. The user might type Z instead of a number, and scanf() would return 0, not EOF.
To detect EOF, check the result of scanf()
if scanf("%c",&testo[i]) == EOF) break;
Note: testo[] may not be null character terminated. To print as a string, insure it is.
char testo[N];
int i;
// for(i=0;i<N;i++) {
for(i=0;i<(N-1);i++) {
if (scanf("%c",&testo[i]) == EOF) break;
}
testo[i] = '\0'; // add
puts(testo);
To stop at end of file, check the return value from scanf:
scanf returns the number of inputs correctly parsed. In your case, %c reads a byte from the stream correctly as long as end of file has not been reached. if (scanf("%c",&testo[i]) != 1) break; will do.
Yet using scanf to read one byte at a time from the input stream is overkill. The idiomatic way to do this in C is using the getchar() or the getc() function. The return value must be stored in an int variable and has special value EOF upon end of file.
You should also make the array 1 byte longer and store a null byte at the end to make it a C string, as expected by puts.
Here is a modified version of your program:
int main(int argc, const char *argv[]) {
char testo[N+1];
int i;
printf("PER TERMINARE L'INSERIMENTO PREMERE CTRL+Z oppure CTRL+D\n");
for (i = 0; i < N; i++) {
int c = getchar();
if (c == EOF)
break;
testo[i] = c;
/* ... further processing ... */
}
testo[i] = '\0';
puts(testo);
return 0;
}

Exercise 3-2 in K&R for C: convert escape sequences into visible ones

K&R: Exercise 3-2. Write a function escape(s,t) that converts characters like newline and tab into visible escape sequences like \n and \t as it copies the string t to s. Use a switch. Write a function for the other direction as well, converting escape sequences into the real characters.
Edit: Got it!! Thanks!! Do i need to add the s[s_index] = '\0'? It seems my program works fine without it (why though? shouldn't it cause an error or a memory problem)? ty again
My question: I'm not really sure if my algorithm is on the right path. Can someone please check my code below? It is not printing out any visible escape sequences. My idea was to replace each \n or \t scanned in from t, with a \ and then an n or a \ and then a t (using 2 spaces in the s array for every 1 space of t). Also, does anyone know how I would be able to assign a '\n' to a character array? For instance, if i typed in "hi" and then enter, it would scan in a \n into an array if i use c=getchar(). is there any other way for me to manually type in '\n' into an array before runtime? Thanks a lot guys! Any help is greatly appreciated.
#include <stdio.h>
void escape(char s[], char t[]);
int main() {
char s[50];
char t[50] = "hello guys bye test bye\\n";
escape(s, t);
printf("%s\n", s);
}
void escape(char s[], char t[]) {
int s_index = 0;
int t_index = 0;
while (t[t_index] != '\0') {
switch (t[t_index]) {
case ('\n'):
s[s_index] = '\\';
s[s_index + 1] = 'n';
t_index++;
s_index = s_index + 2;
break;
case ('\t'):
s[s_index] = '\\';
s[s_index + 1] = 't';
t_index++;
s_index = s_index + 2;
break;
default:
s[s_index] = t[t_index];
s_index++;
t_index++;
break;
}
}
s[s_index] = '\0';
}
I'm not sure I'd be overly keen on learning C from K&R, it was a great book in its day but there are better ones around now, and the language has changed quite a bit.
But, at a bare minimum, you should move toward meaningful variable names and learn how to refactor common code so that you're not repeating yourself unnecessarily.
The code I would start with for this task would be along the following lines. It has common code centralised (as macros in this case, to simplify the code), and also prevents buffer overflows. First the requisite headers and helper macros:
#include <stdio.h>
// Macros for output of character sequences,
// including buffer overflow detection.
// Output a single character.
#define OUT_NORM(p1) \
if (sz < 1) return -1; \
sz--; \
*to++ = p1;
// Output a backslash followed by a single
// character.
#define OUT_CTRL(p1) \
if (sz < 2) return -1; \
sz -= 2; \
*to++ = '\\'; \
*to++ = p1;
Then the function itself, greatly simplified by the use of common code, and immune to buffer overflows:
static int escape (char *from, char *to, size_t sz) {
// Process every character in source string.
while (*from != '\0') {
// Output control or normal character.
switch (*from) {
case '\n': OUT_CTRL ('n'); break;
case '\t': OUT_CTRL ('t'); break;
default: OUT_NORM (*from);
}
from++;
}
// Finish off string.
OUT_NORM ('\0');
return 0;
}
And, finally, a test program for checking:
int main (void) {
char src[] = "Today is a good\n\tday to die.\n";
char dest[100];
printf ("Original: [%s]\n", src);
if (escape (src, dest, sizeof(dest)) != 0)
puts ("Error found");
else
printf ("Final: [%s]\n", dest);
return 0;
}

Resources