Saving file as UTF-8 - c

I am trying to write a program for translating English to Greek. So I find the ASCII number of the English character (a) and then saving the new char to a file. However is still saves 'á' and not 'α' cause they have the same decimal number.
int main(int argc, char *argv[]) {
FILE *fp1, *fp2;
char ch,demo;
int i;
fp1 = fopen( argv[1], "r");
fp2 = fopen("Translated.txt", "w");
while (1) {
ch = fgetc(fp1);
if (ch == EOF)
break;
else{
i = ch + 128;
demo = i;
putc(demo, fp2);
}
}
printf("File copied Successfully!");
fclose(fp1);
fclose(fp2);
return 0;
}
How can I save a file as UTF-8 in order to view it as a Greek character ?
Any other way of converting ISO8859-1 to ISO8859-7 ?

Related

fgetc only reads UTF8 encoded file. not working for UTF16

My aim is to find the encoding of a text file by dividing the size of the file by the number of characters in the file. but fgetc only reads UTF8 encoded files. not working for UTF16. Kindly help me to solve this problem or suggest me if any substitute for fgetc.
#include <stdio.h>
#include <stdlib.h>
void main()
{
findEncode("C:\\UTF-8_TestCase\\TestCase1.txt");
}
int findEncode(char *str){
int ch = NumberOfCharecter(str);
int size = SizeOfFile(str);
if(size/ch == 1){
printf("UTF-8");
}else if(size/ch == 2){
printf("UTF-16");
}else {
printf("UTF-32");
}
}
int NumberOfCharecter(char *str){
FILE *fptr;
char ch;
int character=1;
fptr=fopen(str,"r");
if(fptr==NULL)
{
printf("File does not exist or can not be opened.");
}
while(1)
{
ch = fgetc(fptr); //fgetc only reads UTF8 encoded file. not working for UTF16
if(ch==EOF)
break;
character++;
}
fclose(fptr);
printf("The number of characters in the file %s are : %d\n\n",str,character-1);
return character-1;
}
//SizeOfFile working well
int SizeOfFile(char *str) {
FILE *fptr;
char ch;
int sz;
fptr=fopen(str,"r+");
fseek(fptr, 0, SEEK_END);
sz = ftell(fptr);
printf("the size of the file is %d \n\n", sz);
fclose(fptr);
return sz;
}
char ch;
…
ch = fgetc(fptr); //…
if(ch==EOF)
You wrongly assign the return value of fgetc() to a char; in order to compare it to EOF, you have to define int ch. After this, you'll find that NumberOfCharecter() returns the same number as SizeOfFile(), since the character read by fgetc() is not a character in the sense of an encoding, it's independent from that.

Putting txt File Into Array is Starting at 12th Element

I'm writing a program in C, in which I am reading the data from a .txt file, and my goal is to put each element from the .txt file into an array. When I compile and run the program, the values of 50, 55, and 0 are returned. These are the ASCII values (I'm not sure why the elements are being stored as ASCII codes, but that's okay for now) for 2, 7, and 0 (meaning nothing was initialized since we reached the end of the .txt file. Why is my program not reading the .txt file from the beginning??
...
int main(int argc, char *argv[]){
FILE *inputFile;
char *input = argv[1];
char magicSquareArray[257];
inputFile = fopen(input, "r");
if (inputFile == 0){
printf("Cannot open file for reading!\n");
return -1;
}
fscanf(inputFile, "%s", magicSquareArray);
while (!feof(inputFile)){
fscanf(inputFile, "%s", magicSquareArray);
}
printf("%i\n", magicSquareArray[0]);
int sideSize = magicSquareArray[0];
int squareSize = sideSize * sideSize;
printf("%i\n", squareSize);
fclose(inputFile);
The text file:
3
4,3,8
9,5,1
2,7,6
Perhaps you want the code such as the following.
(However, I think in the following manner.
To prepare an array read the first number,
To assign a numerical value to read into it.)
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]){
FILE *inputFile;
char *input = argv[1];
char magicSquareArray[257];
int ch, len;
inputFile = fopen(input, "r");
if (inputFile == 0){
printf("Cannot open file for reading!\n");
return -1;
}
len = 0;
while((ch = fgetc(inputFile)) != EOF && len < sizeof(magicSquareArray)-1){
magicSquareArray[len++] = ch;
}
magicSquareArray[len] = 0;
fclose(inputFile);
printf("%c\n", magicSquareArray[0]);
int sideSize = atoi(magicSquareArray);
int squareSize = sideSize * sideSize;
printf("%i\n", squareSize);
return 0;
}

Why is there a space after the first letter?

I have this code:
void text(){
FILE *fp;
fp = fopen( "Text.txt", "w" );
if (fp == NULL)printf("not open\n");
char txt[100];
char c;
while(1){
char c = getche();
if (c == '\e')
break;
txt[0] = c;
gets(txt+1);
fprintf(fp, "%s ",txt);
}
fclose(fp);
}
It works fine and the output would be like: Hello there.
void text(){
char fname[20] ;
puts("Write file name");
scanf("%s",&fname); //input of the file name
char ext[5] = ".txt";
char fileSpec[strlen(fname)+strlen(ext)+1]; //file name assambly
snprintf( fileSpec, sizeof( fileSpec ), "%s%s", fname, ext);
FILE *fp;
fp = fopen( fileSpec, "w" );
if (fp == NULL)printf("not open\n");
char txt[100];
char c;
while(1){
char c = getche();
if (c == '\e')
break;
txt[0] = c;
gets(txt+1);
fprintf(fp, "%s ",txt);
}
fclose(fp);
}
I took the top half from a website, tested it separately and it worked. The bottom half is the same and when i put the two together the output would be like this: H ello there.
Why is that space there?
You didn't provide enough context (i.e. What was put into stdin? What was in the file?).
But your fprintf(fp, "%s ",txt); has a space in it, and that seems like the likely culprit. (Note that the while loop can run more than one time, thus adding spaces between what it outputs to the file).

print the first letter of a file in C

Hello guys I am having a problem in printing the first two letter/characters of a .txt file which contains --> "need help". I would like to print the first two letters --> "ne". I tried with ch[], but I couldnt fix, so i changed it back to the part which works:
int main() {
char ch, file_name[2];
int i;
FILE *fp;
printf("Enter the name of file you wish to see\n");
gets(file_name);
fp = fopen(file_name,"r");
if( fp == NULL )
{
printf("Error while opening the file.\n");
exit(1);
}
printf("The contents of %s file are :\n", file_name);
while( ( ch = fgetc(fp) ) != EOF )
printf("%c",ch);
fclose(fp);
return 0;
}
int main() {
char ch[2];
FILE *fp;
fp = fopen("file.txt","r");
fread(ch,2,1,fp);
printf("(%c%c) (%2.2s)",ch[0], ch[1],ch);
}
stdout :
(ne) (ne)
I don't know why you need only the two first letters, but here's how to do it.
char file_name[256];
gets(file_name);
int lenght = 0;
strlen(file_name) > 2 ? lenght = 2: lenght = strlen(file_name);
for(int i = 0; i < lenght; i++)
printf("%c", file_name[i]);
But an advice that I can give you for strings in C (char arrays) is try to always create a bigger array that you need. It doesn't cost much memory and it's always safer to have more than enough. When you call standards functions like printf(), they will check the null terminated character and this will defines the size of your string.
This is what i came up so far. It prints the first two characters, but then it prints questions marks within a square underneath.
Here is the code:
int main() {
char ch[2], file_name[100];
int i;
FILE *fp;
printf("Enter the name of file you wish to see\n");
gets(file_name);
fp = fopen(file_name,"r");
if( fp == NULL )
{
printf("Error while opening the file.\n");
exit(1);
}
printf("The contents of %s file are :\n", file_name);
fscanf(fp, "%2s", ch);
printf("%s\n", ch);
while( ( ch[i] = fgetc(fp) ) != EOF ){
printf("%c",ch);
}
fclose(fp);
return 0;
}

Reading a UTF-16 CSV file by char

Currently I am trying to read a UTF-16 encoded CSV file char by char, and convert each char into ascii so I can process it. I later plan to change my processed data back to UTF-16 but that is besides the point right now.
I know right off the bat I am doing this completely wrong, as I have never attempted anything like this before:
int main(void)
{
FILE *fp;
int ch;
if(!(fp = fopen("x.csv", "r"))) return 1;
while(ch != EOF)
{
ch = fgetc(fp);
ch = (wchar_t) ch;
ch = (char) ch;
printf("%c", ch);
}
fclose(fp);
return 0;
}
Wishfully thinking, I was hoping that that work by magic for some reason but that was not the case. How can I read a UTF-16 CSV file and convert it to ascii? My guess is since each utf-16 char is two bytes (i think?) I'm going to have to read two bytes at a time from the file into a variable of some datatype which I am not sure of. Then I guess I will have to check the bits of this variable to make sure it is valid ascii and convert it from there? I don't know how I would do this though and any help would be great.
You should use fgetwc. The below code should work in the presence of a byte-order mark, and an available locale named en_US.UTF-16.
#include <stdio.h>
#include <wchar.h>
#include <locale.h>
main() {
setlocale(LC_ALL, "en_US.UTF-16");
FILE *fp = fopen("x.csv", "rb");
if (fp) {
int order = fgetc(fp) == 0xFE;
order = fgetc(fp) == 0xFF;
wint_t ch;
while ((ch = fgetwc(fp)) != WEOF) {
putchar(order ? ch >> 8 : ch);
}
putchar('\n');
fclose(fp);
return 0;
} else {
perror("opening x.csv");
return 1;
}
}
This is my solution thanks to the comments under my original question. Since every character in the CSV file is valid ascii the solution was simple as this:
int main(void)
{
FILE *fp;
int ch, i = 1;
if(!(fp = fopen("x.csv", "r"))) return 1;
while(ch != EOF)
{
ch = fgetc(fp);
if(i % 2) //ch is valid ascii
i++;
}
fclose(fp);
return 0;
}

Resources