Here is a minimal "working" example:
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char* argv[])
{
int num = 10;
FILE* fp = fopen("test.txt", "r"); // test.txt contains character sequence
char* ptr = (char*) malloc(sizeof (char)*(num+1)); // +1 for '\0'
fread(ptr, sizeof(char), num, fp); // read bytes from file
ptr[num] = '\0';
printf("%s\n", ptr); // output: ´╗┐abcdefg
free(ptr);
fclose(fp);
return 0;
}
I would like to read some letters from a text file, containing all letters from the alphabet in a single line. I want my array to store the first 10 letters, but the first 3 shown in the output are weird symbols (see the comment at the printf statement).
What am I doing wrong?
The issue is that your file is encoded using UTF-8. While UTF-8 is backwards-compatible with ASCII (which is what your code will be using) there are many differences.
In particular, many programs will put a BOM (Byte Order Mark) symbol at the start of the file to indicate which direction the bytes go. If you print the BOM using the default windows code page, you get the two symbols you saw.
Whatever program you used to create your text file was automatically inserting that BOM at the start of the file. Notepad++ is notorious for doing this. Check the save options and make sure to save either as plain ASCII or as UTF-8 without BOM. That will solve your problem.
Related
I'm trying to open and read a CSV file that holds data that is needed to set capabilities. I need to set the capabilities based off what capabilities are read in from the CSV file. I open the file and read the data into a buffer. That's where I am stuck. I now want to use that data in the buffer to make string or character comparison to enter if else statements. For example my csv file looks like this:
1000, CAP_SETPCAP, CAP_NET_RAW, CAP_SYS_ADMIN
The first number is the euid and the rest are the capabilities that I want to set for a process. When I read it into a buffer the buffer holds ASCII decimals. I want to be able to convert the buffer into a string or array so I can make comparisons.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define BUFFER_SIZE 1024
int main(int argc, char *argv[]){
FILE *fp;
fp = fopen("csvtest.csv", "r");
char buff[BUFFER_SIZE];
char *newCaps[] = {};
if(!fp){
printf("did not open\n");
}
fgets(buff, 1024, fp);
int i = 0;
//I don't know how to just get the size of what was put into the buffer
while(i < size_of_buffer){
//these are the comparisons I'd like to make, I know this isn't right
if(buffer[1000] == "1000"){
printf("This is the correct euid\n");
newCaps[0] = buffer[CAP_SETPCAP];
newCaps[1] = buffer[CAP_NET_RAW];
newCaps[2] = buffer[CAP_SYS_ADMIN];
}
i++;
}
Something along these lines.
You may want to take a look at the strtok() function. It separates a string in different tokens according to a delimiter you set to it, and returns a pointer to that string, which you may want to compare to the other string.
It is a good tool for you to separate and handle standarized delimited information.
I'm getting some issues with reading the content of my array. I'm not sure if I'm storing it correctly as my result for every line is '1304056712'.
#include <stdio.h>
#include <stdlib.h>
#define INPUT "Input1.dat"
int main(int argc, char **argv) {
int data_index, char_index;
int file_data[1000];
FILE *file;
int line[5];
file = fopen(INPUT, "r");
if(file) {
data_index = 0;
while(fgets(line, sizeof line, file) != NULL) {
//printf("%s", line); ////// the line seems to be ok here
file_data[data_index++] = line;
}
fclose(file);
}
int j;
for(j = 0; j < data_index; j++) {
printf("%i\n", file_data[j]); // when i display data here, i get '1304056712'
}
return 0;
}
I think you need to say something like
file_data[data_index++] = atoi(line);
From your results I assume the file is a plain-text file.
You cannot simply read the line from file (a string, an array of characters) into an array of integers, this will not work. When using pointers (as you do by passing line to fgets()) to write data, there will be no conversion done. Instead, you should read the line into an array of chars and then convert it to integers using either sscanf(), atoi() or some other function of your choice.
fgets reads newline terminated strings. If you're reading binary data, you need fread. If you're reading text, you should declare line as an array of char big enough for the longest line in the file.
Because file_data is an array of char, file_data[data_index] is a single character. It is being assigned a pointer (the base address of int line[5] buffer). If reading binary data, file_data should be an array of integers. If reading strings, it should be an array of string, ie char pointers, like char * file_data[1000]
you also need to initialize data_index=0 outside the if (file) ... block, because the output loop needs it to be set even if the file failed to open. And when looping and storing input, the loop should test that it's not reached the size of the array being stored into.
I wrote a program to test writing a char[128] array to file using write() function in C. The following is my code, however, after writing, I can see that the string "testseg" is followed by a "d" or "È" in the testFile.txt file. Is this a proper way of writing char[] array to file?
int main()
{
char pathFile[MAX_PATHNAME_LEN];
sprintf(pathFile, "testFile.txt");
int filedescriptor = open(pathFile, O_RDWR | O_APPEND | O_CREAT, 0777);
int num_segs = 10;
int mods = 200;
const char *segname = "testseg"; /* */
char real_segname[128];
strcpy(real_segname, segname);
write(filedescriptor, &num_segs, sizeof(int));
write(filedescriptor, real_segname, strlen(real_segname));
printf("real_segname length is %d \n", (int) strlen(real_segname));
write(filedescriptor, &mods, sizeof(int));
close(filedescriptor);
return 0;
}
...writing a char[128] array to file ...I can see that the string "testseg" ...
is a contradiction.
In C, a string is an array of char followed by and including a '\0' and
a char[128] is a fixed 128 char in length.
When code does write(filedescriptor, real_segname, strlen(real_segname));, it does neither. It is not writing a C string, 7 char of "testseg" terminated with a '\0'. Instead it just wrote the 7 char and no terminating '\0'. Neither did it write 128 char.
One could instead perform write(filedescriptor, real_segname, strlen(real_segname)+1); to write the 7 char and the terminating '\0'. Or write the length and then the interesting parts of the arry. Or write the entire 128 char array`. Need to identify how you want to read data back and other coding goals to well advise.
As #SGG suggests, the unusually char are simply the result of write(filedescriptor, &mods, sizeof(int)); and are not part of your unterminated array.
after writing, I can see that the string "testseg" is followed by a "d" or "È" in the testFile.txt file
Why it is showing "d" or "È"??
Only try below write function (in your code, comment remaining write calls except below call)
write(filedescriptor, &mods, sizeof(int));
Now see the contents of testFile.txt (cat testFile.txt). It shows some junk value(s).
Because, all .txt files will show you in the form of ASCII text format. It converts every byte into ASCII charcter. String and characters you're writing in ASCII format and reading them as ASCII. So no problem. But here you're writing mods and num_segs as integers and reading them as ASCII format. So you got those junk values.
Is this a proper way of writing char[] array to file?
Yes, according to man pages you're writing them in proper way. And please make sure to validate your function calls(write). Where to write, what to write in a file depends upon your requirement.
I am completely new to C and need help with this badly.
Im reading a file with fopen(), then obtaining the contents of it using fgetc(). What I want to know is how I can access the line fgetc() returns so if I can put the 4th - 8th characters into a char array. Below is an example I found online but am having a hard time parsing the data returns, I still don't have a firm understanding of C and don't get how an int can be used to store a line of characters.
FILE *fr;
fr = fopen("elapsed.txt", "r");
int n = fgetc(fr);
while(n!= EOF){
printf("%c", n);
n = fgetc(fr);
} printf("\n");
Here
1 first open the file
2 get size of file
3 allocated size to character pointer
4 and read data from file
FILE *fr;
char *message;
fr = fopen("elapsed.txt", "r");
/*create variable of stat*/
struct stat stp = { 0 };
/*These functions return information about a file. No permissions are required on the file itself*/
stat("elapsed.txt", &stp);
/*determine the size of data which is in file*/
int filesize = stp.st_size;
/*allocates the address to the message pointer and allocates memory*/
message = (char *) malloc(sizeof(char) * filesize);
if (fread(message, 1, filesize - 1, fr) == -1) {
printf("\nerror in reading\n");
/**close the read file*/
fclose(fr);
/*free input string*/
free(message);
}
printf("\n\tEntered Message for Encode is = %s", message);
PS Dont Forget to Add #include <sys/stat.h>.
You're not retrieving a line with fgetc. You are retrieving one character at a time from the file. That sample keeps retrieving characters until the EOF character is encountred (end of file). Look at this description of fgetc.
http://www.cplusplus.com/reference/clibrary/cstdio/fgetc/
On each iteration of the while loop, fgetc will retrieve a single character and place it into the variable "n". Something that can help you with "characters" in C is to just think of it as one byte, instead of an actual character. What you're not understanding here is that an int is 4 bytes and the character is 1 byte, but both can store the same bit pattern for the same ASCII character. The only different is the size of the variable internally.
The sample you have above shows a printf with "%c", which means to take the value in "n" and treat it like an ASCII character.
http://www.cplusplus.com/reference/clibrary/cstdio/printf/
You can use a counter in the while loop to keep track of your position to find the 4th and 8th value from the file. You should also think about what happens if the input file is smaller than your maximum size.
Hope that helps.
Ok look at it as box sizes I could have a 30cm x 30cm box that can hold 1 foam letter that I have. Now the function I am calling a function that 'could' return a 60cm x 60cm letter but it 99% likely to return a 30cm x 30cm letter because I know what its reading - I know if I give it a 60cm x 60cm box the result will always fit without surprises.
But if I am sure that the result will always be a 30cm x 30cm box then I know I can convert the result of a function that returns aa 60cm x 60cm box without losing anything
I have a binary file with variable length record that looks about like this:
12 economic10
13 science5
14 music1
15 physics9
16 chemistry9
17 history2
18 anatomy7
19 physiology7
20 literature3
21 fiction3
16 chemistry7
14 music10
20 literature1
The name of the course is the only variable length record in the file, the first number is the code of the course and it can be a number between 1 and 9999 and the 2nd number is the department and it can be between 1 and 10.
As you see in the file there is no space between the name of the course and the department number.
The question is how can I read from the binary file? There is no field in the file to tell me what is the size of the string which is the course name..
I can read the first int (course id) fine, but how do I know what is the size of the name of the course?
Use fscanf() with the format string "%u %[a-z]%u".
Here's a complete example program:
#include <stdio.h>
#define NAME_MAX 64
int main(int argc, char ** argv)
{
FILE * file = fopen("foo.txt", "rb");
unsigned int course, department;
char name[NAME_MAX];
while(fscanf(file, "%u %[a-z]%u", &course, name, &department) != EOF)
{
// do stuff with records
printf("%u-%u %s\n", department, course, name);
}
fclose(file);
return 0;
}
You'd need to know how the file was written out in the first place.
To read variable length records, you should use some sort of convention. For example, a special characters that indicates the end of a record. Inside every record, you could use a another special character indicating end of field.
DO_READ read from file
is END_OF_RECORD char present?
yes: GOTO DO_PROCESS
no : GOTO DO_READ
DO_PROCESS read into buffer
is END_OF_FILE mark present?
yes: GOTO DOSOMETHINGWITHIT
no: GOTO DO_PROCESS
As others have said this looks a lot like text, so a text parsing approach is likely to be the right way to go. Since this is homework, I'm not going to code it out for you, but here's the general approach I'd take:
Using fscanf, read the course code, and a combined string with the name and department code.
Starting from the end of the combined string, go backwards until you find the first non-digit. This is end the of the course name.
Read the integer starting just beyond the end of the course name (ie, the last digit we find scanning backwards).
Replace the first character of that integer's part of the string with a NUL ('\0') - this terminates the combined string immediately after the course name. So all we have left in the combined string is the course name, and we have the course code and department code in integer variables.
Repeat for the next line.
If there is a one to one correspondence between course code and course name (including department code), you can deduce the size of the course name from its code, with a predefined table somewhere in the code or in a configuration file.
If not, the main problem I see is to discriminate things like music1 and music10.
Assuming there are no carriage returns and each string is null terminated.
I have written a little program to create a binary file and then read it back, producing similar output.
// binaryFile.cpp
#include "stdafx.h"
#include <stdio.h>
#include <string.h>
#define BUFSIZE 64
int _tmain(int argc, _TCHAR* argv[])
{
FILE *f;
char buf[BUFSIZE+1];
// create dummy bin file
f = fopen("temp.bin","wb");
if (f)
{ // not writing all the data, just a few examples
sprintf(buf,"%04d%s\00",12,"economic10"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
sprintf(buf,"%04d%s\00",13,"science5"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
sprintf(buf,"%04d%s\00",14,"music1"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
sprintf(buf,"%04d%s\00",15,"physics9"); fwrite(buf,sizeof(char),strlen(buf)+1,f);
fclose(f);
}
// read dummy bin file
f = fopen("temp.bin","rb");
if (f)
{
int classID;
char str[64];
char *pData
long offset = 0;
do
{
fseek(f,offset,SEEK_SET);
pData = fgets(buf,BUFSIZE,f);
if (pData)
{ sscanf(buf,"%04d%s",&classID,&str);
printf("%d\t%s\r\n",classID,str);
offset +=strlen(pData)+1; // record + 1 null character
}
} while(pData);
fclose(f);
}
getchar();
return 0;
}