#EDIT: I think the problem is that I put my 2 text files on desktop. Then, I move them to the same place as the source file and it works. But the program cannot run this time, the line:
cok = 0;
shows "exception thrown".
// end EDIT
I have the assignment at school to write a C program to create 2 text files. 1 file stores 25 keywords, and 1 file stores the fake resume. The problem is, my program cannot read my keywords.txt file. Anyone can help me? Thank you so much.
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//*****************************************************
// MAIN FUNCTION
int main()
{
//File pointer for resume.txt file declared
FILE *fpR;
//File pointer for keywords.txt file declared and open it in read mode
FILE* fpK = fopen("keywords.txt", "r");
//To store character extracted from keyword.txt file
char cK;
//To store character extracted from resume.txt file
char cR;
//To store word extracted from keyword.txt file
char wordK[50];
//To store word extracted from resume.txt file
char wordR[50];
//To store the keywords
char keywords[10][50];
//To store the keywords counter and initializes it to zero
int keywordsCount[10] = { 0, 0, 0, 0 };
int coK, coR, r, r1;
coK = coR = r = r1 = 0;
//Checks if file is unable to open then display error message
if (fpK == NULL)
{
printf("Could not open files");
exit(0);
}//End of if
//Extracts a character from keyword.txt file and stores it in cK variable and Loops till end of file
while ((cK = fgetc(fpK)) != EOF)
{
//Checks if the character is comma
if (cK != ',')
{
//Store the character in wordK coK index position
wordK[coK] = cK;
//Increase the counter coK by one
coK++;
}//End of if
//If it is comma
else
{
//Stores null character
wordK[coK] = '\0';
//Copies the wordK to the keywords r index position and increase the counter r by one
strcpy(keywords[r++], wordK);
//Re initializes the counter to zero for next word
coK = 0;
fpR = fopen("resume.txt", "r");
//Extracts a character from resume.txt file and stores it in cR variable and Loops till end of file
while ((cR = fgetc(fpR)) != EOF)
{
//Checks if the character is space
if (cR != ' ')
{
//Store the character in wordR coR index position
wordR[coR] = cR;
//Increase the counter coR by one
coR++;
}//End of if
else
{
//Stores null character
wordR[coR] = '\0';
//Re initializes the counter to zero for next word
coR = 0;
//Compares word generated from keyword.txt file and word generated from resume.txt file
if (strcmp(wordK, wordR) == 0)
{
//If both the words are same then increase the keywordCounter arrays r1 index position value by one
keywordsCount[r1] += 1;
}//End of if
}//End of else
}//End of inner while loop
//Increase the counter by one
r1++;
//Close the file for resume
fclose(fpR);
}//End of else
}//End of outer while loop
//Close the file for keyword
fclose(fpK);
//Display the result
printf("\n Result \n");
for (r = 0; r < r1; r++)
printf("\n Keyword: %s %d time available", keywords[r], keywordsCount[r]);
}//End of main
I think the problem is the text files, aren't they?
The name of my 1st test file is "keywords.txt", and its content is:
Java, CSS, HTML, XHTML, MySQL, College, University, Design, Development, Security, Skills, Tools, C, Programming, Linux, Scripting, Network, Windows, NT
The name of my 2nd test file is "resume.txt", and its content is:
Junior Web developer able to build a Web presence from the ground up -- from concept, navigation, layout, and programming to UX and SEO. Skilled at writing well-designed, testable, and efficient code using current best practices in Web development. Fast learner, hard worker, and team player who is proficient in an array of scripting languages and multimedia Web tools. (Something like this).
I don't see any problem with these 2 files. But my program still cannot open the file and the output keeps showing "Could not open files".
while ((cK = fgetc(fpK)) != EOF)
If you check the documentation, you can see that fgets returns an int. But since cK is a char, you force a conversion to char, which can change its value. You then compare the possibly changed value to EOF, which is not correct. You need to compare the value that fgets returns to EOF since fgetc returns an EOF on end of file.
I'm developing some code in C that reads a file extension and stores it as a code in a byte together whether a text file or binary file is being processed. Later I wish to recover the file extension that is encoded in a byte.
As a test I created a loop in the main function where I can test out the function fileExtenCode(), which is in the second listing.
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#define EXLEN 9
#define EXNUM 8
typedef unsigned char BYTE;
bool fileExtenCode(char*, BYTE*, int*);
int main(void) {
char fileExten[EXLEN];
BYTE code;
int bin;
for (;;) {
printf("Type file extension: ");
scanf_s("%s", fileExten, EXLEN);
if (fileExten[0] == '.') break;
printf("%s\n", fileExten);
code = 0;
bin = 0;
bool extFound = fileExtenCode(fileExten, &code, &bin); // <== (1)
if (extFound) printf("Extension found: TRUE\n");
else printf("Extension found: FALSE\n");
printf("%s%d", "Code: ", code);
if (bin) printf(" binary file\n");
else printf(" text file\n");
printf("\n");
printf("Type code: ");
int icode;
scanf_s("%d", &icode);
code = icode;
bin = -1;
fileExtenCode(fileExten, &code, &bin); // <== (2)
printf("%s", fileExten); // <== (5)
printf("\n");
}
return 0;
}
The function that I'm trying to test is as follows:
bool fileExtenCode(char* ext, BYTE* code, int* binary) {
char *fileEx[EXNUM] = {
"jpg1", "txt0", "html0", "xml0", "exe1", "bmp1", "gif1", "png1"};
if (*binary < 0) { // <== (3)
ext = fileEx[*code]; // <== (4)
return true;
}
size_t extLen = strlen(ext);
for (BYTE i = 0; i < EXNUM; i++) {
if (strncmp(fileEx[i], ext, extLen) == 0) {
*binary = (fileEx[i][extLen] == '1') ? 1 : 0;
*code = i;
return true;
}
}
return false;
}
The idea is that you pass a string with the file extension to fileExtenCode() in statement (1) in main, and the function searched for that extension in an array, and if found returns true together with the code argument indicating the position in array of file extensions and the binary flag as 0 or 1 indicating if the file is text or binary. A '0' or '1' immediately follows file extension in the array. If the extension is not found, the function returns with false and the return values in the arguments have no meaning.
So far so good, and this part works correctly. However, in using the function in reverse to recover the file extension given the input value of code, it fails when called with statement (2) in main. In this case binary is set to -1, and then the function is called and the condition at (3) is now true and ext in (4) recovers the file extension. This is confirmed when inserting a temporary print statement immediately after (4), but this value is not returned in (5) back in main, and an old input value is instead printed.
Obviously there is a problem with pointers, but I cannot see an obvious way of fixing it. My question is how to correct this without messing up the rest of the code, which is working correctly? Note that char* ext and BYTE* code are used for both input and output, whilst int* binary is used as an input flag and returns no useful value when set to -1.
Once this problem is fixed, then it should be relatively easy to separate the binary flag from the extension when the binary flag is set to -1. Eventually I plan to have many more file extensions, but not until this is working correctly with a sample of 8.
Getting help in fixing this problem would be most appreciated.
OK, many thanks pmg, that works, except that I have to use:
strcpy_s(ext, EXLEN, fileEx[*code]);
as the Visual Studio 2022 compiler flags an error. This also solves a warning I was getting when I declared the array *fileEx[EXNUM] with the const keyword.
In my haste last night I omitted to include the statement:
if (*code >= EXNUM) return false;
immediately after (3) to trap the case when *code goes out of bounds of *fileEx[EXNUM].
I’m trying to program a HMI console to read a file from an USB pen drive and display its data on the screen. This is a csv file and the objective is to store the interpreted data to HMI console memory, which the HMI console later interprets. The macros on these consoles run in C (not C++).
I have no issue with both reading and interpreting the file, the issue that the existing function (not accessible to me, shown below) to write in the console memory only interprets char.
int WriteLocal( const char *type, int addr, int nRegs, void *buf , int flag );
Parameter: type is the string of "LW","LB" etc;
address is the Operation address ;
nRegs is the length of read or write ;
buf is the buffer which store the reading or writing data
flag is 0,then codetype is BIN,is 1 then codetype is BCD;
return value : 1 , Operation success
0 , Operation fail.
As my luck would have it I need to write integer values. What are available to me are the variables for each memory position. These are preexisting and are named individually such as:
int WR_LW200;
int WR_LW202;
int WR_LW204;
...
int WR_LW20n;
Ideally we could have a vector with all the names of the variables but unfortunately this is not possible. I could manually write every single variable but I need to do 300 of these…
must be a better way, right?
Just to give you a look on how it ended up looking:
int* arr[50][5] = { {&WR_LW200, &WR_LW400, &WR_LW600, &WR_LW800, &WR_LW1000},
{&WR_LW202, &WR_LW402, &WR_LW602, &WR_LW802, &WR_LW1002},
{&WR_LW204, &WR_LW404, &WR_LW604, &WR_LW804, &WR_LW1004},
{&WR_LW206, &WR_LW406, &WR_LW606, &WR_LW806, &WR_LW1006},
{&WR_LW208, &WR_LW408, &WR_LW608, &WR_LW808, &WR_LW1008},
{&WR_LW210, &WR_LW410, &WR_LW610, &WR_LW810, &WR_LW1010},
{&WR_LW212, &WR_LW412, &WR_LW612, &WR_LW812, &WR_LW1012},
{&WR_LW214, &WR_LW414, &WR_LW614, &WR_LW814, &WR_LW1014},
{&WR_LW216, &WR_LW416, &WR_LW616, &WR_LW816, &WR_LW1016},
{&WR_LW218, &WR_LW418, &WR_LW618, &WR_LW818, &WR_LW1018},
{&WR_LW220, &WR_LW420, &WR_LW620, &WR_LW820, &WR_LW1020},
{&WR_LW222, &WR_LW422, &WR_LW622, &WR_LW822, &WR_LW1022},
{&WR_LW224, &WR_LW424, &WR_LW624, &WR_LW824, &WR_LW1024},
{&WR_LW226, &WR_LW426, &WR_LW626, &WR_LW826, &WR_LW1026},
{&WR_LW228, &WR_LW428, &WR_LW628, &WR_LW828, &WR_LW1028},
{&WR_LW230, &WR_LW430, &WR_LW630, &WR_LW830, &WR_LW1030},
{&WR_LW232, &WR_LW432, &WR_LW632, &WR_LW832, &WR_LW1032},
{&WR_LW234, &WR_LW434, &WR_LW634, &WR_LW834, &WR_LW1034},
{&WR_LW236, &WR_LW436, &WR_LW636, &WR_LW836, &WR_LW1036},
{&WR_LW238, &WR_LW438, &WR_LW638, &WR_LW838, &WR_LW1038},
{&WR_LW240, &WR_LW440, &WR_LW640, &WR_LW840, &WR_LW1040},
{&WR_LW242, &WR_LW442, &WR_LW642, &WR_LW842, &WR_LW1042},
{&WR_LW244, &WR_LW444, &WR_LW644, &WR_LW844, &WR_LW1044},
{&WR_LW246, &WR_LW446, &WR_LW646, &WR_LW846, &WR_LW1046},
{&WR_LW248, &WR_LW448, &WR_LW648, &WR_LW848, &WR_LW1048},
{&WR_LW250, &WR_LW450, &WR_LW650, &WR_LW850, &WR_LW1050},
{&WR_LW252, &WR_LW452, &WR_LW652, &WR_LW852, &WR_LW1052},
{&WR_LW254, &WR_LW454, &WR_LW654, &WR_LW854, &WR_LW1054},
{&WR_LW256, &WR_LW456, &WR_LW656, &WR_LW856, &WR_LW1056},
{&WR_LW258, &WR_LW458, &WR_LW658, &WR_LW858, &WR_LW1058},
{&WR_LW260, &WR_LW460, &WR_LW660, &WR_LW860, &WR_LW1060},
{&WR_LW262, &WR_LW462, &WR_LW662, &WR_LW862, &WR_LW1062},
{&WR_LW264, &WR_LW464, &WR_LW664, &WR_LW864, &WR_LW1064},
{&WR_LW266, &WR_LW466, &WR_LW666, &WR_LW866, &WR_LW1066},
{&WR_LW268, &WR_LW468, &WR_LW668, &WR_LW868, &WR_LW1068},
{&WR_LW270, &WR_LW470, &WR_LW670, &WR_LW870, &WR_LW1070},
{&WR_LW272, &WR_LW472, &WR_LW672, &WR_LW872, &WR_LW1072},
{&WR_LW274, &WR_LW474, &WR_LW674, &WR_LW874, &WR_LW1074},
{&WR_LW276, &WR_LW476, &WR_LW676, &WR_LW876, &WR_LW1076},
{&WR_LW278, &WR_LW478, &WR_LW678, &WR_LW878, &WR_LW1078},
{&WR_LW280, &WR_LW480, &WR_LW680, &WR_LW880, &WR_LW1080},
{&WR_LW282, &WR_LW482, &WR_LW682, &WR_LW882, &WR_LW1082},
{&WR_LW284, &WR_LW484, &WR_LW684, &WR_LW884, &WR_LW1084},
{&WR_LW286, &WR_LW486, &WR_LW686, &WR_LW886, &WR_LW1086},
{&WR_LW288, &WR_LW488, &WR_LW688, &WR_LW888, &WR_LW1088},
{&WR_LW290, &WR_LW490, &WR_LW690, &WR_LW890, &WR_LW1090},
{&WR_LW292, &WR_LW492, &WR_LW692, &WR_LW892, &WR_LW1092},
{&WR_LW294, &WR_LW494, &WR_LW694, &WR_LW894, &WR_LW1094},
{&WR_LW296, &WR_LW496, &WR_LW696, &WR_LW896, &WR_LW1096},
{&WR_LW298, &WR_LW498, &WR_LW698, &WR_LW898, &WR_LW1098} };
Big right? I had consurns that this HMI would have issues with such an approach but it did the job. The code below runs trough a string that comes from the csv file. This code runs inside another while cycle to cycle trough the multi dimensional array.
it's a little crude but works.
while (i<=5)
{
memset(lineTemp, 0, sizeof lineTemp); // clear lineTemp array
while (lineFromFile[index] != delimiter)
{
if (lineFromFile[index] != delimiter && lineFromFile[index] != '\0') { lineTemp[j] = lineFromFile[index]; index++; j++; }
if (lineFromFile[index] == '\0') { i = 5; break; }
}
index++;
lineTemp[j] = '\0'; // NULL TERMINATION
j = 0;
if (i == -1) { WriteLocal("LW",temp,3,lineTemp,0); }
if (i >= 0 && i<=5) { *(arr[x][i]) = atoi(lineTemp); }
i++;
}
Thanks again for the tip.
Cheers
I am building an LZW encoding algorithm, which uses dictionary and hashing so it can reach fast enough for working words already stored in a dictionary.
The algorithm gives proper results when ran on smaller files (cca few hundreds of symbols), but on the larger files (and especially in those files which contain of less different symbols - for example, it gives the worst performance when ran on a file which consists only of 1 symbol, 'y' let's say). The worst performance, in terms that it just crashes when dictionary is not even close to being full. However, when the large input file consists of more than 1 symbol, dictionary gets close to being full, approximately 90%, but again then it crashes.
Considering the structure of my algorithm, I am not quite sure what is causing it to crash in general, or crash so soon when large file of just 1 symbol is given.
It must be something about hashing (first time doing it, so it might have some bugs).
The hash function I am using can be found here, and from what I have tested it, it gives good results: oat_hash
LZW encoding algorithm is based on this link, with slight change, that it works until the dictionary is not full: LZW encoder
Let's get into code:
Note: oat_hash is changed so it returns value % CAPACITY, so every index is from DICTIONARY
// Globals
#define CAPACITY 100000
char *DICTIONARY[CAPACITY];
unsigned short CODES[CAPACITY]; // CODES and DICTIONARY are linked via index: word from dictionary on index i, has its code in CODES on index i
int position = 0;
int code_counter = 0;
void encode(FILE *input, FILE *output){
int succ1 = fseek(input, 0, SEEK_SET);
if(succ1 != 0) printf("Error: file not open!");
int succ2 = fseek(output, 0, SEEK_SET);
if(succ2 != 0) printf("Error: file not open!");
//1. Working word = next symbol from the input
char *working_word = malloc(2048*sizeof(char));
char new_symbol = getc(input);
working_word[0] = new_symbol;
working_word[1] = '\0';
//2. WHILE(there are more symbols on the input) DO
//3. NewSymbol = next symbol from the input
while((new_symbol = getc(input)) != EOF){
char *workingWord_and_newSymbol= NULL;
char newSymbol[2];
newSymbol[0] = new_symbol;
newSymbol[1] = '\0';
workingWord_and_newSymbol = working_word_and_new_symbol(working_word, newSymbol);
int index = oat_hash(workingWord_and_newSymbol, strlen(workingWord_and_newSymbol));
//4. IF(WorkingWord + NewSymbol) is already in the dictionary THEN
if(DICTIONARY[index] != NULL){
// 5. WorkingWord += NewSymbol
working_word = working_word_and_new_symbol(working_word, newSymbol);
}
//6. ELSE
else{
//7. OUTPUT: code for WorkingWord
int idx = oat_hash(working_word, strlen(working_word));
fprintf(output, "%u", CODES[idx]);
//8. Add (WorkingWord + NewSymbol) into a dictionary and assign it a new code
if(!dictionary_full()){
DICTIONARY[index] = workingWord_and_newSymbol;
CODES[index] = code_counter + 1;
code_counter += 1;
working_word = strdup(newSymbol);
}else break;
}
//10. END IF
}
//11. END WHILE
//12. OUTPUT: code for WorkingWord
int index = oat_hash(working_word, strlen(working_word));
fprintf(output, "%u", CODES[index]);
free(working_word);
}
int index = oat_hash(workingWord_and_newSymbol, strlen(workingWord_and_newSymbol));
And later
int idx = oat_hash(working_word, strlen(working_word));
fprintf(output, "%u", CODES[idx]);
//8. Add (WorkingWord + NewSymbol) into a dictionary and assign it a new code
if(!dictionary_full()){
DICTIONARY[index] = workingWord_and_newSymbol;
CODES[index] = code_counter + 1;
code_counter += 1;
working_word = strdup(newSymbol);
}else break;
idx and index are unbounded and you use them to access a bounded array. You're accessing memory out of range. Here's a suggestion, but it may skew the distribution. If your hash range is much larger than CAPACITY it shouldn't be a problem. But you also have another problem which was mentioned, collisions, you need to handle them. But that's a different problem.
int index = oat_hash(workingWord_and_newSymbol, strlen(workingWord_and_newSymbol)) % CAPACITY;
// and
int idx = oat_hash(working_word, strlen(working_word)) % CAPACITY;
LZW compression is certainly used to construct binary files and normally is capable of reading binary files.
The following code is problematic as it relies on new_symbol never being a \0.
newSymbol[0] = new_symbol; newSymbol[1] = '\0';
strlen(workingWord_and_newSymbol)
strdup(newSymbol)
Needs re-write to work with arrays of bytes rather than strings.
fopen() was not shown. Insure one is opening in binary. input = fopen(..., "rb");
#Wumpus Q. Wumbley is correct, use int newSymbol.
Minor:
new_symbol and newSymbol are confusing.
Consider:
// char *working_word = malloc(2048*sizeof(char));
#define WORKING_WORD_N (2048)
char *working_word = malloc(WORKING_WORD_N*sizeof(*working_word));
// or
char *working_word = malloc(WORKING_WORD_N);