Related
I am currently building a symbol table program using C. It needs to stay as simple as possible while having the required functionality as I am expected to produce a working compiler by the end of the semester. I currently have a working implementation that creates entries into the symbol table from user input but it is not 100% where it needs to be. I just need some guidance based on the feedback I was given from my professor. I understand that there are some things I need to change, I am new to coding in C and I am also trying to learn Python and R at the same time so im a little overwhelmed. I know I need a separate initialize and print function, That there should be no Input or Output in the create function, and that every entry has a scope of 0. where I'm stuck at, is creating the functions for initialize and print without losing the current functionality that I already have. Any help is appreciated. Here is my current implementation of the code:
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
struct ADT {
char name[18]; // lexeme name
char usage;
char type; // I is integer, S is type string, I for identifier
int scope; // scope of where it was declared, inserted for later use
int reference;
};
typedef struct ADT new_type;
new_type table[200];
int i = 0;
int read(char *name, char usage, char type, char scope) { //Read function to read input and check for duplicates
for (int j = sizeof(table) / sizeof(table[0]); j >= 0; --j) {
if (strcmp(table[j].name, name) == 0 &&
table[j].usage == usage &&
table[j].type == type &&
table[j].scope == scope)
return 1; // found
}
return -1; // not found! that's good
}
int create( char *name, char usage, char type, char scope) { //Create function to insert new input into symbol table
strcpy(table[i].name, name);
table[i].usage = usage;
table[i].type = type;
table[i].scope = scope;
if (table[i].usage == 'I' && table[i].type == 'L')
table[i].reference = atoi(name);
else
table[i].reference = -1;
return i++;
}
int initialize(char *name, char usage, char type, char scope) { // Function to initialize the symbol table and clear it. also creates the fred lexeme
create("Fred", 'I', 'I', '0');
}
int print(char *name, char usage, char type, char scope) { // Print function to print the symbol table
printf("Nate's Symbol Table\n");
printf("#\t\tName\tScope\tType\tUsage\tReference\n");
for (int j = 0; j < i; j++) {
if (table[j].name == NULL)
break;
printf("%*d\t\t%*s\t%*d\t%*c\t%*c\t%*d\n", j, table[j].name, table[j].scope, table[j].type, table[j].usage, table[j].reference);
}
}
int main() { // Main function to take input and produce the symbol table lexemes
printf("Course: CSCI 490 Name: Nathaniel Bennett NN: 02 Assignment: A03\n");
printf("\n");
create("Fred", 'I', 'I', 0);
for (int j = 0; j < i; j++) {
if (table[j].name == NULL)
break;
printf("#\t\tName\tScope\tType\tUsage\tReference\n");
printf("%*d\t\t%*s\t%*d\t%*c\t%*c\t%*d\n", j, table[j].name, table[j].scope, table[j].type, table[j].usage, table[j].reference);
}
// keep asking for a lexeme until we type STOP or stop
while (1) {
char lexeme[256];
char nUsage;
char nType;
char nScope;
printf("Enter a lexeme: \n"); //enter lexeme name
scanf("%s", lexeme);
if (strcmp(lexeme, "stop") == 0) break;
printf("Enter its usage: \n");
scanf(" %c", &nUsage);
printf("Enter its type: \n");
scanf(" %c", &nType);
printf("Enter its scope: \n");
scanf(" %c", &nScope);
printf("%s, %c, %c, %c\n", lexeme, nUsage, nType, nScope);
create(lexeme, nUsage, nType, nScope);
for (int j = 0; j < i; j++) {
if (table[j].name == NULL)
break;
printf("%*d\t\t%*s\t%*d\t%*c\t%*c\t%*d\n", j, table[j].name, table[j].scope, table[j].type, table[j].usage, table[j].reference);
}
}
printf("Nate's Symbol Table\n");
printf("#\t\tName\tScope\tType\tUsage\tReference\n");
for (int j = 0; j < i; j++) {
if (table[j].name == NULL)
break;
printf("%*d\t\t%*s\t%*d\t%*c\t%*c\t%*d\n", j, table[j].name, table[j].scope, table[j].type, table[j].usage, table[j].reference);
}
return 0;
}```
...I think we're normally reluctant to get up in people's course assignments, but you seem like you have thought about this for a while Nate.
I can't quite make out what your instructor is suggesting. I do not see I/O in your code for the create() function. Unless the call to strcpy() is considered I/O in their view.
I do see some room for improvement in your print() function though. Your function relies upon a global entity (table) and then it ties your loop both to an imaginary value (what is "i" in your loop initialization?) AND to a condition where your logic asks effectively, "did I run out of table?"
Choose one condition or the other. There is a semantic elegance in simply printing everything you find in the table. You can make the function better if you pass a reference to the table rather than code to the existence of a static global value. So instead of passing all those values to your print() function, how about just one argument? Pass a reference to table, and your function could then be used for other similar dump operations. It becomes more generalized, and that's a good thing.
I would also say this. I prefer using sprintf() to stage my output in a string and then when everything is ready, I output it all at one time. This is easier to inspect and debug.
Also, not related to your assignment I imagine, but be extra-vigilant every time you use scanf() -- it was often my number one suspect whenever I had a bad pointer.
Definitely try to isolate or eliminate calls to chaotic functions like that one.
Keep thinking about how to make your function stronger, keep refactoring. You'll do great!
There are a number of issues. This won't even compile:
read conflicts with the syscall (i.e. rename it)
read has UB (undefined behavior) because it starts the for loop at one beyond the end of the table array
The symbol printing code is replicated everywhere. Better to define a table printing function (e.g. tblprint) and a symbol printing function (e.g. symprint).
The format used to print a symbol uses (incorrectly) variable precision format specifiers (e.g.) %*s expects two arguments: int len,char *str With -Wall as a compile option, these statements are flagged.
AFAICT, ordinary format specifiers work fine.
The if (sym->name == NULL) will never be valid because it is a fixed length array. We need to use a char *.
Using i as a global for the count of the array is misleading. Try something more descriptive (e.g.) tabcount
Using table[i].whatever everywhere is cumbersome. Try using a pointer (e.g. sym->whatever)
initialize [and some others] need a return with a value.
I've used cpp conditionals to denote old code vs new code:
#if 0
// old code
#else
// new code
#endif
Here is the refactored code. It is annotated. It compiles cleanly and passes a rudimentary test:
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
struct ADT {
// NOTE/BUG: the if (sym->name == NULL) will fail
#if 0
char name[18]; // lexeme name
#else
const char *name; // lexeme name
#endif
char usage;
// I is integer, S is type string, I for identifier
char type;
// scope of where it was declared, inserted for later use
int scope;
int reference;
};
#if 0
typedef struct ADT new_type;
new_type table[200];
#else
typedef struct ADT ADT;
ADT table[200];
#endif
int tabcount = 0;
// NOTE/BUG: "read" conflicts with a syscall name
#if 0
//Read function to read input and check for duplicates
int
read(char *name, char usage, char type, char scope)
#else
// find_entry -- find a matching entry (if it exists)
int
find_entry(char *name, char usage, char type, char scope)
#endif
{
// NOTE/BUG: this is UB (undefined behavior) because you're starting at one
// past the end of the array
#if 0
for (int j = sizeof(table) / sizeof(table[0]); j >= 0; --j) {
#else
for (int j = tabcount - 1; j >= 0; --j) {
#endif
ADT *sym = &table[j];
if (strcmp(sym->name, name) == 0 &&
sym->usage == usage &&
sym->type == type &&
sym->scope == scope)
return 1;
}
// not found! that's good
return -1;
}
//Create function to insert new input into symbol table
int
create(char *name, char usage, char type, char scope)
{
ADT *sym = &table[tabcount];
// NOTE/BUG: this needs to be a pointer to a string to allow long strings and
// for "if (sym->name == NULL)" to be valid
#if 0
strcpy(sym->name, name);
#else
sym->name = strdup(name);
#endif
sym->usage = usage;
sym->type = type;
sym->scope = scope;
if (sym->usage == 'I' && sym->type == 'L')
sym->reference = atoi(name);
else
sym->reference = -1;
return tabcount++;
}
// Function to initialize the symbol table and clear it. also creates the fred
// lexeme
int
initialize(char *name, char usage, char type, char scope)
{
create("Fred", 'I', 'I', '0');
return 0;
}
void
symprint(ADT *sym)
{
int j = sym - table;
// NOTE/BUG: with (e.g) %*d this is variable precision field -- it requires
// _two_ arguments: <int wid>,<int val>
#if 0
printf("%*d\t\t%*s\t%*d\t%*c\t%*c\t%*d\n",
j, sym->name, sym->scope, sym->type,
sym->usage, sym->reference);
#else
printf("%d\t\t%s\t%d\t%c\t%c\t%d\n",
j, sym->name, sym->scope, sym->type,
sym->usage, sym->reference);
#endif
}
void
tblprint(int title)
{
if (title)
printf("#\t\tName\tScope\tType\tUsage\tReference\n");
for (int j = 0; j < tabcount; j++) {
ADT *sym = &table[j];
if (sym->name == NULL)
break;
symprint(sym);
}
}
// Print function to print the symbol table
int
print(char *name, char usage, char type, char scope)
{
printf("Nate's Symbol Table\n");
tblprint(1);
return 0;
}
// Main function to take input and produce the symbol table lexemes
int
main()
{
printf("Course: CSCI 490 Name: Nathaniel Bennett NN: 02 Assignment: A03\n");
printf("\n");
create("Fred", 'I', 'I', 0);
tblprint(1);
// keep asking for a lexeme until we type STOP or stop
while (1) {
char lexeme[256];
char nUsage;
char nType;
char nScope;
// enter lexeme name
printf("Enter a lexeme: \n");
scanf("%s", lexeme);
if (strcmp(lexeme, "stop") == 0)
break;
printf("Enter its usage: \n");
scanf(" %c", &nUsage);
printf("Enter its type: \n");
scanf(" %c", &nType);
printf("Enter its scope: \n");
scanf(" %c", &nScope);
printf("%s, %c, %c, %c\n", lexeme, nUsage, nType, nScope);
create(lexeme, nUsage, nType, nScope);
tblprint(0);
}
printf("Nate's Symbol Table\n");
tblprint(1);
return 0;
}
I'm using an array of strings in C to hold arguments given to a custom shell. I initialize the array of buffers using:
char *args[MAX_CHAR];
Once I parse the arguments, I send them to the following function to determine the type of IO redirection if there are any (this is just the first of 3 functions to check for redirection and it only checks for STDIN redirection).
int parseInputFile(char **args, char *inputFilePath) {
char *inputSymbol = "<";
int isFound = 0;
for (int i = 0; i < MAX_ARG; i++) {
if (strlen(args[i]) == 0) {
isFound = 0;
break;
}
if ((strcmp(args[i], inputSymbol)) == 0) {
strcpy(inputFilePath, args[i+1]);
isFound = 1;
break;
}
}
return isFound;
}
Once I compile and run the shell, it crashes with a SIGSEGV. Using GDB I determined that the shell is crashing on the following line:
if (strlen(args[i]) == 0) {
This is because the address of arg[i] (the first empty string after the parsed commands) is inaccessible. Here is the error from GDB and all relevant variables:
(gdb) next
359 if (strlen(args[i]) == 0) {
(gdb) p args[0]
$1 = 0x7fffffffe570 "echo"
(gdb) p args[1]
$2 = 0x7fffffffe575 "test"
(gdb) p args[2]
$3 = 0x0
(gdb) p i
$4 = 2
(gdb) next
Program received signal SIGSEGV, Segmentation fault.
parseInputFile (args=0x7fffffffd570, inputFilePath=0x7fffffffd240 "") at shell.c:359
359 if (strlen(args[i]) == 0) {
I believe that the p args[2] returning $3 = 0x0 means that because the index has yet to be written to, it is mapped to address 0x0 which is out of the bounds of execution. Although I can't figure out why this is because it was declared as a buffer. Any suggestions on how to solve this problem?
EDIT: Per Kaylum's comment, here is a minimal reproducible example
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>
#include <sys/stat.h>
#include<readline/readline.h>
#include<readline/history.h>
#include <fcntl.h>
// Defined values
#define MAX_CHAR 256
#define MAX_ARG 64
#define clear() printf("\033[H\033[J") // Clear window
#define DEFAULT_PROMPT_SUFFIX "> "
char PROMPT[MAX_CHAR], SPATH[1024];
int parseInputFile(char **args, char *inputFilePath) {
char *inputSymbol = "<";
int isFound = 0;
for (int i = 0; i < MAX_ARG; i++) {
if (strlen(args[i]) == 0) {
isFound = 0;
break;
}
if ((strcmp(args[i], inputSymbol)) == 0) {
strcpy(inputFilePath, args[i+1]);
isFound = 1;
break;
}
}
return isFound;
}
int ioRedirectHandler(char **args) {
char inputFilePath[MAX_CHAR] = "";
// Check if any redirects exist
if (parseInputFile(args, inputFilePath)) {
return 1;
} else {
return 0;
}
}
void parseArgs(char *cmd, char **cmdArgs) {
int na;
// Separate each argument of a command to a separate string
for (na = 0; na < MAX_ARG; na++) {
cmdArgs[na] = strsep(&cmd, " ");
if (cmdArgs[na] == NULL) {
break;
}
if (strlen(cmdArgs[na]) == 0) {
na--;
}
}
}
int processInput(char* input, char **args, char **pipedArgs) {
// Parse the single command and args
parseArgs(input, args);
return 0;
}
int getInput(char *input) {
char *buf, loc_prompt[MAX_CHAR] = "\n";
strcat(loc_prompt, PROMPT);
buf = readline(loc_prompt);
if (strlen(buf) != 0) {
add_history(buf);
strcpy(input, buf);
return 0;
} else {
return 1;
}
}
void init() {
char *uname;
clear();
uname = getenv("USER");
printf("\n\n \t\tWelcome to Student Shell, %s! \n\n", uname);
// Initialize the prompt
snprintf(PROMPT, MAX_CHAR, "%s%s", uname, DEFAULT_PROMPT_SUFFIX);
}
int main() {
char input[MAX_CHAR];
char *args[MAX_CHAR], *pipedArgs[MAX_CHAR];
int isPiped = 0, isIORedir = 0;
init();
while(1) {
// Get the user input
if (getInput(input)) {
continue;
}
isPiped = processInput(input, args, pipedArgs);
isIORedir = ioRedirectHandler(args);
}
return 0;
}
Note: If I forgot to include any important information, please let me know and I can get it updated.
When you write
char *args[MAX_CHAR];
you allocate room for MAX_CHAR pointers to char. You do not initialise the array. If it is a global variable, you will have initialised all the pointers to NULL, but you do it in a function, so the elements in the array can point anywhere. You should not dereference them before you have set the pointers to point at something you are allowed to access.
You also do this, though, in parseArgs(), where you do this:
cmdArgs[na] = strsep(&cmd, " ");
There are two potential issues here, but let's deal with the one you hit first. When strsep() is through the tokens you are splitting, it returns NULL. You test for that to get out of parseArgs() so you already know this. However, where your program crashes you seem to have forgotten this again. You call strlen() on a NULL pointer, and that is a no-no.
There is a difference between NULL and the empty string. An empty string is a pointer to a buffer that has the zero char first; the string "" is a pointer to a location that holds the character '\0'. The NULL pointer is a special value for pointers, often address zero, that means that the pointer doesn't point anywhere. Obviously, the NULL pointer cannot point to an empty string. You need to check if an argument is NULL, not if it is the empty string.
If you want to check both for NULL and the empty string, you could do something like
if (!args[i] || strlen(args[i]) == 0) {
If args[i] is NULL then !args[i] is true, so you will enter the if body if you have NULL or if you have a pointer to an empty string.
(You could also check the empty string with !(*args[i]); *args[i] is the first character that args[i] points at. So *args[i] is zero if you have the empty string; zero is interpreted as false, so !(*args[i]) is true if and only if args[i] is the empty string. Not that this is more readable, but it shows again the difference between empty strings and NULL).
I mentioned another issue with the parsed arguments. Whether it is a problem or not depends on the application. But when you parse a string with strsep(), you get pointers into the parsed string. You have to be careful not to free that string (it is input in your main() function) or to modify it after you have parsed the string. If you change the string, you have changed what all the parsed strings look at. You do not do this in your program, so it isn't a problem here, but it is worth keeping in mind. If you want your parsed arguments to survive longer than they do now, after the next command is passed, you need to copy them. The next command that is passed will change them as it is now.
In main
char input[MAX_CHAR];
char *args[MAX_CHAR], *pipedArgs[MAX_CHAR];
are all uninitialized. They contain indeterminate values. This could be a potential source of bugs, but is not the reason here, as
getInput modifies the contents of input to be a valid string before any reads occur.
pipedArgs is unused, so raises no issues (yet).
args is modified by parseArgs to (possibly!) contain a NULL sentinel value, without any indeterminate pointers being read first.
Firstly, in parseArgs it is possible to completely fill args without setting the NULL sentinel value that other parts of the program should rely on.
Looking deeper, in parseInputFile the following
if (strlen(args[i]) == 0)
contradicts the limits imposed by parseArgs that disallows empty strings in the array. More importantly, args[i] may be the sentinel NULL value, and strlen expects a non-NULL pointer to a valid string.
This termination condition should simply check if args[i] is NULL.
With
strcpy(inputFilePath, args[i+1]);
args[i+1] might also be the NULL sentinel value, and strcpy also expects non-NULL pointers to valid strings. You can see this in action when inputSymbol is a match for the final token in the array.
args[i+1] may also evaluate as args[MAX_ARGS], which would be out of bounds.
Additionally, inputFilePath has a string length limit of MAX_CHAR - 1, and args[i+1] is (possibly!) a dynamically allocated string whose length might exceed this.
Some edge cases found in getInput:
Both arguments to
strcat(loc_prompt, PROMPT);
are of the size MAX_CHAR. Since loc_prompt has a length of 1. If PROMPT has the length MAX_CHAR - 1, the resulting string will have the length MAX_CHAR. This would leave no room for the NUL terminating byte.
readline can return NULL in some situations, so
buf = readline(loc_prompt);
if (strlen(buf) != 0) {
can again pass the NULL pointer to strlen.
A similar issue as before, on success readline returns a string of dynamic length, and
strcpy(input, buf);
can cause a buffer overflow by attempting to copy a string greater in length than MAX_CHAR - 1.
buf is a pointer to data allocated by malloc. It's unclear what add_history does, but this pointer must eventually be passed to free.
Some considerations.
Firstly, it is a good habit to initialize your data, even if it might not matter.
Secondly, using constants (#define MAX_CHAR 256) might help to reduce magic numbers, but they can lead you to design your program too rigidly if used in the same way.
Consider building your functions to accept a limit as an argument, and return a length. This allows you to more strictly track the sizes of your data, and prevents you from always designing around the maximum potential case.
A slightly contrived example of designing like this. We can see that find does not have to concern itself with possibly checking MAX_ARGS elements, as it is told precisely how long the list of valid elements is.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_ARGS 100
char *get_input(char *dest, size_t sz, const char *display) {
char *res;
if (display)
printf("%s", display);
if ((res = fgets(dest, sz, stdin)))
dest[strcspn(dest, "\n")] = '\0';
return res;
}
size_t find(char **list, size_t length, const char *str) {
for (size_t i = 0; i < length; i++)
if (strcmp(list[i], str) == 0)
return i;
return length;
}
size_t split(char **list, size_t limit, char *source, const char *delim) {
size_t length = 0;
char *token;
while (length < limit && (token = strsep(&source, delim)))
if (*token)
list[length++] = token;
return length;
}
int main(void) {
char input[512] = { 0 };
char *args[MAX_ARGS] = { 0 };
puts("Welcome to the shell.");
while (1) {
if (get_input(input, sizeof input, "$ ")) {
size_t argl = split(args, MAX_ARGS, input, " ");
size_t redirection = find(args, argl, "<");
puts("Command parts:");
for (size_t i = 0; i < redirection; i++)
printf("%zu: %s\n", i, args[i]);
puts("Input files:");
if (redirection == argl)
puts("[[NONE]]");
else for (size_t i = redirection + 1; i < argl; i++)
printf("%zu: %s\n", i, args[i]);
}
}
}
Here is the problem we have to check if two strings contain the same characters, regardless of order. For example s1=akash s2=ashka match.
My program is showing NO for every input strings;
s1 and s2 are two input strings
t is the number of testcases
->it would be really helpful if you can tell me where is the error I am a beginner
#include<stdio.h>
#include<string.h>
int main(){
int t,i,j;
scanf("%d",&t);
while(t>0){
char s1[100],s2[100];
scanf("%s ",s1);
scanf("%s",s2);
int count=0;
int found[100];
for(i=0;i<strlen(s1)-1;i++){
for(j=0;j<strlen(s1);j++){
if(s1[i]==s2[j]){
found[i]=1;
break;
}
}
}
for(i=0;i<strlen(s1);i++){
if(found[i]!=1){
count=1;
break;
}
}
if(count==1)
printf("NO");
else
printf("YES");
t--;
}
}
Some good answers above suggest sorting the strings first.
If you want to modify your program above to do this job then you need to modify it as you realised. I have a suggestion (in words) for how to do this below - after that there is a modified code that works, and finally a couple of extra points.
I guess that two strings aa and a would not be equal according to your definition, but your program would say that they were equal because once you find a character you do not have anyway of saying that it has been 'used up'
I would suggest that you change your found[] array so that it records when a character in the second string is matched.
I suggest logic as follows.
Loop through all S1 characters
| Loop through S2 charaters
| - if you get a match mark the S2 character as found
| - if you don't get a match by the end of the S2 loops then you are done - they are not equal
At the end of the S1 loop if you have not finished early then every character is matched, but you need to go through found[] array to check that every character in S2 was found.
working code is below....
note
you did not initialize found - it is initialize below in code
the first loop needs to have < strlen(s1) not < strlen(s1)-1
the second loop you should have been going to strlen(s2).
logic changed as described above so that found records characters found in s2 not s1
logic also changed so that if a character in s1 is not found the loop breaks early. There are tests to see if the loop broke early to see if the values of i and j are what we expect at the end of the loop.
edited code below (at the bottom below the code are some extra comments)
#include<stdio.h>
#include<string.h>
int main(){
int t,i,j;
scanf("%d",&t);
while(t>0){
char s1[100],s2[100];
scanf("%s ",s1);
scanf("%s",s2);
int count=0;
int found[100]={ 0 };
for(i=0;i<strlen(s1);i++){
for(j=0;j<strlen(s2);j++){
if(found[j]==1) continue; // character S2[j] already found
if(s1[i]==s2[j]){
found[j]=1;
break;
}
}
if (j==strlen(s2)) {
break; // we get here if we did not find a match for S1[i]
}
}
if (i!=strlen(s1)) {
printf("NO"); // we get here if we did not find a match for S1[i]
}
else {
// matched all of S1 now check S2 all matched
for(i=0;i<strlen(s2);i++){
if(found[i]!=1){
count=1;
break;
}
}
if(count==1) {
printf("NO");
}
else {
printf("YES");
}
}
t--;
}
return 0;
}
Two extra points to make your code more efficient.
First, as suggested by #chux it will probably be faster not to have strlen(s2) in the condition for the loop. What you could have instead would be for (j=0;s2[j];j++). This works because the final character at the end of the string will have the value 0 and in C a value of 0 means false.. in the for loop the loop runs whilst the logic statement is true and when it is false the loop stops. The speed up of not using strlen[s2] in the loop is because the compiler might decide to calculate strlen[s2] each time you go through the loop, which means counting for l2 if l2 is the length of s2 - thus as you have to go through the two loops l1*l2 times potentially with the strlen counting you actually have l1*l2*l2 steps.
secondly, you could speed up many tests by checking to see if the lengths of the two strings are different before checking if they contain the same number of the same types of character.
As suggested in my comment, and since it's now a bit more clear, an easy way to compare two multisets represented as strings is to:
Sort the two strings (easy using the qsort() standard function)
Compare the result (using the strcmp() standard function)
This will work since it will map both "akash" and "ashka" to "aahks", before comparing.
Sort both the strings by using bubble sort or any other tech. you know , then simply compair both strings by using strcmp() function .
for(i=0;i<strlen(s1)-1;i++){
for(j=0;j<strlen(s1);j++){
if(s1[i]==s2[j]){
found[i]=1;
break;
}
}
}
I am not able to understand why are you using j<strlen(s1) is second loop.
I think simple solution will be sorting the characters alphabetically and comparing one by one in single loop.
First, note that found is never initialized. The values within it are unknown. It ought to be initialized by setting every element to zero before each test for equality. (Or, if not every element, every element up to strlen(s1)-1, as those are the ones that will be used.)
Once found is initialized, though, there is another problem.
The first loop on i uses for(i=0;i<strlen(s1)-1;i++). Within this, found[i] is set if a match is found to s1[i]. Note that i never reaches strlen(s1)-1 within the loop, since the loop terminates when it does.
The second loop on i uses for(i=0;i<strlen(s1);i++). Within this loop, found[i] is tested to see if it is set. Note that i does reach strlen(s1)-1, since the loop terminates only when i reaches strlen(s1). However, found[strlen(s1)-1] can never have been set by the first loop, since i never reaches strlen(s1)-1 in the first loop. Therefore, the second loop would always report failure.
Additionally, it is not clear whether two strings ought to be considered equal if and only if they are anagrams (the characters in one can be rearranged to form the other string, without adding or removing any characters) or if each character in one string is found at least once in the other (“aaabbc” would be equal to “abbccc”, because both strings contain a, b, and c).
As written, with the initialization and loop bugs fixed, your program tests whether each character in the first string appears in the second string. This is not an equivalence relation because it is not reflexive: It does not test whether each character in the second string appears in the first string. So, you need to think more about what property you want to test and how to test for it.
Complicated solutions I did as a training. Two implementations controlled with a macro below.
First implementation loops through every character in the string, counts it's count in the first and second string and compares the values.
The second implementation allocates and creates a map of characters with count for each string and then compares these maps.
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#include <assert.h>
#include <stdlib.h>
#include <errno.h>
// configuration
#define STRCHARSETCNTCMP_METHOD_FOREACH 0
#define STRCHARSETCNTCMP_METHOD_MAP 1
// eof configuration
//#define dbgln(fmt, ...) fprintf(stderr, "%s:%d: " fmt "\n", __func__, __LINE__, ##__VA_ARGS__)
#define dbgln(...) ((void)0)
/**
* STRing CHARacter SET CouNT CoMPare
* compare the count of set of characters in strings
* #param first string
* #param the other string
* #ret true if each character in s1 is used as many times in s2
*/
bool strcharsetcntcmp(const char s1[], const char s2[]);
// Count how many times the character is in the string
size_t strcharsetcntcmp_count(const char s[], char c)
{
assert(s != NULL);
size_t ret = 0;
while (*s != '\0') {
if (*s == c) {
++ret;
}
*s++;
}
return ret;
}
// foreach method implementation
bool strcharsetcntcmp_method_foreach(const char s1[], const char s2[])
{
const size_t s1len = strlen(s1);
const size_t s2len = strlen(s2);
if (s1len != s2len) {
return false;
}
for (size_t i = 0; i < s1len; ++i) {
const char c = s1[i];
const size_t cnt1 = strcharsetcntcmp_count(s1, c);
const size_t cnt2 = strcharsetcntcmp_count(s2, c);
// printf("'%s','%s' -> '%c' -> %zu %zu\n", s1, s2, c, cnt1, cnt2);
if (cnt1 != cnt2) {
return false;
}
}
return true;
}
// array of map elements
struct strcharsetcntcmp_map_s {
size_t cnt;
struct strcharsetcntcmp_map_cnt_s {
char c;
size_t cnt;
} *map;
};
// initialize empty map
void strcharsetcntcmp_map_init(struct strcharsetcntcmp_map_s *t)
{
assert(t != NULL);
dbgln("%p", t);
t->map = 0;
t->cnt = 0;
}
// free map memory
void strcharsetcntcmp_map_fini(struct strcharsetcntcmp_map_s *t)
{
assert(t != NULL);
dbgln("%p %p", t, t->map);
free(t->map);
t->map = 0;
t->cnt = 0;
}
// get the map element for character from map
struct strcharsetcntcmp_map_cnt_s *strcharsetcntcmp_map_get(const struct strcharsetcntcmp_map_s *t, char c)
{
assert(t != NULL);
for (size_t i = 0; i < t->cnt; ++i) {
if (t->map[i].c == c) {
return &t->map[i];
}
}
return NULL;
}
// check if the count for character c was already added into the map
bool strcharsetcntcmp_map_exists(const struct strcharsetcntcmp_map_s *t, char c)
{
return strcharsetcntcmp_map_get(t, c) != NULL;
}
// map element into map, without checking if it exists (only assertion)
int strcharsetcntcmp_map_add(struct strcharsetcntcmp_map_s *t, char c, size_t cnt)
{
assert(t != NULL);
assert(strcharsetcntcmp_map_exists(t, c) == false);
dbgln("%p %p %zu %c %zu", t, t->map, t->cnt, c, cnt);
void *pnt = realloc(t->map, sizeof(t->map[0]) * (t->cnt + 1));
if (pnt == NULL) {
return -errno;
}
t->map = pnt;
t->map[t->cnt].c = c;
t->map[t->cnt].cnt = cnt;
t->cnt++;
return 0;
}
// create map from string, map needs to be initialized by init and needs to be freed with fini
int strcharsetcntcmp_map_parsestring(struct strcharsetcntcmp_map_s *t, const char s[])
{
assert(t != NULL);
assert(s != NULL);
int ret = 0;
while (*s != '\0') {
const char c = *s;
if (!strcharsetcntcmp_map_exists(t, c)) {
const size_t cnt = strcharsetcntcmp_count(s, c);
ret = strcharsetcntcmp_map_add(t, c, cnt);
if (ret != 0) {
break;
}
}
++s;
}
return ret;
}
// compare two maps if they have same sets of characters and counts
bool strcharsetcntcmp_cmp(const struct strcharsetcntcmp_map_s *t, const struct strcharsetcntcmp_map_s *o)
{
assert(t != NULL);
assert(o != NULL);
if (t->cnt != o->cnt) {
return false;
}
for (size_t i = 0; i < t->cnt; ++i) {
const char c = t->map[i].c;
const size_t t_cnt = t->map[i].cnt;
struct strcharsetcntcmp_map_cnt_s *o_map_cnt = strcharsetcntcmp_map_get(o, c);
if (o_map_cnt == NULL) {
dbgln("%p(%zu) %p(%zu) %c not found", t, t->cnt, o, o->cnt, c);
return false;
}
const size_t o_cnt = o_map_cnt->cnt;
if (t_cnt != o_cnt) {
dbgln("%p(%zu) %p(%zu) %c %zu != %zu", t, t->cnt, o, o->cnt, c, t_cnt, o_cnt);
return false;
}
dbgln("%p(%zu) %p(%zu) %c %zu", t, t->cnt, o, o->cnt, c, t_cnt);
}
return true;
}
// map method implementation
bool strcharsetcntcmp_method_map(const char s1[], const char s2[])
{
struct strcharsetcntcmp_map_s map1;
strcharsetcntcmp_map_init(&map1);
if (strcharsetcntcmp_map_parsestring(&map1, s1) != 0) {
abort(); // <insert good error handler here>
}
struct strcharsetcntcmp_map_s map2;
strcharsetcntcmp_map_init(&map2);
if (strcharsetcntcmp_map_parsestring(&map2, s2) != 0) {
abort(); // <insert good error handler here>
}
const bool ret = strcharsetcntcmp_cmp(&map1, &map2);
strcharsetcntcmp_map_fini(&map1);
strcharsetcntcmp_map_fini(&map2);
return ret;
}
bool strcharsetcntcmp(const char s1[], const char s2[])
{
assert(s1 != NULL);
assert(s2 != NULL);
#if STRCHARSETCNTCMP_METHOD_FOREACH
return strcharsetcntcmp_method_foreach(s1, s2);
#elif STRCHARSETCNTCMP_METHOD_MAP
return strcharsetcntcmp_method_map(s1, s2);
#endif
}
// unittests. Should return 0
int strcharsetcntcmp_unittest(void)
{
struct {
const char *str1;
const char *str2;
bool eq;
} const tests[] = {
{ "", "", true, },
{ "a", "b", false, },
{ "abc", "bca", true, },
{ "aab", "abb", false, },
{ "aabbbc", "cbabab", true, },
{ "123456789012345678901234567890qwertyuiopqwertyuiopasdfghjklasdfghjklzxcvbnmzxcvbnm,./;", "123456789012345678901234567890qwertyuiopqwertyuiopasdfghjklasdfghjklzxcvbnmzxcvbnm,./;", true },
{ "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890", "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890", true },
{ "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890", "1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678900", false },
};
int ret = 0;
for (size_t i = 0; i < sizeof(tests)/sizeof(tests[0]); ++i) {
const bool is = strcharsetcntcmp(tests[i].str1, tests[i].str2);
if (is != tests[i].eq) {
fprintf(stderr,
"Error: strings '%s' and '%s' returned %d should be %d\n",
tests[i].str1, tests[i].str2, is, tests[i].eq);
ret = -1;
}
}
return ret;
}
int main()
{
return strcharsetcntcmp_unittest();
}
What I want to do is: The user inputs a string with commas; for example: 123456,44,55,,66
and I want to separate it and store in a new array without the commas; for example:
m[0][]={123456}, m[1][]={44}, m[2][]={55}, m[3][]={}, m[4][]={66}
123456 is the student ID number, 44 is the mark for 1st module, 55 is the mark for 2nd module, NULL means that the student didn't take that 3rd module, and 66 is the mark for 4th module.
How can I exactly do that? What I know is that by detecting double commas, it means the student didn't take that 3rd module.
Here is what I have written so far:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
void copystring(char m[],char temp[]);
main()
{
char temp[10000];
char m[10000][10000];
gets(temp);
copystring(m,temp);
printf("%s\n",m);
return 0;
}
void copystring(char m[],char temp[])
{
int i;
int j;
for (j=0;j<(strlen(temp));j++)
{
for (i=0;i<(strlen(temp));i++)
{
if (temp[i]!=*(",")&&temp[i]!=*(" "))
{
m[j][i]=temp[i];
}
}
}
}
I have edited my code so that you can understand how to declare a function prototype containing 2D array as parameter. Also use fgets() instead of gets(). The function returns the number of marks read , i.e. an integer. I think this might help. Run the code and look on the man pages or google to understand fgets() better.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define SIZE 1000
int stringcopy(char m[][SIZE],char temp[]);
main()
{
char temp[SIZE],m[100][SIZE];
fgets(temp,SIZE,stdin);
int num=stringcopy(m,temp);
int i;
for(i=0;i<num;++i)
printf("%s\n",m[i]);
return 0;
}
int stringcopy(char m[][SIZE],char temp[]) {
int len=strlen(temp);
int i,j=0,k=0;
for(i=0;i<len;++i) {
if(temp[i]!=',')
m[j][k++]=temp[i];
else {
m[j][k]='\0';
++j;
k=0;
}
}
m[j][k-1]='\0';
return j+1;
}
You have to solve your problem in different steps,
The first step is to check how many tokens you have in your input string to be able to allocate enough space to store an array of tokens.
Then you should extract the tokens of the input strings in your tokens string array. To extract the tokens from your input string, you can use the strtok function from <string.h>.
Finally you can use your tokens however you want, like converting them to long in your case.
EDIT: given the requirements, here is a small implementation of what you could do. I don't check the returns of the malloc, you maybe should do it.
int main(int argc, char** argv) {
int i;
char* input_string = /* your input string for somewhere */;
char** tokens;
int tokens_count = 0;
char* input_ptr = input_string;
char* tmp_token;
size_t tmp_token_length;
// count the occurences of your separtor to have the number of elements
for(; input_ptr[tokens_count]; input_ptr[tokens_count] == ',' ? tokens_count++ : input_ptr++);
if(tokens_count == 0) {
// no tokens found, what do you want to do here ?
}
else {
// build our tokens array
tokens = malloc(sizeof(*tokens) * tokens_count);
i = 0;
tmp_token = strtok(input_string, ',');
while(tmp_token != NULL) {
tmp_token_length = strlen(tmp_token);
if(tmp_token_length != 0) {
tokens[i] = malloc(tmp_token_length);
strcpy(tokens[i], tmp_token);
}
else {
tokens[i] = NULL;
}
i++;
tmp_token = strtok(input_string, ',');
}
// populate your array of arrays of integers
long** m = malloc(sizeof(long*) * tokens_count);
for(i=0; i<tokens_count; i++) {
char* tmp_token = tokens[i];
if(tmp_token == NULL) {
m[i] = NULL;
}
else {
m[i] = malloc(sizeof(long));
m[i][0] = strtol(tmp_token, NULL, 10);
}
}
}
}
However, you should probably change your data structure by using structures instead of a massive array.
Try to use scanf for getting input, your copystring function seems fine; but if there is a problem then debug it to see what the problem is.
for (j=0;j<(strlen(temp));j++)
{
for (i=0;i<(strlen(temp));i++)
{
if (temp[i]!=(',')&&temp[i]!=(' '))
{
m[j][i]=temp[i];
}
else{
m[j][i]='\0';break;// must end a string with null character.
}
}
}
and for priting use
printf("%s",m[0]);// make a loop for it
You can read entire string using fgets, or scanf and then use strtok(string, ",") to get substrings between commas.
To detect if student has missed some entry, there are many ways, few of them are:
1) Check no. of sub-strings you get before strtok returns NULL.
2) You can search for substring ,, using strstr in the input string.
I'm trying to make a quick function that gets a word/argument in a string by its number:
char* arg(char* S, int Num) {
char* Return = "";
int Spaces = 0;
int i = 0;
for (i; i<strlen(S); i++) {
if (S[i] == ' ') {
Spaces++;
}
else if (Spaces == Num) {
//Want to append S[i] to Return here.
}
else if (Spaces > Num) {
return Return;
}
}
printf("%s-\n", Return);
return Return;
}
I can't find a way to put the characters into Return. I have found lots of posts that suggest strcat() or tricks with pointers, but every one segfaults. I've also seen people saying that malloc() should be used, but I'm not sure of how I'd used it in a loop like this.
I will not claim to understand what it is that you're trying to do, but your code has two problems:
You're assigning a read-only string to Return; that string will be in your
binary's data section, which is read-only, and if you try to modify it you will get a segfault.
Your for loop is O(n^2), because strlen() is O(n)
There are several different ways of solving the "how to return a string" problem. You can, for example:
Use malloc() / calloc() to allocate a new string, as has been suggested
Use asprintf(), which is similar but gives you formatting if you need
Pass an output string (and its maximum size) as a parameter to the function
The first two require the calling function to free() the returned value. The third allows the caller to decide how to allocate the string (stack or heap), but requires some sort of contract about the minumum size needed for the output string.
In your code, when the function returns, then Return will be gone as well, so this behavior is undefined. It might work, but you should never rely on it.
Typically in C, you'd want to pass the "return" string as an argument instead, so that you don't have to free it all the time. Both require a local variable on the caller's side, but malloc'ing it will require an additional call to free the allocated memory and is also more expensive than simply passing a pointer to a local variable.
As for appending to the string, just use array notation (keep track of the current char/index) and don't forget to add a null character at the end.
Example:
int arg(char* ptr, char* S, int Num) {
int i, Spaces = 0, cur = 0;
for (i=0; i<strlen(S); i++) {
if (S[i] == ' ') {
Spaces++;
}
else if (Spaces == Num) {
ptr[cur++] = S[i]; // append char
}
else if (Spaces > Num) {
ptr[cur] = '\0'; // insert null char
return 0; // returns 0 on success
}
}
ptr[cur] = '\0'; // insert null char
return (cur > 0 ? 0 : -1); // returns 0 on success, -1 on error
}
Then invoke it like so:
char myArg[50];
if (arg(myArg, "this is an example", 3) == 0) {
printf("arg is %s\n", myArg);
} else {
// arg not found
}
Just make sure you don't overflow ptr (e.g.: by passing its size and adding a check in the function).
There are numbers of ways you could improve your code, but let's just start by making it meet the standard. ;-)
P.S.: Don't malloc unless you need to. And in that case you don't.
char * Return; //by the way horrible name for a variable.
Return = malloc(<some size>);
......
......
*(Return + index) = *(S+i);
You can't assign anything to a string literal such as "".
You may want to use your loop to determine the offsets of the start of the word in your string that you're looking for. Then find its length by continuing through the string until you encounter the end or another space. Then, you can malloc an array of chars with size equal to the size of the offset+1 (For the null terminator.) Finally, copy the substring into this new buffer and return it.
Also, as mentioned above, you may want to remove the strlen call from the loop - most compilers will optimize it out but it is indeed a linear operation for every character in the array, making the loop O(n**2).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *arg(const char *S, unsigned int Num) {
char *Return = "";
const char *top, *p;
unsigned int Spaces = 0;
int i = 0;
Return=(char*)malloc(sizeof(char));
*Return = '\0';
if(S == NULL || *S=='\0') return Return;
p=top=S;
while(Spaces != Num){
if(NULL!=(p=strchr(top, ' '))){
++Spaces;
top=++p;
} else {
break;
}
}
if(Spaces < Num) return Return;
if(NULL!=(p=strchr(top, ' '))){
int len = p - top;
Return=(char*)realloc(Return, sizeof(char)*(len+1));
strncpy(Return, top, len);
Return[len]='\0';
} else {
free(Return);
Return=strdup(top);
}
//printf("%s-\n", Return);
return Return;
}
int main(){
char *word;
word=arg("make a quick function", 2);//quick
printf("\"%s\"\n", word);
free(word);
return 0;
}