Comparing elements from two 2d-arrays in C without strcmp - c

I am creating a spell-checker in C. I have a dictionary array which is a 2d array. So each word in the dictionary takes a row in the 2d array. In the same way, my input array is also a 2d array. I want to check the spelling of the rows/words in my input array. I cannot use strcmp
An example of input array
['boy','girll','.','friend',' ']-can contain spaces,punctuation and words. We only care about spelling words
if a punctuation/space is compared against a word,we ignore it and move onto the next word.
example of dictionary
['boy','girl','cow'...]-all are words
My code is:
for (int a = 0; a < MAX_INPUT_SIZE + 1; a++)
{
for (int b = 0; b < MAX_DICTIONARY_WORDS; b++)
{
if(tokens[a]==dict_token[b])
{
printf("correct");
}
else
{
printf("wrong");
}
}
}
The output is all "wrong". Though 5 out of the 6 word input should be correct.

Every test returns false because the comparison you're using,
if(tokens[a]==dict_token[b])
is comparing two pointers that are never going to point at the same address, because, the tokens you are testing are in a completely separate bit of memory to the dict_token dictionary that you are comparing them with.
You need to pass the two pointers tokens[a] and dict_token[b] to a comparison function that will perform a letter-by-letter comparison, and which will return one value when it finds a difference between them, and another when it gets to the end of both without finding a difference. In other words, you need to write an implementation of strcmp.

Related

Sorting integer array by ascending order, segmentation fault: 11

The function is suppose to sort an array of random integers by ascending order. I found a method for solving this problem, the bubble sort, swaping a by b if b < a. However, my implementation, or the lack of it, keeps returning a segmentation fault: 11. Could it have something to do with the parameter "int *tab" or subscripts I'm using during the swaping of elements?
void ft_sort_integer_table(int *tab, int size)
{
int i;
int j;
int t;
i = 1;
j = 0;
t = 0;
while (tab[j] != '\0')
{
if (tab[i] < tab[j])
{
t = tab[i];
tab[i] = tab[j];
tab[j] = t;
}
i++;
j++;
}
}
Unless you are guaranteed beforehand that the value 0 terminates your buffer and doesn’t appear elsewhere in the array (like you are with null terminated strings) you can’t test for tab[i] being zero to determine that you have reached the end of the array. Your function takes size as a parameter too; why not use that?
EDIT: Also, no sorting algorithm runs in O(n). Bubble sort, which looks like what you’re trying to implement, requires two nested loops.
Skipping the correctness of this implementation of the sorting algorithm (as it seems wrong) the segmentation is caused by the null termination check that you are doing. The NULL('\0') character is specified for strings, or char array types in C programming language, and it is used to signal their termination. It doesn't work with int type arrays. You should be using the size argument for iterating the array.
You do not use the size parameter. Instead you are trying find the null-terminator which int array is not supposed to have (unlike C-string). So in case you have to compare j with size and keep swapping till the array is fully sorted.
Also, it is better of using size_t size instead of int size in order to stay pedantic.
You pass an array and a size to your sorting function but do not use the size anywhere so potentially i and j could go out of bounds causing undefined behavior.
An int array can contain 0s so you need to have other criteria for when your sorting is finished. E.g. when you go through all the elements in the array [0..size] and do not do a swap - then it is sorted.
Firstly, your while loop logic is wrong. The character '\0' refers to the null character at the end of string. This doesn't make sense if you compare it with int type.
Secondly, the logic you implemented is comparing side by side elements of an array and not a single element with all others and placing it. I would recommend you study bubble sort. Geekforgeeks is the best source for cse guys. Hope it solves. Cheers !! Feel free to ask questions

Find longest suffix of string in given array

Given a string and array of strings find the longest suffix of string in array.
for example
string = google.com.tr
array = tr, nic.tr, gov.nic.tr, org.tr, com.tr
returns com.tr
I have tried to use binary search with specific comparator, but failed.
C-code would be welcome.
Edit:
I should have said that im looking for a solution where i can do as much work as i can in preparation step (when i only have a array of suffixes, and i can sort it in every way possible, build any data-structure around it etc..), and than for given string find its suffix in this array as fast as possible. Also i know that i can build a trie out of this array, and probably this will give me best performance possible, BUT im very lazy and keeping a trie in raw C in huge peace of tangled enterprise code is no fun at all. So some binsearch-like approach will be very welcome.
Assuming constant time addressing of characters within strings this problem is isomorphic to finding the largest prefix.
Let i = 0.
Let S = null
Let c = prefix[i]
Remove strings a from A if a[i] != c and if A. Replace S with a if a.Length == i + 1.
Increment i.
Go to step 3.
Is that what you're looking for?
Example:
prefix = rt.moc.elgoog
array = rt.moc, rt.org, rt.cin.vof, rt.cin, rt
Pass 0: prefix[0] is 'r' and array[j][0] == 'r' for all j so nothing is removed from the array. i + 1 -> 0 + 1 -> 1 is our target length, but none of the strings have a length of 1, so S remains null.
Pass 1: prefix[1] is 't' and array[j][1] == 'r' for all j so nothing is removed from the array. However there is a string that has length 2, so S becomes rt.
Pass 2: prefix[2] is '.' and array[j][2] == '.' for the remaining strings so nothing changes.
Pass 3: prefix[3] is 'm' and array[j][3] != 'm' for rt.org, rt.cin.vof, and rt.cin so those strings are removed.
etc.
Another naïve, pseudo-answer.
Set boolean "found" to false. While "found" is false, iterate over the array comparing the source string to the strings in the array. If there's a match, set "found" to true and break. If there's no match, use something like strchr() to get to the segment of the string following the first period. Iterate over the array again. Continue until there's a match, or until the last segment of the source string has been compared to all the strings in the array and failed to match.
Not very efficient....
Naive, pseudo-answer:
Sort array of suffixes by length (yes, there may be strings of same length, which is a problem with the question you are asking I think)
Iterate over array and see if suffix is in given string
If it is, exit the loop because you are done! If not, continue.
Alternatively, you could skip the sorting and just iterate, assigning the biggestString if the currentString is bigger than the biggestString that has matched.
Edit 0:
Maybe you could improve this by looking at your array before hand and considering "minimal" elements that need to be checked.
For instance, if .com appears in 20 members you could just check .com against the given string to potentially eliminate 20 candidates.
Edit 1:
On second thought, in order to compare elements in the array you will need to use a string comparison. My feeling is that any gain you get out of an attempt at optimizing the list of strings for comparison might be negated by the expense of comparing them before doing so, if that makes sense. Would appreciate if a CS type could correct me here...
If your array of strings is something along the following:
char string[STRINGS][MAX_STRING_LENGTH];
string[0]="google.com.tr";
string[1]="nic.tr";
etc, then you can simply do this:
int x, max = 0;
for (x = 0; x < STRINGS; x++) {
if (strlen(string[x]) > max) {
max = strlen(string[x]);
}
}
x = 0;
while(true) {
if (string[max][x] == ".") {
GOTO out;
}
x++;
}
out:
char output[MAX_STRING_LENGTH];
int y = 0;
while (string[max][x] != NULL) {
output[y++] = string[++x];
}
(The above code may not actually work (errors, etc.), but you should get the general idea.
Why don't you use suffix arrays ? It works when you have large number of suffixes.
Complexity, O(n(logn)^2), there are O(nlogn) versions too.
Implementation in c here. You can also try googling suffix arrays.

C - How can I sort and print an array in a method but have the prior unsorted array not be affected

This is for a Deal or No Deal game.
So in my main function I'm calling my casesort method as such:
casesort(cases);
My method looks like this, I already realize it's not the most efficient sort but I'm going with what I know:
void casesort(float cases[10])
{
int i;
int j;
float tmp;
float zero = 0.00;
for (i = 0; i < 10; i++)
{
for (j = 0; j < 10; j++)
{
if (cases[i] < cases[j])
{
tmp = cases[i];
cases[i] = cases[j];
cases[j] = tmp;
}
}
}
//Print out box money amounts
printf("\n\nHidden Amounts: ");
for (i = 0; i < 10; i++)
{
if (cases[i] != zero)
printf("[$%.2f] ", cases[i]);
}
}
So when I get back to my main it turns out the array is sorted. I thought void would prevent the method returning a sorted array. I need to print out actual case numbers, I do this by just skipping over any case that is populated with a 0.00. But after the first round of case picks I get "5, 6, 7, 8, 9, 10" printing out back in my MAIN. I need it to print the cases according to what has been picked. I feel like it's a simple fix, its just that my knowledge of the specifics of C is still growing. Any ideas?
Return type void has nothing to do with prevention of array from being sorted. It just says that function does not return anything.
You see that the passed array itself is affected because an array decays to a pointer when passed to a function. Make a copy of the array and then pass it. That way you have the original list.
In C, arrays are passed by reference. i.e. they're passed as pointer to the first element. So when you pass cases into your function, you're actually giving it the original array to modify. Try creating a copy and sorting the copy rather than the actual array. Creating a copy wouldn't be bad as you have only 10 floats.
Instead of rolling your own sort, consider using qsort() or std::sort() if you are actually using c++
There are 2 obvious solutions. 1) Make a copy of the array and sort the copy (easy, waste some memory, likely not a problem these days). 2) Create a parallel array of integers and perform an index sort, i.e., instead of sorting thing original, you sort the index and then dereference the array using the index when you want the sorted version, otherwise by the raw unsorted array.
Well, make a local copy of you input and sort it. Something like this:
void casesort(float cases[10])
{
float localCases[10];
memcopy(localCases, cases, sizeof(cases));
...
Then use localCases to do your sorting.
If you don't want the array contents to be affected, then you'll have to create a copy of the array and pass that to your sorting routine (or create the copy within the routine itself).
Arrays Are Different™ in C; see my answer here for a more detailed explanation.

C - Returning the most repeated/occurring string in an array of char pointers

I have almost completed the code for this problem, which I shall state as under:
Given:
Array of length 'n' (say n = 10000) declared as below,
char **records = malloc(10000*sizeof(*records));
Each record[i] is a char pointer and points to a non-empty string.
records[i] = malloc(11);
The strings are of fixed length (10 chars + '\0').
Requirement:
Return the most frequently occurring string in the above array.
But now, I am interested in obtaining a slightly less brutal algorithm than the primitive one which I have currently, which is to sift through the entire array in two for loops :(, storing strings encountered by the two loops in a temporary array of similar size ('n' - in case all are unique strings) for comparison with the next strings. The inner loop iterates from 'outer loop position + 1' to 'n'. At the same time, I have an integer array, of similar size - 'n', for counting repeat occurrences, with each i th element corresponding to the i th (unique) string in the comparison array. Then find the largest integer and use its index in the comparison array to return the most frequently occurring string.
I hope I am clear enough. I am quite ashamed of the algo myself, but it had to be done. I am sure there is a much smarter way to do this in C.
Have a great Sunday,
Cheers!
Without being good at nice algorithms (Google, Wikipedia and Stackoverflow are good enough for me), one solution that comes out at the top of my head is to sort the array, then use a single loop to go through the entries. As long as the current string is the same as the previous, increase a counter for that string. When done you have a "list" of strings and their occurrence, which can then be sorted if needed.
In most languages, the usual approach would be to construct a hashtable, mapping strings to counts. This has O(N) complexity.
For example, in Python (although usually you would use collections.Counter for this, and even this code can be made more concise using more specialised Python knowledge, but I've made it explicit for demonstration).
def most_common(strings):
counts = {}
for s in strings:
if s not in counts:
counts[s] = 0
counts[s] += 1
return max(counts, key=counts.get)
But in C, you don't have a hashtable in the standard library (although in C++ you can use hash_map from the STL), so a sort and scan can be done instead. It's O(N.log(N)) complexity, which is worse than optimal, but quite practical.
Here's some C (actually C99) code that implements this.
int compare_strings(const void*s0, const void*s1) {
return strcmp((const char*)s0, (const char*)s1);
}
const char *most_common(const char **records, size_t n) {
qsort(records, n, sizeof(records[0]), compare_strings);
const char *best = 0; // The most common string found so far.
size_t max = 0; // The longest run found.
size_t run = 0; // The length of the current run.
for (size_t i = 0; i < n; i++) {
if (!compare_strings(records[i], records[i - run])) {
run += 1;
} else {
run = 1;
}
if (run > max) {
best = records[i];
max = run;
}
}
return best;
}

verify each element to one other in a array pointer

C Experts,
I have an array of pointers to strings. I need to compare each array element with all other array elements and throw error if they are same. Here is the piece of code I have written and got stuck. Please help me.
# define FOUND 1
# define NOTFOUND 0
int k,flag,a;
char cmp_string[10]; //used to get one array element to compare with all other array elements
char *values[]={010,020,030,040}; //valid case that's how it should be
char *vales[]={010,020,020,030}; wrong or throw error because in array i should have only unique values
int size=4;
for(k=0; k<=size;k++){
strcpy(values[k],cmp_string);
flag=NOTFOUND;
int counter=k+1;
for(int n=counter;n<=size;n++)
{
a=((strcmp(values[n],cmp_string) || (strcmp(values[k-1],cmp_string)))
// stuck here what if k value is 2 I wont be able to compare with zero or first element of array.
if(a==0){
throw error same name for the operation
flag=FOUND;
break;
}
}//for int n;
}//for int k;
if(flag==NOTFOUND){
True or PASS
}
}
Quick solution: sort the array (using e.g. the builtin qsort function), then scan it comparing adjacent elements; if two are the same, you have a repetition.
You can also know before completing the sort that you have duplicates if in the comparison function you find that the two compared items are the same.
If I understand your question correctly, you're trying to turn strcmp into something that returns nonzero if the strings are the same and zero otherwise:
a = (strcmp(whatever) != 0) || (strcmp(whatever else) != 0);

Resources