Hash function C - c

I am having trouble implementing my hash function for my hash table.
I want to hash my words such that A = 1, B = 2, C = 3, and so on. The position of the letter in the word is irrelevant, since we will consider permutations of the word. Moreover, the case of the letter will be irrelevant in this problem as well, so the value of a = the value of A = 1.
And for strings, abc = 1 + 2 + 3 = 6, bc = 2 + 3 = 5, etc.
And for cases where ab = 3 and aaa = 3, I have already had a way to handle that situation. Right now I just want to get the hash value.
The problem I am having right now is that aaa is giving me 1, and ab is giving me 2.
Below is my code:
int hash(char *word)
{
int h = 1;
int i, j;
char *A;
char *a;
// an array of 26 slots for 26 uppercase letters in the alphabet
A = (char *)malloc(26 * sizeof(char));
// an array of 26 slots for 26 lowercase letters in the alphabet
a = (char *)malloc(26 * sizeof(char));
for (i = 0; i < 26; i++) {
A[i] = (char)(i + 65); // fill the array from A to Z
a[i] = (char)(i + 97); // fill the array from a to z
}
for (i = 0; i < strlen(word); i++) {
//printf("HIT\n");
for (j = 0; j < 26; j++) {
// upper and lower case have the same hash value
if (word[i] == A[j] || word[i] == a[j]) {
h = h + j; // get the hash value of the word
//printf("HIT 2\n");
break;
}
}
}
printf("H: %d\n", h);
return h;
}

I think that changing
int h = 1;
to
int h = 0;
and
h = h + j;
to
h = h + j + 1;
will fix the issue.
The one other problem is that you forgot to free the malloced memory. Also, there is no need to cast the result of malloc(and family) in C.
This
for (i = 0; i < strlen(word); i++) {
will call strlen in every iteration of the loop. This will reduce the performance of your program. Use
int len = strlen(word);
for (i = 0; i < len; i++) {
instead, which is much faster as strlen isn't called in every iteration. Lastly, sizeof(char) is 1. So you can omit it.

change h=h+j to h=h+j+1
and h=1 to h=0.
Also you should free the allocated memory so include these lines just before return:
free(A);
free(a);
However I don't understand why so complicated code was written for such a simple task.
A much simpler code can be written:
int hash(char *word)
{
int sum=0;
while(*word != '\0')
{
if(*word >='A' && *word < 'A'+26)
sum=sum+(*word -'A' + 1);
else if(*word >='a' && *word < 'a'+26)
sum=sum+(*word -'a' + 1);
else
return -1;
word++;
}
return sum;
}

Multiple issues:
You still aren't freeing the arrays you allocated
Initial value of 1 for h makes no sense
You add the index to the hash. 'A' and 'a' are at index 0, so you're adding 0 in that case (so no matter how many 'a' s you give your code will return 1)
Why a dynamic array? You know the size, it isn't going to change. You could use
char A[26];
char a[26]; // you can also add initialisation, e.g. = {'a', 'b', ...
Why an array in the first place?
So, here is the quick fix, staying close to your code.
Taking all of the above into account, you could simplify to:
int hash(char const * string) {
int h = 0;
for (; *string; ++string) {
int index = tolower(*string) - 'a' + 1;
if ((index > 0) && (index < 27)) {
h += index;
}
}
return h;
}
Live
When only hashing words with non special characters, you need to handle ignored words in the caller somehow.
char hash(char const * string, int * h) {
*h = 0;
for (; *string; ++string) {
int index = tolower(*string) - 'a' + 1;
if ((index > 0) && (index < 27)) {
*h += index;
} else {
return 0;
}
}
return 1;
}
That way you can use the return value to test if the word should be ignored.

Related

print string based on the frequency of character in C

I was solving the question of leet code in C
Question:
Given a string s, sort it in decreasing order based on the frequency of the characters. The frequency of a character is the number of times it appears in the string.
Return the sorted string. If there are multiple answers, return any of them.
Example 1:
Input: s = "tree"
Output: "eert"
Explanation: 'e' appears twice while 'r' and 't' both appear once.
So 'e' must appear before both 'r' and 't'. Therefore "eetr" is also a valid answer.
I tried to use the different approach instead taking the array[255] and increasing the array value in specific char ASCII index. But I m getting segmentation fault. I not understand why I m getting segmentation voilation. Only assumption I made here is input str always be in UPPER CASE.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *frequencySort(char *s)
{
int strlenn = strlen(s);
int fq[strlenn];
// init with 1 because at least 1 time char occur in input str.
for (int i = 0; i < strlenn; i++)
{
fq[i] = 1;
}
// count freq of string and replace dublicate char with *
// ex: ABCDA => ABCD*
for (int i = 0; i < strlenn - 1; i++)
{
if (s[i] == '*')
{
continue;
}
for (int j = (i + 1); j < strlenn; j++)
{
if (s[i] == s[j])
{
fq[i]++;
s[j] = (char)'*'; // segmentation violation error shows here
fq[j] = 0;
}
}
}
// sort freqency by in decending order and str char
// ex: ABCDDDAA = freq[3, 3, 1, 1] = soredt str = [ A, D, B, C ]
for (int i = 0; i < strlenn - 1; i++)
{
for (int j = i + 1; j < strlenn; j++)
{
if (fq[i] < fq[j])
{
// swap
int temp = fq[i];
fq[i] = fq[j];
fq[j] = temp;
// swap string char
char temp1 = s[i];
s[i] = s[j]; // segmentation violation error shows here
s[j] = temp1;
}
}
}
char *result = (char *)calloc(strlenn + 1, sizeof(char));
int l = 0;
for (int i = 0; fq[i] != 0 && i < strlenn; i++)
{
int k = fq[i];
while (k > 0)
{
printf("%c", s[i]);
result[l++] = s[i];
k--;
}
}
result[l] = '\0';
return result;
}
int main()
{
char *s = "ABCDDDAA";
frequencySort(s);
return 0;
}
Thanks!
In frequencySort, you are modifying s (e.g.):
s[j] = (char)'*'; // segmentation violation error shows here
In main, the s argument comes from:
char *s = "ABCDDDAA";
frequencySort(s);
Here s is function scoped and goes on the stack.
But, because this is a pointer to a string literal, the actual string data goes into the .rodata section. This is mapped as read only. So, when we try to change it, we get a protection exception.
To fix this, define a string array:
char s[] = "ABCDDDAA";
frequencySort(s);
Now the literal data is [still] in the .rodata section. But, when main starts up, it is copied into the s array which is on the stack [which is writable].

My system is showing wrong output for correct code and the same code is giving correct output in other device.How to rectify this issue?

I have written the code in c language(for n factorial).
The code is-->
#include <stdio.h>
#include <string.h>
int main()
{
int n;
scanf("%d", &n);
char str[200];
str[0] = '1';
int k;
int t = 0;
int carry = 0;
for (int i = 1; i <= n; i++)
{
carry = 0;
for (int j = 0;; j++)
{
int arr = str[j] - 48;
k = arr * i + carry;
arr = k % 10;
str[j] = arr + 48;
carry = k / 10;
if (carry == 0 && str[j + 1] == '\0')
{
break;
}
if (carry != 0 && str[j + 1] == '\0')
{
for (int r = j;; r++)
{
str[r + 1] = (carry % 10) + 48;
carry = carry / 10;
if (carry == 0)
{
str[r + 2] = '\0';
t = 1;
break;
}
}
break;
}
}
}
int len = strlen(str);
// // printf("%d\n",len);
char prr[200];
for (int i = 0; i < len; i++)
{
int b = len - i - 1;
printf("%c", str[b]);
}
// printf(" %s\n", str);
return 0;
}
In other systems(including online c compilers) it is showing correct answer
Input=7
Output=5040
In my system(laptop):
Input=7
Output=+,*)'040
My laptop is hp envy 13-ab070TU
os:Windows 10 Home
system type:64-bit operating system, x64-based processor
I have also tried my code on virtual machine in my laptop on ubuntu and kali but the result is same that it is showing wrong output.
What is the reason for this and how I can rectify this issue?
You're filling digits in to your str array, but str is not necessarily a proper, null-terminated string.
At the end, you call strlen(str) to discover how many digits you computed in your result. But since str is not necessarily a null-terminated string, strlen doesn't necessarily get the right answer.
str is a local (stack allocated) variable, and you don't give it an initializer, so it starts out containing unpredictable garbage.
If str happens to start out containing zeroes (which it might), your program will happen to work. But if it contains one or more nonzero bytes, strlen might compute too long a length, so your digit-printing loop at the end might print some extra characters, as you saw on your laptop.
There are two or three ways to fix this.
Call memset(str, '\0', sizeof(str)); to fill the array with 0.
Initialize the array: char str[200] = "";. (It turns out that will fill the whole array with 0.)
Keep track of the number of digits some other way. I suspect it's the maximum value ever taken on by k or r, or something like that.

Compute the most frequent digit in a string of digits in C

I am trying to compute the most frequent digit in a string of characters, and I need to use pointers but I am not sure how to go about this with pointers.
int most(char* string){
int counter = 0;
int* array =(int*) malloc(sizeof(int)*10);
char* original = string;
while(*original){
counter++;
string++;
//Not sure what to put in this loop
}
}
for example, I want to call the code
char nums[] = "132433423";
printf("%d \n",most(nums));
// 3
The specification for your function is incomplete:
can the string contain non-digit characters?
what should be returned if there are no digits at all?
which value should be returned if there are multiple digits with the same maximum number of occurrences?
should the function return the digit or its numeric value? Your main() function uses the latter, but it is not clear from the text of the question.
The most function receives a pointer to the string. You can write a loop where you handle one character at a time and increment the pointer for the next iteration until you reach the end of the string. You must also decide what to return if the string contains no digits.
Here is a simple example:
int most(const char *s) {
int count[10] = { 0 };
int res, i, max;
while (*s) {
if (*s >= '0' && *s <= '9')
count[*s - '0']++;
s++;
}
res = -1; /* return -1 if no digits */
max = 0;
for (i = 0; i < 10; i++) {
if (count[i] > max)
res = i;
}
return res;
}
If you are restricted from using any array at all, allocating a block of memory seems indeed a good solution:
int most(const char *s) {
int *count = calloc(sizeof(*count), 10);
int res, i, max;
while (*s) {
if (*s >= '0' && *s <= '9')
*(count + *s - '0') += 1;
s++;
}
res = -1; /* return -1 if no digits */
max = 0;
for (i = 0; i < 10; i++) {
if (*(count + i) > max)
res = i;
}
free(count);
return res;
}
The notation *(count + *s - '0') += 1 works this way: count is a pointer to an array of int allocated and initialized to 0 by calloc. *s - '0' is the digit value n of the character pointed to by s, that has been tested to be a digit. count + *s - '0' is a pointer to the n-th entry in the array. *(count + *s - '0') += 1 increments this value by one.
There are ways to do this without memory allocation, with 10 variables and explicit tests for the different digits, but I doubt this be the expected solution.
If you can explain your choices to your teacher, there are 2 ways to use arrays without the [ and ] characters. These are obsolescent features of the C Standard, which most programmers are not familiar with, and which you can ignore unless you are curious:
int most(const char *s) { /* version with C99 digraphs */
int count<:10:> = { 0 };
int res, i, max;
while (*s) {
if (*s >= '0' && *s <= '9')
count<:*s - '0':>++;
s++;
}
res = -1; /* return -1 if no digits */
max = 0;
for (i = 0; i < 10; i++) {
if (count<:i:> > max)
res = i;
}
return res;
}
Or
int most(const char *s) { /* version with old-style trigraphs */
int count??(10??) = { 0 };
int res, i, max;
while (*s) {
if (*s >= '0' && *s <= '9')
count??(*s - '0'??)++;
s++;
}
res = -1; /* return -1 if no digits */
max = 0;
for (i = 0; i < 10; i++) {
if (count??(i??) > max)
res = i;
}
return res;
}
You could first sort the string so that the smaller digit characters appear first in num. You could use qsort() (from stdlib.h) like
int cmpfn(const void *a, const void *b)
{
int x = *(char *)a;
int y = *(char *)b;
return x-y;
}
int main()
{
char nums[] = "132433423";//"111222223333";//
qsort(nums, sizeof(nums)/sizeof(nums[0]) -1, sizeof(nums[0]), cmpfn);
printf("\nAfter sorting: %s", nums);
. . . . . . . . . .
. . . . . . . . . .
}
Declare variables to store the mode (ie, the value that appears most frequently in the data) and the frequency of the mode value.
int mode=-1, modecount=-1, n;
Now find the frequency of each digit character. Since this is a sorted character array, the same value will appear consecutively.
for(char *ptr=nums, *lptr=NULL; *ptr; ptr+=n)
{
lptr = strrchr(ptr, *ptr);
n = lptr - ptr + 1;
if(n>modecount)
{
printf("\n%c", *ptr);
modecount = n;
mode = *ptr;
}
}
printf("\nMode is: %c", mode);
strrchr() (from string.h) will find the last occurrence of a character in a string.
#include<string.h>
int most (char* nums) {
int i, max_index = 0;
int digit_dictionary[10]={0,0,0,0,0,0,0,0,0,0};
for (i=0; i< strlen(nums); i++) {
digit_dictionary[nums[i]-'0'] ++;
}
for (i=1; i<10; i++) {
if (digit_dictionary[i]> digit_dictionary[max_index])
max_index = i;
}
return max_index;
}
I will try to be as elaborate as I can:
You create a dictionary in which each index corresponds to a digit that can occur (0-9). Then, iterate through the string(which is basically an array of characters, and store each digit to its corresponding index in dictionary.
Note: [nums[i]-'0'] is calculated into the index of the dictionary since each char has an integer value (look up ASCII table). The counter at that index is incremented to keep count of number of occurrences of that digit.
After that, go through the dictionary to determine at which position is the digit with most occurrences, and return that digit.
I'm not sure what you mean by "using pointers", but here's a version which doesn't use pointers except for walking the input string:
char most_frequent_character(char *s)
{
int freq[10];
int max_freq;
int max_idx;
int idx;
while(*s)
freq[*s++ - '0']++; /* compute character freqs */
max_idx = 0;
max_freq = freq[0];
for(idx = 1 ; idx < 10 ; ++idx)
if(freq[idx] > max_freq)
{
max_freq = freq[idx];
max_idx = i;
}
return '0' + max_idx;
}
Have fun.
EDIT
To convert the above to "use pointers":
A. Change freq to a pointer-to-int and initialize it using malloc; in addition, initialize the memory pointed to by freq using memset:
int *freq = malloc(sizeof(int) * 10);
memset(freq, 0, sizeof(int)*10);
B. In the "compute character freqs" loop, use pointer references instead of indexing:
while(*s)
{
*(freq + (*s - '0')) = *(freq + (*s - '0')) + 1;
s++;
}
C. Use a pointer ref to set the initial value of max_freq:
max_freq = *freq;
D. In the for loop, use pointer math instead of indexing:
for(idx = 1 ; idx < 10 ; ++idx)
if( *(freq + idx) > max_freq)
{
max_freq = *(freq + idx);
max_idx = i;
}
E. Free the memory allocated earlier before the return statement:
free(freq);
return '0' + max_idx;
Now, sit down and understand why things are done in the way that they are here. For example, why didn't I do the following when computing the character frequencies?
while(*s++)
*(freq + (*s - '0')) = *(freq + (*s - '0')) + 1;
or
while(*s)
*(freq + (*s++ - '0')) = *(freq + (*s++ - '0')) + 1;
Each of the above would save several lines of code - why shouldn't they be used? (The obvious answer is "because they won't work as intended" - but WHY?)
Best of luck.

How to avoid duplicates when finding all k-length substrings

I want to display all substrings with k letters, one per line, but avoid duplicate substrings. I managed to write to a new string all the k length words with this code:
void subSent(char str[], int k) {
int MaxLe, i, j, h, z = 0, Length, count;
char stOu[1000] = {'\0'};
Length = (int)strlen(str);
MaxLe = maxWordLength(str);
if((k >= 1) && (k <= MaxLe)) {
for(i = 0; i < Length; i++) {
if((int)str[i] == 32) {
j = i = i + 1;
} else {
j = i;
}
for(; (j < i + k) && (Length - i) >= k; j++) {
if((int)str[j] != 32) {
stOu[z] = str[j];
} else {
stOu[z] = str[j + 1];
}
z++;
}
stOu[z] = '\n';
z++;
}
}
}
But I'm struggling with the part that needs to save only one time of a word.
For example, the string HAVE A NICE DAY
and k = 1 it should print:
H
A
V
E
N
I
C
D
Y
Your subSent() routine poses a couple of challenges: first, it neither returns nor prints it's result -- you can only see it in the debugger; second it calls maxWordLength() which you didn't supply.
Although avoiding duplicates can be complicated, in the case of your algorithm, it's not hard to do. Since all your words are fixed length, we can walk the output string with the new word, k letters (plus a newline) at a time, doing strncmp(). In this case the new word is the last word added so we quit when the pointers meet.
I've reworked your code below and added a duplication elimination routine. I didn't know what maxWordLength() does so I just aliased it to strlen() to get things running:
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#define maxWordLength strlen
// does the last (fixed size) word in string appear previously in string
bool isDuplicate(const char *string, const char *substring, size_t n) {
for (const char *pointer = string; pointer != substring; pointer += (n + 1)) {
if (strncmp(pointer, substring, n) == 0) {
return true;
}
}
return false;
}
void subSent(const char *string, int k, char *output) {
int z = 0;
size_t length = strlen(string);
int maxLength = maxWordLength(string);
if (k >= 1 && k <= maxLength) {
for (int i = 0; i < length - k + 1; i++) {
int start = z; // where does the newly added word begin
for (int j = i; (z - start) < k; j++) {
output[z++] = string[j];
while (string[j + 1] == ' ') {
j++; // assumes leading spaces already dealt with
}
}
output[z++] = '\n';
if (isDuplicate(output, output + start, k)) {
z -= k + 1; // last word added was a duplicate so back it out
}
while (string[i + 1] == ' ') {
i++; // assumes original string doesn't begin with a space
}
}
}
output[z] = '\0'; // properly terminate the string
}
int main() {
char result[1024];
subSent("HAVE A NICE DAY", 1, result);
printf("%s", result);
return 0;
}
I somewhat cleaned up your space avoidance logic but it can be tripped by leading spaces on the input string.
OUTPUT
subSent("HAVE A NICE DAY", 1, result);
H
A
V
E
N
I
C
D
Y
subSent("HAVE A NICE DAY", 2, result);
HA
AV
VE
EA
AN
NI
IC
CE
ED
DA
AY
subSent("HAVE A NICE DAY", 3, result);
HAV
AVE
VEA
EAN
ANI
NIC
ICE
CED
EDA
DAY

Caesar code in C extra letter for result

#include <stdio.h>
void caesar(char bemenet[], char eredmeny[], int n){
int i = 0;
for(i = 0; bemenet[i] != '\0'; i++) {
if(bemenet[i] == 'z') {
eredmeny[i] = 'a';
eredmeny[i] += n-1;
}
else
{
eredmeny[i] += n;
}
}
eredmeny[i] = '\0';
}
int main(){
char tomb1[]="caesarkodolas";
char tomb2[]="";
caesar(tomb1,tomb2,1);
printf("%s \n",tomb2);
return 0;
}
My out for the "eredmeny" (result) this:
"dbftbslpepmb" but tomb2=> ☺dbftbslpepmb it's not OK.. cause I have an extra char |☺|..
Allocate enough memory for the second parameter, and change this line
eredmeny[i] += n;
to this:
eredmeny[i] = bemenet[i] + n;
Note that this is not a bulletproof implementation of Caesar cipher: it would work for n==1, but it will break for larger n.
You need to think of a different way of implementing the "wrap-around": rather than testing for 'z' and replacing it with 'a', compute the new position of a letter modulo 26, and then add a to it:
void caesar(char bemenet[], char eredmeny[], int n){
int i;
for(i = 0; bemenet[i] != '\0'; i++) {
// Compute the position of the replacement letter
int pos = (bemenet[i] - 'a' + n) % 26;
// Place the letter into the output.
eredmeny[i] = 'a' + pos;
}
eredmeny[i] = '\0';
}
demo.
First of all you should have tomb2 big enough to store result.
For example, as mentioned above
char tomb2[255] = {0};
Also you have error here
else
{
eredmeny[i] += n;
}
You have to assign valid ASCII value to eredmeny[i] so change this string to
eredmeny[i] += bemenet[i] + n
Also it usually bad practice to pass a pointer on array without passing its size. Easy to get buffer overflow.
you're not doing the maths right.
if you are using just lower case letters then you need to add n, but then many letters will be "after" z, so you need to start again at a.
you want something more like this:
for(i = 0; bemenet[i] != '\0'; i++) {
int encrypted = bemenet[i] + n;
if (encrypted > 'z') encrypted = encrypted - 'z' + 'a';
eredmeny[i] = (char)encrypted;
}
(and also fix the output array size as described in other answers here).

Resources