Word count in C, learning more CS

Word count in C, learning more CS - c

After about 5 years of programming in dynamic languages such as Python and JS I am starting to feel I'm missing out of what happens under the hood. Languages such as these are really great because they let you focus on what you have to do leveraging the trouble of working with pointers, memory allocation and many searching, sorting, inserting algorithms. Even though I never regret using these languages as I really feel they are ridiculously powerful I feel that, in order to become a better programmer, I need to take a step back and understand what happens under the hood!
I decided to do this by writing a simple word counter: The app gets all the params and outputs all the unique words, each one with a counter: "Hello world Hello" would return "Hello: 2", "world: 1" (not taking in consideration the actual output structure). This program is the Python equivalent of:
import sys
from collections import defaultdict
def main():
results = defaultdict(int)
for word in sys.argv[1:]:
results[word] += 1
print results
Writing it in C is a bit different, I feel like I'm getting something utterly wrong with pointers, arrays of pointers and all that stuff! I want to get better, Help me get better!!
#include <stdio.h>
#include <stdlib.h>
// This is what a key-value pair: <int, string>
typedef struct {
int counter;
unsigned char* word;
} hashmap;
// Checks if inside the array of results, hashmap->word is equals to word paramter
hashmap* get_word_from_results(hashmap* results[], int count, const char* word) {
int i;
hashmap* result;
for (i = 0; i < count; i++) {
result = results[i];
if (result->word == (unsigned char *)word)
return result;
}
return NULL;
}
int main(int argc, const char *argv[])
{
hashmap* results;
int results_counter = 0;
int i;
const char* word;
for (i = 1; i < argc; i++) {
word = argv[i];
hashmap* result = get_word_from_results(&results, results_counter, word);
// If result is NULL, means word is not inserted yet, let's create a new hashmap and insert it inside the array
if (result == NULL) {
hashmap h;
h.counter = 1;
h.word = (unsigned char *)word;
results = realloc(NULL, (results_counter + 1) * sizeof(hashmap) );
// NOTE: potential memory leak? would h be deallocated?
results[results_counter] = h;
results_counter++;
printf("NEW\n");
} else {
// The word already exists in the hashmap array, let's increase it by 1
result->counter++;
printf("INCREMENTED\n");
}
}
return 0;
}
Can anyone give me some advice? what am I doing wrong here? Are my pointers okay? also I think I spotted a memory leak (see comments), would anyone like to submit their version??
Thanks!! you guys are so cool!!
Daniel

The major pointer issue in your program is that when hashmap* results is passed to realloc for the first time, its value is uninitialized. This is undefined behavior. You should initialize the pointer to NULL, like this:
hashmap* results = NULL;
The other problem is comparing strings: you need to use strcmp rather than ==. Remember that strcmp returns zero when strings are equal.
There are also memory leaks at the end of your program. You should free results, along with the words that are stored inside its elements.
Of course the thing that you call hashmap behaves precisely like a dynamic array. Programming a hash table in C presents a different level of challenge, however, so I would encourage you to make your current approach work.

Related

Can I convert char*[20] to char[][20]?

I've corrected the program by myself now.
this is still the -Never- answered question:
I have a 2D array of chars that will contain a word every array. I split a char* word by word with a function to place them in the array. My problem is that it doesn't print the word but random characters. May it be a problem of pointers? I'm not sure about the conversion of char*[20] to char[][20]
because I want filter a char*spamArray[20] into a char[][20]
I need to pass char*[20] to the filter which has an argument char[][20].
This is the call:
char* spam = "this is a string";
//spam isn't actually initialized this way, but this is just for explaining what it contains
//OLD QUESTION CODE:char (*spamArray)[20] = (char(*)[20])malloc((sizeof(char) * 20) * nSpam);
//new:
char spamArray[nSpam][20];
//nSpam is the number of words
splitstring(spam, &spamArray[0], nSpam);
This is the function splitstring into words
inline void splitstring(char *str, char (*arr)[20], size_t t)
{
size_t i = 0; //index char nella stringa dell'array
while(t != 0)
{
if (*str != ' ' && *str != '\0')
{
(*arr)[i] = *str;
*str++;
i++;
}
else
{
t--;
*str++;
(*arr)[i] = '\0';
*arr++;
i = 0;
}
}
}
then I'll call a function which is for testing and printing the words in the 2D array (spamArray)
filter_mail(&txt, spamArray) //i call the function this way
void filter_mail(char **line, char spam[][20], int nSpam)
{
char *line_tmp = *line;
bool isSpam = 0;
int a = 0;
int c = 0;
while(nSpam!= 0)
{
if (spam[a][c] != '\0')
{
printf("%c", spam[a][c]);
c++;
}
else
{
a++;
c = 0;
nSpam--;
}
}
}
Then it prints random things every time and the program crashes.
Also, how should I free a spamArray?
is it correct to free it this way?
free(spamArray)
I haven't got any answer right now because everyone pointed out that using char[][] doesn't work. Well of course it doesn't. I don't even use it in the source code. That was just the title of the question. Please read everything before any other answer.

i have a 2D Array
No, you don't. 2D arrays don't exist in C99 or C11, and don't exist in C++11. BTW, even if C++17 added more containers to the C++11 and C++14 standards, they did not add matrixes.
Arrays (both in C and in C++) are always unidimensional. In some weird cases, you could have arrays of arrays (each component should have the same type, so same dimension, same size, same alignment), but this is so confusing that you should not even try.
(and your code and your numerous comments show that you are very confused. It is OK, programming is difficult to learn; you need years of work)
Can i convert char*[] to char[][]?
No, because the char[][] type does not exist and cannot exist (both in C++11 or C++14 and in C99 or C11) because arrays should have elements of the same fixed and known size and type.
Look into existing libraries (such as Glib), at least for inspiration. Study the source code of relevant free software projects (e.g. on github).
Beware of undefined behavior; it may happen that a code (like yours) is wrong but don't crash properly. Be scared of UB!
Then it prints random things every time and the program crashes
Typical case of UB (probably elsewhere in your code). You are lucky to observe a crash. Sometimes UB is much more insidious.
coding in C99 or C11
First, spend more time in reading documentation. Read several books first. Then look into some reference site. At last, read carefully the n1570 standard of C11. Allocate a week of dense work for that purpose (and don't touch your code at all during that time; perhaps carry on some tiny experiments on toy code unrelated to your project and use the debugger to understand what is going on in the computer).
You may have an array of 16-byte wide strings; I often don't do that, but if I did I prefer to name intermediate types:
typedef char sixteenbytes_ty[16];
extern sixteenbytes_ty array[];
You might code extern char array[][16]; but that is so confusing that I got it wrong -because I never do that- and you really should never code that.
This declares a global array containing elements of 16 bytes arrays. Again, I don't recommend doing that.
As a rule of thumb: never use so called "2D arrays" (in reality arrays of arrays) in C. If you need matrixes of variable dimensions (and you probably don't) implement them as an abstract data type like here.
If you manipulate data which happens to have 16 byte, make a struct of them:
struct mydata_st {
char bytes[16];
};
it is much more readable.
You may have an array of pointers, e.g. char*[] (each pointer has a fixed size, 8 bytes on my Linux/x86-64 machine, which is not the same as the allocated size of the memory zone pointed by it).
You probably should start your code entirely (and throw away your code) and think in terms of abstract data types. I strongly recommend reading SICP (freely downloadable). So first, write on paper the specification (the complete list of operations, or the interface or API of your library), using some natural language like English or Italian.
Perhaps you want some kind of vector of strings, or matrix of chars (I don't understand what you want, and you probably did not specify it clearly enough on paper).
If coding in C99, consider using some flexible array members in some of your (internal) type implementations.
Perhaps you decide that you handle some array of dynamically allocated strings (each obtained by strdup or asprintf etc...).
So perhaps you want some kind of dynamic vector of dynamically allocated types. Then define first the exact list of operations on them. Read the wikipage on flexible array members. It could be very useful.
BTW, compile with all warnings and debug info, so compile with gcc -Wall -Wextra -g your C code (if using GCC). Use the debugger gdb to understand better the behavior of your program on your system, run it step by step, query the state of the debugged process.
coding in C++11 or newer
If coding in C++11 (which is not the same language as C99) use existing types and containers. Again, read some good book (like this) more documentation and reference. C++ is a very difficult programming language, so spend several weeks in reading.

No, you can't. That's because char[][] is an array of incomplete type, thus is invalid (so it doesn't exist at all). Array elements must be of complete types, that is, the size, alignment and layout of the type must be determined at compile time.
Please stop arguing the existence of char[][]. I can assure you that it doesn't exist, at all.
Or go Google.
Fixed-length array is a candidate solution:
char[][32]
but dynamic memory allocation (with an array of pointers) is better because the size of the allocated memory is flexibly changeable. Then you can declare the function like:
void filter_mail(char **line, char spam**);
Or as suggested. At the very least, you should do it like this# (but you can't omit m):
void foo(size_t m, char (*)[m]);
You can never declare char[][] because pointer-array conversion can be done only at the top level, i.e. char*[] and char[][] (that's because of operator precedence).
I bet you don't know at all what you're doing here in splitstring():
while(t != 0)
{
if (*str != ' ' && *str != '\0')
{
*arr[c] = *str;
*str++;
c++;
}
else
{
t--;
*str++;
*arr[c] = '\0';
*arr++;
c = 0;
}
}
Because *arr[c] is equivalent to arr[c][0], you're just copying str to the first element in each string of arr. Get parentheses so it looks like (*arr)[c]. Then remove the asterisk before pointer increment (you don't use the value from dereferencing at all):
while(t != 0)
{
if (*str != ' ' && *str != '\0')
{
(*arr)[c] = *str;
str++;
c++;
}
else
{
t--;
str++;
(*arr)[c] = '\0';
arr++;
c = 0;
}
It should be fine now.
Finally, don't cast the result of malloc. Freeing spamArray
with free() is just the standard way and it should be fine.

This is a program that's a version of your program that does what you seem to want to do:
#include <stdlib.h>
#include <ctype.h>
#include <assert.h>
#include <stdio.h>
const int nSpam = 30;
char* spam = "this is a string";
char spamArray[30][20];
//nSpam is the number of words
void splitstring(char *str, char arr[][20], size_t nSpam)
{
int word_num = 0;
int char_num = 0;
int inspace = 0;
for (char *i = str; *i != '\0'; ++i)
{
if (!isspace(*i))
{
inspace = 0;
assert(word_num < nSpam);
arr[word_num][char_num++] = *i;
assert(char_num < 20);
}
else
{
if (!inspace)
{
arr[word_num++][char_num] = '\0';
char_num = 0;
inspace = 1;
}
}
}
if (!inspace)
{
arr[word_num++][char_num] = '\0';
}
while (word_num < nSpam)
{
arr[word_num++][0] = '\0';
}
}
void filter_mail(char const * const *line, char spam[][20], size_t nSpam)
{
int a = 0;
while (a < nSpam)
{
int c = 0;
while (spam[a][c] != '\0')
{
printf("%c", spam[a][c++]);
}
printf("\n");
++a;
}
}
char const * const mail_msg[] = {
"This is a line.\n",
"This is another line.\n",
0
};
int main()
{
splitstring(spam, spamArray, 30);
filter_mail(mail_msg, spamArray, 30);
return 0;
}
I warn you that this is a poor design that will cause you no end of problems. It's very much the wrong approach.
There is no need to free anything here because it's all statically allocated.

program for finding the minimum value in an array of integersi(in C) won't compile

I'm trying to come up with a program that reads in numbers from the command line, turns the argv array into integers, and then finds the smallest integer in the array of those integers.
Below is my code for this program, can anyone help me out?
#include <stdio.h>
#include <stdlib.h>
int *integerizeArgs(int, char **);
int *findMin(int, int *);
int *integerizeArgs(int argc, char **argv)
{
int i = 0;
int *a = malloc(sizeof(int) * (argc-1));
for(i= 1; i < argc; ++i){
a[i-1] = atoi(argv[1]);
return a;
}
return 0;
}
int *findMin(int itemCount, int *a) {
int i, smallest = a[0];
for (i=0; i < itemCount; i++) {
if(a[i] < smallest) {
smallest = a[i];
return smallest;
}
return 0;
}
return 0;
}
int main(int argc, char **argv){
int *a = integerizeArgs(argc, argv);
int b = findMin(argc, a[0]);
printf("%d", b);
return 0;
}

Write proper code.
You should check if malloc() was successful.
Be careful for typo. argv[1] is not as reasonable as argv[i] here.
Do not use return; when you don't want to return from the function.
Use proper type. Distinguish between "normal" integers and pointers.
Be careful for off-by-one error.
In this case, argc-1 elements are allocated, not argc elements.
You should free whatever you allocated.
Corrected code:
#include <stdio.h>
#include <stdlib.h>
int *integerizeArgs(int, char **);
int findMin(int, int *);
int *integerizeArgs(int argc, char **argv)
{
int i = 0;
int *a = malloc(sizeof(int) * (argc-1));
if (a == NULL){ /* add error check */
perror("malloc");
exit(1);
}
for(i= 1; i < argc; ++i){
a[i-1] = atoi(argv[i]); /* convert each arguments instead of only the first one */
/* don't return when the process is not done */
}
return a; /* return the result */
}
/* use proper return type */
int findMin(int itemCount, int *a) {
int i, smallest = a[0];
for (i=0; i < itemCount; i++) {
if(a[i] < smallest) {
smallest = a[i];
/* don't return when the process is not done */
}
/* don't return when the process is not done */
}
return smallest; /* return the result */
}
int main(int argc, char **argv){
int *a = integerizeArgs(argc, argv);
/* pass the (pointer to) the array instead of the first element of the array (&a[0] is also OK) */
/* pass correct itemCount (there are argc-1 items because the first argument typically is the command) */
int b = findMin(argc - 1, a);
printf("%d", b);
free(a); /* free whatever you allocated */
return 0;
}

There are multiple problems with this code. Such significant misunderstandings would tend to elude to the point that your resources aren't working for you. Have you considered trying other resources?
If you don't yet understand the basics of procedural programming, I recommend learning a different language first. Unfortunately I haven't come across a decent C programming book that teaches both procedural programming in general and C programming. The only books I know of seem to require that you already understand procedural programming. I'll try to help a bit with that with this post.
I can highly recommend K&R 2E, providing you've understood the procedural programming basics first; remember to do the exercises as you encounter them as they're a valuable part of the learning experience.
You seem to be quite confused about the effect upon the flow of execution caused by return, for and if. C is a procedural language, meaning it has a structure similar to a recipe in a cookbook (a procedure), for example.
Okay, that's grossly over-simplified, but if you think of steps like "preheat the oven to 180C (described in page 42)" and "caramelise the onion" as though they are separate procedures then we can establish a use for keywords like return, for and if, so bear with me.
As you turn to page 42 you might notice the procedure is simple but disjoint from the recipe, something like this:
Ensure the oven is empty, and if necessary install an oven thermometer onto the front of one of the racks.
if the oven is a gas oven:
Strike a match.
Kneel in front of the oven, and turn the knob to the appropriate temperature.
Bring the lit match into contact with the gas stream, keeping your fingers well clear of the gas stream at all times.
else turn the knob to the appropriate temperature.
Close the oven.
for the duration beginning when you close the oven and ending when the thermometer reaches the appropriate value, periodically check the thermometer
return to the previous procedure.
Here, the words if and else are clearly meant to ensure that you choose the appropriate path for your oven. The same goes for your computer. You're telling it to choose which one of the paths based on the condition in your code.
The return keyword is used to tell you that the procedure is over, and you should resume the procedure you were using before.
There is a loop embedded into the procedure, too. That could be expressed using for language, i.e. "for the duration beginning when you close the oven and ending when the thermometer reaches the appropriate value, periodically check the thermometer" would loosely translate to something like:
close(oven);
for (actual_temp = check_temp(oven); actual_temp < desired_temp; actual_temp = check_temp(oven)) {
sleep(15 minutes);
}
Recipes are easy to understand because they're written in natural language. Computer programs, however, aren't. Programming languages have idioms that aren't so commonly used in natural language, such as variable scope and memory location, so it's important to use those idioms consistently.
I recommend designing software as though it's meant to fit in with the environment. I can see you've given that some thought by forwarding argc and argv (or its translated equivalent) to each of your functions, but a deeper analysis of the environment is required.
It goes against the grain for a function to perform allocation and expect the caller to perform cleanup for that allocation. If you analyse the example set by the C standard library, the only functions that allocate memory are allocation functions, thread creation and file creation. Everything else lets the caller choose how memory is allocated. Thus, your integerizeArgs function should probably be refactored:
int *integerize_arguments(int *destination, char **source, size_t length) {
for (size_t x = 0; x < length; x++) {
if (sscanf(source[x], "%d", &destination[x]) != 1) {
return NULL;
}
}
return destination;
}
Did you notice how closely this resembles memcpy, for example? Using this pattern the caller can choose what kind of allocation the array should have, which is very nice for maintenance should you decide you don't need malloc.
Also notice how I used size_t for index variables. This is important because it doesn't usually make sense to allow negative numbers for arrays.
As for your findMin function, the best advice I can give there is to think about the procedural language we discussed earlier. I don't think the instructions you're giving your computer are the same instructions you have in your head. If they are, they don't make sense and you might need to run through that procedure a few times by hand to see what's going wrong.
Unless the caller (main, in this case) needs to know the address of the minimum value, findMin doesn't need to return int *; it should return int instead. If, on the other hand, main needs to know where the minimum value is located, then your algorithm needs to be adapted because you're currently not storing that position in your logic.
As covered by MikeCAT, you shouldn't be returning to the previous procedure until you've inspected the entire array. Hence your return smallest; should be below the for loop.
This post is getting quite lengthy, kind of turning into a book of it's own... so I'll wrap up here, and recommend in summary that you purchase a book about programming.

To know the size of an array in c

I am learning C language. I want to know the size of an array inside a function. This function receive a pointer pointing to the first element to the array. I don't want to send the size value like a function parameter.
My code is:
#include <stdio.h>
void ShowArray(short* a);
int main (int argc, char* argv[])
{
short vec[] = { 0, 1, 2, 3, 4 };
short* p = &vec[0];
ShowArray(p);
return 0;
}
void ShowArray(short* a)
{
short i = 0;
while( *(a + i) != NULL )
{
printf("%hd ", *(a + i) );
++i;
}
printf("\n");
}
My code doesn't show any number. How can I fix it?
Thanks.

Arrays in C are simply ways to allocate contiguous memory locations and are not "objects" as you might find in other languages. Therefore, when you allocate an array (e.g. int numbers[5];) you're specifying how much physical memory you want to reserve for your array.
However, that doesn't tell you how many valid entries you have in the (conceptual) list for which the physical array is being used at any specific point in time.
Therefore, you're required to keep the actual length of the "list" as a separate variable (e.g. size_t numbers_cnt = 0;).
I don't want to send the size value like a function parameter.
Since you don't want to do this, one alternative is to use a struct and build an array type yourself. For example:
struct int_array_t {
int *data;
size_t length;
};
This way, you could use it in a way similar to:
struct int_array_t array;
array.data = // malloc for array data here...
array.length = 0;
// ...
some_function_call(array); // send the "object", not multiple arguments
Now you don't have to write: some_other_function(data, length);, which is what you originally wanted to avoid.
To work with it, you could simply do something like this:
void display_array(struct int_array_t array)
{
size_t i;
printf("[");
for(i = 0; i < array.length; ++i)
printf("%d, ", array.data[i]);
printf("]\n");
}
I think this is a better and more reliable alternative than another suggestion of trying to fill the array with sentinel values (e.g. -1), which would be more difficult to work with in non-trivial programs (e.g. understand, maintain, debug, etc) and, AFAIK, is not considered good practice either.
For example, your current array is an array of shorts, which would mean that the proposed sentinel value of -1 can no longer be considered a valid entry within this array. You'd also need to zero out everything in the memory block, just in case some of those sentinels were already present in the allocated memory.
Lastly, as you use it, it still wouldn't tell you what the actual length of your array is. If you don't track this in a separate variable, then you'll have to calculate the length at runtime by looping over all the data in your array until you come across a sentinel value (e.g. -1), which is going to impact performance.
In other words, to find the length, you'd have to do something like:
size_t len = 0;
while(arr[len++] != -1); // this is O(N)
printf("Length is %u\n", len);
The strlen function already suffers from this performance problem, having a time-complexity of O(N), because it has to process the entire string until it finds the NULL char to return the length.
Relying on sentinel values is also unsafe and has produced countless bugs and security vulnerabilities in C and C++ programs, to the point where even Microsoft recommends banning their use as a way to help prevent more security holes.
I think there's no need to create this kind of problem. Compare the above, with simply writing:
// this is O(1), does not rely on sentinels, and makes a program safer
printf("Length is %u\n", array.length);
As you add/remove elements into array.data you can simply write array.length++ or array.length-- to keep track of the actual amount of valid entries. All of these are constant-time operations.
You should also keep the maximum size of the array (what you used in malloc) around so that you can make sure that array.length never goes beyond said limit. Otherwise you'd get a segfault.

One way, is to use a terminator that is unique from any value in the array. For example, you want to pass an array of ints. You know that you never use the value -1. So you can use that as your terminator:
#define TERM (-1)
void print(int *arr)
{
for (; *arr != TERM; ++arr)
printf("%d\n", *arr);
}
But this approach is usually not used, because the sentinel could be a valid number. So normally, you will have to pass the length.
You can't use sizeof inside of the function, because as soon as you pass the array, it decays into a pointer to the first element. Thus, sizeof arr will be the size of a pointer on your machine.

#include <stdio.h>
void ShowArray(short* a);
int main (int argc, char* argv[])
{
short vec[] = { 0, 1, 2, 3, 4 };
short* p = &vec[0];
ShowArray(p);
return 0;
}
void ShowArray(short* a)
{
short i = 0;
short j;
j = sizeof(*a) / sizeof(short);
while( i < j )
{
printf("%hd ", *(a + i) );
++i;
}
printf("\n");
}
Not sure if this will work tho give it a try (I don't have a pc at the moment)

How to declare an array with an arbitrary size

Ok, this is a C programming homework question. But I'm truly stuck.
I ask the user to input words, and then I insert the input into an array, but I can't have any control over the number of words the user types.
I guess what I'm asking is how do you declare a an array in C without declaring its length and without asking the user what the length should be.
I know this has something to do with malloc, but if you could give me some examples of how to do this, I would really appreciate it.

You can malloc a block of memory large enough to hold a certain number of array items.
Then, before you exceed that number, you can use realloc to make the memory block bigger.
Here's a bit of C code that shows this in action, reallocating an integer array whenever it's too small to hold the next integer.
#include <stdio.h>
#include <stdlib.h>
int main (void) {
int *xyzzy = NULL; // Initially NULL so first realloc is a malloc.
int currsz = 0; // Current capacity.
int i;
// Add ten integers.
for (i = 0; i < 10; i++) {
// If this one will exceed capacity.
if (i >= currsz) {
// Increase capacity by four and re-allocate.
currsz += 4;
xyzzy = realloc (xyzzy, sizeof(int) * currsz);
// Should really check for failure here.
}
// Store number.
xyzzy[i] = 100 + i;
}
// Output capacity and values.
printf ("CurrSz = %d, values =", currsz);
for (i = 0; i < 10; i++) {
printf (" %d", xyzzy[i]);
}
printf ("\n");
return 0;
}

You can realloc it every time like:
int size = 0;
char **array = malloc(0);
while(/* something */)
{
char *string = // get input
size++;
array = realloc(array, size * sizeof(char*));
array[size - 1] = string;
}
Or in chunks if you care about speed.

Yes, you want malloc. Checkout this tut.
http://www.cprogramming.com/tutorial/dynamic_memory_allocation.html
This site is good in general for learning.
Here is an example of using realloc, it is basically exactly what you are asking to do.
http://www.cplusplus.com/reference/clibrary/cstdlib/realloc/

0) obviously you will need multiple buffers, so you will need a list like structure: perhaps a record with char array 100 chars and a pointer to next structure
1) You need to capture the words char by char and store them in your buffer
2) once the buffer is full you allocate another record, chain it with the previous one and keep going until you are out of mem or the process is over.
That should be better performance than realloc function. I believe malloc is trying to give contious block of memory. Therefore the list like structure will be faster and work better.

What's the difference between intializating a struct as pointer or not?

I have the following for my HashTable structure:
typedef char *HashKey;
typedef int HashValue;
typedef struct sHashElement {
HashKey key;
HashValue value;
} HashElement;
typedef struct sHashTable {
HashElement *items;
float loadFactor;
} HashTable;
I never really thought about it until now but I just realized there's two ways how I can use this:
Alternative 1:
void hashInitialize(HashTable *table, int tabSize) {
table->items = malloc(sizeof(HashElement) * tabSize);
if(!table->items) {
perror("malloc");
exit(1);
}
table->items[0].key = "AAA";
table->items[0].value = 45;
table->items[1].key = "BBB";
table->items[1].value = 82;
table->loadFactor = (float)2 / tabSize;
}
int main(void) {
HashTable t1;
int i;
hashInitialize(&t1, HASHSIZE);
for(i = 0; i < HASHSIZE - 1; i++) {
printf("PAIR(%d): %s, %d\n", i+1, t1.items[i].key, t1.items[i].value);
}
printf("LOAD FACTOR: %.2f\n", t1.loadFactor);
return 0;
}
Alternative 2:
void hashInitialize(HashTable **table, int tabSize) {
*table = malloc(sizeof(HashTable));
if(!*table) {
perror("malloc");
exit(1);
}
(*table)->items = malloc(sizeof(HashElement) * tabSize);
if(!(*table)->items) {
perror("malloc");
exit(1);
}
(*table)->items[0].key = "AAA";
(*table)->items[0].value = 45;
(*table)->items[1].key = "BBB";
(*table)->items[1].value = 82;
(*table)->loadFactor = (float)2 / tabSize;
}
int main(void) {
HashTable *t1 = NULL;
int i;
hashInitialize(&t1, HASHSIZE);
for(i = 0; i < HASHSIZE - 1; i++) {
printf("PAIR(%d): %s, %d\n", i+1, t1->items[i].key, t1->items[i].value);
}
printf("LOAD FACTOR: %.2f\n", t1->loadFactor);
return 0;
}
Question 1: They both seem to produce the same result. On main, both examples print the right key/value pair. So, what exactly is the different between them besides the syntax change (using (*table) instead of just table), the extra code to allocate memory for the HashTable structure and the declaration of HashTable pointer?
I've been writing a few data structures lately like stacks, linked lists, binary search trees and now hash tables. And for all of them, I've always used the alternative 2. But now I'm thinking if I could have used alternative 1 and simplify the code, removing most of the * and & that are all over the place.
But I'm asking this question to understand the differences between the two methods and if, and also why, I should use on over the other.
Question 2: As you can see in the structures code, HashKey is a pointer. However, I'm not using strdup nor malloc to allocate space for that string. How and why is this working? Is this OK to do? I've always used malloc or strdup where appropriate when handling dynamic strings or I would get lots of segmentation faults. But this code is not giving me any segmentation faults and I don't understand why and if I should do it like this.

First both solutions are perfectly right !
Alternative 1 :
Your HashTable is declared in the main, which means the struct is somewhere in the call stack. The struct will be destroy if you leave the scope. Note : In your case that can't happen because the declaration is in the main so the scope ends on process exit.
Alternative 2:
You've got a HashTable* (pointer) in the call stack so you need to allocate the memory for the struct. To do so you use malloc.
In both case your struct is correctly allocated. The main difference will be on performances. It's far more performant to allocate on the stack but you can't do dynamic allocation. To do so you need to use malloc.
So, some times, you have to use malloc but try to avoid mallocing a lot if you want to do a high performance application.
Is that clear enough? :)

In alternative 1, the caller would allocate table but your function would allocate the contents thereof, which is not always a good idea in terms of memory management. Alternative 2 keeps all allocations in the same place.

As answered previously, the differences between the two alternatives is memory management. In alternative 1 you expect the caller to allocate the memory for table prior to the call; whereas, in alternative 2 just a pointer declaration is required to give you a place to put the memory after you've created it.
To question 2, the simple answer is that you are assigning a constant to the string. According to the following site the assignment is set up at compile time, not runtime.
http://publications.gbdirect.co.uk/c_book/chapter6/initialization.html

for question 2:
(*table)->items[0].key = "AAA";
actually puts "AAA" in read only parts of memory and char *key points to it, contents pointed by key cannot be changed.
(*table)->items[0].key[0]='a' gives and error
Here you can find further discussion about it.
What is the difference between char s[] and char *s?

The only difference is where the memory comes from -- local variables are typically on the stack whereas mallocs typically come from the heap.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Word count in C, learning more CS - c

Related

Can I convert char*[20] to char[][20]?

program for finding the minimum value in an array of integersi(in C) won't compile

To know the size of an array in c

How to declare an array with an arbitrary size

What's the difference between intializating a struct as pointer or not?

Categories

Resources