Undesirable Zero on the Output of Single Linked-list - c

I am trying to do a simple linked-list in order to print the numbers inserted as arguments on the call of the program. However, it prints an undesirable zero on the final of the output. I guess it is a NULL that is printed, but I don't know how to get rid of it. I am still understanding the basics of linked-lists. Thank you.
/* */
#include <stdio.h>
#include <stdlib.h>
/* */
#define NUMERO_DE_ARGUMENTOS_MINIMO 3
#define EOS '\0'
/* */
#define OK 0
#define ARGUMENTO_NULO 1
#define ARGUMENTO_VAZIO 2
#define PONTEIRO_NULO 3
#define NUMERO_DE_ARGUMENTOS_INVALIDO 101
/* */
typedef struct estruturaNumeros
{
unsigned numero;
struct estruturaNumeros *proximaEstrutura;
} tipoNumeros;
/* */
int
main(int argc, char **argv)
{
/* */
tipoNumeros *numeroInicial, *proximoNumero;
char *validacao;
unsigned indiceArgumento;
/* */
numeroInicial = (tipoNumeros *) malloc(sizeof(tipoNumeros));
/* */
if (argc < NUMERO_DE_ARGUMENTOS_MINIMO)
{
printf("\n\n\nNumero de argumentos invalido.\n\n\n\n");
exit(NUMERO_DE_ARGUMENTOS_INVALIDO); /* Programa abortado. */
} /* if */
/* */
if (!numeroInicial)
{
printf("\n\n\nPonteiro nulo.\n\n\n\n");
exit(PONTEIRO_NULO); /* Programa abortado. */
} /* if */
/* */
proximoNumero = numeroInicial;
/* */
for (indiceArgumento = 1; indiceArgumento < argc; indiceArgumento++)
{
proximoNumero->numero = strtoul(*(argv + indiceArgumento), &validacao, 10);
proximoNumero->proximaEstrutura = (tipoNumeros *) malloc(sizeof(tipoNumeros));
proximoNumero = proximoNumero->proximaEstrutura;
} /* for */
/* */
proximoNumero->proximaEstrutura = NULL;
proximoNumero = numeroInicial;
/* */
printf("\n\n\n");
/* */
while (proximoNumero != NULL)
{
printf("%u\n", proximoNumero->numero);
proximoNumero = proximoNumero->proximaEstrutura;
} /* while */
/* */
printf("\n\n\n");
return OK; /* Codigo retornado com sucesso. */
} /* main */
/* output */
UBUNTU 05 --> ./exemplo_lista_encadeada_004 1 2 3
1
2
3
0

The while test should be testing the structure pointer proximaEstrutura.
Your code uses a final (or terminal) node. It is the terminal node's proximaEstrutura member that is initialised to NULL.
while (proximoNumero->proximaEstrutura != NULL)
{
printf("%u\n", proximoNumero->numero);
proximoNumero = proximoNumero->proximaEstrutura;
}

Related

What is the safest way to hash a list of non-repeating integers without taking order into account?

I am looking for a hash function, that can hash a list of non-repeating integers while ignoring the order of them.
Example
I want the two lists
l1 = [0, 1, 3, 7]
l2 = [7, 3, 1, 0]
to have the same hash.
Background
I have an algorithm that finds a list of vertices on a graph. In an undirected graph, the algorithm will find certain lists multiple times in different orders. With my current understanding of the algorithm, it is easier to filter out the duplicates rather than re-inventing the algorithm. For performance reasons, I understand it to be easier to hash the found lists of vertices rather than comparing the whole lists.
Possible answers
Now, I see that
an XOR or a simple sum might be an answer.
Unfortunately, both offer too much potential for hash collisions, as I see it.
The not-very-efficient working method is to sort a list, and then use this sorted list to compare the new list (also sorted) against.
Other Thoughts
Given that
The lists contain only integers.
The integers will be the vertex indices, and the graph can have billions of vertices.
The integers in a list are non-repeating, and their order doesn't matter.
The lists can and will consist of between 2 and 100 (and in some cases > 1000) entries.
No need for cryptographically-secure randomness.
I have this feeling that there should be a relatively easy and straight-forward answer, and I just have not found it.
Use a combination of the product, sum and ^. All are communitive (order independent) with unsigned math.
unsigned long long product = 1;
unsigned sum = 0; // Maybe unsigned long long
unsigned x = 0;
for (i=0; i < array_element_count; i++) {
product *= l[i];
sum += l[i];
x ^= l[i];
}
unsigned long long pre_hash = product + sum + ((unsigned long long) x << 32));
unsigned hash = pre_hash % hash_table_size;
Tip: hash_table_size should be a prime to effectively use all pre_hash bits.
If array_element_count was high, I would consider p *= shift_right_until_odd(l[i]), else p will too often become 0.
If l[i] == 0 p *= l[i] deserves something different. A simple mitigation is p *= l[i] | 1, but that is something pulled out of the air.
Hashing takes time for good design and the above are candidate building blocks for OP.
Any CRC will do the job. Just XOR (I have used 64bit numbers, but 32bits crc, but it should work also with full 64 xor/crc or 32bit xor/crc) the elements together (to eliminate any order between them, as the XOR operation is conmutative, you eliminate the dependency on the order) mod 2&31, then take a CRC32 of the result (that will spread the set of values uniformly, as it warrants ---or tries to--- that a change in one bit will affect half of the bits in the result) See here for sample code and several crc tables. The repository is BSD license, so you can use it as desired.
Below is a sample implementation that generates random lists, and reorders them, comparing their hashes:
crc32ieee8023.h
#ifndef CRC32IEEE8023_H
#define CRC32IEEE8023_H
#include "crc.h"
extern CRC_STATE crc32ieee8023[];
#endif /* CRC32IEEE8023_H */
crc.h
#ifndef CRC_H
#define CRC_H
#include <stdlib.h>
#include <stdint.h>
#define CRC_TABLE_SIZE 256
#define CRC_BYTE_SIZE 8
#define CRC_BYTE_MASK 0xff
typedef uint8_t CRC_BYTE;
typedef uint64_t CRC_STATE;
CRC_STATE do_crc(
CRC_STATE state,
CRC_BYTE *buff,
size_t nbytes,
CRC_STATE *table);
#endif /* CRC_H */
test_xor_crc_hash.c
(This is the important file, where all the stuff is included.)
/* test_crc_table -- program to test a crc hash algorithm that
* checks a list of numbers and generates the same crc in a form
* that is independent on the list order presented.
* Program generates a list of random numbers (32bit) then it
* generates a random permutation of the list and a sorted list,
* calculates the hash over the three lists, and compares them.
*/
#include <errno.h>
#include <fcntl.h>
#include <getopt.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include "crc.h"
#include "crc32ieee8023.h"
#define DFLT_N 10
#define RANDOM_DEV "/dev/urandom"
int long_compare(
const void *_a,
const void *_b);
void print(
const char *name,
const uint64_t *v,
int vsz,
CRC_STATE crc,
uint64_t xor);
int
main(int argc, char **argv)
{
int opt;
int n = DFLT_N,
res;
/* process options */
while ((opt = getopt(argc, argv, "n:")) != EOF) {
switch (opt) {
case 'n': res = sscanf(optarg, "%u", &n);
if (res != 1) {
fprintf(stderr,
"%s: invalid format (-n)\n",
optarg);
}
break;
} /* switch */
} /* while */
/* initialization of random number generator */
unsigned short random_state[3];
int fd = open(RANDOM_DEV, O_RDONLY);
if (fd < 0) {
fprintf(stderr,
"open: %s: %s\n",
RANDOM_DEV, strerror(errno));
exit(EXIT_FAILURE);
}
res = read(fd, random_state, sizeof random_state);
if (res < 0) { /* error */
fprintf(stderr,
"read: %s: %s\n",
RANDOM_DEV, strerror(errno));
exit(EXIT_FAILURE);
}
if (res < sizeof random_state) {
fprintf(stderr,
"read: %s: incomplete read (%d/%zd)\n",
RANDOM_DEV, res, sizeof random_state);
exit(EXIT_FAILURE);
}
seed48(random_state);
close(fd);
/* generate a list of random numbers and make two copies */
uint64_t *original = calloc(n, sizeof *original),
*copy_sorted = calloc(n, sizeof *copy_sorted),
*random_sort = calloc(n, sizeof *random_sort);
/* make two copies */
for (int i = 0; i < n; i++) {
original[i] = copy_sorted[i]
= random_sort[i]
= (long)lrand48() | ((long)lrand48() << 32);
}
/* sort the numbers */
qsort(copy_sorted, n, sizeof *copy_sorted, long_compare);
/* and random permutation */
for (int i = 0; i < n-1; i++) {
int j = lrand48() % (n - i);
if (i != j) {
uint64_t temp = random_sort[i];
random_sort[i] = random_sort[j];
random_sort[j] = temp;
}
}
/* calculate the sorts */
uint64_t xor_original = 0, xor_sorted = 0, xor_random = 0;
for (int i = 0; i < n; i++) {
xor_original ^= original[i];
xor_sorted ^= copy_sorted[i];
xor_random ^= random_sort[i];
}
/* now, calculate the crc's (a crc64 would be better for long) */
CRC_STATE
crc_original = do_crc(0xffffffff, (unsigned char *)&xor_original,
sizeof xor_original, crc32ieee8023),
crc_sorted = do_crc(0xffffffff, (unsigned char *)&xor_sorted,
sizeof xor_sorted, crc32ieee8023),
crc_random = do_crc(0xffffffff, (unsigned char *)&xor_random,
sizeof xor_random, crc32ieee8023);
print("original", original, n, crc_original, xor_original);
print(" sorted", copy_sorted, n, crc_sorted, xor_sorted);
print(" random", random_sort, n, crc_random, xor_random);
if (crc_original != crc_sorted || crc_sorted != crc_random) {
fprintf(stderr, "crc's don't match (crc_original == 0x%08lx, "
"crc_sorted == 0x%08lx, crc_random == 0x%08lx)\n",
crc_original, crc_sorted, crc_random);
}
/* change only one bit in one element to see how it changes the hash */
int bit_to_change = lrand48() % (n * 64),
elem_to_change = bit_to_change % n;
bit_to_change %= 64;
original[elem_to_change] ^= (1UL << bit_to_change); /* change the bit */
/* we should do the calculation over all elements, but just
* changing a bit in one element will change just the same bit in the
* xor_original accumulation variable */
uint64_t xor_original_new = xor_original;
xor_original_new ^= (1UL << bit_to_change);
printf("element=%d, bit=%d\n", elem_to_change, bit_to_change);
uint64_t crc_original_new = do_crc(0xffffffff, (unsigned char *)&xor_original_new, sizeof xor_original_new, crc32ieee8023);
print(" chg1bit", original, n, crc_original_new, xor_original_new);
}
int long_compare(const void *_a, const void *_b)
{
const uint64_t *a = _a, *b = _b;
return *a == *b
? 0
: *a > *b
? +1
: -1;
}
void print(const char *name, const uint64_t *v, int vsz, CRC_STATE crc, uint64_t xor)
{
printf("%s: { ", name);
char *sep = "";
for (int i = 0; i < vsz; i++) {
printf("%s0x%016lx", sep, v[i]);
sep = ", ";
}
printf(" }\n"
" xor = 0x%016lx, crc = 0x%08lx\n",
xor, crc);
}
crc.c
#include <sys/types.h>
#include "crc.h"
/* table based CRC calculation */
CRC_STATE do_crc(
CRC_STATE state,
CRC_BYTE *buff,
size_t nbytes,
CRC_STATE *table)
{
CRC_STATE index;
while (nbytes--) {
state ^= *buff++;
index = state & CRC_BYTE_MASK;
state >>= CRC_BYTE_SIZE;
state ^= table[index];
} /* while */
return state;
} /* do_crc */
crc32ieee8023.c
#include "crc.h"
/* variables */
CRC_STATE crc32ieee8023[] = {
/* Comando usado: mkcrc -gpedb88320 */
/* Polinomio: x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1 */
/* 0 */ 0x0, 0x77073096, 0xee0e612c, 0x990951ba,
/* 4 */ 0x76dc419, 0x706af48f, 0xe963a535, 0x9e6495a3,
/* 8 */ 0xedb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988,
/* 12 */ 0x9b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91,
/* 16 */ 0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de,
/* 20 */ 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7,
/* 24 */ 0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec,
/* 28 */ 0x14015c4f, 0x63066cd9, 0xfa0f3d63, 0x8d080df5,
/* 32 */ 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172,
/* 36 */ 0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b,
/* 40 */ 0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940,
/* 44 */ 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59,
/* 48 */ 0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116,
/* 52 */ 0x21b4f4b5, 0x56b3c423, 0xcfba9599, 0xb8bda50f,
/* 56 */ 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924,
/* 60 */ 0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d,
/* 64 */ 0x76dc4190, 0x1db7106, 0x98d220bc, 0xefd5102a,
/* 68 */ 0x71b18589, 0x6b6b51f, 0x9fbfe4a5, 0xe8b8d433,
/* 72 */ 0x7807c9a2, 0xf00f934, 0x9609a88e, 0xe10e9818,
/* 76 */ 0x7f6a0dbb, 0x86d3d2d, 0x91646c97, 0xe6635c01,
/* 80 */ 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e,
/* 84 */ 0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457,
/* 88 */ 0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c,
/* 92 */ 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65,
/* 96 */ 0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2,
/* 100 */ 0x4adfa541, 0x3dd895d7, 0xa4d1c46d, 0xd3d6f4fb,
/* 104 */ 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0,
/* 108 */ 0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9,
/* 112 */ 0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086,
/* 116 */ 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f,
/* 120 */ 0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4,
/* 124 */ 0x59b33d17, 0x2eb40d81, 0xb7bd5c3b, 0xc0ba6cad,
/* 128 */ 0xedb88320, 0x9abfb3b6, 0x3b6e20c, 0x74b1d29a,
/* 132 */ 0xead54739, 0x9dd277af, 0x4db2615, 0x73dc1683,
/* 136 */ 0xe3630b12, 0x94643b84, 0xd6d6a3e, 0x7a6a5aa8,
/* 140 */ 0xe40ecf0b, 0x9309ff9d, 0xa00ae27, 0x7d079eb1,
/* 144 */ 0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe,
/* 148 */ 0xf762575d, 0x806567cb, 0x196c3671, 0x6e6b06e7,
/* 152 */ 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc,
/* 156 */ 0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5,
/* 160 */ 0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252,
/* 164 */ 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b,
/* 168 */ 0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60,
/* 172 */ 0xdf60efc3, 0xa867df55, 0x316e8eef, 0x4669be79,
/* 176 */ 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236,
/* 180 */ 0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f,
/* 184 */ 0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04,
/* 188 */ 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d,
/* 192 */ 0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x26d930a,
/* 196 */ 0x9c0906a9, 0xeb0e363f, 0x72076785, 0x5005713,
/* 200 */ 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0xcb61b38,
/* 204 */ 0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0xbdbdf21,
/* 208 */ 0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e,
/* 212 */ 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777,
/* 216 */ 0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c,
/* 220 */ 0x8f659eff, 0xf862ae69, 0x616bffd3, 0x166ccf45,
/* 224 */ 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2,
/* 228 */ 0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db,
/* 232 */ 0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0,
/* 236 */ 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9,
/* 240 */ 0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6,
/* 244 */ 0xbad03605, 0xcdd70693, 0x54de5729, 0x23d967bf,
/* 248 */ 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94,
/* 252 */ 0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d,
}; /* crc32ieee8023 */
Makefile
targets = test_xch
toclean = $(targets)
test_xch_deps =
test_xch_objs = crc32ieee8023.o crc.o test_xor_crc_hash.o
test_xch_libs =
test_xch_ldfl =
toclean += $(test_xch_objs)
all: $(targets)
clean:
$(RM) $(toclean)
test_xch: $(test_xch_deps) $(test_xch_objs)
$(CC) $(LDFLAGS) $($#_ldfl) -o $# $($#_objs) $($#_libs) $(LIBS)
To make the program, just run:
$ make
and to run it, you can use option -n that allows you to specify the number of random elements to generate.
I think you will have to invent one to avoid the slow sorting option. In addition to XOR and arithmetic addition, there are bit rotations, and bit masks you could use. If you need high collision resistance, you could just combine more than one of the hash functions. e.g. Assuming the d_i and arithmetic are modular like with uint32_t for example,
H_1 = sum_{i = 1 to n} d_i
H_2 = xor_{i = 1 to n} d_i
H_3 = xor_{i = 1 to n} (rotl(d_i, d_i & 0x1f) + c)
Then take H1H2H3 as a 12 byte hash.

How do I return a pointer to location in a buffer? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I'm currently working on a project in C which I'm required to create a buffer program. For the most part my program is finished but I'm just missing one required function: a location function to return the distance between the head (beginning of array) to the desired location designated by short loc_offset.
Given what I already have, how do I return a pointer to an index or location from my short?
Here is my given and created code:
Main (plat_bt.c):
/* File name: platy_bt.c
* Purpose:This is the main program for Assignment #1, CST8152
* Version: 1.18.1
* Author: Svillen Ranev
* Date: 16 January 2018
*/
/* The #define _CRT_SECURE_NO_WARNINGS should be used in MS Visual Studio projects
* to suppress the warnings about using "unsafe" functions like fopen()
* and standard sting library functions defined in string.h.
* The define directive does not have any effect on other compiler projects (gcc, Borland).
*/
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <limits.h>
#include "buffer.h"
/* constant definitions */
#define INIT_CAPACITY 200 /* initial buffer capacity */
#define INC_FACTOR 15 /* increment factor */
/*check for ANSI C compliancy */
#define ANSI_C 0
#if defined(__STDC__)
#undef ANSI_C
#define ANSI_C 1
#endif
/* Declaration of an error printing function with
* variable number of arguments
*/
void err_printf(char *fmt, ...);
/* Declaration of a buffer contents display function */
void display (Buffer *ptr_Buffer);
long get_filesize(char *fname);
int main(int argc, char **argv){
pBuffer ptr_Buffer; /* pointer to Buffer structure */
FILE *fi; /* input file handle */
int loadsize = 0; /*the size of the file loaded in the buffer */
int ansi_c = !ANSI_C; /* ANSI C compliancy flag */
/* Check if the compiler option is set to compile ANSI C */
/* __DATE__, __TIME__, __LINE__, __FILE__, __STDC__ are predefined preprocessor macros*/
if(ansi_c){
err_printf("Date: %s Time: %s",__DATE__, __TIME__);
err_printf("ERROR: Compiler is not ANSI C compliant!\n");
exit(1);
}
/* missing file name or/and mode parameter */
if (argc <= 2){
err_printf("\nDate: %s Time: %s",__DATE__, __TIME__);
err_printf("\nRuntime error at line %d in file %s\n", __LINE__, __FILE__);
err_printf("%s\b\b\b\b%s%s",argv[0],": ","Missing parameters.");
err_printf("Usage: platybt source_file_name mode");
exit(1);
}
/* create a source code input buffer */
switch(*argv[2]){
case 'f': case 'a': case 'm': break;
default:
err_printf("%s%s%s",argv[0],": ","Wrong mode parameter.");
exit(1);
}
/*create the input buffer */
ptr_Buffer = b_allocate(INIT_CAPACITY,INC_FACTOR,*argv[2]);
if (ptr_Buffer == NULL){
err_printf("%s%s%s",argv[0],": ","Cannot allocate buffer.");
exit(1);
}
/* open the source file */
if ((fi = fopen(argv[1],"r")) == NULL){
err_printf("%s%s%s%s",argv[0],": ", "Cannot open file: ",argv[1]);
exit (1);
}
/* load a source file into the input buffer */
printf("Reading file %s ....Please wait\n",argv[1]);
loadsize = b_load (fi,ptr_Buffer);
if(loadsize == RT_FAIL1)
err_printf("%s%s%s",argv[0],": ","Error in loading buffer.");
/* close the source file */
fclose(fi);
/*find the size of the file */
if (loadsize == LOAD_FAIL){
printf("The input file %s %s\n", argv[1],"is not completely loaded.");
printf("Input file size: %ld\n", get_filesize(argv[1]));
}
/* display the contents of the input buffer */
display(ptr_Buffer);
/* compact the buffer
* add end-of-file character (EOF) to the buffer
* display again
*/
if(!b_compact(ptr_Buffer,EOF)){
err_printf("%s%s%s",argv[0],": ","Error in compacting buffer.");
}
display(ptr_Buffer);
/* free the dynamic memory used by the buffer */
b_free(ptr_Buffer);
/* make the buffer invalid
It is not necessary here because the function terminates anyway,
but will prevent run-time errors and crashes in future expansions
*/
ptr_Buffer = NULL;
/*return success */
return (0);
}
/* error printing function with variable number of arguments*/
void err_printf( char *fmt, ... ){
/*Initialize variable list */
va_list ap;
va_start(ap, fmt);
(void)vfprintf(stderr, fmt, ap);
va_end(ap);
/* Move to new line */
if( strchr(fmt,'\n') == NULL )
fprintf(stderr,"\n");
}
void display (Buffer *ptr_Buffer){
printf("\nPrinting buffer parameters:\n\n");
printf("The capacity of the buffer is: %d\n",b_capacity(ptr_Buffer));
printf("The current size of the buffer is: %d\n",b_limit(ptr_Buffer));
printf("The operational mode of the buffer is: %d\n",b_mode(ptr_Buffer));
printf("The increment factor of the buffer is: %lu\n",b_incfactor(ptr_Buffer));
printf("The current mark of the buffer is: %d\n", b_mark(ptr_Buffer, b_limit(ptr_Buffer)));
/*printf("The reallocation flag is: %d\n",b_rflag(ptr_Buffer));*/
printf("\nPrinting buffer contents:\n\n");
b_rewind(ptr_Buffer);
b_print(ptr_Buffer);
}
long get_filesize(char *fname){
FILE *input;
long flength;
input = fopen(fname, "r");
if(input == NULL){
err_printf("%s%s","Cannot open file: ",fname);
return 0;
}
fseek(input, 0L, SEEK_END);
flength = ftell(input);
fclose(input);
return flength;
}
That main file has no errors or warnings, it was given to me to be used as my main.
Here is my edited source file (buffer.c):
/* File name: buffer.c
* Purpose: This is the main program for Assignment #01, CST8152
* Version: 1.0.3
* Author: Jack Loveday
* Date: 7 Febuary, 2018
*/
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <limits.h>
#include "buffer.h"
/* b_allocate: Creates new buffer on heap
Tries to allocate memory for:
- one Buffer (calloc)
- one dynamic character buffer (char array) (malloc)
Sets buffer operational mode and inc factor
Copies given init_capacity into Buffer capacity val */
Buffer * b_allocate(short init_capacity, char inc_factor, char o_mode) {
Buffer *pBuffer;
if (init_capacity < 0 || (unsigned char)inc_factor < 0)
return NULL;
pBuffer = (Buffer *)calloc(1, sizeof(Buffer));
if (!pBuffer) {
free(pBuffer);
return NULL;
}
pBuffer->cb_head = (char *)malloc(init_capacity);
if (o_mode == 'f' || (unsigned char)inc_factor == 0) {
pBuffer->mode = 0;
pBuffer->inc_factor = 0;
}
else if (o_mode == 'a') {
pBuffer->mode = ADD_MODE;
pBuffer->inc_factor = (unsigned char)inc_factor;
}
else if (o_mode == 'm' && (unsigned char)inc_factor <= 100) {
pBuffer->mode = MULTI_MODE;
pBuffer->inc_factor = (unsigned char)inc_factor;
}
else {
free(pBuffer->cb_head);
free(pBuffer);
return NULL;
}
pBuffer->capacity = init_capacity;
return pBuffer;
}
/* b_addc: resets r_flag to 0 & tries to add symbol to char array pointed by pBD */
pBuffer b_addc(pBuffer const pBD, char symbol) {
char *tBuffer;
short aSpace = 0, newInc = 0, newCapacity = 0;
if (!pBD)
return NULL;
pBD->r_flag = 0;
if ((short)(pBD->addc_offset * sizeof(char)) < pBD->capacity) {
pBD->cb_head[pBD->addc_offset] = symbol;
pBD->addc_offset++;
}
else if ((short)(pBD->addc_offset * sizeof(char)) == pBD->capacity) {
if (pBD->mode == 0)
return NULL;
else if (pBD->mode == 1) {
newCapacity = pBD->capacity + (unsigned char)pBD->inc_factor * sizeof(char);
if (newCapacity <= 0)
return NULL;
pBD->capacity = newCapacity;
}
else if (pBD->mode == -1) {
if (pBD->capacity == SHRT_MAX)
return NULL;
else {
aSpace = SHRT_MAX - pBD->capacity;
newInc = (short)((aSpace * pBD->inc_factor) / 100);
if (!newInc)
newInc = NEW_ONE;
newCapacity = (short)((pBD->capacity + newInc));
if (newCapacity <= 0)
newCapacity = SHRT_MAX;
}
}
tBuffer = (char *)realloc(pBD->cb_head, newCapacity);
if (tBuffer != pBD->cb_head)
pBD->r_flag = SET_R_FLAG;
pBD->cb_head = tBuffer;
pBD->cb_head[pBD->addc_offset] = symbol;
pBD->addc_offset++;
pBD->capacity = newCapacity;
}
return pBD;
}
/* b_clear: retains memory space currently allocated to buffer, but
re-initializes all appropriate members of given Buffer */
int b_clear(Buffer * const pBD){
if (!pBD)
return RT_FAIL1;
pBD->capacity = 0;
pBD->addc_offset = 0;
pBD->getc_offset = 0;
pBD->eob = 0;
return 1;
}
/* b_free: de-allocates memory for char array & Buffer struct */
void b_free(Buffer * const pBD) {
if (pBD) {
free(pBD->cb_head);
free(pBD);
}
}
/* b_isfull: returns 1 if buffer is full, otherwise returns 0, if rt error return -1 */
int b_isfull(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
if ((short)(pBD->addc_offset * sizeof(char)) == pBD->capacity)
return 1;
else
return 0;
}
/* b_limit: returns current limit of buffer, return -1 on error */
short b_limit(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
return pBD->addc_offset;
}
/* b_capacity: returns current capacity of buffer, return -1 on error */
short b_capacity(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
return pBD->capacity;
}
/* b_mark: sets markc_offset to mark, must be between 0 and addc_offset, return -1 on error */
short b_mark(pBuffer const pBD, short mark) {
if (!pBD || mark < 0 || mark > pBD->addc_offset)
return RT_FAIL1;
pBD->markc_offset = mark;
return pBD->markc_offset;
}
/* b_mode: returns value of mode, or -1 on error */
int b_mode(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
return pBD->mode;
}
/* b_infactor: returns non-negative value of inc_factor, return 0x100 (256) */
size_t b_incfactor(Buffer * const pBD) {
if (!pBD)
return NEW_FULL;
return (unsigned char)pBD->inc_factor;
}
/* b_load: returns value of mode */
int b_load(FILE * const fi, Buffer * const pBD) {
char c;
int cNum = 0;
if (!fi || !pBD)
return RT_FAIL1;
for (;;) {
c = (char)fgetc(fi);
if (feof(fi))
break;
if (!b_addc(pBD, c))
return LOAD_FAIL;
++cNum;
}
fclose(fi);
return cNum;
}
/* b_isempty: if addc_offset is 0 return 1, otherwise return 0, return -1 on error */
int b_isempty(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
if (!pBD->addc_offset)
return 1;
else
return 0;
}
/* b_eob: returns eob (end of buffer), returns -1 on error */
int b_eob(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
return pBD->eob;
}
/* b_getc: checks logic, if not valid returns -2, if getc_offset and addc_offset
are equal, set eob to 1 and returns -1, otherwise set eob to 0 */
char b_getc(Buffer * const pBD) {
char returner;
if (!pBD || !pBD->cb_head)
return RT_FAIL2;
if (pBD->getc_offset == pBD->addc_offset) {
pBD->eob = 1;
return RT_FAIL1;
}
else {
pBD->eob = 0;
}
returner = *(pBD->cb_head + pBD->getc_offset);
pBD->getc_offset++;
return returner;
}
/* b_print: used for diagnostic purposes only, returns -1 on error */
int b_print(Buffer * const pBD) {
int numOfChars = 0;
char c;
if (!pBD->addc_offset) {
printf("The Buffer is empty.\n");
return numOfChars;
}
pBD->getc_offset = 0;
while (1) {
c = b_getc(pBD);
if (b_eob(pBD))
break;
printf("%c", c);
++numOfChars;
}
numOfChars = pBD->getc_offset;
pBD->getc_offset = 0;
printf("\n");
return numOfChars;
}
/* b_compact: shrinks buffer to new capacity, before returning to a pointer add symbol
to the end of buffer, must set r_flag appropriatley */
Buffer * b_compact(Buffer * const pBD, char symbol) {
char *tempBuffer;
short tempCap;
if (!pBD)
return NULL;
tempCap = (pBD->addc_offset + 1) * sizeof(char);
if (tempCap == SHRT_MAX)
return pBD;
tempBuffer = (char *)realloc(pBD->cb_head, (unsigned short)tempCap);
if (tempCap == 0)
return NULL;
if (tempBuffer != pBD->cb_head)
pBD->r_flag = SET_R_FLAG;
pBD->cb_head = tempBuffer;
pBD->capacity = tempCap;
return pBD;
}
/* b_rflag: returns r_flag, returns -1 on error */
char b_rflag(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
return pBD->r_flag;
}
/* b_retract: decrements getc_offset by 1, returns -1 on error, otherwise returns getc_offset */
short b_retract(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
pBD->getc_offset--;
return pBD->getc_offset;
}
/* b_reset: sets getc_offset to current markc_offset, returns -1 on error and getc_offset otherwise */
short b_reset(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
pBD->getc_offset = pBD->markc_offset;
return pBD->getc_offset;
}
/* b_getcoffset: returns getc_offset, or returns -1 on error */
short b_getcoffset(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
return pBD->getc_offset;
}
/* b_rewind: set getc_offset and markc_offset to 0, return -1 on error and 0 otherwise */
int b_rewind(Buffer * const pBD) {
if (!pBD)
return RT_FAIL1;
pBD->addc_offset = 0;
pBD->markc_offset = 0;
return 0;
}
/* b_location: returns a pointer to location indicated by loc_offset, returns null on error */
char * b_location(Buffer * const pBD, short loc_offset) {
if (!pBD)
return NULL;
return *loc_offset;
}
And finally my buffer.h:
/*File Name: buffer.h
* Version: 1.18.1
* Author: S^R
* Date: 16 January 2018
* Preprocessor directives, type declarations and prototypes necessary for buffer implementation
* as required for CST8152-Assignment #1.
* The file is not completed.
* You must add your function declarations (prototypes).
* You must also add your constant definitions and macros,if any.
*/
#ifndef BUFFER_H_
#define BUFFER_H_
/*#pragma warning(1:4001) *//*to enforce C89 type comments - to make //comments an warning */
/* standard header files */
#include <stdio.h> /* standard input/output */
#include <malloc.h> /* for dynamic memory allocation*/
#include <limits.h> /* implementation-defined data type ranges and limits */
/* constant definitions */
/* You may add your own constant definitions here */
#define RT_FAIL1 -1 /* fail return value */
#define RT_FAIL2 -2 /* fail return value */
#define LOAD_FAIL -2 /* load fail error */
#define SET_R_FLAG 1 /* realloc flag set value */
#define ADD_MODE 1; /* named constant for additive mode */
#define MULTI_MODE -1; /* named constant for multiplicative mode */
#define NEW_ONE 1; /* generic named constant value of 1 */
#define NEW_FULL 256; /* generic named constant value of 256 */
/* user data type declarations */
typedef struct BufferDescriptor {
char *cb_head; /* pointer to the beginning of character array (character buffer) */
short capacity; /* current dynamic memory size (in bytes) allocated to character buffer */
short addc_offset; /* the offset (in chars) to the add-character location */
short getc_offset; /* the offset (in chars) to the get-character location */
short markc_offset; /* the offset (in chars) to the mark location */
char inc_factor; /* character array increment factor */
char r_flag; /* reallocation flag */
char mode; /* operational mode indicator*/
int eob; /* end-of-buffer flag */
} Buffer, *pBuffer;
/* function declarations */
/*
Place your function declarations here.
Do not include the function header comments here.
Place them in the buffer.c file
*/
Buffer * b_allocate(short init_capacity, char inc_factor, char o_mode);
pBuffer b_addc(pBuffer const pBD, char symbol);
int b_clear(Buffer * const pBD);
void b_free(Buffer * const pBD);
int b_isfull(Buffer * const pBD);
short b_limit(Buffer * const pBD);
short b_capacity(Buffer * const pBD);
short b_mark(pBuffer const pBD, short mark);
int b_mode(Buffer * const pBD);
size_t b_incfactor(Buffer * const pBD);
int b_load(FILE * const fi, Buffer * const pBD);
int b_isempty(Buffer * const pBD);
int b_eob(Buffer * const pBD);
char b_getc(Buffer * const pBD);
int b_print(Buffer * const pBD);
Buffer * b_compact(Buffer * const pBD, char symbol);
char b_rflag(Buffer * const pBD);
short b_retract(Buffer * const pBD);
short b_reset(Buffer * const pBD);
short b_getcoffset(Buffer * const pBD);
int b_rewind(Buffer * const pBD);
char * b_location(Buffer * const pBD, short loc_offset);
#endif
So again, I need to find the location of loc_offset in my b_location function and return it as a pointer. Thanks in advance!
char * b_location(Buffer * const pBD, short loc_offset) {
if (!pBD)
return NULL;
/* Make sure loc_offset doesn't go beyond the length of the array */
if( ! in_bounds )
return NULL;
return &(pBD->cb_head[loc_offset]);
}

"stray '#' in program assert" error when compiling

I keep getting the above error message for blocks of code that use the assert function and I cannot see why this is. I have search the internet for potential answers to this error .. with no success. So any/all assistance would be very much appreciated.
Below is the aforementioned block of code that throws up this error:
#include <stdio.h>
#include <assert.h>
typedef struct WORD
{
char *Word;
size_t Count;
struct WORD *Left;
struct WORD *Right;
} WORD;
#define SUCCESS 0
#define NO_MEMORY_FOR_WORDNODE 3
#define NO_MEMORY_FOR_WORD 4
int AddToTree(WORD **DestTree, size_t *Treecount, char *Word)
{
int Status = SUCCESS;
int CompResult = 0;
/* safety check */
assert(NULL != DestTree);
assert(NULL != Treecount);
assert(NULL != Word);
/* DestTree is NULL or it isn't */
if(NULL == *DestTree){ /* this is the place to add it then */
*DestTree = malloc(sizeof **DestTree);
if(NULL == *DestTree){
/* out of memory */
Status = NO_MEMORY_FOR_WORDNODE;
}
else {
(*DestTree)->Left = NULL;
(*DestTree)->Right = NULL;
(*DestTree)->Count = 1;
(*DestTree)->Word = dupstr(Word);
if(NULL == (*DestTree)->Word){
/* out of memory in the middle */
Status = NO_MEMORY_FOR_WORD;
free(*DestTree);
*DestTree = NULL;
}
else {
/* everything worked, add one to the tree nodes count */
++*Treecount;
}
}
}
else { /* we need to make a decision */
CompResult = strcmp(Word, (*DestTree)->Word);
if(0 < CompResult){
Status = AddToTree(&(*DestTree)->Left, Treecount, Word);
}
else if(0 > CompResult){
Status = AddToTree(&(*DestTree)->Left, Treecount, Word);
}
else {
/* add one to the count - this is the same node */
++(*DestTree)->Count;
}
} /* end of else we need to make a decision */
return Status;
}
Please note that each one of the assert lines throws up the same error.
I took #alk's advice to check the preprocessor outputs and this is what I get:
# 2 "ex6-4.c" 2
# 1 "c:\\mingw\\include\\assert.h" 1 3
# 38 "c:\\mingw\\include\\assert.h" 3
void __attribute__((__cdecl__)) __attribute__ ((__nothrow__)) _assert (const ch
ar*, const char*, int) __attribute__ ((__noreturn__));
# 16 "ex6-4.c" 2`
I have to admit that this makes as much sense to me as the error in question.

Duplicated declarations. error: conflicting types for ‘functionname’

I'm trying to include these files to my main C code:
who.c:
/* who3.c - who with buffered reads
* - surpresses empty records
* - formats time nicely
* - buffers input (using utmplib)
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <utmp.h>
#include <fcntl.h>
#include <time.h>
#include "utmplib.h"
#define SHOWHOST
void show_info(struct utmp *);
void showtime(time_t);
int Who()
{
struct utmp *utbufp, /* holds pointer to next rec */
*utmp_next(); /* returns pointer to next */
if ( utmp_open( UTMP_FILE ) == -1 ){
perror(UTMP_FILE);
exit(1);
}
while ( ( utbufp = utmp_next() ) != ((struct utmp *) NULL) )
show_info( utbufp );
utmp_close( );
return 0;
}
/*
* show info()
* displays the contents of the utmp struct
* in human readable form
* * displays nothing if record has no user name
*/
void show_info( struct utmp *utbufp )
{
printf("%-8.8s", utbufp->ut_name); /* the logname */
printf(" "); /* a space */
printf("%-8.8s", utbufp->ut_line); /* the tty */
printf(" "); /* a space */
showtime( utbufp->ut_time ); /* display time */
#ifdef SHOWHOST
if ( utbufp->ut_host[0] != '\0' )
printf(" (%s)", utbufp->ut_host); /* the host */
#endif
printf("\n"); /* newline */
}
void showtime( time_t timeval )
/*
* displays time in a format fit for human consumption
* uses ctime to build a string then picks parts out of it
* Note: %12.12s prints a string 12 chars wide and LIMITS
* it to 12chars.
*/
{
char *ctime(); /* convert long to ascii */
char *cp; /* to hold address of time */
cp = ctime( &timeval ); /* convert time to string */
/* string looks like */
/* Mon Feb 4 00:46:40 EST 1991 */
/* 0123456789012345. */
printf("%12.12s", cp+4 ); /* pick 12 chars from pos 4 */
}
who.h:
#ifndef WHO_H
#define WHO_H
/* This file was automatically generated. Do not edit! */
int Who();
void showtime(time_t);
void showtime(time_t timeval);
void show_info(struct utmp *);
void show_info(struct utmp *utbufp);
#endif
utmplib.c:
/* utmplib.c - functions to buffer reads from utmp file
*
* functions are
* utmp_open( filename ) - open file
* returns -1 on error
* utmp_next( ) - return pointer to next struct
* returns NULL on eof
* utmp_close() - close file
*
* reads NRECS per read and then doles them out from the buffer
*/
#include <stdio.h>
#include <fcntl.h>
#include <sys/types.h>
#include <utmp.h>
#define NRECS 1
#define NULLUT ((struct utmp *)NULL)
#define UTSIZE (sizeof(struct utmp))
static char utmpbuf[NRECS * UTSIZE]; /* storage */
static int num_recs; /* num stored */
static int cur_rec; /* next to go */
static int fd_utmp = -1; /* read from */
utmp_open( char *filename )
{
fd_utmp = open( filename, O_RDONLY ); /* open it */
cur_rec = num_recs = 0; /* no recs yet */
return fd_utmp; /* report */
}
struct utmp *utmp_next()
{
struct utmp *recp;
//struct utmp *nextRecp;
int match = 0;
if(recp->ut_type == USER_PROCESS)
{
recp = ( struct utmp *) &utmpbuf[cur_rec * UTSIZE];
}
while(recp->ut_type!= USER_PROCESS)
{
if ( fd_utmp == -1 ) /* error ? */
return NULLUT;
if ( cur_rec==num_recs && utmp_reload()==0 ) /* any more ? */
return NULLUT;
/* get address of next record */
recp = ( struct utmp *) &utmpbuf[cur_rec * UTSIZE];
cur_rec++;
}
return recp;
}
int utmp_reload()
/*
* read next bunch of records into buffer
*/
{
int amt_read;
/* read them in */
amt_read = read( fd_utmp , utmpbuf, NRECS * UTSIZE );
/* how many did we get? */
num_recs = amt_read/UTSIZE;
/* reset pointer */
cur_rec = 0;
return num_recs;
}
utmp_close()
{
if ( fd_utmp != -1 ) /* don't close if not */
close( fd_utmp ); /* open */
}
utmplib.h:
#ifndef UTMPLIB_H
#define UTMPLIB_H
/* This file was automatically generated. Do not edit! */
utmp_close();
int utmp_reload();
struct utmp *utmp_next();
utmp_open(char *filename);
#endif
But I can not compile these files, I'm getting this error:
who.h:9:6: error: conflicting types for ‘show_info’
void show_info(struct utmp *utbufp);
I didn't write these two file. First of all why there are duplicated declarations? And how can I fix it?
I generated header files using makeheaders.
Actually these files are working example for who command in Linux or basic version of who.c in GNU core utils. Taken from here
who.c compiling without any error and working seamlessly(if you change Who function name with main):
cc who.c utmplib.c -o who
who
You can override flags when using automake, e.g., 'make CFLAGS=-O0 file.o' to disable optimization for file.c

searching an lzw encoded file

I am looking to alter the LZW compressor to enable it to search for a word in an LZW encoded file and finds the number of matches for that search term.
For example if my file is used as
Prompt:>lzw "searchterm" encoded_file.lzw
32
Any suggestions on how to achive this?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BITS 12 /* Setting the number of bits to 12, 13*/
#define HASHING_SHIFT (BITS-8) /* or 14 affects several constants. */
#define MAX_VALUE (1 << BITS) - 1 /* Note that MS-DOS machines need to */
#define MAX_CODE MAX_VALUE - 1 /* compile their code in large model if*/
/* 14 bits are selected. */
#if BITS == 16
#define TABLE_SIZE 99991
#endif
#if BITS == 14
#define TABLE_SIZE 18041 /* The string table size needs to be a */
#endif /* prime number that is somewhat larger*/
#if BITS == 13 /* than 2**BITS. */
#define TABLE_SIZE 9029
#endif
#if BITS <= 12
#define TABLE_SIZE 5021
#endif
void *malloc();
int *code_value; /* This is the code value array */
unsigned int *prefix_code; /* This array holds the prefix codes */
unsigned char *append_character; /* This array holds the appended chars */
unsigned char decode_stack[4000]; /* This array holds the decoded string */
/*
* Forward declarations
*/
void compress(FILE *input,FILE *output);
void expand(FILE *input,FILE *output);
int find_match(int hash_prefix,unsigned int hash_character);
void output_code(FILE *output,unsigned int code);
unsigned int input_code(FILE *input);
unsigned char *decode_string(unsigned char *buffer,unsigned int code);
/********************************************************************
**
** This program gets a file name from the command line. It compresses the
** file, placing its output in a file named test.lzw. It then expands
** test.lzw into test.out. Test.out should then be an exact duplicate of
** the input file.
**
*************************************************************************/
main(int argc, char *argv[])
{
FILE *input_file;
FILE *output_file;
FILE *lzw_file;
char input_file_name[81];
char command;
command=(argv==3);
/*
** The three buffers are needed for the compression phase.
*/
code_value=(int*)malloc(TABLE_SIZE*sizeof(int));
prefix_code=(unsigned int *)malloc(TABLE_SIZE*sizeof(unsigned int));
append_character=(unsigned char *)malloc(TABLE_SIZE*sizeof(unsigned char));
if (code_value==NULL || prefix_code==NULL || append_character==NULL)
{
printf("Fatal error allocating table space!\n");
exit(-1);
}
/*
** Get the file name, open it up, and open up the lzw output file.
*/
if (argc>1)
strcpy(input_file_name,argv[1]);
else
{
printf("Input file name? ");
scanf("%s",input_file_name);
}
input_file=fopen(input_file_name,"rb");
lzw_file=fopen("test.lzw","wb");
if (input_file==NULL || lzw_file==NULL)
{
printf("Fatal error opening files.\n");
exit(-1);
};
/*
** Compress the file.
*/
if(command=='r')
{
compress(input_file,lzw_file);
}
fclose(input_file);
fclose(lzw_file);
free(c-ode_value);
/*
** Now open the files for the expansion.
*/
lzw_file=fopen("test.lzw","rb");
output_file=fopen("test.out","wb");
if (lzw_file==NULL || output_file==NULL)
{
printf("Fatal error opening files.\n");
exit(-2);
};
/*
** Expand the file.
*/
expand(lzw_file,output_file);
fclose(lzw_file);
fclose(output_file);
free(prefix_code);
free(append_character);
}
/*
** This is the compression routine. The code should be a fairly close
** match to the algorithm accompanying the article.
**
*/
void compress(FILE *input,FILE *output)
{
unsigned int next_code;
unsigned int character;
unsigned int string_code;
unsigned int index;
int i;
next_code=256; /* Next code is the next available string code*/
for (i=0;i<TABLE_SIZE;i++) /* Clear out the string table before starting */
code_value[i]=-1;
i=0;
printf("Compressing...\n");
string_code=getc(input); /* Get the first code */
/*
** This is the main loop where it all happens. This loop runs util all of
** the input has been exhausted. Note that it stops adding codes to the
** table after all of the possible codes have been defined.
*/
while ((character=getc(input)) != (unsigned)EOF)
{
if (++i==1000) /* Print a * every 1000 */
{ /* input characters. This */
i=0; /* is just a pacifier. */
printf("*");
}
index=find_match(string_code,character);/* See if the string is in */
if (code_value[index] != -1) /* the table. If it is, */
string_code=code_value[index]; /* get the code value. If */
else /* the string is not in the*/
{ /* table, try to add it. */
if (next_code <= MAX_CODE)
{
code_value[index]=next_code++;
prefix_code[index]=string_code;
append_character[index]=character;
}
output_code(output,string_code); /* When a string is found */
string_code=character; /* that is not in the table*/
} /* I output the last string*/
} /* after adding the new one*/
/*
** End of the main loop.
*/
output_code(output,string_code); /* Output the last code */
output_code(output,MAX_VALUE); /* Output the end of buffer code */
output_code(output,0); /* This code flushes the output buffer*/
printf("\n");
}
/*
** This is the hashing routine. It tries to find a match for the prefix+char
** string in the string table. If it finds it, the index is returned. If
** the string is not found, the first available index in the string table is
** returned instead.
*/
int find_match(int hash_prefix,unsigned int hash_character)
{
int index;
int offset;
index = (hash_character << HASHING_SHIFT) ^ hash_prefix;
if (index == 0)
offset = 1;
else
offset = TABLE_SIZE - index;
while (1)
{
if (code_value[index] == -1)
return(index);
if (prefix_code[index] == hash_prefix &&
append_character[index] == hash_character)
return(index);
index -= offset;
if (index < 0)
index += TABLE_SIZE;
}
}
/*
** This is the expansion routine. It takes an LZW format file, and expands
** it to an output file. The code here should be a fairly close match to
** the algorithm in the accompanying article.
*/
void expand(FILE *input,FILE *output)
{
unsigned int next_code;
unsigned int new_code;
unsigned int old_code;
int character;
int counter;
unsigned char *string;
next_code=256; /* This is the next available code to define */
counter=0; /* Counter is used as a pacifier. */
printf("Expanding...\n");
old_code=input_code(input); /* Read in the first code, initialize the */
character=old_code; /* character variable, and send the first */
putc(old_code,output); /* code to the output file */
/*
** This is the main expansion loop. It reads in characters from the LZW file
** until it sees the special code used to inidicate the end of the data.
*/
while ((new_code=input_code(input)) != (MAX_VALUE))
{
if (++counter==1000) /* This section of code prints out */
{ /* an asterisk every 1000 characters */
counter=0; /* It is just a pacifier. */
printf("*");
}
/*
** This code checks for the special STRING+CHARACTER+STRING+CHARACTER+STRING
** case which generates an undefined code. It handles it by decoding
** the last code, and adding a single character to the end of the decode string.
*/
if (new_code>=next_code)
{
*decode_stack=character;
string=decode_string(decode_stack+1,old_code);
}
/*
** Otherwise we do a straight decode of the new code.
*/
else
string=decode_string(decode_stack,new_code);
/*
** Now we output the decoded string in reverse order.
*/
character=*string;
while (string >= decode_stack)
putc(*string--,output);
/*
** Finally, if possible, add a new code to the string table.
*/
if (next_code <= MAX_CODE)
{
prefix_code[next_code]=old_code;
append_character[next_code]=character;
next_code++;
}
old_code=new_code;
}
printf("\n");
}
/*
** This routine simply decodes a string from the string table, storing
** it in a buffer. The buffer can then be output in reverse order by
** the expansion program.
*/
unsigned char *decode_string(unsigned char *buffer,unsigned int code)
{
int i;
i=0;
while (code > 255)
{
*buffer++ = append_character[code];
code=prefix_code[code];
if (i++>=MAX_CODE)
{
printf("Fatal error during code expansion.\n");
exit(-3);
}
}
*buffer=code;
return(buffer);
}
/*
** The following two routines are used to output variable length
** codes. They are written strictly for clarity, and are not
** particularyl efficient.
*/
unsigned int input_code(FILE *input)
{
unsigned int return_value;
static int input_bit_count=0;
static unsigned long input_bit_buffer=0L;
while (input_bit_count <= 24)
{
input_bit_buffer |=
(unsigned long) getc(input) << (24-input_bit_count);
input_bit_count += 8;
}
return_value=input_bit_buffer >> (32-BITS);
input_bit_buffer <<= BITS;
input_bit_count -= BITS;
return(return_value);
}
void output_code(FILE *output,unsigned int code)
{
static int output_bit_count=0;
static unsigned long output_bit_buffer=0L;
output_bit_buffer |= (unsigned long) code << (32-BITS-output_bit_count);
output_bit_count += BITS;
while (output_bit_count >= 8)
{
putc(output_bit_buffer >> 24,output);
output_bit_buffer <<= 8;
output_bit_count -= 8;
}
}
Here's a document on an algorithm to do regex searching directly in LZW compressed bytes :
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.9.1434&rep=rep1&type=pdf
It contains references to efficient algorithms to search for exact strings as well.

Resources