Zlib with using Win api createFile and ReadFile - zlib

I saw 'zlib Usage Example' at the www.zlib.net but they used 'fread' function from stdio.h.
I have to consider the performance of my program so I must use win api 'ReadFile' function.
But I saw this too.
This is an ugly hack required to avoid corruption of the input and output data on
Windows/MS-DOS systems. Without this, those systems would assume that the input and
output files are text, and try to convert the end-of-line characters from one standard
to another. That would corrupt binary data, and in particular would render the
compressed data unusable. This sets the input and output to binary which suppresses the
end-of-line conversions. SET_BINARY_MODE() will be used later on stdin and stdout, at
the beginning of main().
#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(__CYGWIN__)
# include <fcntl.h>
# include <io.h>
# define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY)
#else
# define SET_BINARY_MODE(file)
#endif
What should I do too use ReadFile on zlib?

Nothing. ReadFile reads binary.

Related

getline in C implementation for Window OS

Statement
I know there is a fucntion called getline() on OS made of Linux/Unix.
I want to know what other functions are not available in the Windows operating system, but available in the operating system made of Linux/Unix.
Question
Is there any getline() function made by yourself that can replace the one in Windows ?
What resources are available for reference and reading ?
size_t getline(char **lineptr, size_t *n, FILE *stream)
The relevant standard is IEEE Std 1003.1, also called POSIX.1, specifically its System Interfaces, with the list of affected functions here.
I recommend Linux man pages online. Do not be deterred by its name, because the C functions (described in sections 2, 3) have a section Conforming to, which specifies which standards standardize the feature. If it is C89, C99, etc., then it is included in the C standard; if POSIX.1, then in POSIX; if SuS, then in Single Unix Specification which preceded POSIX; if 4.3BSD, then in old BSD version 4.3; if SVr4, then in Unix System V release 4, and so on.
Windows implements its own extensions to C, and avoids supporting anything POSIX or SuS, but has ported some details from 4.3BSD. Use Microsoft documentation to find out.
In Linux, the C libraries expose these features if certain preprocessor macros are defined before any #include statements are done. These are described at man 7 feature_test_macros. I typically use #define _POSIX_C_SOURCE 200809L for POSIX, and occasionally #define _GNU_SOURCE for GNU C extensions.
getline() is an excellent interface, and not "leaky" except perhaps when used by programmers used to Microsoft/Windows inanities, like not being able to do wide character output to console without Microsoft-only extensions (because they just didn't want to put that implementation inside fwide(), apparently).
The most common use pattern is to initialize an unallocated buffer, and a suitable line length variable:
char *line_buf = NULL;
size_t line_max = 0;
ssize_t line_len;
Then, when you read a line, the C library is free to reallocate the buffer to whatever size is needed to contain the line. For example, your read file line-by-line loop might look like this:
while (1) {
len = getline(&line_buf, &line_max, stdin);
if (len < 0)
break;
// line_buf has len characters of data in it, and line_buf[len] == '\0'.
// If the input contained embedded '\0' bytes in it, then strlen(line_buf) < len.
// Normally, strlen(line_buf) == len.
}
free(line_buf);
line_buf = NULL;
line_max = 0;
if (!feof(stdin) || ferror(stdin)) {
// Not all of input was processed, or there was an error.
} else {
// All input processed without I/O errors.
}
Note that free(NULL) is safe, and does nothing. This means that we can safely use free(line_buf); line_buf = NULL; line_max = 0; after the loop –– in fact, at any point we want! –– to discard the current line buffer. If one is needed, the next getline() or getdelim() call with the same variables will allocate a new one.
The above pattern never leaks memory, and correctly detects all errors during file processing, from I/O errors to not having enough RAM available (or allowed for the current process), albeit it cannot distinguish between them: only that an error occurred. It also won't have false errors, unless you break out of the loop in your own added processing code.
Thus, any claims of getline() being "leaky" are anti-POSIX, pro-Microsoft propaganda. For some reason, Microsoft has steadfastly refused to implement these in their own C library, even though they easily could.
If you want to copy parts of the line, I do recommend using strdup() or strndup(), also POSIX.1-2008 functions. They return a dynamically allocated copy of the string, the latter only copying up to the specified number of characters (if the string does not end before that); in all cases, if the functions return a non-NULL pointer, the dynamically allocated string is terminated with a nul '\0', and should be freed with free() just like the getline() buffer above, when no longer needed.
If you have to run code on Microsoft also, a good option is to implement your own getline() on the architectures and OSes that do not provide one. (You can use the Pre-defined Compiler Macros Wiki to see how you can detect the code being compiled on a specific architecture, OS, or compiler.)
An example getline() implementation can be written on top of fgets(), growing the buffer and reading more (appending to existing buffer), until the buffer ends with a newline. It, however, cannot really handle embedded '\0' bytes in the data; to do that, and properly implement getdelim(), you need to read the data character-by-character, using e.g. fgetc().

Initialize a `FILE *` variable in C?

I've got several oldish code bases that initialize variables (to be able to redirect input/output at will) like
FILE *usrin = stdin, *usrout = stdout;
However, gcc (8.1.1, glibc 2.27.9000 on Fedora rawhide) balks at this (initializer is not a compile time constant). Rummaging in /usr/include/stdio.h I see:
extern FILE *stdin;
/* C89/C99 say they're macros. Make them happy */
#define stdin stdin
First, it makes no sense to me that you can't initialize variables this (rather natural) way for such use. Sure, you can do it in later code, but it is a nuisance.
Second, why is the macro expansion not a constant?
Third, what is the rationale for having them be macros?
First, it makes no sense to me that you can't initialize variables this (rather natural) way for such use.
Second, why is the macro expansion not a constant?
stdin, stdout, and stderr are pointers which are initialized during C library startup, possibly as the result of a memory allocation. Their values aren't known at compile time -- depending on how your C library works, they might not even be constants. (For instance, if they're pointers to statically allocated structures, their values will be affected by ASLR.)
Third, what is the rationale for having them be macros?
It guarantees that #ifdef stdin will be true. This might have been added for compatibility with some very old programs which needed to handle systems which lacked support for stdio.
Classically, the values for stdin, stdout and stderr were variations on the theme of:
#define stdin (&__iob[0])
#define stdout (&__iob[1])
#define stderr (&__iob[2])
These are address constants and can be used in initializers for variables at file scope:
static FILE *def_out = stdout;
However, the C standard does not guarantee that the values are address constants that can be used like that C11 §7.21 Input/output <stdio.h.>:
stderr, stdin, stdout
which are expressions of type ''pointer to FILE'' that point to the FILE objects associated, respectively, with the standard error, input, and output streams.
Sometime a decade or more ago, the GNU C Library changed their definitions so that you could no longer use stdin, stdout or stderr as initializers for variables at file scope, or static variables with function scope (though you can use them to initialize automatic variables in a function). So, old code that had worked for ages on many systems stopped working on Linux.
The macro expansion of stdin etc is either a simple identity expansion (#define stdin stdin) or equivalent (on macOS, #define stdout __stdoutp). These are variables, not address constants, so you can't copy the value of the variable in the file scope initializer. It is a nuisance, but the standard doesn't say they're address constants, so it is legitimate.
They're required to be macros because they always were macros, so it retains that much backwards compatibility with the dawn of the standard I/O library (circa 1978, long before there was a standard C library per se).

Why do all the C files written by my lecturer start with a single # on the first line?

I'm going through some C course notes, and every C program source file begins with a single # on the first line of the program.
Then there are blank lines, and following that other stuff followed by the main function.
What is the reason for the #?
(It's out of term now and I can't really ask the chap.)
Here's an example:
#
#include <stdio.h>
int main() {
printf("Hello, World!");
return 0;
}
Wow, this requirement goes way back to the 1970s.
In the very early days of pre-standardised C, if you wanted to invoke the preprocessor, then you had to write a # as the first thing in the first line of a source file. Writing only a # at the top of the file affords flexibility in the placement of the other preprocessor directives.
From an original C draft by the great Dennis Ritchie himself:
12. Compiler control lines
[...] In order to cause [the] preprocessor to be invoked, it is necessary that the very
first line of the program begin with #. Since null lines are ignored by the preprocessor, this line need contain no other
information.
That document makes for great reading (and allowed me to jump on this question like a mad cat).
I suspect it's the lecturer simply being sentimental - it hasn't been required certainly since ANSI C.
It Does Nothing
As of the ISO standard of C/C++:
A preprocessing directive of the form
# new-line
has no effect.
So in today's compilers, that empty hash does not do anything (like- new-line ; has no functionality).
PS: In * pre-standardized C*, # new-line had an important role, it was used to invoke the C Pre-Processor (as pointed out by #Bathsheba). So, the code here was either written within that time period, or came from the habit of the programmer.
Edit: recently I have come across code like this-
#ifdef ANDROID
#
#define DEVICE_TAG "ANDROID"
#define DEBUG_ENABLED
#
#else
#
#define DEVICE_TAG "NOT_ANDROID"
#
#endif /* ANDROID */
Here, those empty hashes are there only for making the code look good. It also improves readability by indicating that it is a preprocessor block.
You need to know about the Compilation process of C. Because that is "must know" how the Source code converting into Executable binary code (file).
From the Compilation Process, the C source code has to Cross the pre-processor Section. But how to tell the Compiler to pre-process the code?... That the time # Symbol was introduced to the indicator of Preprocess to the compiler.
For Example #define PI 3.141 is in the Source code. Then it will be change after the Preprocessing session. Means, all the PI will be changed into 3.141.
This like #include <stdio.h>, the standard I/O Functions will be added into your Source code.
If you have a Linux machine, compile like gcc -save-temps source_code.c. And see the compiler outputs.

wprintf doesn't print special character and fgetws returns NULL when one is fund

I have an issue with fgetws and wprintf.
NULL is returned when a special character is fund in the File opened before. I don't have this problem with fgets.
I tried to use setlocale, as recommended here : fgetws fails to get the exact wide char string from FILE*
but it doesn't change nothing.
Moreover, wprintf(L"éé"); prints ?? (I also don't have this problem with printf) in the terminal (on Ubuntu 12), what can be done to avoid this?
Edit : as it is asked in the comments, here is the very simple code :
# include "sys.h"
#define MAX_LINE_LENGTH 1024
int main (void){
FILE *File = fopen("D.txt", "r");
wchar_t line[MAX_LINE_LENGTH];
while (fgetws(line, MAX_LINE_LENGTH, File))
wprintf(L"%S", line);
fclose(File);
return 0;
}
By default, when a program starts, it is running in the C locale, which is not guaranteed to support any characters except those needed for translating C programs. (It can contain more as an implementation detail, but you cannot rely on this.) In order to use wchar_t to store other characters and process them with the wide character conversion functions or wide stdio functions, you need to set a locale in which those characters are supported.
The locales available, and how they are named, vary by system, so you should not attempt to set a locale by name. Instead, pass "" to setlocale to request the "default" locale for the user or the system. On POSIX-like systems, this uses the LANG and LC_* environment variables to determine the preferred locale. As long as the characters you're trying to use exist in the user's locale, your wprintf should work.
The call to setlocale should look like:
setlocale(LC_CTYPE, "");
or:
setlocale(LC_ALL, "");
The former only applies the locale settings to character encoding/character type functions (things that process wchar_t). The latter also causes locale to be set for, and affect, a number of other things like message language, formatting of numbers and time, ...
One detail to note is that wide stdio functions bind the character encoding of the locale that's in use at the time the stream "becomes wide-oriented", i.e. on the first wide operation that's performed on it. So you need to call setlocale before using wprintf.

Design a compiler like C

I'm developing a C like compiler and I want to know how the compiler works with the system include.
The compiler read the entire code, and stores all includes found in one list and parser the includes, after finish the reading the current code?
// file main.c
#include <stdio.h> // store in one list
// continue the parse ...
int main()
{
return 0;
}
// now, read the includes
// after finish the includes parse, gen code of sources
// just a sample
// file stdio.h
#include <types.h> // store in list
#include <bios.h> // store in list
void printf(...)
{
}
void scanf(...)
{
}
Btw, I have developd an system ( only test ) to read the includes and, stop the parse, to read the include... ( it's a disgusting code, but, work... )
( link of sample ) -> https://gist.github.com/4399601
Btw, What is the best way to read the includes... and work with includes files ??
#include, #define, #ifdef and the like are processed by a separate pass called the preprocessor. It replaces the lines with #include with the included files. The resulting temporary source text is then fed to later passes like the tokenizer and parser.
Any line in C that begins with # is handled by the preprocessor, not the compiler. The preprocessor generates a file that the compiler then compiles. The contents of the file depend on whatever is #defined by the developer and the SDK.
Anything which begins with # is a preprocessor directive.. the corresponding code gets substituted at the time of compilation.. the first stage of compilation is this preprocessor compilation..
then later the output of preprocessor(.i file) is given to the later stages of compilation..
later stages of compilation include LEXICAL ANALYZER, PARSER, OPTIMIZER and CODE GENERATOR..
If I was writing a compiler from scratch, I would first of all consider if handing includes is a necessary part of the language - and if so, do YOU have to write it, or could you use an already existing one (such as the cpp part of gcc). The "fun" part of a compiler, after all, is the real compiling of the code, not reading files and replacing strings with other strings through macro expansion [although that can be quite fun too, of course - but you can write that once you have a compiler that works!].
The tricky part with include files isn't the including itself (fairly trivial, recursive, function), but the parsing of #define/#ifdef/#if/#undef, and more importantly, the replacing stuff with that.
Have fun!

Resources