Self contained C routine to print string - c

I would like to make a self contained C function that prints a string. This would be part of an operating system, so I can't use stdio.h. How would I make a function that prints the string I pass to it without using stdio.h? Would I have to write it in assembly?

Assuming you're doing this on an X86 PC, you'll need to read/write directly to video memory located at address 0xB8000. For color monitors you need to specify an ASCII character byte and an attribute byte, which can indicate color. It is common to use macros when accessing this memory:
#define VIDEO_BASE_ADDR 0xB8000
#define VIDEO_ADDR(x,y) (unsigned short *)(VIDEO_BASE_ADDR + 2 * ((y) * SCREEN_X_SIZE + (x)))
Then, you write your own IO routines around it. Below is a simple function I used to write from a screen buffer. I used this to help implement a crude scrolling ability.
void c_write_window(unsigned int x, unsigned int y, unsigned short c)
{
if ((win_offset + y) >= BUFFER_ROWS) {
int overlap = ((win_offset + y) - BUFFER_ROWS);
*VIDEO_ADDR(x,y) = screen_buffer[overlap][x] = c;
} else {
*VIDEO_ADDR(x,y) = screen_buffer[win_offset + y][x] = c;
}
}
To learn more about this, and other osdev topics, see http://wiki.osdev.org/Printing_To_Screen

You will probably want to look at, or possibly just use, the source to the stdio functions in the FreeBSD C library, which is BSD-licensed.
To actually produce output, you'll need at least some function that can write characters to your output device. To do this, the stdio routines end up calling write, which performs a syscall into the kernel.

Related

Place function instructions successively in program memory

Say I have a program which controls some Christmas lights (this isn't the actual application, only an example). These lights have a few different calculations to determine whether a light, i, will be lit in a given frame, t. Each of i and t is a uint8_t, so it can be assumed that there are 256 lights and t will loop each 256 frames. Some light patterns could be the following:
int flash(uint8_t t, uint8_t i) {
return t&1;}
int alternate(uint8_t t, uint8_t i) {
return i&1 == t&1;}
int loop(uint8_t t, uint8_t i) {
return i == t;}
If I then wanted to implement a mode-changing system that would loop through these modes, I could use a function pointer array int (*modes)(uint8_t, uint8_t)[3]. But, since these are all such short functions, is there any way I could instead force the compiler to place the functions directly after one another in program memory, sort of like an inline array?
The idea would be that to access one of these functions wouldn't require evaluating the pointer, and you could instead tell the processor the correct function is at modes + pitch*mode where pitch is the spacing between functions (at least the length of the longest).
I ask more out of curiosity than requirement, because I doubt this would actually cause much of a speed improvement.
What you are asking for is not directly available in C. But such logic can be possible in assembler, and C compilers might utilize different assembler tricks depending on CPU, optimization level etc. Try to just make the logic small and compact, mark the different functions as static, and use an switch() block in C and look at the assembler the compiler generates.
You could use a switch statement, like:
#define FLASH 1
#define ALTERNATE 2
#define LOOP 3
int patternexecute(uint8_t t, uint8_t i, int pattern)
{
switch (pattern) {
case FLASH: return t&1;
case ALTERNATE: return i&1 == t&1;
case LOOP: return i == t;
}
return 0;
}

C preprocessor evaluate sin() of constant

Is there a way to convince the C preprocessor to evaluate a transcendental function of a constant at compile-time?
For example, replace (int)256*sin(PI/4) with 181. This will help me keep magic numbers out of my code.
If it makes a difference, I'm using MSPGCC 4.5.3 and I have no sin() or cos() available at runtime.
The C preprocessor can't provide sin() or cos().
For my applications, I use a perl script to create a separate .h file containing the needed precalculations. There are probably sexier ways to do it, but this integrates into my workflow well enough.
The preprocessor can only resolve macros, something quite different from performing a function. The closest solution I can think of to reduce your magic numbers is creating a header with the most used sins or co-sins values:
#define SIN_PI (-1)
#define SIN_PI2 0
#define SIN_PI4 0.707106781186548
...
Then you can write:
256*SIN_PI2
And let the compiler optimization reduce it to a single constant.
As long as your arguments are all within the range [-π/4,+π/4], you can use the same formula standard implementations of libm use to compute sin. It's correct up to the last place (at most 1ulp error) just like the IEEE standard requires:
static const double
half = 5.00000000000000000000e-01, /* 0x3FE00000, 0x00000000 */
S1 = -1.66666666666666324348e-01, /* 0xBFC55555, 0x55555549 */
S2 = 8.33333333332248946124e-03, /* 0x3F811111, 0x1110F8A6 */
S3 = -1.98412698298579493134e-04, /* 0xBF2A01A0, 0x19C161D5 */
S4 = 2.75573137070700676789e-06, /* 0x3EC71DE3, 0x57B1FE7D */
S5 = -2.50507602534068634195e-08, /* 0xBE5AE5E6, 0x8A2B9CEB */
S6 = 1.58969099521155010221e-10; /* 0x3DE5D93A, 0x5ACFD57C */
double __kernel_sin(double x, double y, int iy)
{
double z,r,v;
int ix;
ix = __HI(x)&0x7fffffff; /* high word of x */
if(ix<0x3e400000) /* |x| < 2**-27 */
{if((int)x==0) return x;} /* generate inexact */
z = x*x;
v = z*x;
r = S2+z*(S3+z*(S4+z*(S5+z*S6)));
if(iy==0) return x+v*(S1+z*r);
else return x-((z*(half*y-v*r)-y)-v*S1);
}
Source: http://www.netlib.org/fdlibm/k_sin.c
While not what I'd call pleasant, you definitely can convert that whole function into a macro that will evaluate to a (compile-time) floating point constant expression. (Ignore the bit hackery at the beginning that has nothing to do with the value, and as far as I know you should assume iy is 0.)
C++ is much more ambitious in the range of initialisation it will do.
Is there any possibility of changing to a version of g++, does mspgcc include it?
Edit: After a bit of searching their website and email archives, AFAICT mspgcc does not support g++ :-(
That'd be such an easy fix.

C: Overwrite another function byte by byte

Let's suppose I have a function:
int f1(int x){
// some more or less complicated operations on x
return x;
}
And that I have another function
int f2(int x){
// we simply return x
return x;
}
I would like to be able to do something like the following:
char* _f1 = (char*)f1;
char* _f2 = (char*)f2;
int i;
for (i=0; i<FUN_LENGTH; ++i){
f1[i] = f2[i];
}
I.e. I would like to interpret f1 and f2 as raw byte arrays and "overwrite f1 byte by byte" and thus, replace it by f2.
I know that usually callable code is write-protected, however, in my particular situation, you can simply overwrite the memory location where f1 is located. That is, I can copy the bytes over onto f1, but afterwards, if I call f1, the whole thing crashes.
So, is my approach possible in principle? Or are there some machine/implementation/whatsoever-dependent issues I have to take into consideration?
It would be easier to replace the first few bytes of f1 with a machine jump instruction to the beginning of f2. That way, you won't have to deal with any possible code relocation issues.
Also, the information about how many bytes a function occupies (FUN_LENGTH in your question) is normally not available at runtime. Using a jump would avoid that problem too.
For x86, the relative jump instruction opcode you need is E9 (according to here). This is a 32-bit relative jump, which means you need to calculate the relative offset between f2 and f1. This code might do it:
int offset = (int)f2 - ((int)f1 + 5); // 5 bytes for size of instruction
char *pf1 = (char *)f1;
pf1[0] = 0xe9;
pf1[1] = offset & 0xff;
pf1[2] = (offset >> 8) & 0xff;
pf1[3] = (offset >> 16) & 0xff;
pf1[4] = (offset >> 24) & 0xff;
The offset is taken from the end of the JMP instruction, so that's why there is 5 added to the address of f1 in the offset calculation.
It's a good idea to step through the result with an assembly level debugger to make sure you're poking the correct bytes. Of course, this is all not standards compliant so if it breaks you get to keep both pieces.
Your approach is undefined behavior for the C standard.
And on many operating systems (e.g. Linux), your example will crash: the function code is inside the read only .text segment (and section) of the ELF executable, and that segment is (sort-of) mmap-ed read-only by execve (or by dlopen or by the dynamic linker), so you cannot write inside it.
Instead of trying to overwrite the function (which you've already found is fragile at best), I'd consider using a pointer to a function:
int complex_implementation(int x) {
// do complex stuff with x
return x;
}
int simple_implementation(int x) {
return x;
}
int (*f1)(int) = complex_implementation;
You'd use this something like:
for (int i=0; i<limit; i++) {
a = f1(a);
if (whatever_condition)
f1 = simple_implementation;
}
...and after the assignment, calling f1 would just return the input value.
Calling a function via a pointer does impose some overhead, but (thanks to that being common in OO languages) most compilers and CPUs do a pretty good job of minimizing that overhead.
Most memory architectures will stop you writing over the function code. It will crash.... But some embedded devices, you can do this kind of thing, but it is dangerous unless you know there's enough space, the calling is going to be ok, the stack is going to be ok, etc etc...
Most likely there is a WAY better way to solve the problem.

Designing Around a Large Number of Discrete Functions in C

Greetings and salutations,
I am looking for information regrading design patterns for working with a large number of functions in C99.
Background:
I am working on a complete G-Code interpreter for my pet project, a desktop CNC mill. Currently, commands are sent over a serial interface to an AVR microcontroller. These commands are then parsed and executed to make the milling head move. a typical example of a line might look like
N01 F5.0 G90 M48 G1 X1 Y2 Z3
where G90, M48, and G1 are "action" codes and F5.0, X1, Y2, Z3 are parameters (N01 is the optional line number and is ignored). Currently the parsing is coming along swimmingly, but now it is time to make the machine actually move.
For each of the G and M codes, a specific action needs to be taken. This ranges from controlled motion to coolant activation/deactivation, to performing canned cycles. To this end, my current design features a function that uses a switch to select the proper function and return a pointer to that function which can then be used to call the individual code's function at the proper time.
Questions:
1) Is there a better way to resolve an arbitrary code to its respective function than a switch statement? Note that this is being implemented on a microcontroller and memory is EXTREMELY tight (2K total). I have considered a lookup table but, unfortunately, the code distribution is sparse leading to a lot of wasted space. There are ~100 distinct codes and sub-codes.
2) How does one go about function pointers in C when the names (and possibly signatures) may change? If the function signatures are different, is this even possible?
3) Assuming the functions have the same signature (which is where I am leaning), is there a way to typedef a generic type of that signature to be passed around and called from?
My apologies for the scattered questioning. Thank you in advance for your assistance.
1) Perfect hashing may be used to map the keywords to token numbers (opcodes) , which can be used to index a table of function pointers. The number of required arguments can also be put in this table.
2) You don's want overloaded / heterogeneous functions. Optional arguments might be possible.
3) your only choice is to use varargs, IMHO
I'm not an expert on embedded systems, but I have experience with VLSI. So sorry if I'm stating the obvious.
The function-pointer approach is probably the best way. But you'll need to either:
Arrange all your action codes to be consecutive in address.
Implement an action code decoder similar to an opcode decoder in a normal processor.
The first option is probably the better way (simple and small memory footprint). But if you can't control your action codes, you'll need to implement a decoder via another lookup table.
I'm not entirely sure on what you mean by "function signature". Function pointers should just be a number - which the compiler resolves.
EDIT:
Either way, I think two lookup tables (1 for function pointers, and one for decoder) is still going to be much smaller than a large switch statement. For varying parameters, use "dummy" parameters to make them all consistent. I'm not sure what the consequences of force casting everything to void-pointers to structs will be on an embedded processor.
EDIT 2:
Actually, a decoder can't be implementated with just a lookup table if the opcode space is too large. My mistake there. So 1 is really the only viable option.
Is there a better way ... than a switch statement?
Make a list of all valid action codes (a constant in program memory, so it doesn't use any of your scarce RAM), and sequentially compare each one with the received code. Perhaps reserve index "0" to mean "unknown action code".
For example:
// Warning: untested code.
typedef int (*ActionFunctionPointer)( int, int, char * );
struct parse_item{
const char action_letter;
const int action_number; // you might be able to get away with a single byte here, if none of your actions are above 255.
// alas, http://reprap.org/wiki/G-code mentions a "M501" code.
const ActionFunctionPointer action_function_pointer;
};
int m0_handler( int speed, int extrude_rate, char * message ){ // M0: Stop
speed_x = 0; speed_y = 0; speed_z = 0; speed_e = 0;
}
int g4_handler ( int dwell_time, int extrude_rate, char * message ){ // G4: Dwell
delay(dwell_time);
}
const struct parse_item parse_table[] = {
{ '\0', 0, unrecognized_action } // special error-handler
{ 'M', 0, m0_handler }, // M0: Stop
// ...
{ 'G', 4, g4_handler }, // G4: Dwell
{ '\0', 0, unrecognized_action } // special error-handler
}
ActionFunctionPointer get_action_function_pointer( char * buffer ){
char letter = get_letter( buffer );
int action_number = get_number( buffer );
int index = 0;
ActionFunctionPointer f = 0;
do{
index++;
if( (letter == parse_table[index].action_letter ) and
(action_number == parse_table[index].action_number) ){
f = parse_table[index].action_function_pointer;
};
if('\0' == parse_table[index].action_letter ){
index = 0;
f = unrecognized_action;
};
}while(0 == f);
return f;
}
How does one go about function pointers in C when the names (and
possibly signatures) may change? If the function signatures are
different, is this even possible?
It's possible to create a function pointer in C that (at different times) points to functions with more or less parameters (different signatures) using varargs.
Alternatively, you can force all the functions that might possibly be pointed to by that function pointer to all have exactly the same parameters and return value (the same signature) by adding "dummy" parameters to the functions that require fewer parameters than the others.
In my experience, the "dummy parameters" approach seems to be easier to understand and use less memory than the varargs approach.
Is there a way to typedef a generic type of that signature
to be passed around and called from?
Yes.
Pretty much all the code I've ever seen that uses function pointers
also creates a typedef to refer to that particular type of function.
(Except, of course, for Obfuscated contest entries).
See the above example and Wikibooks: C programming: pointers to functions for details.
p.s.:
Is there some reason you are re-inventing the wheel?
Could maybe perhaps one of the following pre-existing G-code interpreters for the AVR work for you, perhaps with a little tweaking?
FiveD,
Sprinter,
Marlin,
Teacup Firmware,
sjfw,
Makerbot,
or
Grbl?
(See http://reprap.org/wiki/Comparison_of_RepRap_Firmwares ).

Best way to convert whole file to lowercase in C

I was wondering if theres a realy good (performant) solution how to Convert a whole file to lower Case in C.
I use fgetc convert the char to lower case and write it in another temp-file with fputc. At the end i remove the original and rename the tempfile to the old originals name. But i think there must be a better Solution for it.
This doesn't really answer the question (community wiki), but here's an (over?)-optimized function to convert text to lowercase:
#include <assert.h>
#include <ctype.h>
#include <stdio.h>
int fast_lowercase(FILE *in, FILE *out)
{
char buffer[65536];
size_t readlen, wrotelen;
char *p, *e;
char conversion_table[256];
int i;
for (i = 0; i < 256; i++)
conversion_table[i] = tolower(i);
for (;;) {
readlen = fread(buffer, 1, sizeof(buffer), in);
if (readlen == 0) {
if (ferror(in))
return 1;
assert(feof(in));
return 0;
}
for (p = buffer, e = buffer + readlen; p < e; p++)
*p = conversion_table[(unsigned char) *p];
wrotelen = fwrite(buffer, 1, readlen, out);
if (wrotelen != readlen)
return 1;
}
}
This isn't Unicode-aware, of course.
I benchmarked this on an Intel Core 2 T5500 (1.66GHz), using GCC 4.6.0 and i686 (32-bit) Linux. Some interesting observations:
It's about 75% as fast when buffer is allocated with malloc rather than on the stack.
It's about 65% as fast using a conditional rather than a conversion table.
I'd say you've hit the nail on the head. Temp file means that you don't delete the original until you're sure that you're done processing it which means upon error the original remains. I'd say that's the correct way of doing it.
As suggested by another answer (if file size permits) you can do a memory mapping of the file via the mmap function and have it readily available in memory (no real performance difference if the file is less than the size of a page as it's probably going to get read into memory once you do the first read anyway)
You can usually get a little bit faster on big inputs by using fread and fwrite to read and write big chunks of the input/output. Also you should probably convert a bigger chunk (whole file if possible) into memory and then write it all at once.
edit: I just rememberd one more thing. Sometimes programs can be faster if you select a prime number (at the very least not a power of 2) as the buffer size. I seem to recall this has to do with specifics of the cacheing mechanism.
If you're processing big files (big as in, say, multi-megabytes) and this operation is absolutely speed-critical, then it might make sense to go beyond what you've inquired about. One thing to consider in particular is that a character-by-character operation will perform less well than using SIMD instructions.
I.e. if you'd use SSE2, you could code the toupper_parallel like (pseudocode):
for (cur_parallel_word = begin_of_block;
cur_parallel_word < end_of_block;
cur_parallel_word += parallel_word_width) {
/*
* in SSE2, parallel compares are either about 'greater' or 'equal'
* so '>=' and '<=' have to be constructed. This would use 'PCMPGTB'.
* The 'ALL' macro is supposed to replicate into all parallel bytes.
*/
mask1 = parallel_compare_greater_than(*cur_parallel_word, ALL('A' - 1));
mask2 = parallel_compare_greater_than(ALL('Z'), *cur_parallel_word);
/*
* vector op - and all bytes in two vectors, 'PAND'
*/
mask = mask1 & mask2;
/*
* vector op - add a vector of bytes. Would use 'PADDB'.
*/
new = parallel_add(cur_parallel_word, ALL('a' - 'A'));
/*
* vector op - zero bytes in the original vector that will be replaced
*/
*cur_parallel_word &= !mask; // that'd become 'PANDN'
/*
* vector op - extract characters from new that replace old, then or in.
*/
*cur_parallel_word |= (new & mask); // PAND / POR
}
I.e. you'd use parallel comparisons to check which bytes are uppercase, and then mask both original value and 'uppercased' version (one with the mask, the other with the inverse) before you or them together to form the result.
If you use mmap'ed file access, this could even be performed in-place, saving on the bounce buffer, and saving on many function and/or system calls.
There is a lot to optimize when your starting point is a character-by-character 'fgetc' / 'fputc' loop; even shell utilities are highly likely to perform better than that.
But I agree that if your need is very special-purpose (i.e. something as clear-cut as ASCII input to be converted to uppercase) then a handcrafted loop as above, using vector instruction sets (like SSE intrinsics/assembly, or ARM NEON, or PPC Altivec), is likely to make a significant speedup possible over existing general-purpose utilities.
Well, you can definitely speed this up a lot, if you know what the character encoding is. Since you're using Linux and C, I'm going to go out on a limb here and assume that you're using ASCII.
In ASCII, we know A-Z and a-z are contiguous and always 32 apart. So, what we can do is ignore the safety checks and locale checks of the toLower() function and do something like this:
(pseudo code)
foreach (int) char c in the file:
c -= 32.
Or, if there may be upper and lowercase letters, do a check like
if (c > 64 && c < 91) // the upper case ASCII range
then do the subtract and write it out to the file.
Also, batch writes are faster, so I would suggest first writing to an array, then all at once writing the contents of the array to the file.
This should be considerable faster.

Resources