Print int from signal handler using write or async-safe functions - c

I want to print a number into log or to a terminal using write (or any async-safe function) inside a signal handler. I would prefer not to use buffered I/O.
Is there an easy and recommended way to do that ?
For example in place of printf, below I would prefer write (or any asyn safe function).
void signal_handler(int sig)
{
pid_t pid;
int stat;
int old_errno = errno;
while((pid = waitpid(-1, &stat, WNOHANG)) > 0)
printf("child %d terminated\n", pid);
errno = old_errno;
return;
}
Printing strings is easy. In place of the printf above I can use (without printing pid):
write(STDOUT_FILENO, "child terminated", 16);

If you really insist on doing the printing from a signal handler, you basically have 2 options:
Block the signal except in a dedicated thread you create for handling the signal. This special thread can simply perform for (;;) pause(); and since pause is async-signal-safe, the signal handler is allowed to use any functions it wants; it's not restricted to only async-signal-safe functions. On the other hand, it does have to access shared resources in a thread-safe way, since you're now dealing with threads.
Write your own code for converting integers to decimal strings. It's just a simple loop of using %10 and /10 to peel off the last digit and storing them to a short array.
However, I would highly recommend getting this operation out of the signal handler, using the self-pipe trick or similar.

Implement your own async-signal-safe snprintf("%d and use write
It is not as bad as I thought, How to convert an int to string in C? has several implementations.
The POSIX program below counts to stdout the number of times it received SIGINT so far, which you can trigger with Ctrl + C.
You can exit the program with Ctrl + \ (SIGQUIT).
main.c:
#define _XOPEN_SOURCE 700
#include <assert.h>
#include <limits.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <unistd.h>
/* Calculate the minimal buffer size for a given type.
*
* Here we overestimate and reserve 8 chars per byte.
*
* With this size we could even print a binary string.
*
* - +1 for NULL terminator
* - +1 for '-' sign
*
* A tight limit for base 10 can be found at:
* https://stackoverflow.com/questions/8257714/how-to-convert-an-int-to-string-in-c/32871108#32871108
*
* TODO: get tight limits for all bases, possibly by looking into
* glibc's atoi: https://stackoverflow.com/questions/190229/where-is-the-itoa-function-in-linux/52127877#52127877
*/
#define ITOA_SAFE_STRLEN(type) sizeof(type) * CHAR_BIT + 2
/* async-signal-safe implementation of integer to string conversion.
*
* Null terminates the output string.
*
* The input buffer size must be large enough to contain the output,
* the caller must calculate it properly.
*
* #param[out] value Input integer value to convert.
* #param[out] result Buffer to output to.
* #param[in] base Base to convert to.
* #return Pointer to the end of the written string.
*/
char *itoa_safe(intmax_t value, char *result, int base) {
intmax_t tmp_value;
char *ptr, *ptr2, tmp_char;
if (base < 2 || base > 36) {
return NULL;
}
ptr = result;
do {
tmp_value = value;
value /= base;
*ptr++ = "ZYXWVUTSRQPONMLKJIHGFEDCBA9876543210123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"[35 + (tmp_value - value * base)];
} while (value);
if (tmp_value < 0)
*ptr++ = '-';
ptr2 = result;
result = ptr;
*ptr-- = '\0';
while (ptr2 < ptr) {
tmp_char = *ptr;
*ptr--= *ptr2;
*ptr2++ = tmp_char;
}
return result;
}
volatile sig_atomic_t global = 0;
void signal_handler(int sig) {
char buf[ITOA_SAFE_STRLEN(sig_atomic_t)];
enum { base = 10 };
char *end;
end = itoa_safe(global, buf, base);
*end = '\n';
write(STDOUT_FILENO, buf, end - buf + 1);
global += 1;
signal(sig, signal_handler);
}
int main(int argc, char **argv) {
/* Unit test itoa_safe. */
{
typedef struct {
intmax_t n;
int base;
char out[1024];
} InOut;
char result[1024];
size_t i;
InOut io;
InOut ios[] = {
/* Base 10. */
{0, 10, "0"},
{1, 10, "1"},
{9, 10, "9"},
{10, 10, "10"},
{100, 10, "100"},
{-1, 10, "-1"},
{-9, 10, "-9"},
{-10, 10, "-10"},
{-100, 10, "-100"},
/* Base 2. */
{0, 2, "0"},
{1, 2, "1"},
{10, 2, "1010"},
{100, 2, "1100100"},
{-1, 2, "-1"},
{-100, 2, "-1100100"},
/* Base 35. */
{0, 35, "0"},
{1, 35, "1"},
{34, 35, "Y"},
{35, 35, "10"},
{100, 35, "2U"},
{-1, 35, "-1"},
{-34, 35, "-Y"},
{-35, 35, "-10"},
{-100, 35, "-2U"},
};
for (i = 0; i < sizeof(ios)/sizeof(ios[0]); ++i) {
io = ios[i];
itoa_safe(io.n, result, io.base);
if (strcmp(result, io.out)) {
printf("%ju %d %s\n", io.n, io.base, io.out);
assert(0);
}
}
}
/* Handle the signals. */
if (argc > 1 && !strcmp(argv[1], "1")) {
signal(SIGINT, signal_handler);
while(1);
}
return EXIT_SUCCESS;
}
Compile and run:
gcc -std=c99 -Wall -Wextra -o main main.c
./main 1
After pressing Ctrl + C fifteen times, the terminal shows:
^C0
^C1
^C2
^C3
^C4
^C5
^C6
^C7
^C8
^C9
^C10
^C11
^C12
^C13
^C14
Here is a related program that creates a more complex format string: How to avoid using printf in a signal handler?
Tested on Ubuntu 18.04. GitHub upstream.

You can use string handling functions (e.g. strcat) to build the string and then write it in one go to the desired file descriptor (e.g. STDERR_FILENO for standard error).
To convert integers (up to 64-bit wide, signed or unsigned) to strings I use the following functions (C99), which support minimal formatting flags and common number bases (8, 10 and 16).
#include <stdbool.h>
#include <inttypes.h>
#define STRIMAX_LEN 21 // = ceil(log10(INTMAX_MAX)) + 2
#define STRUMAX_LEN 25 // = ceil(log8(UINTMAX_MAX)) + 3
static int strimax(intmax_t x,
char buf[static STRIMAX_LEN],
const char mode[restrict static 1]) {
/* safe absolute value */
uintmax_t ux = (x == INTMAX_MIN) ? (uintmax_t)INTMAX_MAX + 1
: (uintmax_t)imaxabs(x);
/* parse mode */
bool zero_pad = false;
bool space_sign = false;
bool force_sign = false;
for(const char *c = mode; '\0' != *c; ++c)
switch(*c) {
case '0': zero_pad = true; break;
case '+': force_sign = true; break;
case ' ': space_sign = true; break;
case 'd': break; // decimal (always)
}
int n = 0;
char sign = (x < 0) ? '-' : (force_sign ? '+' : ' ');
buf[STRIMAX_LEN - ++n] = '\0'; // NUL-terminate
do { buf[STRIMAX_LEN - ++n] = '0' + ux % 10; } while(ux /= 10);
if(zero_pad) while(n < STRIMAX_LEN - 1) buf[STRIMAX_LEN - ++n] = '0';
if(x < 0 || force_sign || space_sign) buf[STRIMAX_LEN - ++n] = sign;
return STRIMAX_LEN - n;
}
static int strumax(uintmax_t ux,
char buf[static STRUMAX_LEN],
const char mode[restrict static 1]) {
static const char lbytes[] = "0123456789abcdefx";
static const char ubytes[] = "0123456789ABCDEFX";
/* parse mode */
int base = 10; // default is decimal
int izero = 4;
bool zero_pad = false;
bool alternate = false;
const char *bytes = lbytes;
for(const char *c = mode; '\0' != *c; ++c)
switch(*c) {
case '#': alternate = true; if(base == 8) izero = 1; break;
case '0': zero_pad = true; break;
case 'd': base = 10; izero = 4; break;
case 'o': base = 8; izero = (alternate ? 1 : 2); break;
case 'x': base = 16; izero = 8; break;
case 'X': base = 16; izero = 8; bytes = ubytes; break;
}
int n = 0;
buf[STRUMAX_LEN - ++n] = '\0'; // NUL-terminate
do { buf[STRUMAX_LEN - ++n] = bytes[ux % base]; } while(ux /= base);
if(zero_pad) while(n < STRUMAX_LEN - izero) buf[STRUMAX_LEN - ++n] = '0';
if(alternate && base == 16) {
buf[STRUMAX_LEN - ++n] = bytes[base];
buf[STRUMAX_LEN - ++n] = '0';
} else if(alternate && base == 8 && '0' != buf[STRUMAX_LEN - n])
buf[STRUMAX_LEN - ++n] = '0';
return STRUMAX_LEN - n;
}
They can be used like this:
#include <unistd.h>
int main (void) {
char buf[STRIMAX_LEN]; int buf_off;
buf_off = strimax(12345,buf,"+");
write(STDERR_FILENO,buf + buf_off,STRIMAX_LEN - buf_off);
}
that outputs:
+12345

If you insist on using xprintf() inside a signal handler you can always roll your own version that does not rely on buffered I/O:
#include <stdarg.h> /* vsnprintf() */
void myprintf(FILE *fp, char *fmt, ...)
{
char buff[512];
int rc,fd;
va_list argh;
va_start (argh, fmt);
rc = vsnprintf(buff, sizeof buff, fmt, argh);
if (rc < 0 || rc >= sizeof buff) {
rc = sprintf(buff, "Argh!: %d:\n", rc);
}
if (!fp) fp = stderr;
fd = fileno(fp);
if (fd < 0) return;
if (rc > 0) write(fd, buff, rc);
return;
}

Related

How to execute faster than "snprintf(mystr, 22, "{%+0.4f,%+0.4f}", (double)3.14159265, (double) 2.718281828459);" on a 32 bit mcu

I've tried a few things, any it seems that at best I'm 1.5x slower than the printf() family of functions, which boggles my mind a bit. I think what I'm up against in this situation is the addressing of my device is 32bit, and I don't have an FPU. I've tried a couple of "ftoa()" implementations and constrained them to only look for 2 digits on the left of the decimal point, and left myself some breadcrumbs as to what the total length is of a larger overall string that I'm trying to build. At the end of the day, it seems like the nature of an array of 8-bit elements on a 32bit system is leading to a bunch of hidden shift operations, bitwise "OR" and bitwise NAND operations that are just slowing things down ridiculously...
Anyone have any general tips for this situation? (other than a re-architect to an 8.24 fixed point design) I've tried the compiler optimizations from wysiwyg to execution speed focused, nothing seems to beat snprintf.
Here's the fastest one that I had tried:
#if (__DEBUG)
#define DATA_FIFO_SIZE (8)
#else
#define DATA_FIFO_SIZE (1024)
#endif
typedef struct
{
int32_t rval[4];
double cval[4];
uint16_t idx;
uint16_t padding; //#attention the compiler was padding with 2 bytes to align to 32bit
} data_fifo_entry;
const char V_ERR_MSG[7] = "ERROR,\0";
static data_fifo_entry data_fifo[DATA_FIFO_SIZE];
static char embed_text[256];
/****
* float to ASCII, adapted from
* https://stackoverflow.com/questions/2302969/how-to-implement-char-ftoafloat-num-without-sprintf-library-function-i#7097567
*
****/
//#attention the following floating point #defs are linked!!
#define MAX_DIGITS_TO_PRINT_FLOAT (6)
#define MAX_SUPPORTED_PRINTABLE_FLOAT (+999999.99999999999999999999999999)
#define MIN_SUPPORTED_PRINTABLE_FLOAT (-999999.99999999999999999999999999)
#define FLOAT_TEST6 (100000.0)
#define FLOAT_TEST5 (10000.0)
#define FLOAT_TEST4 (1000.0)
#define FLOAT_TEST3 (100.0)
#define FLOAT_TEST2 (10.0)
#define FLOAT_TEST1 (1.0)
static inline int ftoa(char *s, const float f_in, const uint8_t precision)
{
float f_p = 0.0001;
float n = f_in;
int neg = (n < 0.0);
int length = 0;
switch (precision)
{
case (1):
{
f_p = 0.1;
break;
}
case (2):
{
f_p = 0.01;
break;
}
case (3):
{
f_p = 0.001;
break;
}
//case (4) is the default assumption
case (5):
{
f_p = 0.00001;
break;
}
case (6):
{
f_p = 0.000001;
break;
}
default: //already assumed, no assignments here
{
break;
}
} /* switch */
// handle special cases
if (isnan(n))
{
strcpy(s, "nan\0");
length = 4;
}
else if ((isinf(n)) || (n >= MAX_SUPPORTED_PRINTABLE_FLOAT) ||
((-1.0 * n) < MIN_SUPPORTED_PRINTABLE_FLOAT))
{
strcpy(s, "inf\0");
length = 4;
}
else if (n == 0.0)
{
int idx;
s[length++] = '+';
s[length++] = '0';
s[length++] = '.';
for (idx = 0; idx < precision; idx++)
{
s[length++] = '0';
}
s[length++] = '\0';
}
else if (((n > 0.0) && (n < f_p)) || ((n < 0.0) && ((-1.0 * n) < f_p)))
{
int idx;
if (n >= 0.0)
{
s[length++] = '+';
}
else
{
s[length++] = '-';
}
s[length++] = '0';
s[length++] = '.';
for (idx = 1; idx < precision; idx++)
{
s[length++] = '0';
}
s[length++] = '\0';
}
else
{
int digit, m;
if (neg)
{
n = -n;
}
// calculate magnitude
if (n >= FLOAT_TEST6)
{
m = 6;
}
else if (n >= FLOAT_TEST5)
{
m = 5;
}
else if (n >= FLOAT_TEST4)
{
m = 4;
}
else if (n >= FLOAT_TEST3)
{
m = 3;
}
else if (n >= FLOAT_TEST2)
{
m = 2;
}
else if (n >= FLOAT_TEST1)
{
m = 1;
}
else
{
m = 0;
}
if (neg)
{
s[length++] = '-';
}
else
{
s[length++] = '+';
}
// set up for scientific notation
if (m < 1.0)
{
m = 0;
}
// convert the number
while (n > f_p || m >= 0)
{
double weight = pow(10.0, m);
if ((weight > 0) && !isinf(weight))
{
digit = floor(n / weight);
n -= (digit * weight);
s[length++] = '0' + digit;
}
if ((m == 0) && (n > 0))
{
s[length++] = '.';
}
m--;
}
s[length++] = '\0';
}
return (length - 1);
} /* ftoa */
static inline void print2_and_idx(int8_t idx1, int8_t idx2, uint16_t fifo_idx)
{
//#attention 10 characters already in the buffer, idx does NOT start at zero
uint8_t idx = V_PREFIX_LENGTH;
char scratch[16] = {'\0'};
char * p_fifo_id;
if ((idx1 >= 0) && (idx1 < MAX_IDX) && (idx2 >= 0) && (idx2 < MAX_IDX) &&
(fifo_idx >= 0) && (fifo_idx < DATA_FIFO_SIZE))
{
ftoa(scratch, data_fifo[fifo_idx].cval[idx1], 4);
memcpy((void *)&embed_text[idx += 7], (void *)scratch, 7);
embed_text[idx++] = ',';
ftoa(scratch, data_fifo[fifo_idx].cval[idx2], 4);
memcpy((void *)&embed_text[idx += 7], (void *)scratch, 7);
embed_text[idx++] = ',';
//!\todo maybe print the .idx as fixed width, zero pad to 5 digits
p_fifo_id = utoa((char *)&embed_text[idx], (unsigned int)data_fifo[fifo_idx].idx, 10);
idx += strlen(p_fifo_id);
embed_text[idx++] = ',';
}
else
{
memcpy((void *)&embed_text[idx], (void *)V_ERR_MSG, 7);
}
} /* print2_and_idx */
Instead of using *printf() with FP arguments, convert the FP values first into scaled integers.
With still calling snprintf(), yet with integer and simple character arguments, my code was about 20x faster than the baseline.
Your mileage may vary. YMMV.
//baseline
void format2double_1(char *mystr, double pi, double e) {
snprintf(mystr, 22, "{%+0.4f,%+0.4f}", pi, e);
//puts(mystr);
}
void format2double_2(char *mystr, double pi, double e) {
int pi_i = (int) lrint(pi * 10000.0);
int api_i = abs(pi_i);
int e_i = (int) lrint(e * 10000.0);
int ae_i = abs(e_i);
snprintf(mystr, 22, "{%c%d.%04d,%c%d.%04d}", //
"+-"[pi_i < 0], api_i / 10000, api_i % 10000, //
"+-"[e_i < 0], ae_i / 10000, ae_i % 10000);
//puts(mystr);
}
[edit]
For a proper -0.0 text, use "+-"[!!signbit(pi)]
[edit]
Some idea for OP to consider as a ftoa() replacement. Central code is lrint(f_in * fscale[precision]); which rounds and scales. Untested.
#define PRINTABLE_MAGNITUDE_LIMIT 1000000
int ftoa_1(char *s, const float f_in, const uint8_t precision) {
int n;
sprintf(s, "%+.*f%n", precision, f_in, &n);
return n;
}
int ftoa_2(char *s, const float f_in, const uint8_t precision) {
float fscale[] = { 1, 10, 100, 1000, 10000, 100000, 1000000 };
long iscale[] = { 1, 10, 100, 1000, 10000, 100000, 1000000 };
assert(precision > 0 && precision < sizeof fscale / sizeof fscale[0]);
// gross range check
if (f_in > -PRINTABLE_MAGNITUDE_LIMIT && f_in < PRINTABLE_MAGNITUDE_LIMIT) {
long value = lrint(f_in * fscale[precision]);
value = labs(value);
long scale = iscale[precision];
long ipart = value / scale;
long fpart = value % scale;
// fine range check
if (ipart < PRINTABLE_MAGNITUDE_LIMIT) {
int n;
sprintf(s, "%c%ld:%0*ld%n", signbit(f_in) ? '-' : '+', ipart, precision,
fpart, &n);
return n;
}
}
// Out of range values need not be of performance concern for now.
return ftoa_1(s, f_in, precision);
}
[edit]
To convert a positive or 0 integer to a string quickly without the need to shift the buffer or reverse it, see below. It also returns the string length for subsequent string building.
// Convert an unsigned to a decimal string and return its length
size_t utoa_length(char *dest, unsigned u) {
size_t len = 0;
if (u >= 10) {
len = utoa_length(dest, u/10);
dest += len;
}
dest[0] = '0' + u%10;
dest[1] = '\0';
return len + 1;
}
In a similar vein of #chux's answer, if the remaining snprintf is still slow you can go down the rabbit hole of hand-composing strings/hand-rendering integers.
char *fmtp04f(char *buf, char *lim, double d) {
// if there's no space at all don't bother
if(buf==lim) return buf;
// 10 characters in maximum 32 bit integer, one for the dot,
// one for the terminating NUL in debug prints
char b[12];
// current position in the buffer
char *bp = b;
// scale and round
int32_t i = lrint(d * 10000.);
// write sign and fix i sign
// (we do have at least one character available in buf)
if(signbit(d)) {
*buf++='-';
i = -i;
} else {
*buf++='+';
}
// *always* write down the last 4 digits, even if they are zeroes
// (they'll become the 4 digits after the decimal dot)
for(; bp!=b+4; ) {
*bp++ = '0' + i%10;
i/=10;
}
*bp++='.';
// write down the remaining digits, writing at least one
do {
*bp++ = '0' + i%10;
i/=10;
} while(i != 0);
// bp is at the character after the last, step back
--bp;
// data is now into b *in reversed order*;
// reverse-copy it into the user-provided buffer
while(buf!=lim) {
*buf++ = *bp;
// check before decrementing, as a pointer to one-before-first
// is not allowed in C
if(bp == b) break;
--bp;
}
if(buf!=lim) *buf=0; // "regular" case: terminate *after*
else lim[-1]=0; // bad case: truncate
return buf;
}
void doformat(char *buf, char *lim, double a, double b) {
if(buf==lim) return; // cannot do anything
*buf++='{';
if(buf==lim) goto end;
buf = fmtp04f(buf, lim, a);
if(buf==lim) return; // already terminated by fmtp04f
*buf++=',';
if(buf==lim) goto end;
buf = fmtp04f(buf, lim, b);
if(buf==lim) return; // idem
*buf++='}';
if(buf==lim) goto end;
*buf++=0;
end:
lim[-1]=0; // always terminate
}
It passes some random tests, so I'm reasonably confident that it is not too wrong.
For some reason, #chux version on my machine (64 bit Linux, gcc 6.3) is generally 2/3 times faster than the baseline, while my version is usually 10/30 times faster than the baseline. I don't know if this is because my snprintf is particularly good or particularly bad. As said above, YMMV.

Instead of printing the binary number out how would I store it as a variable?

I pass in a hex number into hex2bin and it prints out the binary number correctly but I don't want it to print out the number I want to return the number so I can use it to find the cardinality of the number. How would I store the number instead of printing it out?
int hex2bin (int n){
int i,k,mask;
for(i = sizeof(int) * 8 - 1; i >= 0; i--){
mask = 1 << i;
k = n & mask;
k == 0 ? printf("0"):printf("1");
}
return 0;
}
Perhaps something like this?
int result = 0;
int i, k...
...
result = result | (((k == 0) ? 0 : 1) << i;
...
return result;
Instead of being clever with an int, you could of course also simply use an array of variables instead.
Store the number in a string whose space is provided by a compound literal (Available since C99).
It works like OP's flow: Loop up to sizeof(int) * 8 times, finding the value of 1 bit and print/save it.
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
// Maximum buffer size needed
#define UTOA_BASE_2 (sizeof(unsigned)*CHAR_BIT + 1)
char *utoa_base2(char *s, unsigned x) {
s += UTOA_BASE_2 - 1;
*s = '\0';
do {
*(--s) = "01"[x % 2];
x /= 2;
} while (x);
return s;
}
#define TO_BASE2(x) utoa_base2((char [UTOA_BASE_2]){0} , (x))
void test(unsigned x) {
printf("base10:%10u base2:%5s ", x, TO_BASE2(x));
char *s = TO_BASE2(x);
// do stuff with `s`, it is valid for until the end of this block
printf("%s\n", s);
}
int main(void) {
test(0);
test(25);
test(UINT_MAX);
}
Sample output
base10: 0 base2: 0 0
base10: 25 base2:11001 11001
base10:4294967295 base2:11111111111111111111111111111111 11111111111111111111111111111111
This is a variation of this base-n answer.
You can use the strcat function to do that.
Note that the new hex2bin function in this answer assumes that the parameter char *buf has already been allocated and can hold at least 1+sizeof(int)*8 bytes including the null terminator:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// assume: buf is at least length 33
int hex2bin (int n, char *buf)
{
int i,k,mask;
for(i = sizeof(int) * 8 - 1; i >= 0; i--){
mask = 1 << i;
k = n & mask;
k == 0 ? strcat(buf, "0") : strcat(buf, "1");
}
return 0;
}
int main()
{
int n = 66555;
char buffer[1+sizeof(int)*8] = { 0 } ;
hex2bin(n, buffer);
printf("%s\n", buffer);
return 0;
}
I hope you will find this helpful :)
bool convertDecimalBNR(INT32 nDecimalValue, UINT32 * punFieldValue, INT32 nBitCount, DecimalBNRType * pDecimalSpecification)
{
bool bBNRConverted = false;
INT32 nBitIndex = nBitCount - 1;
INT32 nBitValue = anTwoExponents[nBitIndex];
*punFieldValue = 0;
if ((nDecimalValue >= pDecimalSpecification->nMinValue) && (nDecimalValue <= pDecimalSpecification->nMaxValue))
{
// if the value is negative, then add (-1 * (2 ^ (nBitCount - 1))) on itself and go on just like a positive value calculation.
if (nDecimalValue < 0)
{
nDecimalValue += nBitValue;
nBitIndex--;
nBitValue /= 2;
*punFieldValue |= BIT_0_ONLY_ONE;
}
while (nBitIndex >= 0)
{
*punFieldValue = (*punFieldValue << 1);
if (nDecimalValue >= nBitValue)
{
nDecimalValue -= nBitValue;
*punFieldValue |= BIT_0_ONLY_ONE;
}
nBitIndex--;
nBitValue /= 2;
}
if (nDecimalValue <= nBitValue)
{
bBNRConverted = true;
}
}
return (bBNRConverted);
}

How to store a digit string into an arbitrary large integer?

Let ib be the input base and ob the output base. str is the ASCII representation of some arbitrary large integer x. I need to define f such as:
f(str="1234567890", ib=10, ob=16) = {4, 9, 9, 6, 0, 2, 13, 2}
... where the return type of f is an int array containing the base ob digits of this integer. We assume that 2 >= ob <= MAX_INT and 2 >= ib <= 10, and str will always be a valid string (no negative needed).
Something to get OP started, but enough to leave OP to enjoy the coding experience.
// form (*d) = (*d)*a + b
static void mult_add(int *d, size_t *width, int ob, int a, int b) {
// set b as the carry
// for *width elements,
// x = (Multiply d[] by `a` (using wider than int math) and add carry)
// d[] = x mod ob
// carry = x/ob
// while (carry <> 0)
// widen d
// x = carry
// d[] = x mod ob
// carry = x/ob
}
int *ql_f(const char *src, int ib, int ob) {
// Validate input
assert(ib >= 2 && ib <= 10);
assert(ob >= 2 && ob <= INT_MAX);
assert(src);
// Allocate space
size_t length = strlen(src);
// + 2 + 4 is overkill, OP to validate and right-size later
size_t dsize = (size_t) (log(ib)/log(ob)*length + 2 + 4);
int *d = malloc(sizeof *d * dsize);
assert(d);
// Initialize d to zero
d[0] = 0;
size_t width = 1;
while (*src) {
mult_add(d, &width, ob, ib, *src - '0');
src++;
}
// add -1 to end, TBD code
return d;
}
I wrote this with older specifications, so it's not valid any more, but it might be useful as a starting point.
The code can handle long long magnitudes. Going to arbitrary precision numbers in C is a big leap!
Note using -1 as the ending marker instead of 0. Can accept ib from 2 to 36 and any ob.
Includes example main.
Function f is not reentrant as-is. To make it thread-safe, it could allocate the required memory then return a pointer to it. The simplest protocol would be having the caller responsible for freeing the memory afterwards.
#include <stdlib.h>
#include <limits.h>
#include <stdio.h>
int *f(const char *str, int ib, int ob) {
static int result[CHAR_BIT * sizeof(long long) + 1];
int i = sizeof(result) / sizeof(int) - 1;
long long l = strtoll(str, NULL, ib);
result[i--] = -1;
while (l) {
result[i] = l % ob;
l /= ob;
i--;
}
return result + i + 1;
}
int main()
{
int *x = f("1234567890", 16, 10);
while (*x > -1) {
printf("%d ", *x);
x++;
}
return 0;
}

How to de-obfuscate the ctk.c code the winner of 2001's IOCCC?

I have seen ctk.c obfuscated code, but How can I start to de-obfuscate it?
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>
#include <signal.h>
#define m(b)a=b;z=*a;while(*++a){y=*a;*a=z;z=y;}
#define h(u)G=u<<3;printf("\e[%uq",l[u])
#define c(n,s)case n:s;continue
char x[]="((((((((((((((((((((((",w[]=
"\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b";char r[]={92,124,47},l[]={2,3,1
,0};char*T[]={" |"," |","%\\|/%"," %%%",""};char d=1,p=40,o=40,k=0,*a,y,z,g=
-1,G,X,**P=&T[4],f=0;unsigned int s=0;void u(int i){int n;printf(
"\233;%uH\233L%c\233;%uH%c\233;%uH%s\23322;%uH#\23323;%uH \n",*x-*w,r[d],*x+*w
,r[d],X,*P,p+=k,o);if(abs(p-x[21])>=w[21])exit(0);if(g!=G){struct itimerval t=
{0,0,0,0};g+=((g<G)<<1)-1;t.it_interval.tv_usec=t.it_value.tv_usec=72000/((g>>
3)+1);setitimer(0,&t,0);f&&printf("\e[10;%u]",g+24);}f&&putchar(7);s+=(9-w[21]
)*((g>>3)+1);o=p;m(x);m(w);(n=rand())&255||--*w||++*w;if(!(**P&&P++||n&7936)){
while(abs((X=rand()%76)-*x+2)-*w<6);++X;P=T;}(n=rand()&31)<3&&(d=n);!d&&--*x<=
*w&&(++*x,++d)||d==2&&++*x+*w>79&&(--*x,--d);signal(i,u);}void e(){signal(14,
SIG_IGN);printf("\e[0q\ecScore: %u\n",s);system("stty echo -cbreak");}int main
(int C,char**V){atexit(e);(C<2||*V[1]!=113)&&(f=(C=*(int*)getenv("TERM"))==(
int)0x756E696C||C==(int)0x6C696E75);srand(getpid());system("stty -echo cbreak"
);h(0);u(14);for(;;)switch(getchar()){case 113:return 0;case 91:case 98:c(44,k
=-1);case 32:case 110:c(46,k=0);case 93:case 109:c(47,k=1);c(49,h(0));c(50,h(1
));c(51,h(2));c(52,h(3));}}
http://www.ioccc.org/2001/ctk.hint:
This is a game based on an Apple ][ Print Shop Companion easter
egg named 'DRIVER', in which the goal is to drive as fast as
you can down a long twisty highway without running off the
road. Use ',./', '[ ]', or 'bnm' to go left, straight, and
right respectively. Use '1234' to switch gears. 'q' quits. The
faster you go and the thinner the road is, the more points you
get. Most of the obfuscation is in the nonsensical if statements
among other things. It works best on the Linux console: you
get engine sound (!) and the * Lock keyboard lights tell you
what gear you're in (none lit=4th). The 'q' argument (no
leading '-') will silence the sound. It won't work on a terminal
smaller than 80x24, but it works fine with more (try it in an
XTerm with the "Unreadable" font and the window maximized
vertically!).
1st step
Using:
sed -e'/#include/d' ctk.c | gcc -E - | sed -e's/;/;\n/g' -e's/}/}\n/g' -e '/^#/d' | indent
I was able to generate the following output which while not perfect already seems to be readable a lot better:
char x[] = "((((((((((((((((((((((", w[] =
"\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b";
char r[] = { 92, 124, 47 }
, l[] =
{
2, 3, 1, 0}
;
char *T[] = { " |", " |", "%\\|/%", " %%%", "" }
;
char d = 1, p = 40, o = 40, k = 0, *a, y, z, g = -1, G, X, **P = &T[4], f = 0;
unsigned int s = 0;
void
u (int i)
{
int n;
printf ("\233;
%uH\233L%c\233;
%uH%c\233;
%uH%s\23322;
%uH#\23323;
%uH \n", *x - *w, r[d], *x + *w, r[d], X, *P, p += k, o);
if (abs (p - x[21]) >= w[21])
exit (0);
if (g != G)
{
struct itimerval t = { 0, 0, 0, 0 }
;
g += ((g < G) << 1) - 1;
t.it_interval.tv_usec = t.it_value.tv_usec = 72000 / ((g >> 3) + 1);
setitimer (0, &t, 0);
f && printf ("\e[10;
%u]", g + 24);
}
f && putchar (7);
s += (9 - w[21]) * ((g >> 3) + 1);
o = p;
a = x;
z = *a;
while (*++a)
{
y = *a;
*a = z;
z = y;
}
;
a = w;
z = *a;
while (*++a)
{
y = *a;
*a = z;
z = y;
}
;
(n = rand ()) & 255 || --*w || ++*w;
if (!(**P && P++ || n & 7936))
{
while (abs ((X = rand () % 76) - *x + 2) - *w < 6);
++X;
P = T;
}
(n = rand () & 31) < 3 && (d = n);
!d && --*x <= *w && (++*x, ++d) || d == 2 && ++*x + *w > 79 && (--*x, --d);
signal (i, u);
}
void
e ()
{
signal (14, SIG_IGN);
printf ("\e[0q\ecScore: %u\n", s);
system ("stty echo -cbreak");
}
int main (int C, char **V)
{
atexit (e);
(C < 2 || *V[1] != 113)
&& (f = (C = *(int *) getenv ("TERM")) == (int) 0x756E696C
|| C == (int) 0x6C696E75);
srand (getpid ());
system ("stty -echo cbreak");
G = 0 << 3;
printf ("\e[%uq", l[0]);
u (14);
for (;;)
switch (getchar ())
{
case 113:
return 0;
case 91:
case 98:
case 44:
k = -1;
continue;
case 32:
case 110:
case 46:
k = 0;
continue;
case 93:
case 109:
case 47:
k = 1;
continue;
case 49:
G = 0 << 3;
printf ("\e[%uq", l[0]);
continue;
case 50:
G = 1 << 3;
printf ("\e[%uq", l[1]);
continue;
case 51:
G = 2 << 3;
printf ("\e[%uq", l[2]);
continue;
case 52:
G = 3 << 3;
printf ("\e[%uq", l[3]);
continue;
}
}
... and now?
I don't think there's much more an automated process will be able perform at this point as the term "more" readable or "less" readable from now on might depend on the specific preferences of the reader.
One step that could be performed is removing escape sequences from the strings and placing them somewhere separately. As it turns out the whole
char l[] = {2, 3, 1, 0}
has no other purpose than to be utilized in the escape sequences of the main loop:
printf ("\e[%uq", l[0]);
and so on. Looking up their meaning:
ESC [ 0 q: clear all LEDs
ESC [ 1 q: set Scroll Lock LED
ESC [ 2 q: set Num Lock LED
ESC [ 3 q: set Caps Lock LED
depending on taste you might want to exchange them with a macro or a function call more meaningful to you like clear_all_LEDs and so on.
I strongly doubt a machine would agree on this being a simplification. As it turns out the whole main loop just seems to be working with keys entered by the user, so probably turning numbers into their corresponding characters might add to readability, like in replacing:
case 113:
return 0;
case 91:
case 98:
case 44:
k = -1;
// ...
case 49:
G = 0 << 3;
printf ("\e[%uq", l[0]);
with something like:
case 'q':
return 0;
case '[':
case 'b':
case ',':
k = -1;
// ...
case '1':
G = 0 << 3;
set_Num_Lock_LED ();
Oh - and while we are at it already why wouldn't we want to change the name from this rather strange G to gear. Again I strongly doubt an automated process would have found renaming from G to gear any better than renaming it to butterfly. Well maybe it even isn't.
While beautifying names maybe this function referenced by a single u is another candidate:
u (14);
with a more meaningful name update probably. And as we already included <signal.h> why don't we deobfuscate the code further by replacing 14 with SIGALRM like this:
upadate (SIGALRM);
As you can see "deobfuscating" here requires the exact opposite step of that taken before. Replacing the expansion with a macro this time. How would a machine try to decide which one is more useful?
Another spot where we might want to replace a bare number with something else is this one in the update function:
f && putchar (7);
Why not replace the 7 with \a as it will turn out to be the same in the end. Maybe we should even change the bare f with something more "meaningful".
Again I vote agains butterfly but would rather like to call it play_sound:
if (play_sound)
putchar ('\a');
might be the more readable version we are looking for. Sure we shouldn't forget to replace f in all other spots. The one right at the beginning of our main function beeing such a culprit:
Holy mess
int main (int C, char **V)
{
atexit (e);
(C < 2 || *V[1] != 113)
&& (f = (C = *(int *) getenv ("TERM")) == (int) 0x756E696C
|| C == (int) 0x6C696E75);
While happily renaming f to play_sound and e to - no, still no butterfly, this time I'll rather call it: - end we spot that the function signature seems to look a bit strange in terms of naming conventions: argc instead of C and argv instead of V would seem more conventional here. Thus giving us:
int main (int argc, char* argv[])
{
atexit (end);
(argc < 2 || *argv[1] != 113)
&& (playsound = (argc = *(int *) getenv ("TERM")) == (int) 0x756E696C
|| argc == (int) 0x6C696E75);
As this is still not a beauty we ask our standards guy and he informs us that it's pretty OK to replace
(A || B) && (C)
with
if (A || B) { C }
and
E = (x=F)==H || x==I
with
x = F;
if (x==H || x==I)
A=1;
else
A=0;`
So maybe this should be a more readable version of the whole code:
if (argc < 2 || *argv[1] != 'q') {
argc = *(int*) getenv ("TERM");
if (argc == (int) 0x756E69 || argc == (int) 0x6C696E75))
play_sound = 1;
/* skip the else brach here as play_sound is alredy initialized to 0 */
}
Now still another guy turns up and starts to inform us, that depending on something called endianness tose strange looking numbers 0x6C696E75 and 0x756E69 if stored in memory would (when interpreting raw byte vales as ASCII code) just look like "linu" or "unil". One being "unil" on one architecure type and "linu" the other one while just the other way round on the other architecture with different endianness.
So taking a closer look what's essentially happening here is:
we get a pointer to a string from getenv ("TERM") which we typcast to a pointer to an int before dereferencing it thus leading the bit pattern stored at the string location as an int.
next we compare this value with the one we would get if had performed the same with either "unil" or "linu" stored at that specific location.
Probably we just want to check if the TERM environment variable is set to "linux" so our deobfuscated version might want to perform a string comparison here.
As on the other hand we can't be sure if also allowing terminals with names starting with "unil" to play sound might be a special feature of this software so I decided to probably better leave it intact.
What now ?
While renaming and re-encoding variable names and values those strange char arrays could be our next victims. The following mess doesn't look too nice:
char x[] = "((((((((((((((((((((((", w[] =
"\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b";
char r[] = { 92, 124, 47 };
So maybe they could be changed to:
char x_offset[] = {
40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 40, 0 };
char width[] = {
8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
8, 8, 0 };
const char border[] = "\\|/";
As you can see I just chose to switch the way the values are described between x as string constant to x written down as an array as this way the purpose of the values stored here seemed a little bit clearer to me.
While on the other hand I changed the way the way r is written down just in exactly the opposite direction as again this seemed a lot clearer to me.
While hunting down all those refs to x, w and r the time could be used to rename p and o to - sorry again no butterfly - pos and old_pos while renaming s to score.
Changing for example:
s += (9 - w[21]) * ((g >> 3) + 1);
o = p;
a = x;
z = *a;
while (*++a)
{
y = *a;
*a = z;
z = y;
}
;
a = w;
z = *a;
while (*++a)
{
y = *a;
*a = z;
z = y;
}
;
to:
/* update score */
score += (9 - width[NEXT_LINE]) * ((g >> 3) + 1);
old_pos = pos;
/* shift x_offset */
a = x_offset;
z = *a;
while (*++a) {
y = *a;
*a = z;
z = y;
};
/* shift width */
a = width;
z = *a;
while (*++a) {
y = *a;
*a = z;
z = y;
};
Besides the possibility to turn it into some other kind of loop there's not much beautification possible for both shifting functions so probably adding an appropriate comment is the maximum you can do. Removing the magic number 21 might be another idea NEXT_LINE didn't seem to be the worst choice here.
The single character labeled variable g still doesn't look too good. But renaming it to something like update_interval there's also the chance to eliminate another weird terminal escape sequence:
if (g != G)
{
struct itimerval t = { 0, 0, 0, 0 }
;
g += ((g < G) << 1) - 1;
t.it_interval.tv_usec = t.it_value.tv_usec = 72000 / ((g >> 3) + 1);
setitimer (0, &t, 0);
f && printf ("\e[10;
%u]", g + 24);
}
Maybe looks a little bit more confusing than:
/* update simulation speed */
if (update_interval != gear) {
struct itimerval t = { 0, 0, 0, 0 } ;
update_interval += ((update_interval < gear) << 1) - 1;
t.it_interval.tv_usec = t.it_value.tv_usec = 72000 / ((update_interval >> 3) + 1);
setitimer (0, &t, 0);
if (play_sound)
change_bell_frequency (update_interval + 24);
}
Last fixes
Although the code should look a lot more readable by now there are still some nasty parts left:
!d && --*x <= *w && (++*x, ++d) || d == 2 && ++*x + *w > 79 && (--*x, --d);
Choosing another (hopefully) more meaningful name for d and breaking operator precedence down you might end up with something like:
if (curve == CURVE_LEFT) {
--*x_offset;
if (*x_offset < *width) {
++*x_offset;
curve = CURVE_NONE;
}
}
else if (curve == CURVE_RIGHT) {
++*x_offset;
if (*x_offset + *width > 79) {
--*x_offsett;
curve = CURVE_NONE;
}
}
instead adding appropriate macros for all those CURVE_...s.
Now there are still those X, P and T names hanging around that also might be changed. As it makes its purpose also a little bit better visible in code I decided to flip the line order of T that I renamed to tree which sure means the calculation also has to be fixed. All in all it's from:
char *T[] = { " |", " |", "%\\|/%", " %%%", "" };
char X, **P = &T[4];
// ...
if (!(**P && P++ || n & 7936))
{
while (abs ((X = rand () % 76) - *x + 2) - *w < 6);
++X;
P = T;
}
To something like:
char *tree[] = {
"",
" %%%",
"%\\|/%",
" |",
" |",
};
char **tree_line = tree;
char tree_position;
// ...
/* update tree line pointer */
if (!(**tree_line && tree_line-- || n & 7936)) {
/* find the right spot to grow */
while (abs ((tree_position = rand () % 76) - *x_offset + 2) - *width < 6)
;
++tree_position;
tree_line = &tree[4];
}
Keeping the best part until the end
Although the code already seems to looks a lot prettier to me now there's still one part missing. That's the one that's doing all the output. It's this line I'm talking about:
printf ("\233;%uH\233L%c\233;%uH%c\233;%uH%s\23322;%uH#\23323;%uH \n",
*x - *w, r[d], *x + *w, r[d], X, *P, p += k, o);
That apart from looking pretty hard to read was even to obfuscated for computer to produce any usable result.
I tried a lot of different things running in other terminal emulators, changing terminal settings and switching locales back and forth without sucess.
So besides the fact this kind of obfuscation seemed to be more that perfect as it even seems to confuse my computer I still can't tell what trick the author intended here.
The octal code \233 has the same bit-pattern as the escape character (\033) with the 8-th bit set additionally which probably is in some way related to effect that was intended here. Unfortunately as I already told it didn't work for me.
Fortunately enough the escape sequences still seemed easy enough to guess, so I came up with the following replacement:
pos += move_x,
/* draw street */
printf ("\e[1;%uH" "\e[L" "%c"
"\e[1;%uH" "%c",
*x_offset - *width, border[curve],
*x_offset + *width, border[curve]);
/* draw tree */
printf ("\e[1;%uH" "%s",
tree_position, *tree_line);
/* redraw car */
printf ("\e[22;%uH" "#"
"\e[23;%uH" " " "\n",
pos,
old_pos);
Taking drawing down into separate to (hopefully) make them a little bit more readable. The actual line and the previous line are still hard coded here as in the original version. Maybe extracting them from there as shown below would even improve readability:
/* draw street */
printf ("\e[1;%uH" "\e[L" "%c"
"\e[1;%uH" "%c",
*x_offset - *width, border[curve],
*x_offset + *width, border[curve]);
/* draw tree */
printf ("\e[1;%uH" "%s",
tree_position, *tree_line);
/* redraw car */
printf ("\e[%u;%uH" "#"
"\e[%u;%uH" " " "\n",
NEXT_LINE +1, pos,
NEXT_LINE +2, old_pos);
This finally brought me to the first usable version which I then "tested" a lot. While probably not 100% state of the art it still seems to be very addictive.
Last words
Here the final unobfuscated version that I came with. As you'll see I didn't implement the LED setting functions and the clear screen function but it shouldn't be to hard to find the needed escape sequences scattered throughout the obfuscated version. In fact I already mentioned the LED sequences in this post. The one to clear the screen is "\e[0q". Happy hacking.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>
#include <signal.h>
#define NEXT_LINE 21
#define CURVE_LEFT 0
#define CURVE_NONE 1
#define CURVE_RIGHT 2
char x_offset[] = {
40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 40, 0 };
char width[] = {
8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
8, 8, 0 };
const char border[] = "\\|/";
void change_bell_frequency () {}
void clear_screen () {}
void clear_all_LEDs () {}
void set_Num_Lock_LED () {}
void set_Scroll_lock_LED () {}
void set_Caps_Lock_LED () {}
char *tree[] = {
"",
" %%%",
"%\\|/%",
" |",
" |",
};
char **tree_line = tree;
char tree_position;
char curve = CURVE_NONE;
char *a, y, z;
char move_x = 0;
char update_interval = -1;
char pos = 40;
char old_pos = 40;
char play_sound = 0;
char gear;
unsigned int score = 0;
void move (char x, char y) {
printf ("\e[%u;%uH", x, y);
}
void insert () {
printf ("\e[L");
}
void update (int i) {
int n;
pos += move_x,
/* draw street */
printf ("\e[1;%uH" "\e[L" "%c"
"\e[1;%uH" "%c",
*x_offset - *width, border[curve],
*x_offset + *width, border[curve]);
/* draw tree */
printf ("\e[1;%uH" "%s",
tree_position, *tree_line);
/* redraw car */
printf ("\e[%u;%uH" "#"
"\e[%u;%uH" " " "\n",
NEXT_LINE + 1, pos,
NEXT_LINE +2, old_pos);
/* did we leave the road ? */
if (abs (pos - x_offset[NEXT_LINE]) >= width[NEXT_LINE])
exit (0);
/* update simulation speed */
if (update_interval != gear) {
struct itimerval t = { 0, 0, 0, 0 } ;
update_interval += ((update_interval < gear) << 1) - 1;
t.it_interval.tv_usec = t.it_value.tv_usec = 72000 / ((update_interval >> 3) + 1);
setitimer (0, &t, 0);
if (play_sound)
change_bell_frequency (update_interval + 24);
}
/* play sound */
if (play_sound)
putchar ('\a');
/* update score */
score += (9 - width[NEXT_LINE]) * ((update_interval >> 3) + 1);
old_pos = pos;
/* shift x_offset */
a = x_offset;
z = *a;
while (*++a) {
y = *a;
*a = z;
z = y;
};
/* shift width */
a = width;
z = *a;
while (*++a) {
y = *a;
*a = z;
z = y;
};
/* generate new road */
n = rand ();
if (!(n & 255) && *width > 1)
--*width;
/* set tree line pointer */
if (!(**tree_line && tree_line-- || n & 7936)) {
/* find the right spot to grow */
while (abs ((tree_position = rand () % 76) - *x_offset + 2) - *width < 6)
;
++tree_position;
tree_line = &tree[4];
}
/* new offset */
n = rand () & 31;
if (n < 3)
curve = n;
if (curve == CURVE_LEFT) {
--*x_offset;
if (*x_offset <= *width) {
++*x_offset;
curve = CURVE_NONE;
}
}
else if (curve == CURVE_RIGHT) {
++*x_offset;
if (*x_offset + *width > 79) {
--*x_offset;
curve = CURVE_NONE;
}
}
signal (SIGALRM, update);
}
void end () {
signal (SIGALRM, SIG_IGN);
clear_all_LEDs ();
clear_screen ();
printf ("Score: %u\n", score);
system ("stty echo -cbreak");
}
int main (int argc, char **argv) {
atexit (end);
if (argc < 2 || *argv[1] != 'q') {
argc = *(int*) getenv ("TERM");
if (argc == (int) 0x6C696E75 || argc == (int) 0x756E696C)
play_sound = 1;
}
srand (getpid ());
system ("stty -echo cbreak");
gear = 0 << 3;
clear_all_LEDs ();
update (14);
for (;;)
switch (getchar ())
{
case 'q':
return 0;
case '[':
case 'b':
case ',':
move_x = -1;
continue;
case ' ':
case 'n':
case '.':
move_x = 0;
continue;
case ']':
case 'm':
case '/':
move_x = 1;
continue;
case '1':
gear = 0 << 3;
set_Num_Lock_LED ();
continue;
case '2':
gear = 1 << 3;
set_Caps_Lock_LED ();
continue;
case '3':
gear = 2 << 3;
set_Scroll_lock_LED ();
continue;
case '4':
gear = 3 << 3;
clear_all_LEDs ();
continue;
}
}

Reading an integer and / or character without an array

I need to read an integer one by one until i read a '$', and then to determine the largest, smallest and so on. I could use a character variable and do it, but it works for numbers from 0 to 9. But how do I read integers of two or more digits and at the same time, detect a '$' - I used a char *, but I guess it is equivalent to an array, which I should not use here. Also, char holds a single number / char, hence not suitable for larger numbers. What should I do?
No arrays, no pointers, no tricky char-by-char read & convert. Just plain scanf and getchar.
#include <stdio.h>
int main()
{
int newValue=0; /* value being acquired */
int max; /* current maximum value */
int min; /* current minimum value */
int firstAcquired=0; /* boolean flag set to 1 after first acquisition */
int ch; /* used as temporary storage for the getchar() */
for(;;)
{
/* scanf returns the number of successfully acquired fields; here if it
returns 0 means that the value couldn't be acquired */
if(scanf("%d",&newValue)==0)
{
/* scanf failed, but it's guaranteed it put the offending character
back into the stream, from where we can get it */
ch=getchar();
if(ch=='$' || ch==EOF)
break;
else
/* from here to the break it's just to handle invalid input and EOF
gracefully; if you are not interested you can replace this stuff
with a random curse to the user */
{
puts("Invalid input, retry.");
/* Empty the buffer */
while((ch=getchar())!='\n' && ch!=EOF)
;
}
/* if it's EOF we exit */
if(ch==EOF)
break;
}
else
{
/* Everything went better than expected */
if(!firstAcquired || newValue>max)
max=newValue;
if(!firstAcquired || newValue<min)
min=newValue;
firstAcquired=1;
}
}
if(firstAcquired)
{
printf("The maximum value was %d\n", max);
printf("The minimum value was %d\n", min);
}
return 0;
}
In the interest of spoiling all the fun, showing off, outright overkill and darn tooting fun:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/support_istream_iterator.hpp>
namespace qi = boost::spirit::qi;
template <typename V>
void show_statistics(const V& data)
{
using namespace boost::spirit::karma;
std::cout << "data:\t"<< format('{' << auto_ % ", " << '}', data) << std::endl;
std::cout << "min:\t" << *std::min_element(data.begin(), data.end()) << std::endl;
std::cout << "max:\t" << *std::max_element(data.begin(), data.end()) << std::endl;
auto sum = std::accumulate(data.begin(), data.end(), 0);
std::cout << "sum:\t" << sum << std::endl;
std::cout << "avg:\t" << (1.0*sum) / data.size() << std::endl;
}
void dostats(const std::vector<int>& data) { show_statistics(data); }
int main()
{
std::cin.unsetf(std::ios::skipws);
auto f = boost::spirit::istream_iterator(std::cin);
decltype(f) l;
bool ok = qi::phrase_parse(f, l, +(+qi::int_ > "$") [ dostats ], qi::space);
if (f!=l)
std::cout << "Remaining input unparsed: " << std::string(f,l) << std::endl;
return ok? 0:255;
}
Demo:
Sample run:
sehe#natty:/tmp$ ./test2 <<< "1 2 3 4 5 $ 3 -9 0 0 0 $ 900 9000 $ unparsed trailing text"
data: {1, 2, 3, 4, 5}
min: 1
max: 5
sum: 15
avg: 3
data: {3, -9, 0, 0, 0}
min: -9
max: 3
sum: -6
avg: -1.2
data: {900, 9000}
min: 900
max: 9000
sum: 9900
avg: 4950
Remaining input unparsed: unparsed trailing text
You can use 'scanf("%s")' to read a group of characters. You can then check if the first character is a '%' and terminate if so. Otherwise, call atoi to convert to an integer. Store the largest and smallest in integer types, not character types.
Basically, the only time you have to deal with characters is when you read them in and check if it's a '$'. Otherwise, use integers all the way through.
If I'm getting what you want correctly it should be something like this:
int i = 0;
char c = getchar();
while (c != '$')
{
i = i * 10 + (c - '0');
c = getchar();
}
Hope it helped.
You can read char by char in a loop, check values and so on...
int i = 0;
char c = 0;
int size = 10;
int currentIndex = 0;
int* integers = malloc(sizeof(int) * size);
int counter = 0;
do
{
scanf("%c", &c);
if (c == ' ') // Match space, used a number separator
{
if (counter != 0 && i != 0)
{
if (currentIndex >= size)
{
size += 5;
integers = realloc(integers, size);
}
integers[currentIndex] = i;
currentIndex++;
}
counter = 0;
i = 0;
}
else if (c >= '0' && c <= '9')
{
i = (i * counter * 10) + (c - '0');
counter++;
}
}
while(c != '$');
Don't forget to free integers in the end!
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#define BUFF_SIZE 16
#define DATA_MAX_SIZE 64
int main(){
char buff[BUFF_SIZE];
int data[DATA_MAX_SIZE];
int i,value,counter = 0;
char *buffp,*p;
while(NULL!=fgets(buff,BUFF_SIZE,stdin)){
buff[BUFF_SIZE - 1]='\0';
buffp = buff;
next: while(isspace(*buffp))
++buffp;
if(*buffp == '\0')
continue;
value = strtol(buffp, &p, 0);
if(counter == DATA_MAX_SIZE){
printf("over data max size!\n");
break;
} else if(p != buffp){
data[counter++]=value;
if(*p == '\0' || *p == '\r'|| *p == '\n')
continue;
buffp = p;
goto next;
} else {
if(*p == '$')
break;
printf("format error\n");
break;
}
}
//check code
for(i=0;i<counter;++i){
printf("data[%d]=%d\n",i, data[i]);
}
return 0;
}
OUTPUT:
1 2 3
123
456
99 $
data[0]=1
data[1]=2
data[2]=3
data[3]=123
data[4]=456
data[5]=99
12345
4
$
data[0]=12345
data[1]=4

Resources