How tab and space affect code size in C - c

When working on embedded system, every byte of memory matters, in C/C++ program is there any difference in resulting code size when you use 4 space instead of 1 tab?

No.
The emitted binary doesn't change based on what spacing you use in your program.
The amount of space the source file takes up does change though. spaces and tabs are each one character, so using 1 tab vs 4 spaces takes up different amounts of memory. It's important to note that this is only for the source file, and during compilation.

Formatting the source code itself with spaces or tabs, makes no difference to the executable code size. It is a preference, mine is never to use tab formatting - please read this.
As for program itself, tabs only make a difference when using string literals. The control character '\t' is one byte in the executable, any spaces will be one or more.
But I prefer to use a field width specifier such as printf("%4d", i) to format the output.

Related

How can I specify the color of particular characters in a String in c?

I am experimenting with colorized text output to the console in c. I know that you are able to change the color of entire printf statements, but I was wondering if I am able to change the text color of individual characters within a printf statement.
In summary, I would like to be able to print out "asdf" with the a being red, the s being green, the d being blue, and the f being orange.
Thank you in advance.
First of all, in C, characters has no color. They are just numbers. So your question is not at all related to C, which doesn't care at all about how your print things.
What you are referring to, is the fact that some terminals accept some control characters to specific how they should render what is sent to them.
Those are just special characters, that are not meant to be printed, but to modify the terminal behavior. There is no guarantee that your terminal understand those control characters. Nor is there any guarantee that they are the same control characters as other terminals.
Some library (such as ncurses) exist that have knowledge on the terminal you are using, and provide helper functions that make this transparent.
All that being said, the way to print in red (well, the most common one, at least) is
printf("\033[31m");
That switch the terminal to red
and
printf("\033[m");
That switches is back to normal.
Those are control characters. So, from C point of view, just characters like any other, to be printed, that is sent to the terminal. It is then up to the terminal to do whatever it sees fit with it.
Being just characters like other, nothing prevent you to mix them with any normal characters,
So your example
printf("\033[31ma\033[32ms\033[34md\033[33mf\033[m\n");
But there is no guarantee it works. You can't really count on it. There is even no guarantee that it won't print some unwanted chars.
I have found a temporary workaround: using multiple printf statements since they do not go to the next line, I do not like this solution very much.

Using C I would like to format my output such that the output in the terminal stops once it hits the edge of the window

If you type ps aux into your terminal and make the window really small, the output of the command will not wrap and the format is still very clear.
When I use printf and output my 5 or 6 strings, sometimes the length of my output exceeds that of the terminal window and the strings wrap to the next line which totally screws up the format. How can I write my program such that the output continues to the edge of the window but no further?
I've tried searching for an answer to this question but I'm having trouble narrowing it down and thus my search results never have anything to do with it so it seems.
Thanks!
There are functions that can let you know information about the terminal window, and some others that will allow you to manipulate it. Look up the "ncurses" or the "termcap" library.
A simple approach for solving your problem will be to get the terminal window size (specially the width), and then format your output accordingly.
There are two possible answers to fix your problem.
Turn off line wrapping in your terminal emulator(if it supports it).
Look into the Curses library. Applications like top or vim use the Curses library for screen formatting.
You can find, or at least guess, the width of the terminal using methods that other answers describe. That's only part of the problem however -- the tricky bit is formatting the output to fit the console. I don't believe there's any alternative to reading the text word by word, and moving the output to the next line when a word would overflow the width. You'll need to implement a method to detect where the white-space is, allowing for the fact that there could be multiple white spaces in a row. You'll need to decide how to handle line-breaking white-space, like CR/LF, if you have any. You'll need to decide whether you can break a word on punctuation (e.g, a hyphen). My approach is to use a simple finite-state machine, where the states are "At start of line", "in a word", "in whitespace", etc., and the characters (or, rather character classes) encountered are the events that change the state.
A particular complication when working in C is that there is little-to-no built-in support for multi-byte characters. That's fine for text which you are certain will only ever be in English, and use only the ASCII punctuation symbols, but with any kind of internationalization you need to be more careful. I've found that it's easiest to convert the text into some wide format, perhaps UTF-32, and then work with arrays of 32-bit integers to represent the characters. If your text is UTF-8, there are various tricks you can use to avoid having to do this conversion, but they are a bit ugly.
I have some code I could share, but I don't claim it is production quality, or even comprehensible. This simple-seeming problem is actually far more complicated than first impressions suggest. It's easy to do badly, but difficult to do well.

BBC Basic: Inserting a control character without occupying space in Mode 7

I'm using mode 7 ("Teletext mode") on my Beeb. I'd like to print a string of unbroken characters with an coloured text control character in the middle, as-per this mock-up:
However, I can't work how this can be done. The control character needs to occupy space in the output:
PRINT CHR$129;"STACK"CHR$132;"OVERFLOW"
I read up on held graphics mode, but this only seems to allow me to repeat the last used graphics symbol, instead of inserting a space when I print a control character. When I do try this with text I just get an additional space for the held graphics character:
PRINT CHR$129;"STACK"CHR$158;CHR$132;"OVERFLOW"
Is this possible? Can I print a control character without getting a visible space?
Or perhaps there is a way to insert a control character followed by a backspace, to claim back the occupied space but retain the control code effect?
It is not possible to treat the text characters as graphics characters when using the 'held graphics' character. A good example of using 'held graphics' can be found here: http://www.riscos.com/support/developers/bbcbasic/part2/teletext.html
You also cannot use the backspace character to go back one space as each control code takes up one space on the screen.
OK so this is a bit of a fudge; but it was an answer to my problem so I will share it here for all those BBC Micro / Teletext developers struggling with the same problem...
My challenge was to avoid a noticeable space between the two coloured words. Control characters must exist in the text and occupy a character (either as a space or a copy of the last used block graphic).
Therefore, by inserting a space between every character I was able to make the text appear as one word (albeit with slightly excessive letter spacing):
PRINT CHR$129;"S T A C K"CHR$132;"O V E R F L O W"
This had the desired effect for me - it may not for some others. The only other route I could see available was to render the whole text in block graphics, which would occupy significantly more screen space than the approach I settled for.
This is from memory, I recall that CHR$(8) moves the cursor one place to the left.
Put that just before the "O":
PRINT CHR$(129);"STACK";CHR$(132);CHR$(8);"OVERFLOW"
Sadly my BBC Model B is, I believe, in my parents' attic, so I can't test this.

How to show arbitary characters in c?

warning C4566: character represented
by universal-charac ter-name '\u2E81'
cannot be represented in the current
code page (936)
Sometimes we need to display text in various languages such as Russian,Japanese and so on.
But seems a single code page can only show characters of 1 single language ,how can I show characters in various languages at the same time?
Since you're (apparently) using VC++, you probably want to switch to the UTF-8 code page. You'll also need to set the font to one that has glyphs for all the code points you care about (many have few if any beyond the first 256).

How do I check if a file is text-based?

I am working on a small text replacement application that basically lets the user select a file and replace text in it without ever having to open the file itself. However, I want to make sure that the function only runs for files that are text-based. I thought I could accomplish this by checking the encoding of the file, but I've found that Notepad .txt files use Unicode UTF-8 encoding, and so do MS Paint .bmp files. Is there an easy way to check this without placing restrictions on the file extensions themselves?
Unless you get a huge hint from somewhere, you're stuck. Purely by examining the bytes there's a non-zero probability you'll guess wrong given the plethora of encodings ("ASCII", Unicode, UTF-8, DBCS, MBCS, etc). Oh, and what if the first page happens to look like ASCII but the next page is a btree node that points to the first page...
Hints can be:
extension (not likely that foo.exe is editable)
something in the stream itself (like BOM [byte-order-marker])
user direction (just edit the file, goshdarnit)
Windows used to provide an API IsTextUnicode that would do a probabilistic examination, but there were well-known false-positives.
My take is that trying to be smarter than the user has some issues...
Honestly, given the Windows environment that you're working with, I'd consider a whitelist of known text formats. Windows users are typically trained to stick with extensions. However, I would personally relax the requirement that it not function on non-text files, instead checking with the user for goahead if the file does not match the internal whitelist. The risk of changing a binary file would be mitigated if your search string is long - that is assuming you're not performing Y2K conversion (a la sed 's/y/k/g').
It's pretty costly to determine if a file is text-based or not (i.e. a binary file). You would have to examine each byte in the file to determine if it is a valid character, irrespective of the file encoding.
Others have said to look at all the bytes in the file and see if they're alphanumeric. Some UNIX/Linux utils do this, but just check the first 1K or 2K of the file as an "optimistic optimization".
well a text file contains text, right ? so a really easy way to check a file if it does contain only text is to read it and check if it does contains alphanumeric characters.
So basically the first thing you have to do is to check the file encoding if its pure ASCII you have an easy task just read the whole file in to a char array (I'm assuming you are doing it in C/C++ or similar) and check every char in that array with functions isalpha and isdigit ...of course you have to take care about special exceptions like tabulators '\t' space ' ' or the newline ('\n' in linux , '\r'\'n' in windows)
In case of a different encoding the process is the same except the fact that you have to use different functions for checking if the current character is an alphanumeric character... also note that in case of UTF-16 or greater a simple char array is simply to small...but if you are doing it for example in C# you dont have to worry about the size :)
You can write a function that will try to determine if a file is text based. While this will not be 100% accurate, it may be just enough for you. Such a function does not need to go through the whole file, about a kilobyte should be enough (or even less). One thing to do is to count how many whitespaces and newlines are there. Another thing would be to consider individual bytes and check if they are alphanumeric or not. With some experiments you should be able to come up with a decent function. Note that this is just a basic approach and text encodings might complicate things.

Resources