BIOS Data Area: Meaning of values for Cursor Start (61h) and Cursor End (60h) - cursor

I have been looking for the clear meaning of memory addresses 461h (Cursor Start) and 460h (Cursor End) in the BIOS Data Area.
But I have just found some references that say that it is the "size" of the cursor. Others say that they refer to scan lines, and the used values also differ broadly (63h and 00h for MCGA; 67h and 00 for 80x25 text mode; or 01h and 00h for instance) but it is still not specific enough, and it seems that it is necessary to adjust those values if the VGA registers are programmed manually for a standard graphics or text mode (more specifically text modes, which actually use a cursor) instead of using INT 10h, so that DOS or the BIOS itself can keep a consistent configuration.
I have seen the first 2 pages of Google results (Ralf Brown, mcamafia.de, Wikipedia, TinyVGA, BIOS Central, etc.) but the information they contain doesn't allow me to describe precisely what those values do, or to which configuration they correspond in the VGA registers, so I would be programming those values blindly.
So what would be the right definition for them?

My understanding:
460h-461h is "Cursor Shape", size=word. The low byte holds the ending scan line number, while the upper byte at address 40:61h holds the starting scan line. With video mode 3 the character cell is normally 16 scan lines high on a VGA. To turn off the cursor, set the ending scan line number above the starting scan line number.
Sample Cursor Shapes:
two line cursor at bottom: 0607h
lower half cursor: 0307h
upper half or quarter cursor: 0003h
full box cursor 0007h
blank cursor 0100h

Related

why the first byte is like that in Beyond Compare tool?

I have two binary files and suppose they should be the same but they are not. So I use binary diff tools to look at them. But two different tools, Beyond Compare and UltraCompare, give me different result in one file at the first byte.
I use HxD tool to verify the content and it seems the HxD supports UltraCompare.
Can anybody tell me what's that mean in Beyond Compare? Does this mean the Beyond Compare is not reliable in some cases?
In Beyond Compare spaces with the cross hatched ▨ background indicate missing (added or deleted) bytes. In your image the file on the left starts with an an 0x00 byte that the one on the right doesn't have. BC will show a gap in the file content to make the rest of the bytes line up visually. That's also indicated the hex addresses that are shown as "line numbers" being different on the two sides, and is the reason the rest of the file shows as black (exact matches). Gaps don't have any affect on the content of the files, it's just a method of presenting the alignment more clearly.
UltraCompare apparently isn't adjusting the alignment in this case, so every 0xC8 byte is lined up with a 0x00 one and vice versa, which is why the entire comparison is shown as a difference (red).
HxD is just showing a single file, not a comparison, so it doesn't need to use gaps to show the alignment. Whether UltraCompare is better or not depends on what you want the comparison to do. It is just comparing byte 1 to byte 1, byte 2 to byte 2, etc, while BC is aligning the files taking into account adds and deletes. In this case, it's showing that byte 1 on the left was added, so doesn't match anything on the right, and that byte 2 on the left is the same as byte 1 on the left, byte 3 on the left matches byte 2 on the right, etc.
If the binary data can have inserts and deletes (e.g., if it contains textual strings or variable length headers), then BC's approach is better because it avoids showing the entire file as different if one side just has an added byte (as in this case).
If the binary data is fixed size, for example a bitmap, then what UltraCompare is doing is better, since it's not adjusting offsets to line things up better. Since your filenames are labeled "pixelData" I assume that's the behavior you would prefer. In that case, in Beyond Compare you can change that by using the Session menu's Session Settings... command and switching the "Comparison" alignment setting from "Complete" to "None".

ASCII characters in RGB565 format

I want to show some text in a 640*480 screen. Where can I get the codes for ASCII characters in RGB565 format for a C program, such that I can have a natural look-and-feel as a command-line terminal for such a screen.
1- What would be the best width-height for a character?
2- Where can I get the 16-bit hex code (known as Bitmap Font or Raster Font) for each character?
e.g. const unsigned short myChar[] = {0x0001, 0x0002, 0x0003, 0x0004 ...}
"... the 16-bit hex code ..." is a misconception. You must have meant 16 bytes – one byte (8 pixels) per character line. A 640*480 screen resolution with 'natural' sized text needs 8x16 bitmaps. That will show as 30 lines of 80 columns (the original MCGA screens actually showed only 25 lines, but that was with the equivalent of 640*400 – stretched a bit).
Basic Google-fu turns up this page: https://fossies.org/dox/X11Basic-1.23/8x16_8c_source.html, and the character set comes pretty close to as I remember it from ye olde monochrome monitors:a
................................................................
................................................................
................................................................
................................................................
...XXXX.........................................................
....XX..........................................................
....XX..........................................................
....XX...XXXXX..XX.XXX...XXX.XX.XX...XX..XXXX...XX.XXX...XXXXX..
....XX..XX...XX..XX..XX.XX..XX..XX...XX.....XX...XXX.XX.XX...XX.
....XX..XX...XX..XX..XX.XX..XX..XX.X.XX..XXXXX...XX..XX.XXXXXXX.
XX..XX..XX...XX..XX..XX.XX..XX..XX.X.XX.XX..XX...XX.....XX......
XX..XX..XX...XX..XX..XX..XXXXX..XXXXXXX.XX..XX...XX.....XX...XX.
.XXXX....XXXXX...XX..XX.....XX...XX.XX...XXX.XX.XXXX.....XXXXX..
........................XX..XX..................................
.........................XXXX...................................
................................................................
Since this is a simple monochrome bitmap pattern, you don't need "RGB565 format for a C program" (another misconception). It is way easier to loop over each bitmap and use your local equivalent of PutPixel to draw each character in any color you want. You can choose between not drawing the background (the 0 pixels) at all, or having a "background color". The space at the bottom of the bitmap is large enough to put in an underline.
That said: I've used such bitmaps for years but I recently switched to a fully antialiased gray shade format. The bitmaps are thus larger (a byte per pixel instead of a single bit) but you don't have to loop over individual bits anymore, which is a huge plus. Another is, I now can use the shades of gray as they are (thus drawing 'opaque') or treat them as alpha, and get nicely antialiased text in any color and over any background.
That looks like this:
I did not draw this font; I liked the way it looked on my terminal, so I wrote a C program to dump a basic character set and grabbed a copy of the screen. Then I converted the image to pure grayscale and wrote a quick-and-dirty program to convert the raw data into a proper C structure.
a Not entirely true. The font blitter in the MCGA video card added another column at the right of each character, so effectively the text was 9x16 pixels. For the small set of border graphics – ╔╦╤╕╩ and so on –, the extra column got copied from the rightmost one.
No the most elegant solution, but I created a bmp empty image and filled it with characters.
Then I used This tool to convert the bmp file to the C bitmap array.
You should then be able to distinguish the characters in your array.
If you can access some type of 16 bit dos mode, you might be able to get the fonts from a BIOS INT 10 (hex 10) call. In this example, the address of the font table is returned in es:bp (es is usually 0xc000). This works for 16 bit programs in Windows dos console mode on 32 bit versions of Windows. For 64 bit versions of Windows, DOSBOX may work, or using a virtual PC should also work. If this doesn't work, do a web search for "8 by 16 font", which should get you some example fonts.
INT 10 - VIDEO - GET FONT INFORMATION (EGA, MCGA, VGA)
AX = 1130h
BH = pointer specifier
00h INT 1Fh pointer
01h INT 43h pointer
02h ROM 8x14 character font pointer
03h ROM 8x8 double dot font pointer
04h ROM 8x8 double dot font (high 128 characters)
05h ROM alpha alternate (9 by 14) pointer (EGA,VGA)
06h ROM 8x16 font (MCGA, VGA)
07h ROM alternate 9x16 font (VGA only) (see #0020)
11h (UltraVision v2+) 8x20 font (VGA) or 8x19 font (autosync EGA)
12h (UltraVision v2+) 8x10 font (VGA) or 8x11 font (autosync EGA)
Return: ES:BP = specified pointer
CX = bytes/character of on-screen font (not the requested font!)
DL = highest character row on screen
Note: for UltraVision v2+, the 9xN alternate fonts follow the corresponding
8xN font at ES:BP+256N
BUG: the IBM EGA and some other EGA cards return in DL the number of rows on
screen rather than the highest row number (which is one less).
SeeAlso: AX=1100h,AX=1103h,AX=1120h,INT 1F"SYSTEM DATA",INT 43"VIDEO DATA"

I cannot understand the abstraction between characters we see and how computers treats them

This is pretty low level and English is not my mother tongue so please be easy on me.
So imagine you are in bash and command prompt is in front of your screen.
When you type ls and hit enter, you are actually sending some bytes to the cpu, 01101100 01110011 00001010 (that is: l s linefeed) from your computer right? The keyboard controller sends the bytes to the cpu, and the cpu tells the operating system what bytes have been received.
So we have an application that is called 01101100 01110011 in our hard drive (or memory..) if I understand correctly? That is a file, and it is an executable file. But how does the operating system find 01101100 01110011 in a drive or in memory?
Also I want to expand this question to functions. We say C Standard library has a function called printf for example. How can a function have a name that is in a file? Ok, I understand that the implementation of printf function is cpu and operating system specific and is a number of machine instructions lying somewhere in the memory or hard drive. But I do not understand how we get to it?
When I link a code that requires the implementation of printf, how is it found? I am assuming the operating system knows nothing about the name of the function, or does it?
Koray, user #DrKoch gave a good answer, but I'd like to add some abstractions.
First, ASCII is a code. It is a table with bit patterns in one column and a letter in the next column. The bit patterns are exactly one byte long (excluding 'wide chars' and so). If we know a byte is supposed to represent a character, then we can look up the bit pattern of the byte in the table. A print function (remember the matrix printers?) receives a character (a byte) and instructs the needles of the matrix printer to hammer in some orderly way onto the paper and see, a letter is formed that humans can read. The ASCII code was devised because computers don't think in letters. There are also other codes, such as EBCDIC, which only means the table is diferent.
Now, if we don't know the byte is a representation of a letter in a certain code, then we are lost and the byte could just mean a number. We can multiply the byte with another byte. So you can multiply a' with 'p', which gives 97 * 112= 10864. Does that male sense? Only if we know the bytes represent numbers and is nonsense if the bytes represent characters.
The next level is that we call a sequence of bytes that are all supposed to represent letters (characters) a 'string' and we developed functions that can search, get and append from/to strings. How long is a string? In C we agreed that the end of the string is reached when we see a byte whose bit pattern is all zeroes, the null character. In other languages, a string representation can have a length member and so won't need a terminating null character.
This is an example of a "stacking of agreements". Another example (referring to a question you asked before) is interrupts: the hardware defines a physical line on the circiut board as an interrupt line (agreement). It gets connected to the interrupt pin (agreement) of the processor. A signal on the line (e.g. from an external device) causes the processor to save the current state of registers (agreement) and transfer control to a pre-defined memory location (agreement) where an interrupt handler is placed (agreement) which handles the request from the external device. In this example of stacking we can go many levels up to the functional application, and many levels down to the individual gates and transistors (and the basal definition of how many volts is a '1' and how many volts is a '0', and of how long that voltage must be observed before a one or zero has definitiely been seen).
Only when understanding that all these levels are only agreements, can one understand a computer. And only when understanding that all these levels are only agreements made between humans, can one abstract from it and not be bothered with these basics (the engineers take care of them).
You'll hardly find an answer if you look at the individual bits or bytes and the CPU.
In fact, if you type l and s the ASCII codes of these character are read by the shell and combined to the string "ls". At that time the shell has build a dictionary with string keys where it finds the key "ls" and it finds that this points to a specific executable "ls" in a path like "/usr/bin".
You see, even the shell thinks in strings not in characters, bytes or even bits.
Something very similar happens inside the linker when it tries to build an executable from your code and a collection of library files (*.lib, *.dll). It has build a dictionary with "printf" as one of the keys, which points to the correct library file and an byte-offset into this file. (This is rather simplified, to demonstrate the principle)
There are several layers of libraries (and BIOS code) before all this gets to the CPU. Don't make your life too hard, don't think too much about these layers in detail.

GIF LZW decompression hints?

I've read through numerous articles on GIF LZW decompression, but I'm still confused as to how it works or how to solve, in terms of coding, the more fiddly bits of coding.
As I understand it, when I get to the byte stream in the GIF for the LZW compressed data, the stream tells me:
Minimum code size, AKA number of bits the first byte starts off with.
Now, as I understand it, I have to either add one to this for the clear code, or add two for clear code and EOI code. But I'm confused as to which of these it is?
So say I have 3 colour codes (01, 10, 11), with EOI code assumed (as 00) will the byte that follows the minimum code size (of 2) be 2 bits, or will it be 3 bits factoring in the clear code? Or is the clear code/EOI code both already factored into the minimum size?
The second question is, what is the easiest way to read in dynamically sized bits from a file? Because reading an odd numbers of bits (3 bits, 12 bits etc) from an even numbered byte (8) sounds like it could be messy and buggy?
To start with your second question: yes you have to read the dynamically sized bits from an 8bit bytestream. You have to keep track of the size you are reading, and the number of unused bits left from previous read operations (used for correctly putting the 'next byte' from the file).
IIRC there is a minimum code size of 8 bits, which would give you a clear code of 256 (base 10) and an End Of Input of 257. The first stored code is then 258.
I am not sure why you did not looked up the source of one of the public domain graphics libraries. I know I did not because back in 1989 (!) there were no libraries to use and no internet with complete descriptions. I had to implement a decoder from an example executable (for MS-DOS from Compuserve) that could display images and a few GIF files, so I know that can be done (but it is not the most efficient way of spending your time).

TSR Program to change case of characters on screen in windows/dos

I would like to know how can we change the letters of the characters on screen using C. It is a TSR program using dos.h header file.
I might be able to help partially from what i remember of my early undergrad.
In DOS, the address 0xB8000000 (0xB800:0 as segment:offset rightly pointed out in comments) is the starting address of text mode video memory (0xA8000000 being that for graphics). Any thing written into this area is copied directly to vga card. Now every character on the screen is made up of two bytes. First byte was the ascii character and second was the color.
So effectively you take a far pointer in 16 bit c (since a normal near pointer won't do) and assign it the above address. Then assuming your screen size (25*80) or whatever * 2 is the total number of single byte addresses filling your screen.
I remember having written the equivalent of a trivial printf function using above.
Getting back to your problem, you have to write code which loops through all even addresses starting from above address till screen size. Even address because odd one represents color. There it checks if the assigned character is valid ascii and add or subtract according to needs e.g. 'A' + 32 would get you 'a' and so on.
Question remains about when your above program does this. I believe you can have some interrupt or similar thing in dos.h which triggers every time any screen character is changed. But this part is not very clear in my memory.
See if that works for you.

Resources