How to get the last recognized key sequence with libreadline? - c

Basically, I want readline to recognize cursor keys for me. Their byte identifiers are multi-character, like e.g. \e[D. So hooking up to rl_getc isn't enough. However, I didn't see any variables similar to rl_key_sequence or rl_current_key_seq, etc.…
Is there really no way to get the – one have to say: fully processed and utilized information, i.e.: the multibyte key ID – out of readline?

Related

Reverse op for GetKeyNameText

We can use GetKeyNameText() to retrieve a string that represents the name of a key. Is there any way to do the reverse, ie get a scancode or virtual key for a given key name?
I want to write key names to a config file so users can edit them easily. When I read in the config file, I would need to do the reverse of GetKeyNameText().
Not a good idea, these names are not constant:
The key name is translated according to the layout of the currently installed keyboard, thus the function may give different results for different input locales.
If you are desperate I suppose you can call GetKeyNameText in a loop trying all possible scancodes.
MapVirtualKey can convert a scancode to a virtual key if you also need to do that. Virtual keys are stable across all keyboards and all versions of Windows.
I would suggest that you hardcode some of the names that are common and have virtual keys like A-Z, 0-9, Ctrl, Alt, Shift, Home, Insert etc. and store those as English text. Only store the scancode for other strange keys.

How to write a keyboard layout dll using the KbdLayerDescriptor symbol?

Looking an example source code wasn't enough, and I couldn't find any official documentation about theKbdLayerDescriptorsymbol. So I have still some questions about it :
What is the purpose of the ligature table, or more precisely how does it works. Is it for writing pre‑composed characters ? If not, does it means automatically insert the ZERO WIDTH JOINER character, or it simply write several characters without ligature ?
Is is possible to define three or more shift states with keys of the numeric pad ?
I saw theKBD_TYPEneed to be defined. What are the purpose of each integer values ?
Is it possible to use Unicode values larger than 16 bits like the mathematical𝚤 ?
I saw keyboards layout use[HKLM\SYSTEM\CurrentControlSet\Control\Keyboard Layout\DosKeybCodes]and[HKLM\SYSTEM\CurrentControlSet\Control\Keyboard Layouts]but it seems it is not the only registry keys that need to be completed in order to register a system wide keyboard. So what are the required registry keys for installing a system wide keyboard layout ?

Should flags only be binary?

I may be erring towards pedantry here, but say I have a field in a database that currently has two values (but may contain more in future). I know I could name this as a flag (e.g. MY_FLAG) containing values 0 and 1, but should more values be required (e.g. 0,1,2,3,4), is it still correct to call the field a flag?
I seem to recall reading something previously, that a flag should always be binary, and anything else should be labelled more appropriately, but I may be mistaken. Does anyone know if my thinking is correct? If so, can you point me to any information on this please? My googling has turned nothing up!!
Thanks very much :o)
Flags are usually binary because when we say flag it means either it is up(1) or down(0).
Just like it is used in military to flag up and down in order to show the war-signs. The concept of flagging is taken from there.
Regarding what you are saying
"your words : values be required (e.g. 0,1,2,3,4)"
In such a situation use Enum. Enumerations are build for such cases or sometimes what we do is , we justify the meaning of these numeric values in comments or in separate file so that more memory could be saved(we use tinyInt or bit field). But never name such a situation Flag.
Flags have standard meaning that is either Up or Down. It doesn't mean that you will get error or something but it is not a good practice. Hope you get it.
It's all a matter of conventions and the ability to maintain your database/code effectively. Technically, you can have a column called my_flag defined as a varchar and hold values like "batman" and "barak obama".
By convention, flags are boolean. If you intend to have other values there, it's probably a better idea to call the column something else, like some_enum, or my_code.
Very occasionally, people talk about (for example) tri-state flags, but Wikipedia and most of the dictionary definitions that I read reserve "flag" for binary / two state uses1.
Of course, neither Wikipedia or any dictionary has the authority to say some English usage is "incorrect". "Correct" usage is really "conventional" usage; i.e. what other people say / write.
I would argue that saying or writing "tri-state flag" is unconventional, but it is unambiguous and serves its purpose of communicating a concept adequately. (And the usage can be justified ...)
1 - Most, but not all; see http://www.oxforddictionaries.com/definition/english/flag.
Don't call anything "flag". Or "count" or "mark" or "int" or "code". Name it like everything else in code: after what it means.
workday {mon..fri}
tall {yes,no}
zip_code {00000..99999}
state {AL..WY}
Notice that (something like) yes/no plays the 'flag' role of indicating a permanent dichotomy. (In lieu of boolean, which does that in the rest of the universe outside SQL). For when the specification/contract really is whether something is so. If a design might add more values you should use a different type.
Of course if you want to add more info to a name you can. Add distinctions that are meaningful if you can.
workday {monday..friday}
workday_abbrev {mon..fri}
is_tall {yes,no}
zip_plus_5 {00000-99..99999-99}
state_name {Alabama..Wyoming}
state_2 {AL..WY}

Parsing a domain name

I am parsing the domain name out of a string by strchr() the last . (dot) and counting back until the dot before that (if any), then I know I have my domain.
This is a rather nasty piece code and I was wondering if anyone has a better way.
The possible strings I might get are:
domain.com
something.domain.com
some.some.domain.com
You get the idea. I need to extract the "domain.com" part.
Before you tell me to go search in google, I already did. No answer, hence I am asking here.
Thank you for your help
EDIT:
The string I have contains a full hostname. This usually is in the form of whatever.domain.com but can also take other forms and as someone mentioned it can also have whatever.domain.co.uk. Either way, I need to parse the domain part of the hostname: domain.com or domain.co.uk
Did you mean strrchr()?
I would probably approach this by doing:
strrchr to get the last dot in the string, save a pointer here, replace the dot with a NUL ('\0').
strrchr again to get the next to last dot in the string. The character after this is the start of the name you are looking for (domain.com).
Using the pointer you saved in #1, put the dot back where you set it NUL.
Beware that names can sometimes end with a dot, if this is a valid part of your input set, you'll need to account for it.
Edit: To handle the flexibility you need in terms of example.co.uk and others, the function described above would take an additional parameter telling it how many components to extract from the end of the name.
You're on your own for figuring out how to decide how many components to extract -- as Philip Potter mentions in a comment below, this is a Hard Problem.
This isn't a reply to the question itself, but an idea for an alternate approach:
In the context of already very nasty code, I'd argue that a good way to make it less nasty, and provide a good facility of parsing domain names and the likes - is to use PCRE or a similar library for regular expressions. That will definitly help you out if you also want to validate that the tld exists, for instance.
It may take some effort to learn initially, but if you need to make changes to existing matching/parsing code, or create more code for string matching - I'd argue that a regex-lib may simplify this a lot in the long term. Especially for more advanced matching.
Another library I recall which supports regex, is glib.
Not sure what flavor of C, but you probably want to tokenize the domain using "." as the separator.
Try this: http://www.metalshell.com/source_code/31/String_Tokenizer.html
As for the domain name, not sure what your end goal is, but domains can have lots and lots of nodes, you could have a domain name foo.baz.biz.boz.bar.co.uk.
If you just want the last 2 nodes, then use above and get the last two tokens.

WPF RichTextBox TextChanged event - how to find deleted or inserted text?

While creating a customized editor with RichTextBox, I've face the problem of finding deleted/inserted text with the provided information with TextChanged event.
The instance of TextChangedEventArgs has some useful data, but I guess it does not cover all the needs. Suppose a scenario which multiple paragraphs are inserted, and at the same time, the selected text (which itself spanned multiple paragraphs) has been deleted.
With the instance of TextChangedEventArgs, you have a collection of text changes, and each change only provides you with the number of removed or added symbols and the position of it.
The only solution I have in mind is, to keep a copy of document, and apply the given list of changes on it. But as the instances of TextChange only give us the number of inserted/removed symbols (and not the symbols), so we need to put some special symbol (for example, '?') to denote unknown symbols while we transform our original copy of document.
After applying all changes to the original copy of document, we can then compare it with the richtextbox's updated document and find the mappings between unknown symbols and the real ones. And finally, get what we want !!!
Anybody has tried this before? I need your suggestions on the whole strategy, and what you think about this approach.
Regards
It primarily depends on your use of the text changes. When the sequence includes both inserts and deletes it is theoretically impossible to know the details of each insert, since some of the symbols inserted may have subsequently been deleted. Therefore you have to choose what results you really want:
For some purposes you must to know the exact sequence of changes even if some of the inserted symbols must be left as "?".
For other purposes you must know exactly how the new text differs from the old but not the exact sequence in which the changes were made.
I will techniques to achieve each of these results. I have used both techniques in the past, so I know they are effective.
To get the exact sequence
This is more appropriate if you are implementing a history or undo log or searching for specific actions.
For these uses, the process you describe is probably best, with one possible change: Instead of "finding the mappings between the unknown symbols and the real ones", simply run the scan forward to find the text of each "Delete" then run it backward to find the text of each "Insert".
In other words:
Start with the initial text and process the changes in order. For each insert, insert '?' symbols. For each delete, remove the specified number of symbols and record them as the text deleted.
Start with the final text and process the changes in reverse order. For each delete, insert '?' symbols. For each insert, remove the specified number of symbols and record them as the text inserted.
When this is complete, all of your "Insert" and "Delete" change entries will have the associated text to the best of our knowledge, and any text that was inserted and immediately deleted will be '?' symbols.
To get the difference
This is more appropriate for revision marking or version comparison.
For these uses, simply use the text change information to compute a set of integer ranges in which changes might be found, then use a standard diff algorithm to find the actual changes. This tends to be very efficient in processing incremental changes but still gives you the best updates.
This is particularly nice when you paste in a replacement paragraph that is almost identical to the original: Using the text change information will indicate the whole paragraph is new, but using diff (ie. this technique) will mark only those symbol runs that are actually different.
The code for computing the change range is simple: Represent the change as four integers (oldstart, oldend, newstart, newend). Run through each change:
If changestart is before newstart, reduce newstart to changestart and reduce oldstart an equal amount
If changeend is after newend, increase newend to changeend and increase oldend an equal amount
Once this is done, extract range [oldstart, oldend] from the old document and the range [newstart, newend] from the new document, then use the standard diff algorithm to compare them.

Resources