I am creating a new TCL_ObjType and so I need to define the 4 functions, setFromAnyProc, updateStringProc, dupIntRepProc and freeIntRepProc. When it comes to test my code, I see something interesting/mystery.
In my testing code, when I do the following:
Tcl_GetString(p_New_Tcl_obj);
updateStringProc() for the new TCL object is called, I can see it in gdb, this is expected.
The weird thing is when I do the following testing code:
Tcl_SetStringObj(p_New_Tcl_obj, p_str, strlen(p_str));
I expect setFromAnyProc() is called, but it is not!
I am confused. Why it is not called?
The setFromAnyProc is not nearly as useful as you might think. It's role is to convert a value[*] from something with a populated bytes field into something with a populated bytes field and a valid internalRep and typePtr. It's called when something wants a generic conversion to a particular format, and is in particular the core of the Tcl_ConvertToType function. You probably won't have used that; Tcl itself certainly doesn't!
This is because it turns out that the point when you want to do the conversion is in a type-specific accessor or manipulator function (examples from Tcl's API include Tcl_GetIntFromObj and Tcl_ListObjAppendElement, which are respectively an accessor for the int type[**] and a manipulator for the list type). At that point, you're in code that has to know the full details of the internals of that specific type, so using a generic conversion is not really all that useful: you can do the conversion directly if necessary (or factor that out to a conversion function).
Tcl_SetStringObj works by throwing away the internal representation of your object (with the freeIntRepProc callback), disposing of the old bytes string representation (through Tcl_InvalidateStringRep, or rather its internal analog) and then installing the new bytes you've supplied.
I find that I can leave the setFromAnyProc field of a Tcl_ObjType set to NULL with no problems.
[*] The Tcl_Obj type is mis-named for historic reasons. It's a value. Tcl_Value was taken for something else that's now obsolete and virtually unused.
[**] Integers are actually represented by a cluster of internal types, depending on the number of bits required. You don't need to know the details if you're just using them, as the accessor functions completely hide the complexity.
Related
Apologies if the answer is obvious, but I don't get it. I have a function that accepts a FloatArray so I passed a Array<Float> to it but it rejects it! I thought FloatArray was just another way of creating Array<Float>. What's the difference?
Short answer: one is an array of primitives, the other an array of references to Float objects.
The difference is mostly hidden from you in Kotlin, so to explain it's probably best to go back to Java…
Java has nine basic types (if I've counted correctly). Eight of them hold a value directly: boolean, byte, short, char, int, long, float, and double — those are called ‘primitives’. The other type is a reference, which can point to an instance of an object or array.
Because there are cases when you need to pass one of those primitive values around as an object, Java also provides some objects which simply wrap a primitive value: java.lang.Boolean, java.lang.Byte, and so on. There's one for each primitive type.
Most code uses primitives directly, but sometimes it's handy to be able to pass an object reference. (For one thing, primitives are not nullable, so if you need to support a null, then you'll need an object reference. For another, generic code such as List and the other classes in the collections framework can handle only object references.)
However, object wrappers are less efficient, because each instance is a full object and takes a certain amount of memory (e.g. 16–32 bytes, depending on the Java runtime) — and that's in addition to the size of references to it (perhaps 8 bytes). The JVM caches commonly-used wrappers (e.g. true and false for booleans, and some small numbers), but for anything else you'll be creating new objects on the heap.
The wrappers are clearly distinguished from the primitive types — they're capitalised (and, in the case of Integer, spelled differently). In early versions of Java, they were not interchangeable; you needed to explicitly wrap (e.g. Int(someValue) and unwrap (e.g. someReference.intValue()) when needed. Java 5 added ‘autoboxing’, where in many cases the compiler would do that for you. This blurs the distinction a bit, but most of the time you still need to be aware of it.
One of the benefits of Kotlin is that it removes some of Java's unnecessary complexity. One of the ways it does this is by hiding that distinction almost completely. The Kotlin language has no primitives: everything looks like an object. However, for reasons of efficiency, compiled Kotlin uses primitives ‘under the hood’ where possible. For example:
var i: Int
That declares an Int value — which will be stored as a primitive field. However:
var i: Int?
That declares a reference to an integer wrapper. (That's because primitives are not nullable, and so a primitive can't store a null value.)
This is an implementation detail: most of the time, when you're writing Kotlin, you don't need to be aware of this. But the distinction is still there at runtime, and arrays are one of the rare times it becomes visible:
FloatArray is an array of primitives. It uses the minimum of memory, and interoperates with Java code that uses a float[] type.
Array<Float> is an array of references to Float objects. It's more flexible, and interoperates with Java code that uses a Float[] type.
So you can see that these are two different types, even though they do similar things.
If you're interoperating with existing code, that will control which one you should use. If you're writing new code, then you have the choice: FloatArray is likely to be more efficient and use less memory — but Array<Float> tends to be better supported in other code (which may be able to process all the relevant types just by accepting a generic Array, instead of having to support FloatArray and IntArray and LongArray and all the others).
Some information about arrays in Kotlin is available here: https://kotlinlang.org/docs/basic-types.html#primitive-type-arrays
Kotlin also has classes that represent arrays of primitive types without boxing overhead: ByteArray, ShortArray, IntArray, and so on. These classes have no inheritance relation to the Array class, but they have the same set of methods and properties.
So FloatArray and Array<Float> are not the same, the difference is that the first has no boxing overhead.
Look at how FloatArray is declared in the documentation. It is just another class, not related to the Array<T> class at all. Sure, they represent very similar things, with the difference being that one of them would box Float values, and the other doesn't, as explained by the other answer. But from the perspective of the type system, they are totally unrelated. It's as if I declared:
class A
class B
and tried to pass an instance of A to a parameter expecting a B.
There are builtin methods to convert between these types though:
floatArrayOf(1f,2f,3f).toTypedArray() // FloatArray to Array<Float>
arrayOf(1f,2f,3f).toFloatArray() // Array<Float> to FloatArray
It's just that there is no implicit conversion between them, because these are unrelated types, unlike if you have subclasses and superclasses for example.
I'm writing some generic functions in Zig, but using Gtk's C api more or less directly (no language bindings). Say I have a widget pointer that I want to cast to a window pointer. How do I determine if the widget in fact is a window?
What I want to do is test whether the widget is also another type of widget before attempting to do the cast. If it's valid, I do the cast and return the pointer. If it isn't valid, I return null.
#Joseph-Sible-Reinstate-Monica pi me on the right path here, although I had to do a bit of extra work because I'm using Zig.
Zig imports C code by translating it to Zig, and unfortunately one of the areas where this can fail is when macros are heavily involved, as is the case here. However, by looking at what the macros actually do I was able to find the functions which return a widget identifier for each given type of widget, and then just match that up with the identifier which is embedded inside the widget data structure. And it's way in there...
ptr.*.parent_instance.g_type_instance.g_class.*.g_type;
In Zig, .* means dereference the pointer, so we deref the widget pointer, get the patent_instance field, then the g_type_instance, g_class, deref that, and finally arrive at the destination. Fun. I'm glad it at least works out to a one-liner. Then match that up with (for a GtkWindow) gtk_window_get_type(). Of course it's slightly more complicated, because in that case I'd also have to check against GtkApplicationWindow and others, but at any rate I found a workable solution.
Thank you all for answering.
The standard return type for functions in Windows C/C++ APIs is called HRESULT.
What does the H mean?
Result handle as stated here at MSDN Error Handling in COM
The documentation only says:
The return value of COM functions and methods is an HRESULT, which is not a handle to an object, but is a 32-bit value with several fields encoded in a single 32-bit ULONG variable.
Which seems to indicate that it stands for "handle", but is misused in this case.
Hex Result.
HRESULT are listed in the form of 0x80070005. They are a number that gets returned by COM\OLE calls to indicate various types of SUCCESS or FAILURE. The code itself is comprised of a bit field structure for those that want to delve into the details.
Details of the bit field structure can be found here at Microsoft Dev Center's topic Structure of COM Error Codes and here at MSDN HRESULT Structure.
The H-prefix in Windows data types generally designates handle types1 (such as HBRUSH or HWND). The documentation seems to be in agreement, sort of:
The HRESULT (for result handle) is a way of returning success, warning, and error values. HRESULTs are really not handles to anything; they are only values with several fields encoded in the value.
In other words: Result handles are really not handles to anything. Clearly, things cannot possibly have been designed to be this confusing. There must be something else going on here.
Luckily, historian Raymond Chen is incessantly conserving this kind of knowledge. In the entry aptly titled Why does HRESULT begin with H when it’s not a handle to anything? he writes:
As I understand it, in the old days it really was a handle to an object that contained rich error information. For example, if the error was a cascade error, it had a link to the previous error. From the result handle, you could extract the full history of the error, from its origination, through all the functions that propagated or transformed it, until it finally reached you.
The document concludes with the following:
The COM team decided that the cost/benefit simply wasn’t worth it, so the HRESULT turned into a simple number. But the name stuck.
In summary: HRESULT values used to be handle types, but aren't handle types any more. The entire information is now encoded in the value itself.
Bonus reading:
Handle types losing their reference semantics over time is not without precedent. What is the difference between HINSTANCE and HMODULE? covers another prominent example.
1 Handle types store values where the actual value isn't meaningful by itself; it serves as a reference to other data that's private to the implementation.
In any programming environment,what ever the data type I am going to choose finally the CPU will do only the Arithmetic operations(addition/logical operations).
How this transition(from user defined data type/operations to CPU instruction set) happens and what is the role of compiler,interpreter,assembler and linker in this life cycle
Also how OOPS handles this mapping since the worst case mostly all are objects in OOPS(I mean the Java language)..
Java source --> native code translation actually happens in two distinct steps: the conversion from source code to bytecode at compile time (that's what javac does), and the conversion from bytecode to native CPU instructions at runtime (that's what java does).
When the source code is being "compiled", the fields and methods get condensed into entries in a symbol table. You say "System.out.println()", and javac turns it into something like "get the static field referenced by symbol #2004, and invoke the method referred to by symbol #300 on it" (where #2004 might be "System.out" and #300 might be "void java.io.PrintStream.println()"). (Note, i'm way oversimplifying -- the symbols look nothing like that, and they're split up a bit more. But they do contain that kind of info.)
At runtime, the JVM looks at those symbols, loads the classes referred to in them, and runs (or generates, if it's JITting) the native instructions necessary to find and execute the method. There's no real "linker" in Java; all the linking is done at runtime, based on the classes referenced. It's a lot like how DLLs work in Windows.
JIT is about the closest thing there is to an "assembler". It takes the bytecode and generates equivalent native code on the fly. The bytecode isn't in human-readable form, though, so i wouldn't normally count the translation as "assembling".
...
In languages like C and C++ (not C++/CLI), the story is quite different. All of the translation (and a good bit of linking) happens at compile time. Access to members of a struct gets converted into something like "give me the int 4 bytes from the beginning of this particular bunch of bytes". There's no flexibility there; if the struct's layout changes, generally the whole app has to be recompiled.
Consider the starting point of a language that has only integers and floats of various sizes, and a type that points into memory that lets us have pointers to those types.
The correlation from this to the machine code the CPU uses would be relatively clear (though in fact we might well optimise beyond that).
Characters we can add by storing code-points in some encoding, and strings we build as arrays of such characters.
Now lets say we want to move this to the point where we can have something like:
class User
{
int _id;
char* _username;
public User(int id, char* username)
{
_id = id;
_username = username;
}
public virtaul bool IsDefaultUser()
{
return _id == 0;
}
}
The first thing we need to add to our language is some sort of struct/class construct that contains the members. Then we can have as far as:
class User
{
int _id;
char* _username;
}
Our compiling process knows that this means storing an integer followed by a pointer to an array of characters. It therefore knows that accessing _id means accessing the integer at the address of the start of the structure, and accessing _username means accessing the pointer to char at a given offset from the address of the start of the structure.
Given this, the constructor can exist as a function that does something like:
_ctor_User*(int id, char* username)
{
User* toMake = ObtainMemoryForUser();
toMake._id = id;
toMake._username = ObtainMemoryAndCopyString(username);
return toMake;
}
Obtaining memory and cleaning it up when appropriate is complicated, take a look at the section in the K&R on how to use pointers to structures and how malloc looks for one way this could be done.
From this point we can also implement IsDefaultUser with something like:
bool _impl_IsDefaultUser(*User this)
{
return this._id == 0
}
This can't be overridden though. To allow for overriding we change User to be:
class User
{
UserVTable* _vTable;
int _id;
char* _username;
}
Then _vTable points at a table of pointers to functions, which in this case contains a single entry, which is a pointer to the function above. Then calling the virtual member becomes a matter of looking at the correct offset into that table, and calling the appropriate function found. A derived class would have a different _vTable that would be the same except for having different function pointers for those methods that are overridden.
This is glossing over an awful lot, and not the only possibility in each case (e.g. v-tables are not the only way to implement overridable methods), but does show how we can build an object-oriented language which can be compiled down to more primitive operations on more primitive data types.
It also glosses over the possibility of doing something like the way C# is compiled to IL which is then in turn compiled to machine code, so that there are two steps between the OO language and the machine code that will actually be excuted.
An Example
Suppose we have a text to write and could be converted to "uppercase or lowercase", and can be printed "at left, center or right".
Specific case implementation (too many functions)
writeInUpperCaseAndCentered(char *str){//..}
writeInLowerCaseAndCentered(char *str){//..}
writeInUpperCaseAndLeft(char *str){//..}
and so on...
vs
Many Argument function (bad readability and even hard to code without a nice autocompletion IDE)
write( char *str , int toUpper, int centered ){//..}
vs
Context dependent (hard to reuse, hard to code, use of ugly globals, and sometimes even impossible to "detect" a context)
writeComplex (char *str)
{
// analize str and perhaps some global variables and
// (under who knows what rules) put it center/left/right and upper/lowercase
}
And perhaps there are others options..(and are welcome)
The question is:
Is there is any good practice or experience/academic advice for this (recurrent) trilemma ?
EDIT:
What I usually do is to combine "specific case" implementation, with an internal (I mean not in header) general common many-argument function, implementing only used cases, and hiding the ugly code, but I don't know if there is a better way that I don't know. This kind of things make me realize of why OOP was invented.
I'd avoid your first option because as you say the number of function you end up having to implement (though possibly only as macros) can grow out of control. The count doubles when you decide to add italic support, and doubles again for underline.
I'd probably avoid the second option as well. Againg consider what happens when you find it necessary to add support for italics or underlines. Now you need to add another parameter to the function, find all of the cases where you called the function and updated those calls. In short, anoying, though once again you could probably simplify the process with appropriate use of macros.
That leaves the third option. You can actually get some of the benefits of the other alternatives with this using bitflags. For example
#define WRITE_FORMAT_LEFT 1
#define WRITE_FORMAT_RIGHT 2
#define WRITE_FORMAT_CENTER 4
#define WRITE_FORMAT_BOLD 8
#define WRITE_FORMAT_ITALIC 16
....
write(char *string, unsigned int format)
{
if (format & WRITE_FORMAT_LEFT)
{
// write left
}
...
}
EDIT: To answer Greg S.
I think that the biggest improvement is that it means that if I decide, at this point, to add support for underlined text I it takes two steps
Add #define WRITE_FORMAT_UNDERLINE 32 to the header
Add the support for underlines in write().
At this point it can call write(..., ... | WRITE_FORMAT_UNLDERINE) where ever I like. More to the point I don't need to modify pre-existing calls to write, which I would have to do if I added a parameter to its signature.
Another potential benefit is that it allows you do something like the following:
#define WRITE_ALERT_FORMAT (WRITE_FORMAT_CENTER | \
WRITE_FORMAT_BOLD | \
WRITE_FORMAT_ITALIC)
I prefer the argument way.
Because there's going to be some code that all the different scenarios need to use. Making a function out of each scenario will produce code duplication, which is bad.
Instead of using an argument for each different case (toUpper, centered etc..), use a struct. If you need to add more cases then you only need to alter the struct:
typedef struct {
int toUpper;
int centered;
// etc...
} cases;
write( char *str , cases c ){//..}
I'd go for a combination of methods 1 and 2.
Code a method (A) that has all the arguments you need/can think of right now and a "bare" version (B) with no extra arguments. This version can call the first method with the default values. If your language supports it add default arguments. I'd also recommend that you use meaningful names for your arguments and, where possible, enumerations rather than magic numbers or a series of true/false flags. This will make it far easier to read your code and what values are actually being passed without having to look up the method definition.
This gives you a limited set of methods to maintain and 90% of your usages will be the basic method.
If you need to extend the functionality later add a new method with the new arguments and modify (A) to call this. You might want to modify (B) to call this as well, but it's not necessary.
I've run into exactly this situation a number of times -- my preference is none of the above, but instead to use a single formatter object. I can supply it with the number of arguments necessary to specify a particular format.
One major advantage of this is that I can create objects that specify logical formats instead of physical formats. This allows, for example, something like:
Format title = {upper_case, centered, bold};
Format body = {lower_case, left, normal};
write(title, "This is the title");
write(body, "This is some plain text");
Decoupling the logical format from the physical format gives you roughly the same kind of capabilities as a style sheet. If you want to change all your titles from italic to bold-face, change your body style from left justified to fully justified, etc., it becomes relatively easy to do that. With your current code, you're likely to end up searching through all your code and examining "by hand" to figure out whether a particular lower-case, left-justified item is body-text that you want to re-format, or a foot-note that you want to leave alone...
As you already mentioned, one striking point is readability: writeInUpperCaseAndCentered("Foobar!") is much easier to understand than write("Foobar!", true, true), although you could eliminate that problem by using enumerations. On the other hand, having arguments avoids awkward constructions like:
if(foo)
writeInUpperCaseAndCentered("Foobar!");
else if(bar)
writeInLowerCaseAndCentered("Foobar!");
else
...
In my humble opinion, this is a very strong argument (no pun intended) for the argument way.
I suggest more cohesive functions as opposed to superfunctions that can do all kinds of things unless a superfunction is really called for (printf would have been quite awkward if it only printed one type at a time). Signature redundancy should generally not be considered redundant code. Technically speaking it is more code, but you should focus more on eliminating logical redundancies in your code. The result is code that's much easier to maintain with very concise, well-defined behavior. Think of this as the ideal when it seems redundant to write/use multiple functions.