`System.CopyArray` vs `System.Copy`? - arrays

Apart from the obvious difference that the first deals only with arrays, don't they do the same thing? I keep reading the help pages for the two functions, and can't understand when should I use one over the other and why. Internet search seems to indicate only the second is used whatsoever for array copying purposes if not written using loops.

System.Copy is really compiler magic. It's applicable both to strings and to dynamic arrays.
Compiler chooses needed version of intrinsic routine (different for short strings, long strings, dynamic arrays) and substitutes your call of Copy.
For dynamic arrays _DynArrayCopyRange prepares memory, provides reference counting, and calls System.CopyArray for deep copy of elements.
Usually you don't need to call the last procedure expicitly, it is compiler prerogative.

Related

Difference between Array and List in Vala

As stated in the title, what is major difference between GLib.Array and GLib.List in Vala language? They look very similar in the tutorial pages.
GLib.Array is an array. GLib.List is a (doubly linked) list. Here is an article comparing (singly linked) lists and arrays: https://www.geeksforgeeks.org/linked-list-vs-array/
It's also worth mentioning that with GLib.List (and GLib.SList), the pointer actually points to the head of the list. That means that if you're storing a reference to the list in one place and modifying the list somewhere else you'll likely end up with some memory management problems. Unless you understand what's going on under the hood you're probably much better off using Gee.List (or one of the other data structures in libgee).
In Vala you'll generally want to use GLib.GenericArray instead of GLib.Array, unless you're interfacing with something which requires a GLib.Array. GLib.Array just doesn't work quite as well with how generics work in Vala; GArray is actually a bit weird in C as well, and just doesn't translate very well to Vala. GLib.GenericArray is an alternate binding for GLib.PtrArray (which is a GPtrArray at the C level, not a GArray).

Tcl String function that does "shimmering" ruined my customized tcl type defined in c

I have defined a customized tcl type using tcl library in c/c++. I basically make the Tcl_Obj.internalRep.otherValuePtr point to my own data structure. The problem happens by calling [string length myVar] or other similar string functions that does so called shimmering behaviour which replace my internalRep with it's own string structure. So that after the string series tcl function, myVar cannot convert back! because it's a complicate data structure cannot be converted back from the Tcl_Obj.bytes representation plus the type is no longer my customized type. How can I avoid that.
The string length command converts the internal representation of the values it is given to the special string type, which records information to allow many string operations to be performed rapidly. Apart from most of the string command's various subcommands, the regexp and regsub commands are the main ones that do this (for their string-to-match-the-RE-against argument). If you have a precious internal representation of your own and do not wish to lose it, you should avoid those commands; there are some operations that avoid the trouble. (Tcl mostly assumes that internal representations are not fragility, and therefore that they can be regenerated on demand. Beware when using fragility!)
The key operations that are mostly safe (as in they generate the bytes/length rep through calling the updateStringProc if needed, but don't clear the internal rep) are:
substitution into a string; the substituted value won't have the internal rep, but it will still be in the original object.
comparison with the eq and ne expression operators. This is particularly relevant for checks to see if the value is the empty string.
Be aware that there are many other operations that spoil the internal representation in other ways, but most don't catch people out so much.
[EDIT — far too long for a comment]: There are a number of relatively well-known extensions that work this way (e.g., TCOM and Tcl/Java both do this). The only thing you can really do is “be careful” as the values really are fragile. For example, put them in an array and then pass the indexes into the array around instead, as those need not be fragile. Or keep things as elements in a list (probably in a global variable) and pass around the list indices; those are just plain old numbers.
The traditional, robust approach is to put a map (e.g., a Tcl_HashTable or std::map) in your C or C++ code and have the indices into that be short strings with not too much meaning (I like to use the name of the type of value followed by either a sequence number or a serialisation of the pointer, such as you might get with the %p conversion in sprintf(); the printed pointer reveals more of the implementation details, is a little more helpful if you're debugging, and generally doesn't actually make that much difference in practice). You then have the removal of things from the map be an explicit deletion operation, and it is also easy to provide operations like listing all the known current values. This is safe, but prone to “leaking” (though it's not formally a memory leak if you provide the listing operation). It can be accelerated by caching the lookup in a Tcl_Obj*'s internal representation (a cheap way to handle deletion is to use a sequence number that you increment when you delete something, and only bypass the map lookup if the sequence number that you cache in the intrep is equal to the main sequence number) but it's not usually a big deal; only go to that sort of thing if you've measured a bottleneck in the lookups.
But I'd probably just live with fragility in my own code, and would just take care to ensure that I never bust the assumptions. The problem is really that you're being incautious about how you use the values; the Tcl code should just pass them around and nothing else really. Also, I've experimented a fair bit with wrapping such things up inside a TclOO object; it's far too heavyweight (by the design of TclOO) for values that you're making a lot of, but if you've only got a few of them and you're wanting to treat them as objects with methods, this can work very well indeed (and gives many more options for automatic cleanup).

Why do C written libraries use so many structs?

I've looked to some open source Libraries in some places. And, I've realized which that Libraries are basically a great stack of structs. I've seen few methods.
Why does C written libraries uses so much structs? What's the basis behind this? This, for me, looked like a attempt to simulate object orientation, 'cause a fast searching told me that each struct is "instantiated" by the using program to make something, per example, in some Desktop enviroments for linux that I've seen that each window was a struct in the used GUI library.
Anyway, the question is that.
Structs are a great way to organize data. And data is fundamental, as Fred Brooks knew decades ago:
Show me your flowcharts and conceal your tables, and I shall continue
to be mystified. Show me your tables, and I won't usually need your
flowcharts; they'll be obvious.
Object-oriented programming doesn't have to be merely simulated in C, it can be realized. For example, did you know that inside your structs you can store function pointers which operate on those same structs, and then you are a little bit closer to C++'s classes?
Also consider extensibility: even a function taking many arguments may be improved by taking a single struct, because then its signature does not need to change when a new argument is added.
Finally, C does not have multiple return values from a single function call. But it can return a struct, which is about the same thing. C is a lot about building your own tools from the raw language, and being able to stash a bunch of related data and/or functions together in one place is a good building block.
With or without object orientation, structures are a useful way to group aggregate data into a single symbol. You can copy the structure wherever you like without having to write out all the members each time, and this makes the structure easier to change if you have to.
It also makes it easier to reference certain members using pointer arithmetic, if you're careful (see sockaddr).
Same argument as with arrays.
Simply put, there's no reason not to use structures.
Structures are useful while retrieving data using a pointer. Because single pointer is enough for complete bunch of data with in a structure.
One, it keeps the APIs clean. Instead of passing N separate arguments to a function, you pass a single argument containing N members.
Two, it allows the library to hide implementation details from the programmer. For example, the C FILE type abstracts away some details of stream I/O, details which vary from implementation to implementation. We don't need to know those details, so they're not exposed to us; we just use the FILE type to pass that information around.

How does the Length() function in Delphi work?

In other languages like C++, you have to keep track of the array length yourself - how does Delphi know the length of my array? Is there an internal, hidden integer?
Is it better, for performance-critical parts, to not use Length() but a direct integer managed by me?
There are three kinds of arrays, and Length works differently for each:
Dynamic arrays: These are implemented as pointers. The pointer points to the first array element, but "behind" that element (at a negative offset from the start of the array) are two extra integer values that represent the array's length and reference count. Length reads that value. This is the same as for the string type.
Static arrays: The compiler knows the length of the array, so Length is a compile-time constant.
Open arrays: The length of an open array parameter is passed as a separate parameter. The compiler knows where to find that parameter, so it replaces Length with that a read of that parameter's value.
Don't forget that the layout of dynamic arrays and the like would change in a 64-bit version of Delphi, so any code that relies on finding the length at a particular offset would break.
I advise just using Length(). If you're working with it in a loop, you might want to cache it, but don't forget that a for loop already caches the terminating bounds of the loop.
Yes, there are in fact two additional fields with dynamic arrays. First is the number of elements in the array at -4 bytes offset to the first element, and at -8 bytes offset there's the reference count. See Rudy's article for a detailed explanation.
For the second question, you'd have to use SetLength for sizing dynamic arrays, so the internal 'length' field would be available anyway. I don't see much use for additional size tracking.
Since Rob Kennedy gave such a good answer to the first part of your question, I'll just address the second one:
Is it better, for performance-critical parts, to not use Length() but a direct integer managed by me?
Absolutely not. First, as Rob mentioned, the compiler does it's thing to access the information extremely quickly, either by reading a fixed offset before the start of the array in the case of dynamic ones, using a compile-time constant in the case of static ones, and passing a hidden parameter in the case of open arrays, you're not going to gain any improvement in performance.
Secondly, the direct integer managed by you wouldn't be any faster, but would actually use more memory (an additional integer allocated along with the one Delphi already provides for dynamic and open arrays, and an extra integer entirely in the case of static arrays).
Even if you directly read the value Delphi stores already for dynamic arrays, you wouldn't gain any performance over Length(), and would risk your code breaking if the internal representation of that hidden header for arrays changes in the future.
Is there an internal, hidden integer
Yes.
to not use Length() but a direct integer managed by me?
Doesn't matter.
See Dynamic arrays item in Addressing pointers article by Rudy Velthuis.
P.S. You can also hit F1 button.

Avoiding string copying in Lua

Say I have a C program which wants to call a very simple Lua function with two strings (let's say two comma separated lists, returning true if the lists intersect at all, false if not).
The obvious way to do this is to push them onto the stack with lua_pushstring, which works fine, however, from the doc it looks like lua_pushstring but makes a copy of the string for Lua to work with.
That means that to cross over to the Lua function is going to require two string copies which I could avoid by rewriting the Lua function in C. Is there any way to arrange things so that the existing C strings could be reused on the Lua side for the sake of performance (or would the strcpy cost pale into insignificance anyway)?
From my investigation so far (my first couple of hours looking seriously at Lua), lite userdata seems like the sort of thing I want, but in the form of a string.
No. You cannot forbid Lua making a copy of the string when you call lua_pushstring().
Reason: Unless, the internal garbage collector would not be able to free unused memory (like your 2 input strings).
Even if you use the light user data functionality (which would be an overkill in this case), you would have to use lua_pushstring() later, when the Lua program asks for the string.
Hmm.. You could certainly write some C functions so that the work is being done on the C side, but as the other answer points out, you might get stuck pushing the string or sections of it in anyways.
Of note: Lua only stores strings once when they are brought in. i.e.: if I have 1 string containing "The quick brown fox jumps over the lazy dog" and I push it into Lua, and there are no other string objects that contain that string, it will make a new copy of it. If, on the other hand, I've already inserted it, you'll just get a pointer to that first string so equality checks are pretty cheap. Importing those strings can be a little expensive, I would guess, if this is done at a high frequency, however comparisons, again, are cheap.
I would try profiling what you're implementing and see if the performance is up to your expectations or not.
As the Lua Performance Tips document (which I recommend reading if you are thinking about maximizing performance with Lua), the two programming maxims related to optimizing are:
Rule #1: Don’t do it.
Rule #2: Don’t do it yet. (for experts only)

Resources