How to push a table onto the Lua stack - c

Please explain, I need to put an analogue of this table on the Lua stack in C:
table_ =
{
param_1 = "Y1234",
param_2 = "XXX",
param_3 = "M"
}
I found the "lua_createtable" function:
https://www.lua.org/manual/5.3... a_newtable
void lua_createtable (lua_State *L, int narr, int nrec);
Creates a new empty table and pushes it onto the stack. Parameter
narr is a hint for how many elements the table will have as a sequence; parameter nrec is a hint for how many OTHER elements
the table will have. Lua may use these hints to preallocate memory for
the new table. This preallocation is useful for performance when you
know in advance how many elements the table will have. Otherwise you
can use the function lua_newtable.
I can’t understand - the parameters narr and nrec.
narr - how many elements will be in the table as a sequence. What is "as a sequence"? What's the sequence?
nrec - is a hint about how many OTHER elements will be in the table. What does others mean? What others ?
And it is not clear, after I create an empty table and it is marked on the stack, how can I fill this table with "key - value" values?

From same manual:
A table with exactly one border is called a sequence. For instance, the table {10, 20, 30, 40, 50} is a sequence, as it has only one border (5). The table {10, 20, 30, nil, 50} has two borders (3 and 5), and therefore it is not a sequence. The table {nil, 20, 30, nil, nil, 60, nil} has three borders (0, 3, and 6), so it is not a sequence, too. The table {} is a sequence with border 0. Note that non-natural keys do not interfere with whether a table is a sequence.
As for filling that table, you use lua_setfield (manual), so you can do something like this:
lua_createtable(L, 0, 2);
lua_pushinteger(L, some_number_value);
lua_setfield(L, -2, "field_1");
lua_pushstring(L, some_string_value);
lua_setfield(L, -2, "field_2");

Related

pyspark: From an array of structs, extract a scalar from the struct for even or odd index , then postprocess the array

I have a dataframe row that contains an ArrayType column named moves. moves is an array of StructType with a few fields in it. I can use a dotpath to fetch just one field from that struct and make a new array of just that field e.g. .select(df.moves.other) will create a same-length array as moves but only with the values of the other field. This is the result:
[null, [{null, null, [0:10:00]}], null, null, [{null, null, [0:10:00]}], [{null, null, [0:09:57]}], [{null, null, [0:09:56]}], [{null, null, [0:09:54]}], ...
So clearly other is not simple. Each element in the array is either null (idx 0,2,and 3 above) if 'other' is not in the struct (which is permitted) or an array of struct where the struct contains field clk which itself is an array (note that simple SPARK output does not list the field names, just the values. The nulls in the struct are unset fields). This is a two-player alternating move sequence; we need to do two things:
Extract the even idx elements and the odd idx elements.
From each, "simplify" the array where entries are either null or the value of the zeroeth entry in the clk field.
This is the target:
even list: [null, null, "0:10:00", "0:09:56", ...
odd list: ["0:10:00", null, "0:09:57", ...
Lastly, we wish to walk these arrays (individually) and compute delta time (n+1 - n) iff both n+1 and n not null.
This is fairly straightforward in "regular" python using slicing e.g. [::2] for evens and [1::2] for odds and map and list comprehensions etc. etc. But I cannot seem to assemble the right functions in pyspark to create the simplified arrays (forget about converting 0:10:00 to something for the moment). For example, unlike regular python, pyspark slice does not accept a step argument and pyspark needs more conditional logic around nulls. transform is promising but I cannot get it to skip entries to arrive at a shorter list.
I tried going the other direction with a UDF. To start, my UDF returned the array that was passed to it:
moves_udf = F.udf(lambda z: z, ArrayType(StructType()))
df.select( moves_udf(df.moves.other) )
But this yielded a grim exception, possibly because the other array contains nulls:
py4j.protocol.Py4JJavaError: An error occurred while calling o55.showString.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1) (192.168.0.5 executor driver): net.razorvine.pickle.PickleException: expected zero arguments for construction of ClassDict (for pyspark.sql.types._create_row)
at net.razorvine.pickle.objects.ClassDictConstructor.construct(ClassDictConstructor.java:23)
at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:773)
...
I know the UDF machinery works for simple scalars. I tested a toUpper() function on a different column and the UDF worked fine.
Almost all of the other move data is much more "SPARK friendly". It is the other field and the array-of-array substructure that is vexing.
Any guidance most appreciated.
P.S. All my pyspark logic is pipelined functions; no SQL. I would greatly prefer not to mix and match.
The trick is to use transform as a general loop exploiting the binary form of the lambda that also passes the current index into the array. Here is a solution:
# Translation:
# select 1: Get only even moves and call the output a "temporary" column 'X'
# select 2: X will look like this [ [{null,null,["0:03:00"]},null,{null,null,["0:02:57"]},...]
# This is because the dfg.moves.a is an array in the moves
# array. In this example, we do not further filter on the
# entries in the 'a' array; we know we want a[0] (the first one).
# We just want ["0:03:00",null,"0:02:57",...]
# x.clk will get the clk field but the value there is *also*
# an array so we must use subscript [0] *twice* to dig thru
# to the string we seek
# select 3: Convert all the "h:mm:ss" strings into timestamps
# select 4: Walk the deltas and return diff to the next neighbor.
# The last entry is always null.
dfx = dfg\
.select( F.filter(dfg.moves.a, lambda x,i: i % 2 == 0).alias('X'))\
.select( F.transform(F.col('X'), lambda x: x.clk[0][0]).alias('X'))\
.select( F.transform(F.col('X'), lambda x: F.to_timestamp(x,"H:m:s").cast("long")).alias('X'))\
.select( F.transform(F.col('X'), lambda x,i: x - F.col('X')[i+1]).alias('delta'))
dfx.show(truncate=False)
dfx.printSchema()
+---------------------------------------------------------------+
|delta |
+---------------------------------------------------------------+
|[2, 1, 2, 1, 4, 9, 16, 0, 6, 3, 8, 5, 2, 12, 4, 4, 10, 0, null]|
+---------------------------------------------------------------+
root
|-- delta: array (nullable = true)
| |-- element: long (containsNull = true)
If you want to compactify it you can do so.
dfx = dfg\
.select( F.transform(F.filter(dfg.moves.a, lambda x,i: i % 2 == 0), lambda x: F.to_timestamp(x.clk[0][0],"H:m:s").cast("long")).alias('X') )\
.select( F.transform(F.col('X'), lambda x,i: x - F.col('X')[i+1]).alias('delta'))

Find overlapping ranges between two int arrays and split/insert them

There two arrays, each of which will always contain an even, (though not equal) number of integers so that each pair will form a range, eg. 1..5, 8..12, etc.
var defaultArray: [Int] = [1, 5, 8, 12]
var priorityArray: [Int] = [1, 3, 5, 10, 13, 20]
What I'm looking for is a generic algorithm that will find each occurrence of where a range from priorityArray overlaps a range from defaultArray and will insert the priorityRange into the defaultArray while splitting the defaultRange apart if necessary.
The goal is to have a combined array of ranges while maintaining their original "types" like so:
var result: [Int] = [
1, 3, // priority
3, 5, // default
5, 10, // priority
10, 12, // default
13, 20 // priority
]
I'll use a simple struct to illustrate the final desired result:
var result: [Range] = [
Range(from: 1, until: 3, key: "priority"),
Range(from: 3, until: 5, key: "default"),
Range(from: 5, until: 10, key: "priority"),
Range(from: 10, until: 12, key: "default"),
Range(from: 13, until: 20, key: "priority")
]
We start with those arrays def and prio and first check if the intervals themselves are sorted wrt their start/end points. Those arrays would then contain the smallest number in the first array position. Ensure these arrays are simple/correct (=no overlapping intervals). If they are not, you can simplify/sanitise them.
We then initialise
array index d=0 to index the def array
array index p=0 to index the prio array
a new array result to hold all your newly created intervals.
a variable s=none to hold the current status
We now determine if the relation between the def[d] and prio[p].
If def[d]<prio[p], we set t=def[d], increment d and set s=def.
If def[d]> prio[p], we set t=prio[p], increment p and set s=prio.
If they are equal, we set t=prio[p], increment p and d and set s=both.
We can now initialise a new entry for the result array with start=def[0]. The priority is either def (if s==def) or prio (if s was prio or both). To determine the end, you can again compare def[d] with prio[p] to determine where it should end. At this point, you should adjust s again, but ensure that you keep track of the proper state which you're in (going from both to def, prio or none depending on the relation between def[d] and prio[p]). As mentioned in the comments of the OP, the different possibilities might require more clarification, but you should be able to incorporate them into a state.
Going from there, you can keep iterating and adjusting your variables until both are done (with d=len(def) and p=len(prio). You should end up with a nice array containing all the desired consolidated intervals.
This is basically a stateful sweep through the 2 arrays, keeping track of the current position in the integer range and advancing 1 (maybe 2) position(s) at a time.

Given a set of arrays, find out which arrays are subset of another

I am working on a project, where we have a set of arrays with same (and fixed) length, and their entries could be Null or a number. Some arrays are "somehow" subsets of other arrays. In this problem, we define subset in another way:
Array "x" is a subset of array "y" iff for each index i in x, x[i] ≠ Null => x[i] = y[i]
For example, array [2, Null, 1, Null] is a subset of [2, 5, 1, 8]
But array [3, 2, Null, Null] isn't a subset of [3, 1, Null, Null] or [3, Null, Null, Null].
I want to find for each array whether it is a subset of another array (and if it is, find its index). It doesn't have to be the optimum answer, it is sufficient to find a maximal one.
At the moment, I implemented an algorithm with O(N^2) which has two nested loops, and for each array "a", I check it with all arrays such as "b" which is not a subset of any other array, if "a" is a subset of "b". (And the subset checking process requires iteration on all indices of arrays which is O(Length of arrays). ) So The whole algorithm is O(N^2 * Length of arrays).
I have an idea that we could add the arrays one by one, and check if the new array is a subset of arrays added until that moment, and if it didn't have a superset, we add it to our list.
Also, we can add our arrays in a trie which the first-level groups them by their first index (those which are Null, 0, 1, etc), the second level groups each first-level group by their second index, and so on. When we want to find out one array (x)'s subsets, we check its first index. If it wasn't Null, we check the group that their first index is equal to x[1]. But if x[1] = Null, we have to check all groups at this level. And we go on with this BFS-like algorithm. But I'm not sure if this algorithm works efficiently in worst-case or average-case.
Would you please recommend me some efficient algorithms for solving this problem?

Sort a Julia 1.1 matrix by one of its columns, that contains strings

As the title suggests, I need to sort the rows of a certain matrix by one of its columns, preferably in place if at all possible. Said column contains Strings (the array being of type Array{Union{Float64,String}}), and ideally the rows should end up in an alphabetial order, determined by this column. The line
sorted_rows = sort!(data, by = i -> data[i,2]),
where data is my matrix, produces the error ERROR: LoadError: UndefKeywordError: keyword argument dims not assigned. Specifying which part of the matrix I want sorted and adding the parameter dims=2 (which I assume is the dimension I want to sort along), namely
sorted_rows = sort!(data[2:end-1,:], by = i -> data[i,2],dims=2)
simply changes the error message to ERROR: LoadError: ArgumentError: invalid index: 01 Suurin yhteinen tekijä ja pienin yhteinen jaettava of type String. So the compiler is complainig about a string being an invalid index.
Any ideas on how this type of sorting cound be done? I should say that in this case the string in the column can be expected to start with a number, but I wouldn't mind finding a solution that works in the general case.
I'm using Julia 1.1.
You want sortslices, not sort — the latter just sorts all columns independently, whereas the former rearranges whole slices. Secondly, the by function doesn't take an index, it takes the value that is about to be compared (and allows you to transform it in some way). Thus:
julia> using Random
data = Union{Float64, String}[randn(100) [randstring(10) for _ in 1:100]]
100×2 Array{Union{Float64, String},2}:
0.211015 "6VPQbWU5f9"
-0.292298 "HgvHLkufqI"
1.74231 "zTCu1U5Vdl"
0.195822 "O3j43sbhKV"
⋮
-0.369007 "VzFH2OpWfU"
-1.30459 "6C68G64AWg"
-1.02434 "rldaQ3e0GE"
1.61653 "vjvn1SX3FW"
julia> sortslices(data, by=x->x[2], dims=1)
100×2 Array{Union{Float64, String},2}:
0.229143 "0syMQ7AFgQ"
-0.642065 "0wUew61bI5"
1.16888 "12PUn4V4gL"
-0.266574 "1Z2ONSBP04"
⋮
1.85761 "y2DDANcFCe"
1.53337 "yZju1uQqMM"
1.74231 "zTCu1U5Vdl"
0.974607 "zdiU0sVOZt"
Unfortunately we don't have an in-place sortslices! yet, but you can easily construct a sorted view with sortperm. This probably won't be as fast to use, but if you need the in-place-ness for semantic reasons it'll do just the trick.
julia> p = sortperm(data[:,2]);
julia> #view data[p, :]
100×2 view(::Array{Union{Float64, String},2}, [26, 45, 90, 87, 6, 96, 82, 75, 12, 27 … 53, 69, 100, 93, 36, 37, 39, 8, 3, 61], :) with eltype Union{Float64, String}:
0.229143 "0syMQ7AFgQ"
-0.642065 "0wUew61bI5"
1.16888 "12PUn4V4gL"
-0.266574 "1Z2ONSBP04"
⋮
1.85761 "y2DDANcFCe"
1.53337 "yZju1uQqMM"
1.74231 "zTCu1U5Vdl"
0.974607 "zdiU0sVOZt"
(If you want the in-place-ness for performance reasons, I'd recommend using a DataFrame or similar structure that holds its columns as independent homogenous vectors — a Union{Float64, String} will be slower than two separate well-typed vectors, and sort!ing a DataFrame works on whole rows like you want.)
you may want to look at SortingLab.jls fast string sort functions.
]add SortingLab
using SortingLab
idx = fsortperm(data[:,2])
new_data = data[idx]

Save integers into array given by first integer

I need to know, how to save integers from stdin into array, given by first integer in line... Ehm... hope you understand. I will give you an example.
On stdin I have:
0 : [ 1, 2, 3 ]
5 : [ 10, 11, 12, 13]
6 : [ 2, 4, 9 ]
0 : [ 4, 9, 8 ]
5 : [ 9, 6, 7 ]
5 : [ 1 ]
And I need save these integers to the arrays like this:
0={1, 2, 3, 4, 9, 8}
5={10, 11, 12, 13, 9, 6, 7, 1}
6={2, 4, 9}
I absolutely don't how to do it. There is a problem, that the number of arrays(in this case - 0, 5, 6 - so 3 arrays ) can be very high and I need to work effectively with memory...So I guess i will need something like malloc and free to solve this problem, or am I wrong? The names of arrays (0, 5, 6) can be changed. Number of integers in brackets has no maximum limit.
Thank you for any help.
I go with the assumption, this is homework, and I go with the assumption, this isn't your first homework to do, so I won't present you a solution but instead some tips that would help you to solve it yourself.
Given the input line
5 : [ 10, 11, 12, 13]
I will call "5" the "array name" and 10, 11, 12 and 13 the values to add.
You should implement some system to map array names to indices. A trivial approach would be like this:
.
size_t num_arrays;
size_t * array_names;
Here, in your example input, num_arrays will end up being 3 with array_names[3] = { 0, 5, 6}. If you find a new array name, realloc and add the new array name. Also you need the actual arrays for the values:
int * * array;
you need to realloc array for each new array name (like you realloc array_names). array[0] will represent array array_names[0] here array 0, array[1] will represent array array_names[1] here array 5 and array[2] will represent array array_names[2] here array 6.
To access an array, find it's index like so:
size_t index;
for (size_t index = 0; index < num_arrays && array_names[index] != search; ++index) ;
The second step is easy. Once you figured out, you need to use array[index] to add elemens, realloc that one (array[index] = realloc(array[index], new size)) and add elements there array[index][i+old_size] = new_value[i].
Obviously, you need to keep track of the number of elements in your separate arrays as well ;)
Hint: If searching for the array names take too long, you will have to replace that trivial mapping part by some more sophisticated data structure, like a hash map or a binary search tree. The rest of the concept may stay more or less the same.
Should you have problems to parse the input lines, I suggest, you open a new question specific on this parsing part.
In algorithmic terms, you need map (associative array) from ints to arrays. This is solved long ago in most higher level languages.
If you have to implement it manually, you have a few options:
simple "master" array where you store your 0, 5, 6, 1000000 and then map them to indices 0, 1, 2, 3 by doing search in for each time you have to access it (it's too time consuming when ;
hash table: write simple hash function to map 0, 5, 6, 1000000 (they're called keys) to values less than 1000, allocate array of 1000 elements and then make "master" array structures for each hash function result;
some kind of tree (e.g. red-black tree), may be a bit complex to implement manually.
Last two structures are part of programming classic and are well described in various articles and books.

Resources