How to store an array as the value in Tokyo Cabinet? - tokyo-cabinet

Is there any way I can store an array of numbers in a Tokyo Cabinet db? For example, I have predictable arrays of values such as
1 => [1, 2, 444, 0.987],
2 => [2, 23, 123, -0.234],
3 => [3, 1, 34, 1.456]
I would like to store the above in a TC fixed length db. Is there a way to store the above as arrays instead of as strings?

Tokyo Cabinet allows arbitrary byte sequences as both key and value, so the schema is really up to you. The first step is to decide how to store each number. This could be float, double, or fixed point (e.g. BigDecimal).
Then, you decide how to serialize the array. This could be contiguous:
num => 1 2 444 0.987
The TC value is simply all the numeric values concatenated together. E.g. using 32-bit floats:
num => 0x 3f 80 00 00 40 00 00 00 43 de 00 00 3f 7c ac 08
Another possibility is a linked list:
key => num next_key
1 => 1.1 2
2 => 2 3
3 => 444 4
4 => 0.987 0
You concatenate the current value and the next key in the array
This provides the traditional benefits of a linked list, including inserting in the middle easily.

Related

Reference for how Python handles data?

I have a list that is <class 'bytes'> that is comprised of a 16-bit PCM value of <class 'int'>. The list is the result of a direct read of a segment of a 16-bit PCM wave file. I then create a numpy array from that built up list to save it as a separate wave file for training but wavfile.write() always fails because the 16-bit PCM data is wrong somehow, such as:
wavfile.write(savepath + 'wave_speechsegment_' + str(wavecnt) + '.wav', sr, nparray.astype(np.int16)) generates a ValueError: invalid literal for int() with base 10: b'z\xfe' error
And trying nparray directly: wavfile.write(savepath + 'wave_speechsegment_' + str(wavecnt) + '.wav', sr, nparray) I get ValueError: Unsupported data type '|S2
I try to set the list as 16-bit PCM values with:
hexval = struct.pack('<BB', val[0], val[1])
waveform.append(hexval)
nparray = np.array(waveform)
but when I save the 16-bit PCM values to the numpy file, python reports:
nparray is type: <class 'numpy.ndarray'> and nparray[0] is: b'z\xfe' and is type: <class 'numpy.bytes_'>
Saving to the numpy array segment to a file produces precisely the data set found for that segment in the source wave file, such as:
7A FE DE FE C5 FF 75 00 2F 01 76 01 99 01 55 01 05 01 74 00 05 00 9D FF 79 FF 65 FF 8C FF C9 FF
Can someone point me to information about how python deals with data, so that I can keep my 16-bit PCM data as 16-bit PCM data?
In [73]: astr = b'z\xfe'
In [74]: type(astr)
Out[74]: bytes
In [75]: len(astr)
Out[75]: 2 # 2 bytes
This is not a list. It's a string, more specifically a byte string, as opposed to the default (for Python 3) unicode string.
An array, created from such as string, will have a S dtype:
In [76]: arr= np.array(astr)
In [77]: arr
Out[77]: array(b'z\xfe', dtype='|S2')
In [78]: arr= np.array(astr+astr+astr) # + joins strings into one
In [79]: arr
Out[79]: array(b'z\xfez\xfez\xfe', dtype='|S6')
The data-buffer of the array contains those bytes. And can be view as other compatible dtypes.
In [87]: arr= np.array([astr+astr+astr])
In [88]: arr
Out[88]: array([b'z\xfez\xfez\xfe'], dtype='|S6')
In [89]: arr.view('S1')
Out[89]: array([b'z', b'\xfe', b'z', b'\xfe', b'z', b'\xfe'], dtype='|S1')
In [94]: arr.view('int16')
Out[94]: array([-390, -390, -390], dtype=int16)
In [95]: arr.view('uint16')
Out[95]: array([65146, 65146, 65146], dtype=uint16)
In [98]: arr.view('>i2')
Out[98]: array([31486, 31486, 31486], dtype=int16)

Multi-Dimensional Arrays Julia

I am new to using Julia and have little experience with the language. I am trying to understand how multi-dimensional arrays work in it and how to access the array at the different dimensions. The documentation confuses me, so maybe someone here can explain it better.
I created an array (m = Array{Int64}(6,3)) and am trying to access the different parts of that array. Clearly I am understanding it wrong so any help in general about Arrays/Multi-Dimensional Arrays would help.
Thanks
Edit I am trying to read a file in that has the contents
58 129 10
58 129 7
25 56 10
24 125 25
24 125 15
13 41 10
0
The purpose of the project is to take these fractions (58/129) and round the fractions using farey sequence. The last number in the row is what both numbers need to be below. Currently, I am not looking for help on how to do the problem, just how to create a multidimensional array with all the numbers except the last row (0). My trouble is how to put the numbers into the array after I have created it.
So I want m[0][0] = 58, so on. I'm not sure how syntax works for this and the manual is confusing. Hopefully this is enough information.
Julia's arrays are not lists-of-lists or arrays of pointers. They are a single container, with elements arranged in a rectangular shape. As such, you do not access successive dimensions with repeated indexing calls like m[j][i] — instead you use one indexing call with multiple indices: m[i, j].
If you trim off that last 0 in your file, you can just use the built-in readdlm to load that file into a matrix. I've copied those first six rows into my clipboard to make it a bit easier to follow here:
julia> str = clipboard()
"58 129 10\n58 129 7\n25 56 10\n24 125 25\n24 125 15\n13 41 10"
julia> readdlm(IOBuffer(str), Int) # or readdlm("path/to/trimmed/file", Int)
6×3 Array{Int64,2}:
58 129 10
58 129 7
25 56 10
24 125 25
24 125 15
13 41 10
That's not very helpful in teaching you how Julia's arrays work, though. Constructing an array like m = Array{Int64}(6,3) creates an uninitialized matrix with 18 elements arranged in 6 rows and 3 columns. It's a bit easier to see how things work if we fill it with a sensible pattern:
julia> m .= [10,20,30,40,50,60] .+ [1 2 3]
6×3 Array{Int64,2}:
11 12 13
21 22 23
31 32 33
41 42 43
51 52 53
61 62 63
This has set up the values of the array to have the row number in their tens place and the column number in the ones place. Accessing m[r,c] returns the value in m at row r and column c.
julia> m[2,3] # second row, third column
23
Now, r and c don't have to be integers — they can also be vectors of integers to select multiple rows or columns:
julia> m[[2,3,4],[1,2]] # Selects rows 2, 3, and 4 across columns 1 and 2
3×2 Array{Int64,2}:
21 22
31 32
41 42
Of course ranges like 2:4 are just vectors themselves, so you can more easily and efficiently write that example as m[2:4, 1:2]. A : by itself is a shorthand for a vector of all the indices within the dimension it indexes into:
julia> m[1, :] # the first row of all columns
3-element Array{Int64,1}:
11
12
13
julia> m[:, 1] # all rows of the first column
6-element Array{Int64,1}:
11
21
31
41
51
61
Finally, note that Julia's Array is column-major and arranged contiguously in memory. This means that if you just use one index, like m[2], you're just going to walk down that first column. As a special extension, we support what's commonly referred to as "linear indexing", where we allow that single index to span into the higher dimensions. So m[7] accesses the 7th contiguous element, wrapping around into the first row of the second column:
julia> m[5],m[6],m[7],m[8]
(51, 61, 12, 22)

Converting a hexadecimal array into Binary and then into Decimal

I have a sample structure which has two sets of data. The first data contains the following Hex array '00 7F 3F FF 08 FF 60 26' and then when I convert it into binary and then decimal I get a correct answer which is '0 127 63 255 8 255 96 38'.
However, I have some data arrays which are not exactly arranged as the first one, they look something like this '1 40 0 F 00 40 00 47' and when I try to convert these kind of data sets the result is inaccurate. I get something like this '64 0 64 0 71' while the expected result is '1 64 0 15 0 64 0 71'.
This is my code with a sample data:
%% Structure
a(1).Id = 118;
a(1).Data = '00 7F 3F FF 08 FF 60 26';
a(2).Id = 108;
a(2).Data = '1 40 0 F 00 40 00 47';
%% Hexadecimal (Data) --> Binary --> Decimal
Data = a(2).Data;
str = regexp(Data,' ','split');
Ind = cellfun(#length,str);
str = str(Ind==2);
%Hex to Binary
binary = hexToBinaryVector(str,8,'MSBFirst');
%Binary to Decimal
Decimal = bi2de(binary,'left-msb');
Any help will be really appreciated!
adding 2 lines should do the trick:
str = regexp(Data,' ','split');
Ind = cellfun(#length,str);
str(Ind==1) = strcat('0',str(Ind==1) );
Ind = cellfun(#length,str);
str = str(Ind==2);
All it is doing is when it sees a String (your Hex) that is 1 Char, it puts a 0 infront of it, so correct it into its correct format. you can actually do this in the cellfun.

Array contents display in pairs

I have an array for example: A=[01 255 03 122 85 107]; and I want to print the contents as
A=
FF 01
7A 03
6B 55
Basically a read out from a memory. Is there any function in MatLab lib? I need to do this with minimum use of loops.
Use this -
str2num(num2str(fliplr(reshape(A,2,[])'),'%1d'))
Output -
ans =
21
43
65
87
If you only want to print it as characters, use it without str2num, like this -
num2str(fliplr(reshape(A,2,[])'),'%1d')
Output -
ans =
21
43
65
87
General case with zeros padding -
A=[1 2 3 4 5 6 7 8 9 3] %// Input array
N = 3; %// groupings, i.e. 2 for pairs and so on
A = [A zeros(1,N-mod(numel(A),N))]; %// pad with zeros
out = str2num(num2str(fliplr(reshape(A,N,[])'),'%1d'))
Output -
out =
321
654
987
3
Edit for hex numbers :
Ar = A(flipud(reshape(1:numel(A),2,[])))
out1 = reshape(cellstr(dec2hex(Ar))',2,[])'
out2 = [char(out1(:,1)) repmat(' ',[numel(A)/2 1]) char(out1(:,2))]
Output -
out1 =
'FF' '01'
'7A' '03'
'6B' '55'
out2 =
FF 01
7A 03
6B 55

Efficiently iterating successive elements of a transposed matrix (via bit operators)

Let's consider matrices which are internally represented as a 1 dimensional array.
For instance a matrix(3, 4) is really an array (of say type double) or 3*4 elements. Here is the 'memory layout' of the matrix:
00 01 02 03
04 05 06 07
08 09 10 11
As such it's very easy to iterate (row by row, left to right) over all the elements of the matrix: it's just an 32-bit integer going from 0 to 11. This is what the transpose looks like:
00 04 08
01 05 09
02 06 10
03 07 11
What is a (fast) algorithm that taking as input a single 32-bit integer representing the i-th element of the transposed matrix (row by row, left to right) returns the index corresponding to the internal representation? By single I mean that an 'incremental' algorithm is not what I'm looking for, the function just take as input a single 32-bit integer (plus number of rows and columns) and output a single 32-bit integer. I mentioned bit-wise operators as it's likely to be the fastest way to solve the problem but any efficient solution suffice really.
In the example above:
0 --> 0
1 --> 4
2 --> 8
3 --> 1
4 --> 5
5 --> 9
6 --> 2
...
Also, what restrictions (if any) need to be imposed on the number of rows and columns (we already have that num_row*num_col fits in a 32-bit integer) so that the algorithms is guaranteed to work.
Thank you!
As long as the dimensions remain small, you can use a constant as a lookup table:
0x4cd0b73a62951840 >> (x*4)) & 15
If they get slightly larger, you could split this into e.g. generating the upper and lower bits of the result:
((0x00fea540 >> (x*2)) & 3) | (((0x00924924 >> (x*2) & 3) << 2))
Eventually though, the straight-forward approach will be faster.

Resources