The end result I'm looking for is to implement T-SQL CHECKSUM in BigQuery with a JavaScript UDF. I would settle for having the C/C++ source code to translate but if someone has already done this work then I'd love to use it.
Alternatively, if someone can think of a way to create an equivalent hash code between strings stored in Microsoft SQL Server compared to those in BigQuery then that would help me too.
UPDATE: I've found some source code through HABO's link in the comments which is written in T-SQL to perform the same CHECKSUM but I'm having difficulty converting it to JavaScript which inherently cannot handle 64bit integers. I'm playing with some small examples and have found that the algorithm works on the low nibble of each byte only.
UPDATE 2: I got really curious about replicating this algorithm and I can see some definite patterns but my brain isn't up to the task of distilling that into a reverse engineered solution. I did find that BINARY_CHECKSUM() and CHECKSUM() return different things so the work done on the former didn't help me with the latter.
I spent the day reverse engineering this by first dumping all results for single ASCII characters as well as pairs. This showed that each character has its own distinct "XOR code" and letters have the same one regardless of case. The algorithm was remarkably simple to figure out after that: rotate 4 bits left and xor by the code stored in a lookup table.
var xorcodes = [
0, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 26, 27, 28, 29, 30, 31,
0, 33, 34, 35, 36, 37, 38, 39, // !"#$%&'
40, 41, 42, 43, 44, 45, 46, 47, // ()*+,-./
132, 133, 134, 135, 136, 137, 138, 139, // 01234567
140, 141, 48, 49, 50, 51, 52, 53, 54, // 89:;<=>?#
142, 143, 144, 145, 146, 147, 148, 149, // ABCDEFGH
150, 151, 152, 153, 154, 155, 156, 157, // IJKLMNOP
158, 159, 160, 161, 162, 163, 164, 165, // QRSTUVWX
166, 167, 55, 56, 57, 58, 59, 60, // YZ[\]^_`
142, 143, 144, 145, 146, 147, 148, 149, // abcdefgh
150, 151, 152, 153, 154, 155, 156, 157, // ijklmnop
158, 159, 160, 161, 162, 163, 164, 165, // qrstuvwx
166, 167, 61, 62, 63, 64, 65, 66, // yz{|}~
];
function rol(x, n) {
// simulate a rotate shift left (>>> preserves the sign bit)
return (x<<n) | (x>>>(32-n));
}
function checksum(s) {
var checksum = 0;
for (var i = 0; i < s.length; i++) {
checksum = rol(checksum, 4);
var c = s.charCodeAt(i);
var xorcode = 0;
if (c < xorcodes.length) {
xorcode = xorcodes[c];
}
checksum ^= xorcode;
}
return checksum;
};
See https://github.com/neilodonuts/tsql-checksum-javascript for more info.
DISCLAIMER: I've only worked on compatibility with VARCHAR strings in SQL Server with collation set to SQL_Latin1_General_CP1_CI_AS. This won't work with multiple columns or integers but I'm sure the underlying algorithm uses the same codes so it wouldn't be hard to figure out. It also seems to differ from db<>fiddle possibly due to collation: https://github.com/neilodonuts/tsql-checksum-javascript/blob/master/data/dbfiddle-differences.png ... mileage may vary!
fyi, for those of you who are stuck in T-SQL legacy mode, here's a C# implementation that was tested and looks good for most strings/ints that I've been working with:
public static int[] xorcodes = {
0, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 26, 27, 28, 29, 30, 31,
0, 33, 34, 35, 36, 37, 38, 39, // !"#$%&'
40, 41, 42, 43, 44, 45, 46, 47, // ()*+,-./
132, 133, 134, 135, 136, 137, 138, 139, // 01234567
140, 141, 48, 49, 50, 51, 52, 53, 54, // 89:;<=>?#
142, 143, 144, 145, 146, 147, 148, 149, // ABCDEFGH
150, 151, 152, 153, 154, 155, 156, 157, // IJKLMNOP
158, 159, 160, 161, 162, 163, 164, 165, // QRSTUVWX
166, 167, 55, 56, 57, 58, 59, 60, // YZ[\]^_`
142, 143, 144, 145, 146, 147, 148, 149, // abcdefgh
150, 151, 152, 153, 154, 155, 156, 157, // ijklmnop
158, 159, 160, 161, 162, 163, 164, 165, // qrstuvwx
166, 167, 61, 62, 63, 64, 65, 66, // yz{|}~
};
public static int rol(int x, int n) {
// simulate a rotate shift left (>>> preserves the sign bit)
return ((int)x << n) | ((int)((uint)x >> (32 - n)));
}
public static int checksum(string s) {
int checksum = 0;
for (var i = 0; i < s.Length; i++) {
checksum = rol(checksum, 4);
var c = ((int)s[i]);
int xorcode = 0;
if (c < xorcodes.Length) {
xorcode = xorcodes[c];
}
checksum ^= xorcode;
}
return checksum;
}
Related
I have implemented both randomized quicksort and tail recursion quicksort in golang and logged the running time. I found out that tail recursion quicksort is taking more time to sort the array. My input length of the arrays are 250, 500, 750, 1000, 1250, 1500, 1750, 2000, 2250, 2500.
Below's are my golang code implementions.
Randomized Quicksort:
// this method will sort the array and place pivot at than correct position.
// we will then run another randomizedquicksort on the partitioned array.
// It take both float and int arrays as input
func randomizedquicksort(arr []interface{}, left int, right int) {
if left < right {
pivot := randomizedPartition(arr, left, right)
randomizedquicksort(arr, left, pivot-1)
randomizedquicksort(arr, pivot+1, right)
}
}
func partition(a []interface{}, left int, right int) int {
pivot := a[right]
i := left - 1
for j := left; j <= right-1; j++ {
switch piv := pivot.(type) {
case float64:
if a[j].(float64) < piv {
i++
//swap
a[i], a[j] = a[j], a[i]
}
case int:
if a[j].(int) < piv {
i++
//swap
a[i], a[j] = a[j], a[i]
}
}
}
//swap pivot with pth index
a[right], a[i+1] = a[i+1], a[right]
return i + 1
}
Tail Recursion Quicksort:
// this method will sort the array and place pivot at than correct position.
// we will then run another quicksort on the partitioned array.
// Last thing we do is recursion in tail recursion
func tailRecursivequicksort(a []interface{}, left int, right int) {
for left < right {
pi := tailPartition(a, left, right)
tailRecursivequicksort(a, left, right-1)
left = pi + 1
}
}
// this method will partition the array around pivot and return pivot's index
func tailPartition(a []interface{}, left int, right int) int {
pivot := a[right]
p := left - 1
for i := left; i < right; i++ {
switch piv := pivot.(type) {
case float64:
// if element is found lower than pivot swap it with pth element
if a[i].(float64) <= piv {
//swap
p++
a[p], a[i] = a[i], a[p]
}
case int:
// if element is found lower than pivot swap it with pth element
if a[i].(int) <= piv {
//swap
p++
a[p], a[i] = a[i], a[p]
}
}
}
//swap pivot with pth index
a[right], a[p+1] = a[p+1], a[right]
return p + 1
}
One sample unit test cases for tail recursion quicksort:
package main
import (
"fmt"
"testing"
"time"
)
// 250 numbers
func TestTailRecursionQuickSort1(t *testing.T) {
startTime := time.Now()
actualArray := []interface{}{123, 39, 2, 198, 236, 5, 214, 195, 100, 86, 162, 16, 233, 34, 197, 209, 173, 174, 238, 75, 6, 12, 191, 4, 44, 108, 85, 72, 216, 210, 248, 152, 226, 155, 38, 103, 45, 136, 206, 19, 181, 107, 81, 133, 118, 35, 190, 154, 212, 193, 232, 106, 196, 43, 243, 63, 245, 165, 60, 124, 36, 235, 137, 176, 228, 234, 183, 22, 187, 128, 142, 42, 29, 224, 131, 112, 110, 117, 217, 98, 178, 13, 74, 146, 122, 109, 1, 121, 78, 229, 46, 127, 150, 114, 28, 95, 8, 237, 32, 207, 166, 227, 144, 120, 15, 17, 94, 151, 47, 88, 247, 192, 82, 230, 31, 41, 138, 56, 21, 97, 53, 164, 126, 30, 67, 91, 66, 105, 71, 148, 125, 10, 218, 99, 203, 25, 119, 40, 250, 246, 153, 51, 84, 102, 186, 33, 37, 93, 104, 68, 18, 50, 139, 80, 205, 199, 20, 57, 27, 249, 145, 223, 168, 83, 140, 90, 201, 23, 184, 221, 156, 163, 202, 204, 157, 175, 241, 219, 116, 54, 149, 129, 194, 49, 64, 167, 211, 62, 87, 89, 59, 169, 14, 244, 200, 79, 24, 141, 171, 77, 189, 147, 134, 180, 225, 185, 73, 111, 213, 215, 158, 52, 69, 70, 11, 135, 7, 115, 101, 177, 76, 61, 208, 242, 48, 132, 26, 188, 182, 92, 239, 3, 58, 9, 172, 113, 160, 220, 222, 143, 130, 65, 170, 240, 231, 55, 159, 96, 179, 161}
expectedArray := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250}
tailRecursivequicksort(actualArray, 0, 249)
timeDiff := time.Now().Sub(startTime)
fmt.Printf("Time take to sort %v numbers is %v", len(actualArray), timeDiff)
for i := 0; i < len(actualArray); i++ {
if actualArray[i] != expectedArray[i] {
t.Fail()
break
}
}
}
Running time logs:
=== RUN TestRandomizedQuickSort1
Time take to sort 250 numbers is 93.1µs--- PASS: TestRandomizedQuickSort1 (0.00s)
=== RUN TestRandomizedQuickSort2
Time take to sort 500 numbers is 153.913µs--- PASS: TestRandomizedQuickSort2 (0.00s)
=== RUN TestRandomizedQuickSort3
Time take to sort 750 numbers is 249.653µs--- PASS: TestRandomizedQuickSort3 (0.00s)
=== RUN TestRandomizedQuickSort4
Time take to sort 1000 numbers is 299.693µs--- PASS: TestRandomizedQuickSort4 (0.00s)
=== RUN TestRandomizedQuickSort5
Time take to sort 1250 numbers is 452.812µs--- PASS: TestRandomizedQuickSort5 (0.00s)
=== RUN TestRandomizedQuickSort6
Time take to sort 1500 numbers is 577.646µs--- PASS: TestRandomizedQuickSort6 (0.00s)
=== RUN TestRandomizedQuickSort7
Time take to sort 1750 numbers is 793.466µs--- PASS: TestRandomizedQuickSort7 (0.00s)
=== RUN TestRandomizedQuickSort8
Time take to sort 2000 numbers is 2.100078ms--- PASS: TestRandomizedQuickSort8 (0.00s)
=== RUN TestRandomizedQuickSort9
Time take to sort 2250 numbers is 1.534166ms--- PASS: TestRandomizedQuickSort9 (0.00s)
=== RUN TestRandomizedQuickSort10
Time take to sort 2500 numbers is 1.488542ms--- PASS: TestRandomizedQuickSort10 (0.00s)
PASS
Debugger finished with the exit code 0
=== RUN TestTailRecursionQuickSort1
Time take to sort 250 numbers is 2.166925ms--- PASS: TestTailRecursionQuickSort1 (0.00s)
=== RUN TestTailRecursionQuickSort2
Time take to sort 500 numbers is 11.049337ms--- PASS: TestTailRecursionQuickSort2 (0.01s)
=== RUN TestTailRecursionQuickSort3
Time take to sort 750 numbers is 36.98923ms--- PASS: TestTailRecursionQuickSort3 (0.04s)
=== RUN TestTailRecursionQuickSort4
Time take to sort 1000 numbers is 213.94526ms--- PASS: TestTailRecursionQuickSort4 (0.21s)
=== RUN TestTailRecursionQuickSort5
Time take to sort 1250 numbers is 87.065747ms--- PASS: TestTailRecursionQuickSort5 (0.09s)
=== RUN TestTailRecursionQuickSort6
Time take to sort 1500 numbers is 105.232837ms--- PASS: TestTailRecursionQuickSort6 (0.11s)
PASS
Debugger finished with the exit code 0
=== RUN TestTailRecursionQuickSort7
Time take to sort 1750 numbers is 2.632979054s--- PASS: TestTailRecursionQuickSort7 (2.63s)
=== RUN TestTailRecursionQuickSort8
Time take to sort 2000 numbers is 1.082278134s--- PASS: TestTailRecursionQuickSort8 (1.08s)
=== RUN TestTailRecursionQuickSort9
Time take to sort 2250 numbers is 3.14799009s--- PASS: TestTailRecursionQuickSort9 (3.15s)
=== RUN TestTailRecursionQuickSort10
Time take to sort 2500 numbers is 4.877045862s--- PASS: TestTailRecursionQuickSort10 (4.88s)
After running the test cases for both the algorithms, I found that randomized quick sort is faster that tail recursion. Isn't tail recursion is the optimization of the randomized quicksort? Let me know your thoughts.
Thanks
The question's code does not include random partition function. The tail recursion is only reducing size by one on each loop (from right to right-1) a worst case scenario. The Lomuto partition scheme used in the code degrades to worst case time complexity if there are a lot of repeated elements:
https://en.wikipedia.org/wiki/Quicksort#Repeated_elements
Example C++ Hoare partition code that recurses on smaller partition, then loops back for the larger partition.
void QuickSort(int a[], int lo, int hi)
{
while (lo < hi){
int p = a[lo + (hi - lo) / 2];
int i = lo - 1;
int j = hi + 1;
while (1){
while (a[++i] < p);
while (a[--j] > p);
if (i >= j)
break;
std::swap(a[i], a[j]);
}
if(j - lo < hi - j){
QuickSort(a, lo, j);
lo = j+1;
} else {
QuickSort(a, j+1, hi);
hi = j;
}
}
}
A true C++ tail recursion example. Visual Studio release build will convert the tail recursion calls into loops.
void QuickSort(int a[], int lo, int hi)
{
if(lo < hi){
int p = a[lo + (hi - lo) / 2];
int i = lo - 1;
int j = hi + 1;
while (1){
while (a[++i] < p);
while (a[--j] > p);
if (i >= j)
break;
std::swap(a[i], a[j]);
}
if(j - lo < hi - j){
QuickSort(a, lo, j);
QuickSort(a, j+1, hi); // converted to loop
} else {
QuickSort(a, j+1, hi);
QuickSort(a, lo, j); // converted to loop
}
}
}
I am implementing a pearson hash in order to create a lightweight dictionary structure for a C project which requires a table of files names paired with file data - I want the nice constant search property of hash tables. I'm no math expert so I looked up good text hashes and pearson came up, with it being claimed to be effective and having a good distribution. I tested my implementation and found that no matter how I vary the table size or the filename max length, the hash is very inefficient, with for example 18/50 buckets being left empty. I trust wikipedia to not be lying, and yes I am aware I can just download a third party hash table implementation, but I would dearly like to know why my version isn't working.
In the following code, (a function to insert values into the table), "csString" is the filename, the string to be hashed, "cLen" is the length of the string, "pData" is a pointer to some data which is inserted into the table, and "pTable" is the table struct. The initial condition cHash = cLen - csString[0] is somethin I experimentally found to marginally improve uniformity. I should add that I am testing the table with entirely randomised strings (using rand() to generate ascii values) with randomised length between a certain range - this is in order to easily generate and test large amounts of values.
typedef struct StaticStrTable {
unsigned int nRepeats;
unsigned char nBuckets;
unsigned char nMaxCollisions;
void** pBuckets;
} StaticStrTable;
static const char cPerm256[256] = {
227, 117, 238, 33, 25, 165, 107, 226, 132, 88, 84, 68, 217, 237, 228, 58, 52, 147, 46, 197, 191, 119, 211, 0, 218, 139, 196, 153, 170, 77, 175, 22, 193, 83, 66, 182, 151, 99, 11, 144, 104, 233, 166, 34, 177, 14, 194, 51, 30, 121, 102, 49,
222, 210, 199, 122, 235, 72, 13, 156, 38, 145, 137, 78, 65, 176, 94, 163, 95, 59, 92, 114, 243, 204, 224, 43, 185, 168, 244, 203, 28, 124, 248, 105, 10, 87, 115, 161, 138, 223, 108, 192, 6, 186, 101, 16, 39, 134, 123, 200, 190, 195, 178,
164, 9, 251, 245, 73, 162, 71, 7, 239, 62, 69, 209, 159, 3, 45, 247, 19, 174, 149, 61, 57, 146, 234, 189, 15, 202, 89, 111, 207, 31, 127, 215, 198, 231, 4, 181, 154, 64, 125, 24, 93, 152, 37, 116, 160, 113, 169, 255, 44, 36, 70, 225, 79,
250, 12, 229, 230, 76, 167, 118, 232, 142, 212, 98, 82, 252, 130, 23, 29, 236, 86, 240, 32, 90, 67, 126, 8, 133, 85, 20, 63, 47, 150, 135, 100, 103, 173, 184, 48, 143, 42, 54, 129, 242, 18, 187, 106, 254, 53, 120, 205, 155, 216, 219, 172,
21, 253, 5, 221, 40, 27, 2, 179, 74, 17, 55, 183, 56, 50, 110, 201, 109, 249, 128, 112, 75, 220, 214, 140, 246, 213, 136, 148, 97, 35, 241, 60, 188, 180, 206, 80, 91, 96, 157, 81, 171, 141, 131, 158, 1, 208, 26, 41
};
void InsertStaticStrTable(char* csString, unsigned char cLen, void* pData, StaticStrTable* pTable) {
unsigned char cHash = cLen - csString[0];
for (int i = 0; i < cLen; ++i) cHash ^= cPerm256[cHash ^ csString[i]];
unsigned short cTableIndex = cHash % pTable->nBuckets;
long long* pBucket = pTable->pBuckets[cTableIndex];
// Inserts data and records how many collisions there are - it may look weird as the way in which I decided to pack the data into the table buffer is very compact and arbitrary
// It won't affect the hash though, which is the key issue!
for (int i = 0; i < pTable->nMaxCollisions; ++i) {
if (i == 1) {
pTable->nRepeats++;
}
long long* pSlotID = pBucket + (i << 1);
if (pSlotID[0] == 0) {
pSlotID[0] = csString;
pSlotID[1] = pData;
break;
}
}
}
FYI (This is not an answer, I just need the formatting)
These are just single runs from a simulation, YMMV.
distributing 50 elements randomly over 50 bins:
kalender_size=50 nperson = 50
E/cell| Ncell | frac | Nelem | frac |h/cell| hops | Cumhops
----+---------+--------+----------+--------+------+--------+--------
0: 18 (0.360000) 0 (0.000000) 0 0 0
1: 18 (0.360000) 18 (0.360000) 1 18 18
2: 10 (0.200000) 20 (0.400000) 3 30 48
3: 4 (0.080000) 12 (0.240000) 6 24 72
----+---------+--------+----------+--------+------+--------+--------
4: 50 50 1.440000 72
Similarly: distribute 365 persons over a birthday-calendar (ignoring leap days ...):
kalender_size=356 nperson = 356
E/cell| Ncell | frac | Nelem | frac |h/cell| hops | Cumhops
----+---------+--------+----------+--------+------+--------+--------
0: 129 (0.362360) 0 (0.000000) 0 0 0
1: 132 (0.370787) 132 (0.370787) 1 132 132
2: 69 (0.193820) 138 (0.387640) 3 207 339
3: 19 (0.053371) 57 (0.160112) 6 114 453
4: 6 (0.016854) 24 (0.067416) 10 60 513
5: 1 (0.002809) 5 (0.014045) 15 15 528
----+---------+--------+----------+--------+------+--------+--------
6: 356 356 1.483146 528
For N items over N slots, the expectation for the number of empty slots and the number of slots with a single item in them is equal. The expected density is 1/e for both.
The final number (1.483146) is the number of ->next pointer traversels per found element (when using a chained hash table) Any optimal hash function will almost reach 1.5.
I have too large const, but masm don't compile my source code. How can I fix it?
C25 byte 51, 135, 173, 160, 231, 165, 173, 168, 165, 32, 162, 235, 224, 160, 166, 165, 173, 168, 239, 32, 115, 117, 109, 91, 49, 48, 48, 48, 44, 32, 49, 48, 48, 48, 93, 32, 109, 111, 100, 32, 49, 53, 48, 48, 32, 100, 105, 118, 32, 51, 58, 32
error image
You can put at most 48 elements per line. So split the line into two or more lines that each contains 48 elements or less, e.g.:
C25 byte 51, 135, 173, 160, 231, 165, 173, 168, 165, 32, 162, 235, 224, 160, 166, 165, 173, 168, 239, 32, 115, 117, 109, 91, 49
byte 48, 48, 48, 44, 32, 49, 48, 48, 48, 93, 32, 109, 111, 100, 32, 49, 53, 48, 48, 32, 100, 105, 118, 32, 51, 58, 32
I have a 3d numpy array and want to generate a secondary array consisting of the minimum of each value and the values in the 10 rows directly above and 10 rows directly below (i.e each entry is the minimum value from 21 values) for each 2d array.
I've been trying to use 'numpy.clip' to deal with the edges of the array - here the range of values which the minimum is taken from should simply reduce to 10 at the values on the top/bottom of the array. I think something like 'scipy.signal.argrelmin' seems to be what I'm after.
Here's my code so far, definitely not the best way to go about it:
import numpy as np
array_3d = np.random.random_integers(50, 80, (3, 50, 18))
minimums = np.zeros(array_3d.shape)
for array_2d_index in range(len(array_3d)):
for row_index in range(len(array_3d[array_2d_index])):
for col_index in range(len(array_3d[array_2d_index][row_index])):
minimums[array_2d_index][row_index][col_index] = min(array_3d[array_2d_index][np.clip(row_index-10, 0, 49):np.clip(row_index+10, 0, 49)][col_index])
The main issue I think is that this is taking the minimum from the columns either side of each entry instead of the rows, which has been giving index errors.
Would appreciate any advice, thanks.
Approach #1
Here's one approach with np.lib.stride_tricks.as_strided -
def strided_3D_axis1(array_3d, L):
s0,s1,s2 = array_3d.strides
strided = np.lib.stride_tricks.as_strided
m,n,r = array_3d.shape
nL = n-L+1
return strided(array_3d, (m,nL,L,r),(s0,s1,s1,s2))
out = strided_3D_axis1(array_3d, L=21).min(axis=-2)
Sample run -
1) Input :
In [179]: array_3d
Out[179]:
array([[[73, 65, 51, 76, 59],
[74, 57, 75, 53, 70],
[60, 74, 52, 54, 60],
[54, 52, 62, 75, 50],
[68, 56, 68, 63, 77]],
[[62, 70, 60, 79, 74],
[70, 68, 50, 74, 57],
[63, 57, 69, 65, 54],
[63, 63, 68, 58, 60],
[70, 66, 65, 78, 78]]])
2) Strided view :
In [180]: strided_3D_axis1(array_3d, L=3)
Out[180]:
array([[[[73, 65, 51, 76, 59],
[74, 57, 75, 53, 70],
[60, 74, 52, 54, 60]],
[[74, 57, 75, 53, 70],
[60, 74, 52, 54, 60],
[54, 52, 62, 75, 50]],
[[60, 74, 52, 54, 60],
[54, 52, 62, 75, 50],
[68, 56, 68, 63, 77]]],
[[[62, 70, 60, 79, 74],
[70, 68, 50, 74, 57],
[63, 57, 69, 65, 54]],
[[70, 68, 50, 74, 57],
[63, 57, 69, 65, 54],
[63, 63, 68, 58, 60]],
[[63, 57, 69, 65, 54],
[63, 63, 68, 58, 60],
[70, 66, 65, 78, 78]]]])
3) Strided view based min :
In [181]: strided_3D_axis1(array_3d, L=3).min(axis=-2)
Out[181]:
array([[[60, 57, 51, 53, 59],
[54, 52, 52, 53, 50],
[54, 52, 52, 54, 50]],
[[62, 57, 50, 65, 54],
[63, 57, 50, 58, 54],
[63, 57, 65, 58, 54]]])
Approach #2
Here's another with broadcasting upon creating all sliding indices along the second axis -
array_3d[:,np.arange(array_3d.shape[1]-L+1)[:,None] + range(L)].min(-2)
Approach #3
Here's another using Scipy's 1D minimum filter -
from scipy.ndimage.filters import minimum_filter1d as minf
L = 21
hL = (L-1)//2
out = minf(array_3d,L,axis=1)[:,hL:-hL]
Runtime test -
In [231]: array_3d = np.random.randint(50, 80, (3, 50, 18))
In [232]: %timeit strided_3D_axis1(array_3d, L=21).min(axis=-2)
10000 loops, best of 3: 54.2 µs per loop
In [233]: %timeit array_3d[:,np.arange(array_3d.shape[1]-L+1)[:,None] + range(L)].min(-2)
10000 loops, best of 3: 81.3 µs per loop
In [234]: L = 21
...: hL = (L-1)//2
...:
In [235]: %timeit minf(array_3d,L,axis=1)[:,hL:-hL]
10000 loops, best of 3: 32 µs per loop
I'm building a keyboard light with AVR micro controller.
There are two buttons, BRIGHT and DIM, and a white LED.
The LED isn't really linear, so I need to use a logarithmic scale (increase brightness faster in higher values, and use tiny steps in lower).
To do that, I adjust the delay between 1 is added or subtracted to/from the PWM compare match control register.
while (1) {
if (btn_high() && OCR0A < 255) OCR0A += 1;
if (btn_low() && OCR0A > 0) OCR0A -= 1;
if (OCR0A < 25)
_delay_ms(30);
else if (OCR0A < 50)
_delay_ms(25);
else if (OCR0A < 128)
_delay_ms(17);
else
_delay_ms(5);
}
It works nice, but there's a visible step when it goes from one speed to another. It'd be much better if the delay adjusted smoothly.
Is there some simple formula I can use?
It must not contain division, modulo, sqrt, log or any other advanced math. I can use multiplication, add, sub, and bit operations. Also, I can't use float in it.
Or perhaps just some kind of lookup table? I'm not really happy with adding more branches to this if-else mess.
The posted transfer function is quite linear. Suggest a linear delay calculation.
delay = 32 - OCR0A/8;
After accept edit
Various look-up-tables lend themselves to a close fit simple equations (constructed to avoid intermediate values > 65535) such as
BRIGHTNESS_60 = (((index*index)>>2 + 128)*index)>>8;
The scaling isn't quite logarithmic so simply using log() isn't enough.
I have tackled this problem in the past by using a LUT with 18 entries and going an entire step at a time (i.e. the control variable varies from 0 to 17 and then is shoved through the LUT), but if finer control is required then having 52 or more is certainly doable. Make sure to put it in flash so that it doesn't consume any SRAM though.
Edit by MightyPork
Here's arrays I used in the end - obtained from the original array by linear interpolation.
Basic
#define BRIGHTNESS_LEN 60
const uint8_t BRIGHTNESS[] PROGMEM = {
0, 1, 1, 2, 2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 9,
10, 11, 13, 14, 16, 18, 21, 24, 27, 30, 32,
35, 38, 40, 42, 45, 48, 50, 54, 58, 61, 65,
69, 72, 76, 80, 85, 90, 95, 100, 106, 112,
119, 125, 134, 142, 151, 160, 170, 180, 190,
200, 214, 228, 241, 255
};
Smoother
#define BRIGHTNESS_LEN 121
const uint8_t BRIGHTNESS[] PROGMEM = {
0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5,
6, 6, 6, 7, 7, 8, 8, 8, 9, 10, 10, 10, 11, 12, 13, 14, 14,
15, 16, 17, 18, 20, 21, 22, 24, 26, 27, 28, 30, 31, 32, 34,
35, 36, 38, 39, 40, 41, 42, 44, 45, 46, 48, 49, 50, 52, 54,
56, 58, 59, 61, 63, 65, 67, 69, 71, 72, 74, 76, 78, 80, 82,
85, 88, 90, 92, 95, 98, 100, 103, 106, 109, 112, 116, 119,
122, 125, 129, 134, 138, 142, 147, 151, 156, 160, 165, 170,
175, 180, 185, 190, 195, 200, 207, 214, 221, 228, 234, 241,
248, 255
};
It sounds like you really want to use some linear function of a logarithm, but without the overhead of the floating point math library. A crude fixed point logarithm can be coded as
uint_8 log2fix(uint_8 in)
{
if(in == 0)
return 0;
uint_8 out = 0;
while(in > 0)
{
in = in >> 1;
out++;
}
return out - 1;
}
This will give you a rough approximation. If you want more precision there is a fast fixed point algorithm that you should be able to modify for Q8.0 to Q3.5.
You have over-complicated the issue. You have already turned the logarithmic problem into a linear one by defining a variable update rate rather than a variable PWM step - so you have essentially solved the problem, but not seen the simple arithmetic relationship.
If you take the OCR0A vs delay points you have selected (25,30), (50,25), (128,17), it can be seen that that is an approximately linear relationship described by (approximately) y = 0.125x + 32, which can be rearranged as y = 32 - x / 8
So what you need is:
while (1)
{
if (btn_high() && OCR0A < 255) OCR0A += 1;
if (btn_low() && OCR0A > 0) OCR0A -= 1;
_delay_ms( 32 - OCR0A / 8 ) ;
}