Eigen SIGSEGV while doing make_pair - eigen3

I'm updating ~3yr old code and am crashing trying to make a pair out of Vector2d and Matrix2d. Roughly, the code is
vector<pair<Vector2d, Matrix2d>> list;
list.resize(points.size());
Vector2d md;
Matrix2d Sd;
for(int i=0; i<n; i++) {
// do stuff to assign elements of md and Sd
// ...
list[i] = make_pair(md, Sd); // SIGSEGV on this line
}
I found an old bug that looks similar at the final two levels of the stack but that bug is marked as fixed and I confirmed that the fixes from that patch are still in place. I tried making a simple test case but it runs without issue. Does anyone have an idea what the problem might be?
Misc info:
Arch linux
gcc 6.1.1
eigen 3.2.9
Here's the backtrace:
Thread 7 "flightControlle" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffd7152700 (LWP 12024)]
0x00000000005e846b in _mm_load_pd (__P=0x0) at /usr/lib/gcc/x86_64-pc-linux-gnu/6.1.1/include/emmintrin.h:119
119 return *(__m128d *)__P;
(gdb) bt
#0 0x00000000005e846b in _mm_load_pd (__P=0x0) at /usr/lib/gcc/x86_64-pc-linux-gnu/6.1.1/include/emmintrin.h:119
#1 Eigen::internal::pload<double __vector(2)>(Eigen::internal::unpacket_traits<double __vector(2)>::type const*) (from=0x0) at /usr/include/eigen3/Eigen/src/Core/arch/SSE/PacketMath.h:220
#2 0x000000000060310b in Eigen::internal::ploadt<double __vector(2), 1>(Eigen::internal::unpacket_traits<double __vector(2)>::type const*) (from=0x0) at /usr/include/eigen3/Eigen/src/Core/GenericPacketMath.h:290
#3 0x0000000000659523 in Eigen::PlainObjectBase<Eigen::Matrix<double, 2, 1, 0, 2, 1> >::packet<1> (this=0x0, rowId=0, colId=0) at /usr/include/eigen3/Eigen/src/Core/PlainObjectBase.h:184
#4 0x00000000006a40fe in Eigen::SwapWrapper<Eigen::Matrix<double, 2, 1, 0, 2, 1> >::copyPacket<Eigen::Matrix<double, 2, 1, 0, 2, 1>, 1, 1> (this=0x7fffd7150a30, rowId=0, colId=0, other=...)
at /usr/include/eigen3/Eigen/src/Core/Swap.h:100
#5 0x00000000006a1da3 in Eigen::DenseCoeffsBase<Eigen::SwapWrapper<Eigen::Matrix<double, 2, 1, 0, 2, 1> >, 1>::copyPacketByOuterInner<Eigen::Matrix<double, 2, 1, 0, 2, 1>, 1, 1> (this=0x7fffd7150a30, outer=0,
inner=0, other=...) at /usr/include/eigen3/Eigen/src/Core/DenseCoeffsBase.h:548
#6 0x000000000069fa0e in Eigen::internal::assign_innervec_CompleteUnrolling<Eigen::SwapWrapper<Eigen::Matrix<double, 2, 1, 0, 2, 1> >, Eigen::Matrix<double, 2, 1, 0, 2, 1>, 0, 2>::run (dst=..., src=...)
at /usr/include/eigen3/Eigen/src/Core/Assign.h:206
#7 0x000000000069d1f7 in Eigen::internal::assign_impl<Eigen::SwapWrapper<Eigen::Matrix<double, 2, 1, 0, 2, 1> >, Eigen::Matrix<double, 2, 1, 0, 2, 1>, 2, 2, 0>::run (dst=..., src=...)
at /usr/include/eigen3/Eigen/src/Core/Assign.h:342
#8 0x000000000069af38 in Eigen::DenseBase<Eigen::SwapWrapper<Eigen::Matrix<double, 2, 1, 0, 2, 1> > >::lazyAssign<Eigen::Matrix<double, 2, 1, 0, 2, 1> > (this=0x7fffd7150a30, other=...)
at /usr/include/eigen3/Eigen/src/Core/Assign.h:506
#9 0x000000000069856f in Eigen::DenseBase<Eigen::Matrix<double, 2, 1, 0, 2, 1> >::swap<Eigen::Matrix<double, 2, 1, 0, 2, 1> > (this=0x7fffd7150c90, other=...)
at /usr/include/eigen3/Eigen/src/Core/DenseBase.h:384
#10 0x000000000069591a in Eigen::internal::matrix_swap_impl<Eigen::Matrix<double, 2, 1, 0, 2, 1>, Eigen::Matrix<double, 2, 1, 0, 2, 1>, false>::run (a=..., b=...)
at /usr/include/eigen3/Eigen/src/Core/PlainObjectBase.h:805
#11 0x0000000000693785 in Eigen::PlainObjectBase<Eigen::Matrix<double, 2, 1, 0, 2, 1> >::_swap<Eigen::Matrix<double, 2, 1, 0, 2, 1> > (this=0x7fffd7150c90, other=...)
at /usr/include/eigen3/Eigen/src/Core/PlainObjectBase.h:682
#12 0x00000000006923c6 in Eigen::Matrix<double, 2, 1, 0, 2, 1>::swap<Eigen::Matrix<double, 2, 1, 0, 2, 1> > (this=0x7fffd7150c90, other=...) at /usr/include/eigen3/Eigen/src/Core/Matrix.h:334
#13 0x00000000006909f9 in Eigen::Matrix<double, 2, 1, 0, 2, 1>::operator=(Eigen::Matrix<double, 2, 1, 0, 2, 1>&&) (this=0x0,
other=<unknown type in /home/ryantr/Software/FlightControl/Rover/build/flightController, CU 0x5b3bd1, DIE 0x73ce2f>) at /usr/include/eigen3/Eigen/src/Core/Matrix.h:224
#14 0x000000000068f440 in std::pair<Eigen::Matrix<double, 2, 1, 0, 2, 1>, Eigen::Matrix<double, 2, 2, 0, 2, 2> >::operator=(std::pair<Eigen::Matrix<double, 2, 1, 0, 2, 1>, Eigen::Matrix<double, 2, 2, 0, 2, 2> >&&) (this=0x0, __p=<unknown type in /home/ryantr/Software/FlightControl/Rover/build/flightController, CU 0x5b3bd1, DIE 0x73e9e5>) at /usr/include/c++/6.1.1/bits/stl_pair.h:319
#15 0x000000000068a452 in ICSL::Quadrotor::VelocityEstimator::calcPriorDistributions (mDeltaList=std::vector of length 1, capacity 1 = {...}, SDeltaList=std::vector of length 1, capacity 1 = {...},
priorDistList=std::vector of length 0, capacity 0, points=std::vector of length 1, capacity 1 = {...}, mv=..., Sv=..., mz=0.050000000000000003, varz=1, focalLength=262.0029296875, dt=0.033000000000000002,
omega=...) at /home/ryantr/Software/FlightControl/Rover/src/VelocityEstimator.cpp:566
Here's info on the state of md and Sd if that makes a difference:
$1 = {<Eigen::PlainObjectBase<Eigen::Matrix<double, 2, 1, 0, 2, 1> >> = {<Eigen::MatrixBase<Eigen::Matrix<double, 2, 1, 0, 2, 1> >> = {<Eigen::DenseBase<Eigen::Matrix<double, 2, 1, 0, 2, 1> >> = {<Eigen::internal::special_scalar_op_base<Eigen::Matrix<double, 2, 1, 0, 2, 1>, double, double, Eigen::DenseCoeffsBase<Eigen::Matrix<double, 2, 1, 0, 2, 1>, 3>, false>> = {<Eigen::DenseCoeffsBase<Eigen::Matrix<double, 2, 1, 0, 2, 1>, 3>> = {<Eigen::DenseCoeffsBase<Eigen::Matrix<double, 2, 1, 0, 2, 1>, 1>> = {<Eigen::DenseCoeffsBase<Eigen::Matrix<double, 2, 1, 0, 2, 1>, 0>> = {<Eigen::EigenBase<Eigen::Matrix<double, 2, 1, 0, 2, 1> >> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, m_storage = {m_data = {array = {-6.5748748779296875,
-96.418296813964844}}}}, <No data fields>}
(gdb) p Sd
$2 = {<Eigen::PlainObjectBase<Eigen::Matrix<double, 2, 2, 0, 2, 2> >> = {<Eigen::MatrixBase<Eigen::Matrix<double, 2, 2, 0, 2, 2> >> = {<Eigen::DenseBase<Eigen::Matrix<double, 2, 2, 0, 2, 2> >> = {<Eigen::internal::special_scalar_op_base<Eigen::Matrix<double, 2, 2, 0, 2, 2>, double, double, Eigen::DenseCoeffsBase<Eigen::Matrix<double, 2, 2, 0, 2, 2>, 3>, false>> = {<Eigen::DenseCoeffsBase<Eigen::Matrix<double, 2, 2, 0, 2, 2>, 3>> = {<Eigen::DenseCoeffsBase<Eigen::Matrix<double, 2, 2, 0, 2, 2>, 1>> = {<Eigen::DenseCoeffsBase<Eigen::Matrix<double, 2, 2, 0, 2, 2>, 0>> = {<Eigen::EigenBase<Eigen::Matrix<double, 2, 2, 0, 2, 2> >> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, m_storage = {m_data = {array = {8947423.3324944824, 110733.5419973651,
110733.5419973651, 13614569.654632445}}}}, <No data fields>}

This is an alignement issue. See this intro, and in your case this page will give you the solution. In short, either use non-aligned types:
typedef Matrix<double,2,1,DontAlign> Vector2du;
typedef Matrix<double,2,2,DontAlign> Matrix2du;
vector<pair<Vector2du, Matrix2du>> list;
or pass an aligned allocator to std::vector:
vector<pair<Vector2du, Matrix2du>,Eigen::aligned_allocator<pair<Vector2du, Matrix2du>>> list;

Related

Python Numpy repeating an arange array

so say I do this
x = np.arange(0, 3)
which gives
array([0, 1, 2])
but what can I do like
x = np.arange(0, 3)*repeat(N=3)times
to get
array([0, 1, 2, 0, 1, 2, 0, 1, 2])
I've seen several recent questions about resize. It isn't used often, but here's one case where it does just what you want:
In [66]: np.resize(np.arange(3),3*3)
Out[66]: array([0, 1, 2, 0, 1, 2, 0, 1, 2])
There are many other ways of doing this.
In [67]: np.tile(np.arange(3),3)
Out[67]: array([0, 1, 2, 0, 1, 2, 0, 1, 2])
In [68]: (np.arange(3)+np.zeros((3,1),int)).ravel()
Out[68]: array([0, 1, 2, 0, 1, 2, 0, 1, 2])
np.repeat doesn't repeat in the way we want
In [70]: np.repeat(np.arange(3),3)
Out[70]: array([0, 0, 0, 1, 1, 1, 2, 2, 2])
but even that can be reworked (this is a bit advanced):
In [73]: np.repeat(np.arange(3),3).reshape(3,3,order='F').ravel()
Out[73]: array([0, 1, 2, 0, 1, 2, 0, 1, 2])
EDIT: Refer to hpaulj's answer. It is frankly better.
The simplest way is to convert back into a list and use:
list(np.arange(0,3))*3
Which gives:
>> [0, 1, 2, 0, 1, 2, 0, 1, 2]
Or if you want it as a numpy array:
np.array(list(np.arange(0,3))*3)
Which gives:
>> array([0, 1, 2, 0, 1, 2, 0, 1, 2])
how about this one?
arr = np.arange(3)
res = np.hstack((arr, ) * 3)
Output
array([0, 1, 2, 0, 1, 2, 0, 1, 2])
Not much overhead I would say.

How to create a random mask array?

I've an array with 128 values, each value is 1:
length = 128
partials = Array.new length
partials.each_index do |i|
partials[i] = 1
end
I want to set value 0 on some (random) position (for example, on pos 1,6,50,70,100,112,120).
Of course, the number of position could be different every time, and if I choose 7 different position, I want to end with 7 different pos changed.
What's the faster way to do this in Ruby?
Assuming you want to have n elements with value 0, you can do the below:
n = 5
partials[0,n] = [0]*n
partials.shuffle
Alternatively, can also be written as:
partials.tap{|p| p[0,n] = [0]*n}.shuffle
You can incorporate the zeros into the array creation:
length = 128
zeros = 7
partials = Array.new(length) { |i| i < zeros ? 0 : 1 }.shuffle
#=> [1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1,
# 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
A way:
array = 128.times.map{1}
Or with randomly sprayed 0s:
array = 128.times.map{rand(2)}
or put a number of 0s later:
10.times{array[rand(128)]=0}
etc... Play with it and see what you need
Another alternative:
length = 10
zeros = 2
([0]*(length-zeros)+[1]*zeros).shuffle

Why is Flash doing array operations wrongly.

Its was runnning fine and then it through me this error
1125 Error #: 117 index is beyond the scope of 115.
It doesn't list a row number but the function below is the only place where a long array is referred to.
The error means its trying to access between end of the vector array- It shouldn't be possible.
Relevant code parts (the rest-public functions and other functions not include all work fine).
public class Main extends Sprite
{
internal var oneoff:Boolean = true;
internal var kanaList:Vector.<String> = new <String>["あ/ア", "あ/ア", "え/え", "え/え", "い/イ", "い/イ", "お/オ", "お/オ", "う/ウ", "う/ウ", "う/ウ", "う/ウ", "か/カ", "か/カ", "け/ケ", "け/ケ", "き/キ", "き/キ", "く/ク", "く/ク", "こ/コ", "こ/コ", "さ/サ", "さ/サ", " し/シ", " し/シ", "す/ス", "す/ス", "そ/ソ", "そ/ソ", "す/ス", "す/ス", "た/タ", "た/タ", "て/テ", "て/テ", " ち/チ", " ち/チ", "と/ト", "と/ト", "つ/ツ", "つ/ツ", "ら/ラ", "ら/ラ", "れ/レ", "れ/レ", "り/リ", "り/リ", "ろ/ロ", "ろ/ロ", "る/ル", "る/ル", "だ/ダ", "で/デ", "じ/ジ", "ど/ド", "ず/ズ", "ざ/ザ", "ぜ/ゼ", "ぞ/ゾ", "な/ナ", "ね/ネ", "に/二", "の/ノ", "ぬ/ヌ", "じゃ/ジャ", "じゅ/ジュ", "じょ/ジョ", "ん/ン", "しゃ/シャ", "しゅ/シュ", "しょ/ショ", "や/ヤ", "ゆ/ユ", "よ/ヨ", "は/ハ", "ひ/ヒ", "ふ/フ", "へ/ヘ", "ほ/ホ", "ば/バ", "ば/バ", "ぶ/ブ", "ぶ/ブ", "び/ビ", "び/ビ", "ぼ/ボ", "ぼ/ボ", "べ/ベ", "べ/ベ", "ぱ/パ", "ぴ/ピ", "ぷ/プ", "ぺ/ペ", "ぽ/ポ", "ま/マ", "み/ミ", " む/ム", "め/メ", "も/モ", "を/ヲ", "みゃ/ミャ", "みゅ/ミャ", "みょ/ミョ", "きゃ/キャ", "きゅ/キュ", "きょ/キョ", "にゃ/ニャ", "にゅ/ニュ", "にょ/ニョ", "びゃ/びゃ", "びゅ/ビュ", "びょ/ビョ", "  ひゃ/ヒャ", "ひゅ/ヒュ", "ひょ/ヒョ", "ぴゃ/ピャ", "ぴゅ/ピュ", "ぴょ/ピョ", "っ/ッ", "っ/ッ"];
internal var valueList:Vector.<uint>= new <uint>[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 10, 10, 5, 5, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 20, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 1, 1];
// Lists of Kana that can be replaced in the replace mode and the substitute Kana and Values
internal var selectghostList:Vector.<String>=new<String>["ま/マ","む/ム","も/モ","か/カ","く/ク","こ/コ","な/ナ","ぬ/ヌ","の/ノ","ば/バ","ぶ/ブ","ぼ/ボ","は/ハ","ふ/フ","ほ/ホ","ぱ/パ","ぷ/プ","ぽ/ポ"];
internal var selectkanaList:Vector.<String>=new <String>["みゃ/ミャ", "みゅ/ミャ", "みょ/ミョ", "きゃ/キャ", "きゅ/キュ", "きょ/キョ", "にゃ/ニャ", "にゅ/ニュ", "にょ/ニョ", "びゃ/びゃ", "びゅ/ビュ", "びょ/ビョ", "  ひゃ/ヒャ", "ひゅ/ヒュ", "ひょ/ヒョ", "ぴゃ/ピャ", "ぴゅ/ピュ", "ぴょ/ピョ"];
internal var selectghostvalueList:Vector.<uint>=new <uint>[2, 2, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2];
//Start list of playerHand contents as I don't know if Null is 0
internal var playernumber:uint;
internal var allplayersHand:Array = [[0], [0], [0], [0],[0], [0]];
internal var playerRound:uint = 1;
internal var round:uint = 1;
internal var aplayersHand:Array;
internal function create():void
{ var listLength:uint;
var row:uint
listLength = kanaList.length;
aplayersHand = allplayersHand[playerRound];
for (var i:uint = (aplayersHand.length); i <= 7; i+=1)
{row = int(Math.random() * listLength);  
trace (row);
trace(i);
aplayersHand[i] = [0, kanaList[row], valueList[row],]
trace (aplayersHand);
trace (aplayersHand[i]);
kanaList.splice(row,1);
valueList.splice(row,1);
}
deal();
}
I'm assuming it's throwing the error intermittently. The reason I think it's happening is that you stored long array's length in listLength, but didn't decrement its value after
kanaList.splice(row,1);
valueList.splice(row,1);
which is why, I think, row value calculated like
row = int(Math.random() * listLength);
would sometimes return a value which is greater than array's length at that iteration.
On a sidenote, it'd be great to have what all was traced till the point you got the error. Also, the exception should show stack trace, if you compile a debug version of swf and run it in a debug flash player. The stack trace is very very useful to track down bugs like these.

Is there anyone that knows what the following code possibly does?

/* utf-8: 0xc0, 0xe0, 0xf0, 0xf8, 0xfc */
static unsigned char _mblen_table_utf8[] =
{
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 1, 1
};
I bet it has something to do with the encodings,
but how exactly it works?
UPDATE
while (str < ptr)
{
j = mblen[(*str)];
tree_nput(r->tree, cr, sizeof(struct rule_item), str, j);
str += j;
}
}
Because a character in a multibyte string has a variable length, this table maps each character to a length.
The last 64 characters are wider than one byte, having lengths of 2 to 6.
The usage would be something like that:
unsigned char current_char = *mbstr;
for (i = 0; i < _mblen_table_utf8[current_char]; i++) {
/* treat *mbstr++ as a part of the current character */
}
Historically, each character was coded on 7 bits (then 8 bits) which was more than enough to encode european languages alphabets.
Only the 128 first characters were common to everyone, the remaining 128 were standardized through codepages (ISO-8859-1 is an example).
The need to encode longer alphabet languages such as Chinese resulted in the Unicode effort were each character is coded on several bytes.
UTF-8 is a way to encode Unicode characters in an efficient, variable code-length way. This means that the first byte you read determines the length of the character byte-sequence.
Basically, your table is a lookup-table to check how many bytes is a character that start from the byte you use as table index. You will see another version of this table here with explanations.
I added the table indexes as comments to make it clearer:
/* utf-8: 0xc0, 0xe0, 0xf0, 0xf8, 0xfc */
static unsigned char _mblen_table_utf8[] =
{
/*0x00*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0x10*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0x20*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0x30*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0x40*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0x50*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0x60*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0x70*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0x80*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0x90*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0xA0*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0xB0*/ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
/*0xC0*/ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
/*0xD0*/ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
/*0xE0*/ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
/*0xF0*/ 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 1, 1
};
The array appears to be a lookup table for determining the number of bytes in a UTF-8 character, given the first byte. Basically the first byte (as an unsigned value) is used as an index into the array, and the element at that index gives the length of the byte sequence for the UTF-8 character.
Invalid and mid-sequence bytes seem to map to 1-byte in this table, so if encountered out of place the code using this table would probably treat them as single characters (unless it specifically ignores them).
One use for a table like this is for counting characters in a UTF-8 string (not bytes, but Unicode characters). Each time you count a character, you look up the length and move ahead by the length of the character's byte sequence instead of moving ahead one byte... it works well as long as you start at the beginning of a character and the string is valid UTF-8 all the way through.
Without any further details, the code above does exactly this: it declares a static unsigned char array and initializes it with the values inside the curly brackets.

Leading zeros calculation with intrinsic function

I'm trying to optimize some code working in an embedded system (FLAC decoding, Windows CE, ARM 926 MCU).
The default implementation uses a macro and a lookup table:
/* counts the # of zero MSBs in a word */
#define COUNT_ZERO_MSBS(word) ( \
(word) <= 0xffff ? \
( (word) <= 0xff? byte_to_unary_table[word] + 24 : \
byte_to_unary_table[(word) >> 8] + 16 ) : \
( (word) <= 0xffffff? byte_to_unary_table[word >> 16] + 8 : \
byte_to_unary_table[(word) >> 24] ) \
)
static const unsigned char byte_to_unary_table[] = {
8, 7, 6, 6, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
};
However most CPU already have a dedicated instruction, bsr on x86 and clz on ARM (http://www.devmaster.net/articles/fixed-point-optimizations/), that should be more efficient.
On Windows CE we have the intrinsic function _CountLeadingZeros, that should just call that value. However it is 4 times slower than the macro (measured on 10 million of iterations).
How is possible that an intrinsic function, that (should) rely on a dedicated ASM instruction, is 4 times slower?
Check the disassembly. Are you sure that the compiler inserted the instruction? In the Remarks section there is this text:
This function can be implemented by
calling a runtime function.
I suspect that's what's happening in your case.
Note that the CLZ instruction is only available in ARMv5 and later. You need to tell the compiler if you want ARMv5 code:
/QRarch5 ARM5 Architecture
/QRarch5T ARM5T Architecture
(Microsoft incorrectly uses "ARM5" instead of "ARMv5")

Resources