Interpreting branch coverage on function call - c

I'm trying to understand a branch coverage report generated with gcov/lcov. This is a section of the output. The issue is on line 84 where I have a glaring minus sign next to a call to a stub function. The stub does not branching statements.
How does one interpret missing branch coverage on a function call?
81 [ + + ][ + + ]: 28 : if(SerialIO_response_count > 0 && SerialIO_tx_read != SerialIO_tx_write ){
82 : :
83 : 16 : tx_char = SerialIO_get_next_from_buffer(TX_BUFFER);
84 [ + - ]: 16 : usart_write_job(SerialIO_usart_module, &tx_char);
85 : :
86 : : // Decrement response count if CR transmitted
87 [ + + ]: 16 : if(tx_char == '\r')
88 : 4 : SerialIO_response_count--;
Thanks!

Turns out this is was a symptom of not handling exceptions.
See the related post for a quick fix my turning on -fno-exceptions when compiling.

Related

Deleting rows where the difference between column 1 and column 2 have a greater difference than 1

Some of the the values in columns Molecular.Weight and m.z are quite similar, often differing only by 1.0 or less. But there are some instances where its greater than 1.0. I would like to generate a new dataset that only includes the rows with a difference less than or equal to 1.0. However, it can be either column that has the higher number, so I am struggling to make an equation that works.
'data.frame': 544 obs. of 48 variables:
$ X : int 1 2 3 4 5 6 7 8 9 10 ...
$ No. : int 2 32 34 95 114 141 169 234 236 278 ...
$ RT..min. : num 0.89 3.921 0.878 2.396 0.845 ...
$ Molecular.Weight : num 70 72 72 78 80 ...
$ m.z : num 103 145 114 120 113 ...
$ HMDB.ID : chr "HMDB0006804" "HMDB0031647" "HMDB0006112" "HMDB0001505" ...
$ Name : chr "Propiolic acid" "Acrylic acid" "Malondialdehyde" "Benzene" ...
$ Formula : chr "C3H2O2" "C3H4O2" "C3H4O2" "C6H6" ...
$ Monoisotopic_Mass: num 70 72 72 78 80 ...
$ Delta.ppm. : num 1.295 0.833 1.953 1.023 0.102 ...
$ X1 : num 288.3 16.7 1130.9 3791.5 33.5 ...
$ X2 : num 276.8 13.4 1069.1 3228.4 44.1 ...
$ X3 : num 398.6 19.3 794.8 2153.2 15.8 ...
$ X4 : num 247.6 100.5 1187.5 1791.4 33.4 ...
$ X5 : num 98.4 162.1 1546.4 1646.8 45.3 ...
I had to do it in 2 parts because I couldn't figure out how to combine them but its still not giving me the right result.
The first section is supposed to filter out the values where Molecular.Weight might be greater than m.z by 1, and the second then filters out when m.z might be greater than Molecular.Weight. The first part seems to work and gives me a new dataset with around half the number of rows, but then when I do the second part on it, it gives me 1 row (and its not even correct because that one compound does fall within the 1.0 difference). Any help is super appreciated, thanks!
rawdata <- read.csv("Analysis negative + positive minus QC.csv")
filtered_data <-c()
for (i in 1:nrow(rawdata)) {
if (rawdata$m.z[i]-rawdata$Molecular.Weight[i]<1)
filtered_data <- rbind(filtered_data, rawdata[i,])
}
newdata <- c()
for (i in 1:row(filtered_data)) {
if ((filtered_data$Molecular.Weight[i] - filtered_data$m.z[i])>1)
newdata <- rbind(newdata, filtered_data[i,])
}

Why can't I use increment operators with useState? [duplicate]

I was browsing Google Code when I chanced upon this project called JSpeed - optimization for Javascript.
I noticed one of the optimization was to change i++ to ++i in for loop statements.
Before Optimization
for (i=0;i<1;i++) {}
for (var i = 0, j = 0; i < 1000000; i++, j++) {
if (i == 4) {
var tmp = i / 2;
}
if ((i % 2) == 0) {
var tmp = i / 2;
i++;
}
}
var arr = new Array(1000000);
for (i = 0; i < arr.length; i++) {}
After optimization
for(var i=0;i<1;++i){}
for(var i=0,j=0;i<1000000;++i,++j){if(i==4){var tmp=i>>1;}
if((i&1)==0){var tmp=i>>1;i++;}}
var arr=new Array(1000000);for(var i=0,arr_len=arr.length;i<arr_len;++i){}
I know what pre and post increments do, but any idea how does this speeds the code up?
This is what I read and could answer your question: "preincrement (++i) adds one to the value of i, then returns i; in contrast, i++ returns i then adds one to it, which in theory results in the creation of a temporary variable storing the value of i before the increment operation was applied".
This is a faux optimization. As far as I understand it, you're saving 1 op code. If you're looking to optimize your code with this technique, then you've gone the wrong way. Also, most compilers/interpreters will optimize this for you anyway (reference 1). In short I wouldn't worry about. But, if you're really worried, you should use i+=1.
Here's the quick-and-dirty benchmark I just did
var MAX = 1000000, t=0,i=0;
t = (new Date()).getTime();
for ( i=0; i<MAX;i++ ) {}
t = (new Date()).getTime() - t;
console.log(t);
t = (new Date()).getTime();
for ( i=0; i<MAX;++i ) {}
t = (new Date()).getTime() - t;
console.log(t);
t = (new Date()).getTime();
for ( i=0; i<MAX;i+=1 ) {}
t = (new Date()).getTime() - t;
console.log(t);
Raw results
Post Pre +=
1071 1073 1060
1065 1048 1051
1070 1065 1060
1090 1070 1060
1070 1063 1068
1066 1060 1064
1053 1063 1054
Removed lowest and highest
Post Pre +=
1071 ---- 1060
1065 ---- ----
1070 1065 1060
---- 1070 1060
1070 1063 ----
1066 1060 1064
---- 1063 1054
Averages
1068.4 1064.2 1059.6
Notice that this is over one million iterations and the results are within 9 milliseconds on average. Not really much of an optimization considering that most iterative processing in JavaScript is done over much smaller sets (DOM containers for example).
In theory, using a post-increment operator may produce a temporary. In practice, JavaScript compilers are smart enough to avoid that, especially in such trivial case.
For example, let's consider that sample code:
sh$ cat test.js
function preInc(){
for(i=0; i < 10; ++i)
console.log(i);
}
function postInc(){
for(i=0; i < 10; i++)
console.log(i);
}
// force lazy compilation
preInc();
postInc();
In that case, the V8 compiler in NodeJS produces exactly the same bytecode (look esp. at opcodes 39-44 for the increment):
sh$ node --version
v8.9.4
sh$ node --print-bytecode test.js | sed -nEe '/(pre|post)Inc/,/^\[/p'
[generating bytecode for function: preInc]
Parameter count 1
Frame size 24
77 E> 0x1d4ea44cdad6 # 0 : 91 StackCheck
87 S> 0x1d4ea44cdad7 # 1 : 02 LdaZero
88 E> 0x1d4ea44cdad8 # 2 : 0c 00 03 StaGlobalSloppy [0], [3]
94 S> 0x1d4ea44cdadb # 5 : 0a 00 05 LdaGlobal [0], [5]
0x1d4ea44cdade # 8 : 1e fa Star r0
0x1d4ea44cdae0 # 10 : 03 0a LdaSmi [10]
94 E> 0x1d4ea44cdae2 # 12 : 5b fa 07 TestLessThan r0, [7]
0x1d4ea44cdae5 # 15 : 86 23 JumpIfFalse [35] (0x1d4ea44cdb08 # 50)
83 E> 0x1d4ea44cdae7 # 17 : 91 StackCheck
109 S> 0x1d4ea44cdae8 # 18 : 0a 01 0d LdaGlobal [1], [13]
0x1d4ea44cdaeb # 21 : 1e f9 Star r1
117 E> 0x1d4ea44cdaed # 23 : 20 f9 02 0f LdaNamedProperty r1, [2], [15]
0x1d4ea44cdaf1 # 27 : 1e fa Star r0
121 E> 0x1d4ea44cdaf3 # 29 : 0a 00 05 LdaGlobal [0], [5]
0x1d4ea44cdaf6 # 32 : 1e f8 Star r2
117 E> 0x1d4ea44cdaf8 # 34 : 4c fa f9 f8 0b CallProperty1 r0, r1, r2, [11]
102 S> 0x1d4ea44cdafd # 39 : 0a 00 05 LdaGlobal [0], [5]
0x1d4ea44cdb00 # 42 : 41 0a Inc [10]
102 E> 0x1d4ea44cdb02 # 44 : 0c 00 08 StaGlobalSloppy [0], [8]
0x1d4ea44cdb05 # 47 : 77 2a 00 JumpLoop [42], [0] (0x1d4ea44cdadb # 5)
0x1d4ea44cdb08 # 50 : 04 LdaUndefined
125 S> 0x1d4ea44cdb09 # 51 : 95 Return
Constant pool (size = 3)
Handler Table (size = 16)
[generating bytecode for function: get]
[generating bytecode for function: postInc]
Parameter count 1
Frame size 24
144 E> 0x1d4ea44d821e # 0 : 91 StackCheck
154 S> 0x1d4ea44d821f # 1 : 02 LdaZero
155 E> 0x1d4ea44d8220 # 2 : 0c 00 03 StaGlobalSloppy [0], [3]
161 S> 0x1d4ea44d8223 # 5 : 0a 00 05 LdaGlobal [0], [5]
0x1d4ea44d8226 # 8 : 1e fa Star r0
0x1d4ea44d8228 # 10 : 03 0a LdaSmi [10]
161 E> 0x1d4ea44d822a # 12 : 5b fa 07 TestLessThan r0, [7]
0x1d4ea44d822d # 15 : 86 23 JumpIfFalse [35] (0x1d4ea44d8250 # 50)
150 E> 0x1d4ea44d822f # 17 : 91 StackCheck
176 S> 0x1d4ea44d8230 # 18 : 0a 01 0d LdaGlobal [1], [13]
0x1d4ea44d8233 # 21 : 1e f9 Star r1
184 E> 0x1d4ea44d8235 # 23 : 20 f9 02 0f LdaNamedProperty r1, [2], [15]
0x1d4ea44d8239 # 27 : 1e fa Star r0
188 E> 0x1d4ea44d823b # 29 : 0a 00 05 LdaGlobal [0], [5]
0x1d4ea44d823e # 32 : 1e f8 Star r2
184 E> 0x1d4ea44d8240 # 34 : 4c fa f9 f8 0b CallProperty1 r0, r1, r2, [11]
168 S> 0x1d4ea44d8245 # 39 : 0a 00 05 LdaGlobal [0], [5]
0x1d4ea44d8248 # 42 : 41 0a Inc [10]
168 E> 0x1d4ea44d824a # 44 : 0c 00 08 StaGlobalSloppy [0], [8]
0x1d4ea44d824d # 47 : 77 2a 00 JumpLoop [42], [0] (0x1d4ea44d8223 # 5)
0x1d4ea44d8250 # 50 : 04 LdaUndefined
192 S> 0x1d4ea44d8251 # 51 : 95 Return
Constant pool (size = 3)
Handler Table (size = 16)
Of course, other JavaScript compilers/interpreters may do otherwise, but this is doubtful.
As the last word, for what it worth, I nevertheless consider as a best practice to use pre-increment when possible: since I frequently switch languages, I prefer using the syntax with the correct semantic for what I want, instead of relying on compiler smartness. For example, modern C compilers won't make any difference either. But in C++, this can have a significant impact with overloaded operator++.
Sounds like premature optimization. When you're nearly done your app, check where the bottlenecks are and optimize those as needed. But if you want a thorough guide to loop performance, check this out:
http://blogs.oracle.com/greimer/entry/best_way_to_code_a
But you never know when this will become obsolete because of JS engine improvements and variations between browsers. Best choice is to not worry about it until it's a problem. Make your code clear to read.
Edit: According to this guy the pre vs. post is statistically insignificant. (with pre possibly being worse)
Anatoliy's test included a post-increment inside the pre-increment test function :(
Here are the results without this side effect...
function test_post() {
console.time('postIncrement');
var i = 1000000, x = 0;
do x++; while(i--);
console.timeEnd('postIncrement');
}
function test_pre() {
console.time('preIncrement');
var i = 1000000, x = 0;
do ++x; while(--i);
console.timeEnd('preIncrement');
}
test_post();
test_pre();
test_post();
test_pre();
test_post();
test_pre();
test_post();
test_pre();
Output
postIncrement: 3.21ms
preIncrement: 2.4ms
postIncrement: 3.03ms
preIncrement: 2.3ms
postIncrement: 2.53ms
preIncrement: 1.93ms
postIncrement: 2.54ms
preIncrement: 1.9ms
That's a big difference.
The optimization isn't the pre versus post increment. It's the use of bitwise 'shift' and 'and' operators rather than divide and mod.
There is also the optimization of minifying the javascript to decrease the total size (but this is not a runtime optimization).
This is probably cargo-cult programming.
It shouldn't make a difference when you're using a decent compilers/interpreters for languages that don't have arbitrary operator overloading.
This optimization made sense for C++ where
T x = ...;
++x
could modify a value in place whereas
T x = ...;
x++
would have to create a copy by doing something under-the-hood like
T x = ...;
T copy;
(copy = T(x), ++x, copy)
which could be expensive for large struct types or for types that do lots of computation in their `copy constructor.
Just tested it in firebug and found no difference between post- and preincrements. Maybe this optimization other platforms?
Here is my code for firebug testing:
function test_post() {
console.time('postIncrement');
var i = 1000000, x = 0;
do x++; while(i--);
console.timeEnd('postIncrement');
}
function test_pre() {
console.time('preIncrement');
var i = 1000000, x = 0;
do ++x; while(i--);
console.timeEnd('preIncrement');
}
test_post();
test_pre();
test_post();
test_pre();
test_post();
test_pre();
test_post();
test_pre();
Output is:
postIncrement: 140ms
preIncrement: 160ms
postIncrement: 136ms
preIncrement: 157ms
postIncrement: 148ms
preIncrement: 137ms
postIncrement: 136ms
preIncrement: 148ms
Using post increment causes stack overflow. Why? start and end would always return the same value without first incrementing
function reverseString(string = [],start = 0,end = string.length - 1) {
if(start >= end) return
let temp = string[start]
string[start] = string[end]
string[end] = temp
//dont't do this
//reverseString(string,start++,end--)
reverseString(string,++start,--end)
return array
}
let array = ["H","a","n","n","a","h"]
console.log(reverseString(array))

how to do memory blocking for this code snippet

I have this piece of code and I am trying to optimize it using cache coherence method like temporal and spatial locality with cache blocking. (https://www.intel.com/content/www/us/en/developer/articles/technical/cache-blocking-techniques.html)
void randFunction1(int *arrayb, int dimension)
{
int i, j;
for (i = 0; i < dimension; ++i)
for (j = 0; j < dimension; ++j) {
arrayb[j * dimension+ i] = arrayb[j * dimension+ i] || arrayb[i * dimension+ j];
}
}
This is how I have optimised it but I was told it doesn't seem to make use of the memory blocking techniques.
for (int i = 0; i < dimension; ++i){
int j = i;
for (; j < dimension; ++j)
{
//access 2 times
arrayb[j * dimension+ i] = arrayb[j * dimension+ i] || arrayb[i * dimension+ j];
arrayb[i * dimension+ j] = arrayb[i * dimension+ j] || arrayb[j * dimension + i];
}
}
Could someone tell me how I can make use of the cache blocking (using locality for smaller tiles) for this sample piece of code? Any help is appreciated thank you!
I think you have a fundamental misunderstanding of cache blocking, misunderstood what you were being asked to do, or whoever asked you to do it doesn't understand. I am also hesitant to give you the full answer because this smells of a contrived example for a home work problem.
The idea is to block/tile/window up the data you're operating on, so the data you're operating on stays in the cache as you operate on it. To do this effectively you need to know the size of the cache and the size of the objects. You didn't give us enough details to know these answers but I can make some assumptions to illustrate how you might do this with the above code.
First how arrays are laid out in memory just so we can reference it later. Say dimension is 3.
That means we have a grid layout where i is the first number and j is the second like...
[0,0][0,1][0,2]
[1,0][1,1][1,2]
[2,0][2,1][2,2]
which is really in memory like:
[0,0][0,1][0,2][1,0][1,1][1,2][2,0][2,1][2,2]
We can also treat this like a 1d array where:
[0,0][0,1][0,2][1,0][1,1][1,2][2,0][2,1][2,2]
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ]
If our cache line could hold say 3 of these guys in there then there would be 3 'blocks'. 0-2, 3-5, and 6-8. If we access them in order, blocking just happens (assuming correct byte alignment of index 0 of the array... but lets keep it simple for now - this is likely taken care of already anyway). That is when we access 0, then 0, 1 and 2 are loaded into the cache. Next we access 1, it's already there. Then 2, already there. Then 3, load 3, 4 and 5 into the cache and so on.
Let's take a look at the original code for a second.
arrayb[j * dimension+ i] = arrayb[j * dimension+ i] || arrayb[i * dimension+ j];
Lets do just a couple of iterations but take out the indexing variables and replace them with their values. I'll use ^ to point to indexes you access and | to show the locations of our imaginary cache lines.
arrayb[0] = arrayb[0] || arrayb[0]
[ 0 ][ 1 ][ 2 ] | [ 3 ][ 4 ][ 5 ] | [ 6 ][ 7 ][ 8 ]
^
arrayb[3] = arrayb[3] || arrayb[1]
[ 0 ][ 1 ][ 2 ] | [ 3 ][ 4 ][ 5 ] | [ 6 ][ 7 ][ 8 ]
^ ^
arrayb[6] = arrayb[6] || arrayb[2]
[ 0 ][ 1 ][ 2 ] | [ 3 ][ 4 ][ 5 ] | [ 6 ][ 7 ][ 8 ]
^ ^
arrayb[1] = arrayb[1] || arrayb[3]
[ 0 ][ 1 ][ 2 ] | [ 3 ][ 4 ][ 5 ] | [ 6 ][ 7 ][ 8 ]
^ ^
So you see other than the first iteration, you cross the cache line every time jumping all over the place.
I think you noticed that the operation you're performing is logical or. That means you do not have to preserve the original order of operations as you go through the loop as your answer will be the same. That is it doesn't matter if you do arrayb[1] = arrayb[1] || arrayb[3] first or arrayb[3] = arrayb[3] | arrayb[1] first.
In your proposed solution you might think you're doing a little better because you noticed the pattern where on the second and fourth iteration we access the same indexes (just flip where we're reading and writing) but you didn't adjust the loops at all, so actually you just did twice the work.
0 = 0 || 0
0 = 0 || 0
3 = 3 || 1
1 = 1 || 3
6 = 6 || 2
2 = 2 || 6
1 = 1 || 3
3 = 3 || 1
4 = 4 || 4
4 = 4 || 4
7 = 7 || 5
5 = 5 || 7
2 = 2 || 6
6 = 6 || 2
5 = 5 || 7
7 = 7 || 5
8 = 8 || 8
8 = 8 || 8
If you fix the double work, you're on your way but you're not really using a blocking strategy. And to be honest, you can't. It's almost like the problem was designed to be not-real-world and purposely cause caching problems. The problem with your example is that you're using a single array that only accesses the same memory locations in pairs (twice). Other than their swap, they're never reused.
You can kind of optimize some of the accesses but you'll always be stuck with a majority collection that crosses boundaries. I think this is what you've been asked to do, but this is not a very good example problem for it. If we keep in mind how the memory in your array is actually being accessed and never really reused then increasing the size of the example makes it really obvious.
Say dimensions was 8 and your cache is big enough to hold 16 items (x86_64 can hold 16 ints in a cacheline). Then the most optimal access grouping would be operations where all indexes fell within 0-15, 16-31, 32-47, or 48-63. There aren't that many of them.
Not crossing a cache line:
0 = 0 || 0
1 = 1 || 8
8 = 8 || 1
9 = 9 || 9
18 = 18 || 18
19 = 19 || 26
26 = 26 || 19
27 = 27 || 27
36 = 36 || 36
37 = 37 || 44
44 = 44 || 37
54 = 54 || 54
55 = 55 || 62
62 = 62 || 55
63 = 63 || 63
Always crossing a cache line:
2 = 2 || 16
3 = 3 || 24
4 = 4 || 32
5 = 5 || 40
6 = 6 || 48
7 = 7 || 56
10 = 10 || 17
11 = 11 || 25
12 = 12 || 33
13 = 13 || 41
14 = 14 || 49
15 = 15 || 57
16 = 16 || 2
17 = 17 || 10
20 = 20 || 34
21 = 21 || 42
22 = 22 || 50
23 = 23 || 58
24 = 24 || 3
25 = 25 || 11
28 = 28 || 35
29 = 29 || 43
30 = 30 || 51
31 = 31 || 59
32 = 32 || 4
33 = 33 || 12
34 = 34 || 20
35 = 35 || 28
38 = 38 || 52
39 = 39 || 60
40 = 40 || 5
41 = 41 || 13
42 = 42 || 21
43 = 43 || 29
45 = 45 || 45
46 = 46 || 53
47 = 47 || 61
48 = 48 || 6
49 = 49 || 14
50 = 50 || 22
51 = 51 || 30
52 = 52 || 38
53 = 53 || 46
56 = 56 || 7
57 = 57 || 15
58 = 58 || 23
59 = 59 || 31
60 = 60 || 39
61 = 61 || 47
This really gets terrible as the number of items out paces the number that'll fit in the cache. You're only hope to save anything at this point is the pattern you noticed where you can do half the memory accesses that while smart, is not blocking/tiling.
The link you provided is similarly bad imo for illustrating cache blocking. It doesn't do a good job of describing what is actually taking place in its loops but at least it tries.
They tile the inner loop to keep memory accesses more local, which I think is what you've been asked to do but given a problem it can't apply to.
It smells like your teacher meant to give you 2 or 3 arrays, but accidentally gave you just one. It's very close to matrix multiplication but missing an inner loop and two other arrays.

Trouble implementing crib drag

so I'm enrolled in Stanford's Cryptography class on Coursera and I've been struggling with the first programming assignment (I'm a few weeks behind.)
I've been playing around with different variations of this code for a few weeks to try and remove the issues mentioned below...
At first, I thought I was getting a successful crib-drag, but then I realized a multitude of issues, none of which I've been able to solve:
The "deciphered" results were shortening from the wrong end of the word as the crib is dragged across the XOR of the two cipher texts ("The" > "Th" > "T" instead of "The" > "he" > "e")
The result of crib dragging isn't the text of the other message, but the crib itself...in other words, no matter what crib I choose, the first X number of indices ALWAYS return the crib itself
Here's the code:
def string_xor(a, b):
return "".join([chr(ord(x) ^ ord(y)) for (x, y) in zip(a ,b)])
def manual_crib_drag(word):
with open("two_ciphers.txt", "r") as f:
cipher1 = f.readline()
cipher2 = f.readline()
xor = string_xor(cipher1, cipher2)
word_hex = word.encode("hex")
for x in range(len(xor)):
try:
result = string_xor(xor[x:x+len(word_hex)], word_hex)\
.strip().decode("hex")
print x, ":", result
except TypeError, e: print
Here are the results when running manual_crib_drag("The "):
0 : The
1 : The
2 : The
3 : The
4 : The
5 : The
6 : The
7 : The
8 : The
9 : The
10 : The
11 : The
12 : The
13 : The
14 : The
15 : The
16 : The
17 : The
18 : The
19 : The
20 : The
21 : The
22 : The
23 : The
24 : The
25 : The
26 : The
27 : The
28 : The
29 : The
30 : The
31 : The
32 : The
33 : The
34 : The
35 : The
36 : The
37 : The
38 : The
39 : The
40 : The
41 : The
43 : The?
46 : Tcn$
53 : ??S
71 : PN?
80 : CT"#
83 : ?Q?
88 : `n$
94 : P'e<
99 : U??
118 : b}l
123 : Ǹd?
132 : Tokf
138 : X6%
148 : YW0-
155 : ??4d
161 : ???
171 : ??!
173 : ??d1
177 : Uy?G
200 : hm
202 : de*t
218 : Xn q
238 : Ti0:
249 : 4|5!
253 : i?u
258 : ;G+
263 : t?Qq
269 : )?
275 : t??
282 : Z
285 : G?d
313 : sLtU
319 : !9u?
320 : yo
325 : ?kv0
329 : Gx??
331 : Dﺹ?
For completeness, here are the two cipher texts used in the example:
32510bfbacfbb9befd54415da243e1695ecabd58c519cd4bd2061bbde24eb76a19d84aba34d8de287be84d07e7e9a30ee714979c7e1123a8bd9822a33ecaf512472e8e8f8db3f9635c1949e640c621854eba0d79eccf52ff111284b4cc61d11902aebc66f2b2e436434eacc0aba938220b084800c2ca4e693522643573b2c4ce35050b0cf774201f0fe52ac9f26d71b6cf61a711cc229f77ace7aa88a2f19983122b11be87a59c355d25f8e4
32510bfbacfbb9befd54415da243e1695ecabd58c519cd4bd90f1fa6ea5ba47b01c909ba7696cf606ef40c04afe1ac0aa8148dd066592ded9f8774b529c7ea125d298e8883f5e9305f4b44f915cb2bd05af51373fd9b4af511039fa2d96f83414aaaf261bda2e97b170fb5cce2a53e675c154c0d9681596934777e2275b381ce2e40582afe67650b13e72287ff2270abcf73bb028932836fbdecfecee0a3b894473c1bbeb6b4913a536ce4f9b13f1efff71ea313c8661dd9a4ce
And the result of XORing these two cipher texts is:
PRX]
TS\TW]SW\[\VTS\^W[
TVSPZWSQV
[TZ[P\Q[PZRUS[TVTU[ZUQT[][SZRTWV
h
I'm not sure why it's broken into separate lines, but my guess is that it has to do with spaces in the cipher text XOR result...encasing the return of the string_xor function inside of another join, seems to do the trick though, but because it doesn't seem to affect the results of the crib drag, I left it out of the provided code:
" ".join("".join([chr(ord(x) ^ ord(y)) for (x, y) in zip(a ,b)]).split())
I'd appreciate any help! Thanks in advance.
try converting it first to ascii and then do the xor
x = strxor(unhexlify(ciphertexts[0]),unhexlify(target))
print "Ciphertext[0] xor Target\n"
crib = raw_input("Enter Crib:>")
print "Crib\n-->%s<--"%crib
# Crib Drag
for i in range(len(x)):
z = x[i:]
print "\n[%d]"%i
print "%s"%strxor(z,crib)

XBee packet format

I have to IEEE 802.15.4 devices running. The question is about XBee-PRO.
Firmware: XBEE PRO 802.15.4 (Version: 10e6)
Hardware: XBEE (Version: 1744)
Both units are configured to the same channel (15) and same PAN id (0x1234). It's hooked to my machines COM port and can actually transmit data when I connect picocom to it. (It responds to AT commands properly and can be configured fully through moltosenso Network Manager - I'm on a Mac). All other registers are at their defaults, apart from the serial baudrate.
The XBee side source address is at 0x1, destination address is 0x2. Now when I type an ASCII character into picocom, this is what I see received on the other device, running in promiscous mode:
-- Typing "a"
E 61 88 7E 34 12 2 0 1 0 2B 0 61 E1
E 61 88 7E 34 12 2 0 1 0 2B 0 61 E1
E 61 88 7E 34 12 2 0 1 0 2B 0 61 E1
E 61 88 7E 34 12 2 0 1 0 2B 0 61 E1
-- Typing "b"
E 61 88 7F 34 12 2 0 1 0 2C 0 62 58
E 61 88 7F 34 12 2 0 1 0 2C 0 62 58
E 61 88 7F 34 12 2 0 1 0 2C 0 62 58
E 61 88 7F 34 12 2 0 1 0 2C 0 62 58
--- Typing "a" again
E 61 88 80 34 12 2 0 1 0 2D 0 61 A9
E 61 88 80 34 12 2 0 1 0 2D 0 61 A9
...
ln pc pan da sa ct pl ck
So for every data payload sent, I see four frames sent out (nobody is picking them up of course). I suppose three of these are 802.15.4 retries, and XBee adds another one for kicks (although the RR register is clearly zero...).
What's the packet format here and where is this specified?
I've looked at XBee API packets and this does look vaguely similar, but I don't see 0x7e delimiters or anything like that here.
I guess what I am seeing is:
ln = length
61 = ??
88 = ??
pc = some sort of packet counter
pan = 16 bits of PAN ID
da = 16 bits of destination address
sa = 16 bits of source address
ct = another counter?
0 = ??
pl = my ASCII character payload
ck = probably a checksum
I tried with setting PAN to 0xFFFF and setting the destination address to 0xFF or broadcast, seeing pretty much the same. These 0x61 and 0x88 don't seem to correspond to much anything in the XBee documentation...
It doesn't directly look like 802.15.4 MAC level data frame either - or if it does, what are the missing fields and where are they specified? Pointers?
EDIT:
Actually, hmm. After importing a hex-formatted dump into Wireshark, it told me exactly that it's a 802.15.4 MAC frame and how to read it.
IEEE 802.15.4 Data, Dst: 0x0002, Src: 0x0001, Bad FCS
Frame Control Field: Data (0x8861)
.... .... .... .001 = Frame Type: Data (0x0001)
.... .... .... 0... = Security Enabled: False
.... .... ...0 .... = Frame Pending: False
.... .... ..1. .... = Acknowledge Request: True
.... .... .1.. .... = Intra-PAN: True
.... 10.. .... .... = Destination Addressing Mode: Short/16-bit (0x0002)
..00 .... .... .... = Frame Version: 0
10.. .... .... .... = Source Addressing Mode: Short/16-bit (0x0002)
Sequence Number: 126
Destination PAN: 0x1234
Destination: 0x0002
Source: 0x0001
I still don't know where the second 16-bit counter comes from in front of the actual data byte, and why FCS is messed up (I had to strip the beginning len field to get Wireshark to read it - that's probably it.)
I think the second counter ct is a counter for the application layer in Zigbee protocol to notice when it should update its data because it is receiving a new one :)
For more information about Frames Format in Zigbee Stack try to download this :
Newnes.ZigBee.Wireless.Networks.and.Transceivers.Sep.2008.eBook-DDU.pdf
Have a nice day :)
Have you try to read packets with X-CTU software?
I suggest you to read this post entry: http://www.tunnelsup.com/xbee-guide/
The pdf with the "Quick Reference Guide" is really useful and contains some data format indicated.
Also, it's always good to study the real documentation from developer (Digi in this case).
The frame is like:
API Frame
But only if you have configured previously the xbee to work in API mode with command:
ATAP 1
Or with XCTU.
Try monitoring communication between two XBee modules to see what the acknowledgement frame looks like.
Try sending a sequence of bytes.
Try performing a Node Discovery (ATND) to see what those frames look like.
Try sending a remote AT command from X-CTU to see what those frames and responses look like.
When reverse engineering a protocol, it's useful to see both sides of the conversation. You can test various theories by emulating each side of the protocol, and trying out variations on what you see. For example, "What if I change this byte, does the remote end still respond?".
My guess is that you're correct about the ct byte being a counter. The following zero byte could be flags, or it could identify the type of packet sent (serial data, remote AT command/response, node discovery/response, etc.).
As you build up an understanding of the structure, you can write a program to parse and dump the contents of the frames. Dump an interpreted version of what you know, and leave the unknown bytes as a sequence of hex bytes. Continue to experiment until you can narrow down the meaning of the remaining bytes.
The extra 2 bytes in payload (0x2D 0x0) is MaxStream header (MM in XCTU). If you disable the MaxStream headers by setting the MM command to without MaxStream headers, then these two bytes will become a part of a 802.15.4 payload, so your full payload would become 2B 0 61 instead of just 61

Resources