Premise
Hi,
I received multiple reports from a Redis user that experienced server crashes, using a Redis stable release (latest, 2.4.6). The bug is strange since the user is not doing esoteric things, just working a lot with the sorted set type, and only with the ZADD, ZREM, and ZREVRANK commands. However it is strange that a bug like that, causing crashes after a few billion operations executed, was only experienced by a single user. Fortunately the user in question is extremely helpful and collaborated a lot in the tracking of the issue, so I was able to obtain many times logs with the exact sequence of operations performed by Redis, that I re-played locally without result, I also tried to write scripts to closely mimic the kind of work load, to perform in-depth code reviews of the skip list implementation, and so forth.
Even after all this efforts no way to reproduce the issue locally.
It is also worth to mention that at some point the the user started sending the exact same traffic to another box running the same Redis version, but compiled with another gcc, and running in different hardware: so far no issues in this second instance. Still I want to understand what is happening.
So finally I setup a different strategy with the user and asked him to run Redis using gdb, in order to obtain a core file. Finally Redis crashed again and I now have both the core file and the executable. Unfortunately I forgot to ask the user to compile Redis without optimizations.
I need the stack overflow community help since with GDB I reach some conclusion but I've no really idea what could be happening here: at some point a function computes a pointer, and when it calls another function magically that pointer is different, pointing to a memory location that does not hold the right kind of data.
GDB session
The original executable was compiled with GCC 4.4.5-8, this is a GDB session that shows my investigation:
gdb ./redis-server core.16525
GNU gdb (GDB) 7.1-ubuntu
[snip]
Program terminated with signal 11, Segmentation fault.
#0 0x00007f3d9ecd216c in __pthread_rwlock_tryrdlock (rwlock=0x1)
at pthread_rwlock_tryrdlock.c:46
46 pthread_rwlock_tryrdlock.c: No such file or directory.
in pthread_rwlock_tryrdlock.c
Actually the strack trace shown is about a secondary thread doing nothing (you can safely consider Redis a single-threaded app, the other threads are only used to perform things like fsync() against a file descriptor without blocking), let's select the right one.
(gdb) info threads
3 Thread 16525 zslGetRank (zsl=0x7f3d8d71c360, score=19.498544884710096,
o=0x7f3d4cab5760) at t_zset.c:335
2 Thread 16527 0x00007f3d9ecd216c in __pthread_rwlock_tryrdlock (
rwlock=0x6b7f5) at pthread_rwlock_tryrdlock.c:46
* 1 Thread 16526 0x00007f3d9ecd216c in __pthread_rwlock_tryrdlock (rwlock=0x1)
at pthread_rwlock_tryrdlock.c:46
(gdb) thread 3
[Switching to thread 3 (Thread 16525)]#0 zslGetRank (zsl=0x7f3d8d71c360,
score=19.498544884710096, o=0x7f3d4cab5760) at t_zset.c:335
335 t_zset.c: No such file or directory.
in t_zset.c
(gdb) bt
#0 zslGetRank (zsl=0x7f3d8d71c360, score=19.498544884710096, o=0x7f3d4cab5760)
at t_zset.c:335
#1 0x000000000042818b in zrankGenericCommand (c=0x7f3d9dcdc000, reverse=1)
at t_zset.c:2046
#2 0x00000000004108d4 in call (c=0x7f3d9dcdc000) at redis.c:1024
#3 0x0000000000410c1c in processCommand (c=0x7f3d9dcdc000) at redis.c:1130
#4 0x0000000000419d3f in processInputBuffer (c=0x7f3d9dcdc000)
at networking.c:865
#5 0x0000000000419e1c in readQueryFromClient (el=<value optimized out>,
fd=<value optimized out>, privdata=0x7f3d9dcdc000,
mask=<value optimized out>) at networking.c:908
#6 0x000000000040d4a3 in aeProcessEvents (eventLoop=0x7f3d9dc47000,
flags=<value optimized out>) at ae.c:342
#7 0x000000000040d6ee in aeMain (eventLoop=0x7f3d9dc47000) at ae.c:387
#8 0x0000000000412a4f in main (argc=2, argv=<value optimized out>)
at redis.c:1719
We also generated a backtrace. As you can see call() is dispatching the ZREVRANK command, so the zrankGenericCommand() is called with the client structure and the reverse=1 (since it is REV rank) argument. We can easily investigate to check what are the arguments of the ZREVRANK command.
(gdb) up
#1 0x000000000042818b in zrankGenericCommand (c=0x7f3d9dcdc000, reverse=1)
at t_zset.c:2046
2046 in t_zset.c
(gdb) print c->argc
$8 = 3
(gdb) print (redisClient*)c->argc
$9 = (redisClient *) 0x3
(gdb) print (char*)(redisClient*)c->argv[0]->ptr
$10 = 0x7f3d8267ce28 "zrevrank"
(gdb) print (char*)(redisClient*)c->argv[1]->ptr
$11 = 0x7f3d8267ce48 "pc_stat.hkperc"
(gdb) print (long)(redisClient*)c->argv[2]->ptr
$12 = 282472606
So the actual command generating the crash was: ZREVRANK pc_stat.hkperc 282472606
This is consistent with the client logs obtained by the user. Note that I casted the pointer to a long integer for the latest argument, since Redis encodes integers this way to save memory when possible.
Now that's fine, it is now time to investigate the zrankGenericCommand() that called zslGetRan() that caused the actual crash. This is the C source code of zrankGenericCommand around like 2046:
2036 } else if (zobj->encoding == REDIS_ENCODING_SKIPLIST) {
2037 zset *zs = zobj->ptr;
2038 zskiplist *zsl = zs->zsl;
2039 dictEntry *de;
2040 double score;
2041
2042 ele = c->argv[2] = tryObjectEncoding(c->argv[2]);
2043 de = dictFind(zs->dict,ele);
2044 if (de != NULL) {
2045 score = *(double*)dictGetEntryVal(de);
2046 rank = zslGetRank(zsl,score,ele);
2047 redisAssert(rank); /* Existing elements always have a rank. */
2048 if (reverse)
2049 addReplyLongLong(c,llen-rank);
2050 else
2051 addReplyLongLong(c,rank-1);
2052 } else {
2053 addReply(c,shared.nullbulk);
2054 }
2055 }
Ok this is how it works:
We lookup a Redis key, containing a sorted set data type (lookup not included in the code). The Redis Object associated with the key is stored in the zobj local variable.
The zobj ptr field is a pointer to a structure of type zset representing the sorted set.
In turn the zset structure has two pointers, one points to an hash table, and one to a skip list. This is needed since We both provide element-to-score lookups in O(1) for which we need an hash table, but also we take the elements ordered so we use a skip list. In line 2038 the pointer to the skip list (represented by a zskiplist structure) is assigned to the zsl variable.
Later we encode the third argument (line 2042), this is why we casted the value to a long to print it from the client structure.
In line 2043 we lookup the element from the dictionary, and the operation succeeds since we know that the function zslGetRank() side the if branch gets called.
Finally in line 2046 we call zslGetRank() with three arguments: the pointer to the skip list, the score of the element, and the element itself.
Fine... now what is the pointer that zslGetRank() should receive in theory? We can easily investigate this manually looking up the Redis hash table. I hashed manually the key and it maps to bucket 62 of the hash table, let's see if it is true:
(gdb) print (char*)c->db->dict->ht->table[62]->key
$13 = 0x7f3d9dc0f6c8 "pc_stat.hkperc"
Exactly as expected. Let's check the object associated:
(gdb) print *(robj*)c->db->dict->ht->table[62]->val
$16 = {type = 3, storage = 0, encoding = 7, lru = 557869, refcount = 1,
ptr = 0x7f3d9de574b0}
Type = 3, Encoding = 7, it means: it is a sorted set, encoded as a skip list. Fine again.
The sorted set address (ptr field) is 0x7f3d9de574b0, so we can inspect this as well:
(gdb) print *(zset*)0x7f3d9de574b0
$17 = {dict = 0x7f3d9dcf6c20, zsl = 0x7f3d9de591c0}
So we have:
The object associated to the key that points to a sorted set that is stored at address 0x7f3d9de574b0
In turn this sorted set is implemented with a skiplist, at address 0x7f3d9de591c0 (zsl field)
Now let's check if our two variables are set to the right values:
2037 zset *zs = zobj->ptr;
2038 zskiplist *zsl = zs->zsl;
(gdb) info locals
zs = 0x7f3d9de574b0
zsl = 0x7f3d9de591c0
de = <value optimized out>
ele = <value optimized out>
zobj = <value optimized out>
llen = 165312
rank = <value optimized out>
Everything is perfect so far: the variable zs is set to 0x7f3d9de574b0 as expected, and so is the variable zsl pointing to the skiplist, that is set to 0x7f3d9de591c0.
Now this variables are no touched in the course of the code execution:
This are the only lines of code between the assignment of the variables and the call to the zslGetRank() function:
2042 ele = c->argv[2] = tryObjectEncoding(c->argv[2]);
2043 de = dictFind(zs->dict,ele);
2044 if (de != NULL) {
2045 score = *(double*)dictGetEntryVal(de);
2046 rank = zslGetRank(zsl,score,ele);
Nobody is touching zsl, however if we check the stack trace we see that the zslGetRank() function gets called not with the address 0x7f3d9de591c0 as first argument, but with a different one:
#0 zslGetRank (zsl=0x7f3d8d71c360, score=19.498544884710096, o=0x7f3d4cab5760)
at t_zset.c:335
If you read all this you are an hero, and the reward is very small, consisting in this question: do you have an idea, even considering that hardware failure is an option, about how this argument gets modified? It seems very unlikely that the object encoding function or the hash table lookup can corrupt the stack of the caller (but apparently the argument is inside registers already at the time of the call). My assembler is not great, so if you have some clue... it is very welcomed. I'll left you with an info registers output and a disassemble:
(gdb) info registers
rax 0x6 6
rbx 0x7f3d9dcdc000 139902617239552
rcx 0xf742d0b6 4148351158
rdx 0x7f3d95efada0 139902485245344
rsi 0x7f3d4cab5760 139901256030048
rdi 0x7f3d8d71c360 139902342775648
rbp 0x7f3d4cab5760 0x7f3d4cab5760
rsp 0x7fffe61a8040 0x7fffe61a8040
r8 0x7fffe61a7fd9 140737053884377
r9 0x1 1
r10 0x7f3d9dcd4ff0 139902617210864
r11 0x6 6
r12 0x1 1
r13 0x7f3d9de574b0 139902618793136
r14 0x7f3d9de591c0 139902618800576
r15 0x7f3d8267c9e0 139902157572576
rip 0x42818b 0x42818b <zrankGenericCommand+251>
eflags 0x10206 [ PF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb) disassemble zrankGenericCommand
Dump of assembler code for function zrankGenericCommand:
0x0000000000428090 <+0>: mov %rbx,-0x30(%rsp)
0x0000000000428095 <+5>: mov %r12,-0x20(%rsp)
0x000000000042809a <+10>: mov %esi,%r12d
0x000000000042809d <+13>: mov %r14,-0x10(%rsp)
0x00000000004280a2 <+18>: mov %rbp,-0x28(%rsp)
0x00000000004280a7 <+23>: mov %rdi,%rbx
0x00000000004280aa <+26>: mov %r13,-0x18(%rsp)
0x00000000004280af <+31>: mov %r15,-0x8(%rsp)
0x00000000004280b4 <+36>: sub $0x58,%rsp
0x00000000004280b8 <+40>: mov 0x28(%rdi),%rax
0x00000000004280bc <+44>: mov 0x23138d(%rip),%rdx # 0x659450 <shared+80>
0x00000000004280c3 <+51>: mov 0x8(%rax),%rsi
0x00000000004280c7 <+55>: mov 0x10(%rax),%rbp
0x00000000004280cb <+59>: callq 0x41d370 <lookupKeyReadOrReply>
0x00000000004280d0 <+64>: test %rax,%rax
0x00000000004280d3 <+67>: mov %rax,%r14
0x00000000004280d6 <+70>: je 0x4280ec <zrankGenericCommand+92>
0x00000000004280d8 <+72>: mov $0x3,%edx
0x00000000004280dd <+77>: mov %rax,%rsi
0x00000000004280e0 <+80>: mov %rbx,%rdi
0x00000000004280e3 <+83>: callq 0x41b270 <checkType>
0x00000000004280e8 <+88>: test %eax,%eax
0x00000000004280ea <+90>: je 0x428110 <zrankGenericCommand+128>
0x00000000004280ec <+92>: mov 0x28(%rsp),%rbx
0x00000000004280f1 <+97>: mov 0x30(%rsp),%rbp
0x00000000004280f6 <+102>: mov 0x38(%rsp),%r12
0x00000000004280fb <+107>: mov 0x40(%rsp),%r13
0x0000000000428100 <+112>: mov 0x48(%rsp),%r14
0x0000000000428105 <+117>: mov 0x50(%rsp),%r15
0x000000000042810a <+122>: add $0x58,%rsp
0x000000000042810e <+126>: retq
0x000000000042810f <+127>: nop
0x0000000000428110 <+128>: mov %r14,%rdi
0x0000000000428113 <+131>: callq 0x426250 <zsetLength>
0x0000000000428118 <+136>: testw $0x3c0,0x0(%rbp)
0x000000000042811e <+142>: jne 0x4282b7 <zrankGenericCommand+551>
0x0000000000428124 <+148>: mov %eax,%eax
0x0000000000428126 <+150>: mov %rax,0x8(%rsp)
0x000000000042812b <+155>: movzwl (%r14),%eax
0x000000000042812f <+159>: and $0x3c0,%ax
0x0000000000428133 <+163>: cmp $0x140,%ax
0x0000000000428137 <+167>: je 0x4281c8 <zrankGenericCommand+312>
0x000000000042813d <+173>: cmp $0x1c0,%ax
0x0000000000428141 <+177>: jne 0x428299 <zrankGenericCommand+521>
0x0000000000428147 <+183>: mov 0x28(%rbx),%r15
0x000000000042814b <+187>: mov 0x8(%r14),%r13
0x000000000042814f <+191>: mov 0x10(%r15),%rdi
0x0000000000428153 <+195>: mov 0x8(%r13),%r14
0x0000000000428157 <+199>: callq 0x41bcc0 <tryObjectEncoding>
0x000000000042815c <+204>: mov 0x0(%r13),%rdi
0x0000000000428160 <+208>: mov %rax,0x10(%r15)
0x0000000000428164 <+212>: mov %rax,%rsi
0x0000000000428167 <+215>: mov %rax,%rbp
0x000000000042816a <+218>: callq 0x40ede0 <dictFind>
0x000000000042816f <+223>: test %rax,%rax
0x0000000000428172 <+226>: je 0x428270 <zrankGenericCommand+480>
0x0000000000428178 <+232>: mov 0x8(%rax),%rax
0x000000000042817c <+236>: mov %rbp,%rsi
0x000000000042817f <+239>: mov %r14,%rdi
0x0000000000428182 <+242>: movsd (%rax),%xmm0
0x0000000000428186 <+246>: callq 0x427fd0 <zslGetRank>
=> 0x000000000042818b <+251>: test %rax,%rax
0x000000000042818e <+254>: je 0x4282d5 <zrankGenericCommand+581>
0x0000000000428194 <+260>: test %r12d,%r12d
0x0000000000428197 <+263>: je 0x4281b0 <zrankGenericCommand+288>
0x0000000000428199 <+265>: mov 0x8(%rsp),%rsi
0x000000000042819e <+270>: mov %rbx,%rdi
0x00000000004281a1 <+273>: sub %rax,%rsi
0x00000000004281a4 <+276>: callq 0x41a430 <addReplyLongLong>
0x00000000004281a9 <+281>: jmpq 0x4280ec <zrankGenericCommand+92>
0x00000000004281ae <+286>: xchg %ax,%ax
0x00000000004281b0 <+288>: lea -0x1(%rax),%rsi
0x00000000004281b4 <+292>: mov %rbx,%rdi
0x00000000004281b7 <+295>: callq 0x41a430 <addReplyLongLong>
0x00000000004281bc <+300>: nopl 0x0(%rax)
0x00000000004281c0 <+304>: jmpq 0x4280ec <zrankGenericCommand+92>
0x00000000004281c5 <+309>: nopl (%rax)
0x00000000004281c8 <+312>: mov 0x8(%r14),%r14
0x00000000004281cc <+316>: xor %esi,%esi
0x00000000004281ce <+318>: mov %r14,%rdi
0x00000000004281d1 <+321>: callq 0x417600 <ziplistIndex>
0x00000000004281d6 <+326>: test %rax,%rax
0x00000000004281d9 <+329>: mov %rax,0x18(%rsp)
0x00000000004281de <+334>: je 0x428311 <zrankGenericCommand+641>
0x00000000004281e4 <+340>: mov %rax,%rsi
0x00000000004281e7 <+343>: mov %r14,%rdi
0x00000000004281ea <+346>: callq 0x4175c0 <ziplistNext>
0x00000000004281ef <+351>: test %rax,%rax
0x00000000004281f2 <+354>: mov %rax,0x10(%rsp)
0x00000000004281f7 <+359>: je 0x4282f3 <zrankGenericCommand+611>
0x00000000004281fd <+365>: mov 0x18(%rsp),%rdi
0x0000000000428202 <+370>: mov $0x1,%r13d
0x0000000000428208 <+376>: lea 0x10(%rsp),%r15
0x000000000042820d <+381>: test %rdi,%rdi
0x0000000000428210 <+384>: jne 0x428236 <zrankGenericCommand+422>
0x0000000000428212 <+386>: jmp 0x428270 <zrankGenericCommand+480>
0x0000000000428214 <+388>: nopl 0x0(%rax)
0x0000000000428218 <+392>: lea 0x18(%rsp),%rsi
0x000000000042821d <+397>: mov %r14,%rdi
0x0000000000428220 <+400>: mov %r15,%rdx
0x0000000000428223 <+403>: callq 0x425610 <zzlNext>
0x0000000000428228 <+408>: mov 0x18(%rsp),%rdi
0x000000000042822d <+413>: test %rdi,%rdi
0x0000000000428230 <+416>: je 0x428270 <zrankGenericCommand+480>
0x0000000000428232 <+418>: add $0x1,%r13
0x0000000000428236 <+422>: mov 0x8(%rbp),%rsi
0x000000000042823a <+426>: movslq -0x8(%rsi),%rdx
0x000000000042823e <+430>: callq 0x417a40 <ziplistCompare>
0x0000000000428243 <+435>: test %eax,%eax
0x0000000000428245 <+437>: je 0x428218 <zrankGenericCommand+392>
0x0000000000428247 <+439>: cmpq $0x0,0x18(%rsp)
0x000000000042824d <+445>: je 0x428270 <zrankGenericCommand+480>
0x000000000042824f <+447>: test %r12d,%r12d
0x0000000000428252 <+450>: je 0x428288 <zrankGenericCommand+504>
0x0000000000428254 <+452>: mov 0x8(%rsp),%rsi
0x0000000000428259 <+457>: mov %rbx,%rdi
0x000000000042825c <+460>: sub %r13,%rsi
0x000000000042825f <+463>: callq 0x41a430 <addReplyLongLong>
0x0000000000428264 <+468>: jmpq 0x4280ec <zrankGenericCommand+92>
0x0000000000428269 <+473>: nopl 0x0(%rax)
0x0000000000428270 <+480>: mov 0x2311d9(%rip),%rsi # 0x659450 <shared+80>
0x0000000000428277 <+487>: mov %rbx,%rdi
0x000000000042827a <+490>: callq 0x419f60 <addReply>
0x000000000042827f <+495>: jmpq 0x4280ec <zrankGenericCommand+92>
0x0000000000428284 <+500>: nopl 0x0(%rax)
0x0000000000428288 <+504>: lea -0x1(%r13),%rsi
0x000000000042828c <+508>: mov %rbx,%rdi
0x000000000042828f <+511>: callq 0x41a430 <addReplyLongLong>
0x0000000000428294 <+516>: jmpq 0x4280ec <zrankGenericCommand+92>
0x0000000000428299 <+521>: mov $0x44939f,%edi
0x000000000042829e <+526>: mov $0x808,%edx
0x00000000004282a3 <+531>: mov $0x44a674,%esi
0x00000000004282a8 <+536>: callq 0x432010 <_redisPanic>
0x00000000004282ad <+541>: mov $0x1,%edi
0x00000000004282b2 <+546>: callq 0x40c3a0 <_exit#plt>
0x00000000004282b7 <+551>: mov $0x44a7d0,%edi
0x00000000004282bc <+556>: mov $0x7da,%edx
0x00000000004282c1 <+561>: mov $0x44a674,%esi
0x00000000004282c6 <+566>: callq 0x432090 <_redisAssert>
0x00000000004282cb <+571>: mov $0x1,%edi
0x00000000004282d0 <+576>: callq 0x40c3a0 <_exit#plt>
0x00000000004282d5 <+581>: mov $0x448982,%edi
0x00000000004282da <+586>: mov $0x7ff,%edx
0x00000000004282df <+591>: mov $0x44a674,%esi
0x00000000004282e4 <+596>: callq 0x432090 <_redisAssert>
0x00000000004282e9 <+601>: mov $0x1,%edi
0x00000000004282ee <+606>: callq 0x40c3a0 <_exit#plt>
0x00000000004282f3 <+611>: mov $0x44a6e5,%edi
0x00000000004282f8 <+616>: mov $0x7e2,%edx
0x00000000004282fd <+621>: mov $0x44a674,%esi
0x0000000000428302 <+626>: callq 0x432090 <_redisAssert>
0x0000000000428307 <+631>: mov $0x1,%edi
0x000000000042830c <+636>: callq 0x40c3a0 <_exit#plt>
0x0000000000428311 <+641>: mov $0x44a6bd,%edi
0x0000000000428316 <+646>: mov $0x7e0,%edx
0x000000000042831b <+651>: mov $0x44a674,%esi
0x0000000000428320 <+656>: callq 0x432090 <_redisAssert>
0x0000000000428325 <+661>: mov $0x1,%edi
0x000000000042832a <+666>: callq 0x40c3a0 <_exit#plt>
End of assembler dump.
As requested, this is the tryObjectEncoding function:
/* Try to encode a string object in order to save space */
robj *tryObjectEncoding(robj *o) {
long value;
sds s = o->ptr;
if (o->encoding != REDIS_ENCODING_RAW)
return o; /* Already encoded */
/* It's not safe to encode shared objects: shared objects can be shared
* everywhere in the "object space" of Redis. Encoded objects can only
* appear as "values" (and not, for instance, as keys) */
if (o->refcount > 1) return o;
/* Currently we try to encode only strings */
redisAssert(o->type == REDIS_STRING);
/* Check if we can represent this string as a long integer */
if (!string2l(s,sdslen(s),&value)) return o;
/* Ok, this object can be encoded...
*
* Can I use a shared object? Only if the object is inside a given
* range and if this is the main thread, since when VM is enabled we
* have the constraint that I/O thread should only handle non-shared
* objects, in order to avoid race conditions (we don't have per-object
* locking).
*
* Note that we also avoid using shared integers when maxmemory is used
* because very object needs to have a private LRU field for the LRU
* algorithm to work well. */
if (server.maxmemory == 0 && value >= 0 && value < REDIS_SHARED_INTEGERS &&
pthread_equal(pthread_self(),server.mainthread)) {
decrRefCount(o);
incrRefCount(shared.integers[value]);
return shared.integers[value];
} else {
o->encoding = REDIS_ENCODING_INT;
sdsfree(o->ptr);
o->ptr = (void*) value;
return o;
}
}
I think I can answer my own question now...
basically this is what happens. zslGetRank() is called by zrankGenericCommand() with first argument into %rdi register. However later this function will use the %rdi register to set an object (and indeed the %rdi register is set to an object that is valid):
(gdb) print *(robj*)0x7f3d8d71c360
$1 = {type = 0, storage = 0, encoding = 1, lru = 517611, refcount = 2,
ptr = 0x1524db19}
The instruction pointer actually pointed to zslGetRank+64 at the time of the crash, I did something wrong with gdb and modified the register before posting the original question.
Also how to verify that zslGetRank() gets the right address as first argument? Because %r14 gets saved on the stack by zslGetRank() so we can inspect the stack to check if there is a the right location. So we dump near the stack pointer:
0x7fffe61a8000: 0x40337fa0a3376aff 0x00007f3d9dcdc000
0x7fffe61a8010: 0x00007f3d9dcdc000 0x00007f3d4cab5760
0x7fffe61a8020: 0x0000000000000001 0x00007f3d9de574b0
---> 0x7fffe61a8030: 0x00007f3d9de591c0 0x000000000042818b
0x7fffe61a8040: 0x0000000000000000 0x00000000000285c0
0x7fffe61a8050: 0x0000000000000000 0x00007f3d9dcdc000
0x7fffe61a8060: 0x0000000000000000 0x00007f3d9dcdc000
0x7fffe61a8070: 0x0000000000000000 0x0004b6b413e12d9a
0x7fffe61a8080: 0x00000000000003d8 0x0000000000000001
As you can see the right address is here in the stack.
So long story short, the function is called with the right address, it is just that gdb can't dump the right stack trace because the %rdi register gets modified and used for another thing inside the function.
So this can be a memory corruption thing, possibly. What I'll do now is to walk the sorted set by hand simulating the work of zslGetRank() so that I can isolate the node that is broken, and check hopefully in which way it is corrupted.
Thanks for your help.
Edit: here you can find a manually annotated disassembled version of zslGetRank() function -> https://gist.github.com/1641112 (I used it to both learn some more assembler and to make my inspection simpler).
In this situation, the first thing I will do, is to use valgrind. The drawback is that valgrind is about x10 slower than native run, and it may change the behaviour because it seems it serialize the threads. But it save me so many times !
Anyway concerning this crash, it occurs in thread 3, the pthread_rwlock_tryrdlock() receive a bad pointer (rwlock is 0x1). It's perhaps a memory corruption caused by others threads. If it's possible try to put a "watch" on this poi
Hope it helps.
Update: the RAM in this box was broken, we found many problems in user's RAM after this one, and now Redis even implements a --test-memory option... Thanks.
Its a longshot at an intuitive guess, but the only thing I could possibly see causing this error is the assignment to pointers at line 2042:
2042 ele = c->argv[2] = tryObjectEncoding(c->argv[2]);
Hope that helps
Can you add the disassembly for zslGetRank ?
If you look at the other, r14 has the right value and rdi has the wrong value but right before the call there is a "mov r14, rdi" so presumably zslGetRank was called with the correct values.
Related
I have the following assembly program from the binary-bomb lab. The goal is to determine the keyword needed to run the binary without triggering the explode_bomb function. I commented my analysis of the assembly for this program but I am having trouble piecing everything together.
I believe I have all the information I need, but I still am unable to see the actual underlying logic and thus I am stuck. I would greatly appreciate any help!
The following is the disassembled program itself:
0x08048c3c <+0>: push %edi
0x08048c3d <+1>: push %esi
0x08048c3e <+2>: sub $0x14,%esp
0x08048c41 <+5>: movl $0x804a388,(%esp)
0x08048c48 <+12>: call 0x80490ab <string_length>
0x08048c4d <+17>: add $0x1,%eax
0x08048c50 <+20>: mov %eax,(%esp)
0x08048c53 <+23>: call 0x8048800 <malloc#plt>
0x08048c58 <+28>: mov $0x804a388,%esi
0x08048c5d <+33>: mov $0x13,%ecx
0x08048c62 <+38>: mov %eax,%edi
0x08048c64 <+40>: rep movsl %ds:(%esi),%es:(%edi)
0x08048c66 <+42>: movzwl (%esi),%edx
0x08048c69 <+45>: mov %dx,(%edi)
0x08048c6c <+48>: movzbl 0x11(%eax),%edx
0x08048c70 <+52>: mov %dl,0x10(%eax)
0x08048c73 <+55>: mov %eax,0x4(%esp)
0x08048c77 <+59>: mov 0x20(%esp),%eax
0x08048c7b <+63>: mov %eax,(%esp)
0x08048c7e <+66>: call 0x80490ca <strings_not_equal>
0x08048c83 <+71>: test %eax,%eax
0x08048c85 <+73>: je 0x8048c8c <phase_3+80>
0x08048c87 <+75>: call 0x8049363 <explode_bomb>
0x08048c8c <+80>: add $0x14,%esp
0x08048c8f <+83>: pop %esi
0x08048c90 <+84>: pop %edi
0x08048c91 <+85>: ret
The following block contains my analysis
5 <phase_3>
6 0x08048c3c <+0>: push %edi // push value in edi to stack
7 0x08048c3d <+1>: push %esi // push value of esi to stack
8 0x08048c3e <+2>: sub $0x14,%esp // grow stack by 0x14 (move stack ptr -0x14 bytes)
9
10 0x08048c41 <+5>: movl $0x804a388,(%esp) // put 0x804a388 into loc esp points to
11
12 0x08048c48 <+12>: call 0x80490ab <string_length> // check string length, store in eax
13 0x08048c4d <+17>: add $0x1,%eax // increment val in eax by 0x1 (str len + 1)
14 // at this point, eax = str_len + 1 = 77 + 1 = 78
15
16 0x08048c50 <+20>: mov %eax,(%esp) // get val in eax and put in loc on stack
17 //**** at this point, 0x804a388 should have a value of 78? ****
18
19 0x08048c53 <+23>: call 0x8048800 <malloc#plt> // malloc --> base ptr in eax
20
21 0x08048c58 <+28>: mov $0x804a388,%esi // 0x804a388 in esi
22 0x08048c5d <+33>: mov $0x13,%ecx // put 0x13 in ecx (counter register)
23 0x08048c62 <+38>: mov %eax,%edi // put val in eax into edi
24 0x08048c64 <+40>: rep movsl %ds:(%esi),%es:(%edi) // repeat 0x13 (19) times
25 // **** populate malloced memory with first 19 (edit: 76) chars of string at 0x804a388 (this string is 77 characters long)? ****
26
27 0x08048c66 <+42>: movzwl (%esi),%edx // put val in loc esi points to into edx
***** // at this point, edx should contain the string at 0x804a388?
28
29 0x08048c69 <+45>: mov %dx,(%edi) // put val in dx to loc edi points to
***** // not sure what effect this has or what is in edi at this point
30 0x08048c6c <+48>: movzbl 0x11(%eax),%edx // edx = [eax + 0x11]
31 0x08048c70 <+52>: mov %dl,0x10(%eax) // [eax + 0x10] = dl
32 0x08048c73 <+55>: mov %eax,0x4(%esp) // [esp + 0x4] = eax
33 0x08048c77 <+59>: mov 0x20(%esp),%eax // eax = [esp + 0x20]
34 0x08048c7b <+63>: mov %eax,(%esp) // put val in eax into loc esp points to
***** // not sure what effect these movs have
35
36 // edi --> first arg
37 // esi --> second arg
38 // compare value in esi to edi
39 0x08048c7e <+66>: call 0x80490ca <strings_not_equal> // store result in eax
40 0x08048c83 <+71>: test %eax,%eax
41 0x08048c85 <+73>: je 0x8048c8c <phase_3+80>
42 0x08048c87 <+75>: call 0x8049363 <explode_bomb>
43 0x08048c8c <+80>: add $0x14,%esp
44 0x08048c8f <+83>: pop %esi
45 0x08048c90 <+84>: pop %edi
46 0x08048c91 <+85>: ret
Update:
Upon inspecting the registers before strings_not_equal is called, I get the following:
eax 0x804d8aa 134535338
ecx 0x0 0
edx 0x76 118
ebx 0xffffd354 -11436
esp 0xffffd280 0xffffd280
ebp 0xffffd2b8 0xffffd2b8
esi 0x804a3d4 134521812
edi 0x804f744 134543172
eip 0x8048c7b 0x8048c7b <phase_3+63>
eflags 0x282 [ SF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x63 99
and I get the following disassembled pseudocode using Hopper:
I even tried using both the number found in eax and the string seen earlier as my keyword but neither of them worked.
The function makes a modified copy of a string from static storage, into a malloced buffer.
This looks weird. The malloc size is dependent on strlen+1, but the memcpy size is a compile-time constant? Your decompilation apparently shows that address was a string literal so it seems that's fine.
Probably that missed optimization happened because of a custom string_length() function that was maybe only defined in another .c (and the bomb was compiled without link-time optimization for cross-file inlining). So size_t len = string_length("some string literal"); is not a compile-time constant and the compiler emitted a call to it instead of being able to use the known constant length of the string.
But probably they used strcpy in the source and the compiler did inline that as a rep movs. Since it's apparently copying from a string literal, the length is a compile-time constant and it can optimize away that part of the work that strcpy normally has to do. Normally if you've already calculated the length it's better to use memcpy instead of making strcpy calculate it again on the fly, but in this case it actually helped the compiler make better code for that part than if they'd passed the return value of string_length to a memcpy, again because string_length couldn't inline and optimize away.
<+0>: push %edi // push value in edi to stack
<+1>: push %esi // push value of esi to stack
<+2>: sub $0x14,%esp // grow stack by 0x14 (move stack ptr -0x14 bytes)
Comments like that are redundant; the instruction itself already says that. This is saving two call-preserved registers so the function can use them internally and restore them later.
Your comment on the sub is better; yes, grow the stack is the higher level semantic meaning here. This function reserves some space for locals (and for function args to be stored with mov instead of pushed).
The rep movsd copies 0x13 * 4 bytes, incrementing ESI and EDI to point past the end of the copied region. So another movsd instruction would copy another 4 bytes contiguous with the previous copy.
The code actually copies another 2, but instead of using movsw, it uses a movzw word load and a mov store. This makes a total of 78 bytes copied.
...
# at this point EAX = malloc return value which I'll call buf
<+28>: mov $0x804a388,%esi # copy src = a string literal in .rodata?
<+33>: mov $0x13,%ecx
<+38>: mov %eax,%edi # copy dst = buf
<+40>: rep movsl %ds:(%esi),%es:(%edi) # memcpy 76 bytes and advance ESI, EDI
<+42>: movzwl (%esi),%edx
<+45>: mov %dx,(%edi) # copy another 2 bytes (not moving ESI or EDI)
# final effect: 78-byte memcpy
On some (but not all) CPUs it would have been efficient to just use rep movsb or rep movsw with appropriate counts, but that's not what the compiler chose in this case. movzx aka AT&T movz is a good way to do narrow loads without partial-register penalties. That's why compilers do it, so they can write a full register even though they're only going to read the low 8 or 16 bits of that reg with a store instruction.
After that copy of a string literal into buf, we have a byte load/store that copies a character with buf. Remember at this point EAX is still pointing at buf, the malloc return value. So it's making a modified copy of the string literal.
<+48>: movzbl 0x11(%eax),%edx
<+52>: mov %dl,0x10(%eax) # buf[16] = buf[17]
Perhaps if the source hadn't defeated constant-propagation, with high enough optimization level the compiler might have just put the final string into .rodata where you could find it, trivializing this bomb phase. :P
Then it stores pointers as stack args for string compare.
<+55>: mov %eax,0x4(%esp) # 2nd arg slot = EAX = buf
<+59>: mov 0x20(%esp),%eax # function arg = user input?
<+63>: mov %eax,(%esp) # first arg slot = our incoming stack arg
<+66>: call 0x80490ca <strings_not_equal>
How to "cheat": looking at the runtime result with GDB
Some bomb labs only let you run the bomb online, on a test server, which would record explosions. You couldn't run it under GDB, only use static disassembly (like objdump -drwC -Mintel). So the test server could record how many failed attempts you had. e.g. like CS 3330 at cs.virginia.edu that I found with google, where full credit requires less than 20 explosions.
Using GDB to examine memory / registers part way through a function makes this vastly easier than only working from static analysis, in fact trivializing this function where the single input is only checked at the very end. e.g. just look at what other arg is being passed to strings_not_equal. (Especially if you use GDB's jump or set $pc = ... commands to skip past the bomb explosion checks.)
Set a breakpoint or single-step to just before the call to strings_not_equal. Use p (char*)$eax to treat EAX as a char* and show you the (0-terminated) C string starting at that address. At that point EAX holds the address of the buffer, as you can see from the store to the stack.
Copy/paste that string result and you're done.
Other phases with multiple numeric inputs typically aren't this easy to cheese with a debugger and do require at least some math, but linked-list phases that requires you to have a sequence of numbers in the right order for list traversal also become trivial if you know how to use a debugger to set registers to make compares succeed as you get to them.
rep movsl copies 32-bit longwords from address %esi to address %edi, incrementing both by 4 each time, a number of times equal to %ecx. Think of it as memcpy(edi, esi, ecx*4).
See https://felixcloutier.com/x86/movs:movsb:movsw:movsd:movsq (it's movsd in Intel notation).
So this is copying 19*4=76 bytes.
I am taking this software security class, but I have never done c before, I have taken some computer organization class, but not confident in assembly at all. I commented all the lines in the file generated by objdump to help myself understand it, but several things still don't make sense to me.
What I got from gdb is at the end, based on that, can someone explain to me:
based on my understanding by now, the string format should be %d %d, the number of arguments converted by sscanf should be larger than 1, the first argument should be smaller than 5, so I typed 2 3, but the arrow in disas by gdb shows I am still stuck in the first line of the code, I don't know where I am wrong that I just cannot proceed.
which line is telling me the rule to switch? I read other people's assembly code, the pattern is like, for example, *0x402470(,%rax,8), the pattern is 0x402470 + %rax*8, then you can print out the content in the corresponding address, I don't know where to find this pattern. All I can see is that *%rax, but when I print it out, it's just the string I typed in.
what are # 0x555555556cf5 in line <+28> and # 0x555555556a80 in line <+58>? I found they are always very useful because I read other people's post, I know what I am looking for, but I don't know what they are..
I learned %rax and (%rax), but what is *%rax? I can't imagine there is the case beyond just using the value directly or using the value as an address.
based on what I read from gdb tutorial, x is to display memory content, and p is to print a value, but value is always stored in somewhere of memory, so if I am using an address, are they two just same? when should I use which one?
Any suggestion or guide would be very appreciated!!! I am taking an online class on ARM assembly too, suggestions on what more specific material I should look into would be very appreciated too, thank you!!!!
That's number 2. Keep going!
2 3
Breakpoint 1, 0x00005555555552cd in phase_3 ()
(gdb) disas
Dump of assembler code for function phase_3:
=> 0x00005555555552cd <+0>: sub $0x18,%rsp
0x00005555555552d1 <+4>: mov %fs:0x28,%rax
0x00005555555552da <+13>: mov %rax,0x8(%rsp)
0x00005555555552df <+18>: xor %eax,%eax
0x00005555555552e1 <+20>: lea 0x4(%rsp),%rcx
0x00005555555552e6 <+25>: mov %rsp,%rdx
0x00005555555552e9 <+28>: lea 0x1a05(%rip),%rsi # 0x555555556cf5
0x00005555555552f0 <+35>: callq 0x555555554f20 <__isoc99_sscanf#plt>
0x00005555555552f5 <+40>: cmp $0x1,%eax
0x00005555555552f8 <+43>: jle 0x555555555317 <phase_3+74>
0x00005555555552fa <+45>: cmpl $0x7,(%rsp)
0x00005555555552fe <+49>: ja 0x55555555539d <phase_3+208>
0x0000555555555304 <+55>: mov (%rsp),%eax
0x0000555555555307 <+58>: lea 0x1772(%rip),%rdx # 0x555555556a80
0x000055555555530e <+65>: movslq (%rdx,%rax,4),%rax
0x0000555555555312 <+69>: add %rdx,%rax
0x0000555555555315 <+72>: jmpq *%rax
0x0000555555555317 <+74>: callq 0x5555555559d3 <explode_bomb>
0x000055555555531c <+79>: jmp 0x5555555552fa <phase_3+45>
0x000055555555531e <+81>: mov $0x2ad,%eax
0x0000555555555323 <+86>: jmp 0x55555555532a <phase_3+93>
0x0000555555555325 <+88>: mov $0x0,%eax
0x000055555555532a <+93>: sub $0x228,%eax
0x000055555555532f <+98>: add $0x29e,%eax
0x0000555555555334 <+103>: sub $0xee,%eax
0x0000555555555339 <+108>: add $0xee,%eax
0x000055555555533e <+113>: sub $0xee,%eax
0x0000555555555343 <+118>: add $0xee,%eax
0x0000555555555348 <+123>: sub $0xee,%eax
0x000055555555534d <+128>: cmpl $0x5,(%rsp)
0x0000555555555351 <+132>: jg 0x555555555359 <phase_3+140>
0x0000555555555353 <+134>: cmp %eax,0x4(%rsp)
0x0000555555555357 <+138>: je 0x55555555535e <phase_3+145>
0x0000555555555359 <+140>: callq 0x5555555559d3 <explode_bomb>
0x000055555555535e <+145>: mov 0x8(%rsp),%rax
0x0000555555555363 <+150>: xor %fs:0x28,%rax
0x000055555555536c <+159>: jne 0x5555555553a9 <phase_3+220>
0x000055555555536e <+161>: add $0x18,%rsp
0x0000555555555372 <+165>: retq
0x0000555555555373 <+166>: mov $0x0,%eax
0x0000555555555378 <+171>: jmp 0x55555555532f <phase_3+98>
0x000055555555537a <+173>: mov $0x0,%eax
0x000055555555537f <+178>: jmp 0x555555555334 <phase_3+103>
0x0000555555555381 <+180>: mov $0x0,%eax
0x0000555555555386 <+185>: jmp 0x555555555339 <phase_3+108>
0x0000555555555388 <+187>: mov $0x0,%eax
0x000055555555538d <+192>: jmp 0x55555555533e <phase_3+113>
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) i r
rax 0x555555758760 93824994346848
rbx 0x0 0
rcx 0x5 5
rdx 0x555555758760 93824994346848
rsi 0x3 3
rdi 0x555555758760 93824994346848
rbp 0x0 0x0
rsp 0x7fffffffdf78 0x7fffffffdf78
r8 0x7ffff7ff7006 140737354100742
r9 0x0 0
r10 0x5 5
r11 0x246 582
r12 0x555555554fe0 93824992235488
r13 0x7fffffffe060 140737488347232
r14 0x0 0
r15 0x0 0
rip 0x5555555552cd 0x5555555552cd <phase_3>
eflags 0x206 [ PF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb) x/s $rdx
0x555555758760 <input_strings+160>: "2 3"
(gdb) x/s 0x555555556cf5
0x555555556cf5: "%d %d"
(gdb) x/s $rsp
0x7fffffffdf78: "\206QUUUU"
(gdb) x 0x555555556a80
0x555555556a80: 0xffffe89e
(gdb) p 0x555555556a80
$1 = 93824992242304
(gdb) x/8a 0x555555556a80
0x555555556a80: 0xffffe8a5ffffe89e 0xffffe8faffffe8f3
0x555555556a90: 0xffffe908ffffe901 0xffffe916ffffe90f
0x555555556aa0 <array.3415>: 0xa00000002 0x100000006
0x555555556ab0 <array.3415+16>: 0x100000000c 0x300000009
(gdb) x/s $r8
0x7ffff7ff7006: "8 16 32\no give Tina Fey more material.\n"
(gdb) x/s $r12
0x555555554fe0 <_start>: "1\355I\211\321^H\211\342H\203\344\360PTL\215\005\252\030"
(gdb) x/s $r13
0x7fffffffe060: "\001"
(gdb) x/s $rip
0x5555555552cd <phase_3>: "H\203\354\030dH\213\004%("
You put a breakpoint at that location. To proceed use stepi/nexti or set another breakpoint. Note that the call is to sscanf which uses a string source. Your input has already been read from stdin by that point, it's passed as argument into this function.
+65 to +72 is the switch, it's just broken up into parts due to an extra addition.
Friendly service of your disassembler. It shows the actual calculated address so you don't have to figure out what e.g. 0x1a05(%rip) will be.
* means indirect jump in at&t syntax. jmp *%rax is "jump to the address stored in rax". It's needed to differentiate jmp foo and jmp *foo. Register operands are unambiguous but the notation is still used (gas will issue a warning otherwise).
Not all values are in memory. To print a register for example you must use p. You can use p to print content of memory too by dereferencing but x is more flexible for that purpose.
If you view the binary in a disassembler like Binary Ninja or IDA pro, it will show you the addresses of the switch statements.
I have an exam comming up, and I'm strugling with assembly. I have written some simple C code, gotten its assembly code, and then trying to comment on the assembly code as practice. The C code:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char const *argv[])
{
int x = 10;
char const* y = argv[1];
printf("%s\n",y );
return 0;
}
Its assembly code:
0x00000000000006a0 <+0>: push %rbp # Creating stack
0x00000000000006a1 <+1>: mov %rsp,%rbp # Saving base of stack into base pointer register
0x00000000000006a4 <+4>: sub $0x20,%rsp # Allocate 32 bytes of space on the stack
0x00000000000006a8 <+8>: mov %edi,-0x14(%rbp) # First argument stored in stackframe
0x00000000000006ab <+11>: mov %rsi,-0x20(%rbp) # Second argument stored in stackframe
0x00000000000006af <+15>: movl $0xa,-0xc(%rbp) # Value 10 stored in x's address in the stackframe
0x00000000000006b6 <+22>: mov -0x20(%rbp),%rax # Second argument stored in return value register
0x00000000000006ba <+26>: mov 0x8(%rax),%rax # ??
0x00000000000006be <+30>: mov %rax,-0x8(%rbp) # ??
0x00000000000006c2 <+34>: mov -0x8(%rbp),%rax # ??
0x00000000000006c6 <+38>: mov %rax,%rdi # Return value copied to 1st argument register - why??
0x00000000000006c9 <+41>: callq 0x560 # printf??
0x00000000000006ce <+46>: mov $0x0,%eax # Value 0 is copied to return register
0x00000000000006d3 <+51>: leaveq # Destroying stackframe
0x00000000000006d4 <+52>: retq # Popping return address, and setting instruction pointer equal to it
Can a friendly soul help me out wherever I have "??" (meaning I don't understand what is happening or I'm unsure)?
0x00000000000006ba <+26>: mov 0x8(%rax),%rax # get argv[1] to rax
0x00000000000006be <+30>: mov %rax,-0x8(%rbp) # move argv[1] to local variable
0x00000000000006c2 <+34>: mov -0x8(%rbp),%rax # move local variable to rax (for move to rdi)
0x00000000000006c6 <+38>: mov %rax,%rdi # now rdi has argv[1]
0x00000000000006c9 <+41>: callq 0x560 # it is puts (optimized)
I will try to make a guess:
mov -0x20(%rbp),%rax # retrieve argv[0]
mov 0x8(%rax),%rax # store argv[1] into rax
mov %rax,-0x8(%rbp) # store argv[1] (which now is in rax) into y
mov -0x8(%rbp),%rax # put y back into rax (which might look dumb, but possibly it has its reasons)
mov %rax,%rdi # copy y to rdi, possibly to prepare the context for the printf
When you deal with assembler, please specify which architecture you are using. An Intel processor might use a different set of instructions from an ARM one, the same instructions might be different or they might rely on different assumptions. As you might know, optimisations change the sequence of assembler instructions generated by the compiler, you might want to specify whether you are using that as well (looks like not?) and which compiler you are using as everyone has its own policy for generating assembler.
Maybe we will never know why the compiler must prepare the context for printf by copying from rax, it could be a compiler's choice or an obligation imposed by the specific architecture. For all those annoying reasons, most of people prefer to use a "high level language" such as C, so that the set of instructions is always right although it might look very dumb for a human (as we know computers are dumb by design) and not always the most choice, that's why there are still many compilers around.
I can give you two more tips:
you IDE must have a way to interleave assembler instructions with C code, and to single step within the assembler. Try to find it out and explore it yourself
the IDE should also have a function to explore the memory of your program. If you find that try to enter the 0x560 address and look were it will lead you. It is very likely that that will be the entry point of your printf
I hope that my answer will help you work it out, good luck
My hello & regards to all. I have a C program, basically wrote for testing Buffer overflow.
#include<stdio.h>
void display()
{
char buff[8];
gets(buff);
puts(buff);
}
main()
{
display();
return(0);
}
Now i disassemble display and main sections of it using GDB. The code:-
Dump of assembler code for function main:
0x080484ae <+0>: push %ebp # saving ebp to stack
0x080484af <+1>: mov %esp,%ebp # saving esp in ebp
0x080484b1 <+3>: call 0x8048474 <display> # calling display function
0x080484b6 <+8>: mov $0x0,%eax # move 0 into eax , but WHY ????
0x080484bb <+13>: pop %ebp # remove ebp from stack
0x080484bc <+14>: ret # return
End of assembler dump.
Dump of assembler code for function display:
0x08048474 <+0>: push %ebp #saves ebp to stack
0x08048475 <+1>: mov %esp,%ebp # saves esp to ebp
0x08048477 <+3>: sub $0x10,%esp # making 16 bytes space in stack
0x0804847a <+6>: mov %gs:0x14,%eax # what does it mean ????
0x08048480 <+12>: mov %eax,-0x4(%ebp) # move eax contents to 4 bytes lower in stack
0x08048483 <+15>: xor %eax,%eax # xor eax with itself (but WHY??)
0x08048485 <+17>: lea -0xc(%ebp),%eax #Load effective address of 12 bytes
lower placed value ( WHY???? )
0x08048488 <+20>: mov %eax,(%esp) #make esp point to the address inside of eax
0x0804848b <+23>: call 0x8048374 <gets#plt> # calling get, what is "#plt" ????
0x08048490 <+28>: lea -0xc(%ebp),%eax # LEA of 12 bytes lower to eax
0x08048493 <+31>: mov %eax,(%esp) # make esp point to eax contained address
0x08048496 <+34>: call 0x80483a4 <puts#plt> # again what is "#plt" ????
0x0804849b <+39>: mov -0x4(%ebp),%eax # move (ebp - 4) location's contents to eax
0x0804849e <+42>: xor %gs:0x14,%eax # # again what is this ????
0x080484a5 <+49>: je 0x80484ac <display+56> # Not known to me
0x080484a7 <+51>: call 0x8048394 <__stack_chk_fail#plt> # not known to me
0x080484ac <+56>: leave # a new instruction, not known to me
0x080484ad <+57>: ret # return to MAIN's next instruction
End of assembler dump.
So folks, you should consider my homework. Rest all of the code is known to me, except few lines. I have included a big "WHY ????" and some more questions in the comments ahead of each line. The first hurdle for me is "mov %gs:0x14,%eax" instruction, I cant make flow chart after this instruction. Somebody plz explain me, what these few instructions are meant for and doing what in the program? Thanks...
0x080484b6 <+8>: mov $0x0,%eax # move 0 into eax , but WHY ????
Don't you have this?:
return(0);
They are probably related. :)
0x0804847a <+6>: mov %gs:0x14,%eax # what does it mean ????
It means reading 4 bytes into eax from memory at address gs:0x14. gs is a segment register. Most likely thread-local storage (AKA TLS) is referenced through this register.
0x08048483 <+15>: xor %eax,%eax # xor eax with itself (but WHY??)
Don't know. Could be optimization-related.
0x08048485 <+17>: lea -0xc(%ebp),%eax #Load effective address of 12 bytes
lower placed value ( WHY???? )
It makes eax point to a local variable that lives on the stack. sub $0x10,%esp allocated some space for them.
0x08048488 <+20>: mov %eax,(%esp) #make esp point to the address inside of eax
Wrong. It writes eax to the stack, to the stack top. It will be passed as an on-stack argument to the called function:
0x0804848b <+23>: call 0x8048374 <gets#plt> # calling get, what is "#plt" ????
I don't know. Could be some name mangling.
By now you should've guessed what local variable that was. buff, what else could it be?
0x080484ac <+56>: leave # a new instruction, not known to me
Why don't you look it up in the CPU manual?
Now, I can probably explain you the gs/TLS thing...
0x08048474 <+0>: push %ebp #saves ebp to stack
0x08048475 <+1>: mov %esp,%ebp # saves esp to ebp
0x08048477 <+3>: sub $0x10,%esp # making 16 bytes space in stack
0x0804847a <+6>: mov %gs:0x14,%eax # what does it mean ????
0x08048480 <+12>: mov %eax,-0x4(%ebp) # move eax contents to 4 bytes lower in stack
...
0x0804849b <+39>: mov -0x4(%ebp),%eax # move (ebp - 4) location's contents to eax
0x0804849e <+42>: xor %gs:0x14,%eax # # again what is this ????
0x080484a5 <+49>: je 0x80484ac <display+56> # Not known to me
0x080484a7 <+51>: call 0x8048394 <__stack_chk_fail#plt> # not known to me
0x080484ac <+56>
So, this code takes a value from the TLS (at gs:0x14) and stores it right below the saved ebp value (at ebp-4). Then there's your stuff with get() and put(). Then this code checks whether the copy of the value from the TLS is unchanged. xor %gs:0x14,%eax does the compare.
If XORed values are the same, the result of the XOR is 0 and flags.zf is 1. Else, the result isn't 0 and flags.zf is 0.
je 0x80484ac <display+56> checks flags.zf and skips call 0x8048394 <__stack_chk_fail#plt> if flags.zf = 1. IOW, this call is skipped if the copy of the value from the TLS is unchanged.
What is that all about? That's a way to try to catch a buffer overflow. If you write beyond the end of the buffer, you will overwrite that value copied from the TLS to the stack.
Why do we take this value from the TLS, why not just a constant, hard-coded value? We probably want to use different, non-hard-coded values to catch overflows more often (and so the value in the TLS will change from a run to another run of your program and it will be different in different threads of your program). That also lowers chances of successfully exploiting the buffer overflow by an attacker if the value is chosen randomly each time your program runs.
Finally, if the copy of the value is found to have been overwritten due to a buffer overflow, call 0x8048394 <__stack_chk_fail#plt> will call a special function dedicated to doing whatever's necessary, e.g. reporting a problem and terminating the program.
0x0804849e <+42>: xor %gs:0x14,%eax # # again what is this ????
0x080484a5 <+49>: je 0x80484ac <display+56> # Not known to me
0x080484a7 <+51>: call 0x8048394 <__stack_chk_fail#plt> # not known to me
0x080484ac <+56>: leave # a new instruction, not known to me
0x080484ad <+57>: ret # return to MAIN's next instruction
The gs segment can be used for thread local storage. E.g. it's used for errno, so that each thread in a multi-threaded program effectively has its own errno variable.
The function name above is a big clue. This must be a stack canary.
(leave is some CISC instruction that does everything you need to do before the actual ret. I don't know the details).
Others already explained the GS thing (has to do with threads)..
0x08048483 <+15>: xor %eax,%eax # xor eax with itself (but WHY??)
Explaining this requires some history of the X86 architecture:
the xor eax, eax instruction clears out all bits in register eax (loads a zero), but as you've already found it this seems to be unnecessary because the register gets loaded with a new value in the next instruction.
However, xor eax, eax does something else on the x86 as well. You probably know that you are able to access parts of the register eax by using al, ah and ax. It has been that way since the 386, and it was okay back then when eax really was a single register.
However, this is no more. The registers that you see and use in your code are just placeholders. Inside the CPU is working with much more internal registers and a completely different instruction set. Instructions that you write are translated into this internal instruction set.
If you use AL, AH and EAX for example you are using three different registers from the CPU point of view.
Now if you access EAX after you have used AL or AH, the CPU has to merge back these different registers to build a valid EAX value.
The line:
0x08048483 <+15>: xor %eax,%eax # xor eax with itself (but WHY??)
Does not only clear out register eax. It also tells the CPU that all renamed sub-registers: AL, AH and AX can now considered to be invalidated (set to zero) and the CPU does not have to do any sub-register merging.
Why is the compiler emitting this instruction?
Because the compiler does not know in which context display() will get called. You may call it from a piece of code that does lots of byte arithmetic using AL and AH. If it would not clear out the EAX register via XOR, the CPU would have to do the costly register merging which takes a lot of cycles.
So doing this extra work at the function start improves performance. It is unnecessary in your case, but since the compiler can't know that emits the instruction to be sure.
The stack_check_fail is part of gcc buffer overflow check. It uses libssp (stack-smash-protection), and your move at the beginning sets up a guard for the stack, and the xor %gs:0x14... is a check if the guard is still ok. When it is ok, it jumps to the leave (check assembler doc for it, its an helper instruction for stack handling) and skips the jump to the stack_chk_fail, which would abort the program and emit an error message.
You can disable the emitting of this overflow check with the gcc option -fno-stack-protector.
And as already mentioned in the comments, the xor x,x is just a quick command to clear x, and the final mov 0, %eax is for the return value of your main.
I've been working on the bufbomb lab from CSAPPS and I've gotten stuck on one of the phases.
I won't get into the gore-y details of the project since I just need a nudge in the right direction. I'm having a hard time finding the starting address of the array called "buf" in the given assembly.
We're given a function called getbuf:
#define NORMAL_BUFFER_SIZE 32
int getbuf()
{
char buf[NORMAL_BUFFER_SIZE];
Gets(buf);
return 1;
}
And the assembly dumps:
Dump of assembler code for function getbuf:
0x08048d92 <+0>: sub $0x3c,%esp
0x08048d95 <+3>: lea 0x10(%esp),%eax
0x08048d99 <+7>: mov %eax,(%esp)
0x08048d9c <+10>: call 0x8048c66 <Gets>
0x08048da1 <+15>: mov $0x1,%eax
0x08048da6 <+20>: add $0x3c,%esp
0x08048da9 <+23>: ret
End of assembler dump.
Dump of assembler code for function Gets:
0x08048c66 <+0>: push %ebp
0x08048c67 <+1>: push %edi
0x08048c68 <+2>: push %esi
0x08048c69 <+3>: push %ebx
0x08048c6a <+4>: sub $0x1c,%esp
0x08048c6d <+7>: mov 0x30(%esp),%esi
0x08048c71 <+11>: movl $0x0,0x804e100
0x08048c7b <+21>: mov %esi,%ebx
0x08048c7d <+23>: jmp 0x8048ccf <Gets+105>
0x08048c7f <+25>: mov %eax,%ebp
0x08048c81 <+27>: mov %al,(%ebx)
0x08048c83 <+29>: add $0x1,%ebx
0x08048c86 <+32>: mov 0x804e100,%eax
0x08048c8b <+37>: cmp $0x3ff,%eax
0x08048c90 <+42>: jg 0x8048ccf <Gets+105>
0x08048c92 <+44>: lea (%eax,%eax,2),%edx
0x08048c95 <+47>: mov %ebp,%ecx
0x08048c97 <+49>: sar $0x4,%cl
0x08048c9a <+52>: mov %ecx,%edi
0x08048c9c <+54>: and $0xf,%edi
0x08048c9f <+57>: movzbl 0x804a478(%edi),%edi
0x08048ca6 <+64>: mov %edi,%ecx
---Type <return> to continue, or q <return> to quit---
0x08048ca8 <+66>: mov %cl,0x804e140(%edx)
0x08048cae <+72>: mov %ebp,%ecx
0x08048cb0 <+74>: and $0xf,%ecx
0x08048cb3 <+77>: movzbl 0x804a478(%ecx),%ecx
0x08048cba <+84>: mov %cl,0x804e141(%edx)
0x08048cc0 <+90>: movb $0x20,0x804e142(%edx)
0x08048cc7 <+97>: add $0x1,%eax
0x08048cca <+100>: mov %eax,0x804e100
0x08048ccf <+105>: mov 0x804e110,%eax
0x08048cd4 <+110>: mov %eax,(%esp)
0x08048cd7 <+113>: call 0x8048820 <_IO_getc#plt>
0x08048cdc <+118>: cmp $0xffffffff,%eax
0x08048cdf <+121>: je 0x8048ce6 <Gets+128>
0x08048ce1 <+123>: cmp $0xa,%eax
0x08048ce4 <+126>: jne 0x8048c7f <Gets+25>
0x08048ce6 <+128>: movb $0x0,(%ebx)
0x08048ce9 <+131>: mov 0x804e100,%eax
0x08048cee <+136>: movb $0x0,0x804e140(%eax,%eax,2)
0x08048cf6 <+144>: mov %esi,%eax
0x08048cf8 <+146>: add $0x1c,%esp
0x08048cfb <+149>: pop %ebx
0x08048cfc <+150>: pop %esi
0x08048cfd <+151>: pop %edi
---Type <return> to continue, or q <return> to quit---
0x08048cfe <+152>: pop %ebp
0x08048cff <+153>: ret
End of assembler dump.
I'm having a difficult time locating where the starting address of buf is (or where buf is at all in this mess!). If someone could point that out to me, I'd greatly appreciate it.
Attempt at a solution
Reading symbols from /home/user/CS247/buflab/buflab-handout/bufbomb...(no debugging symbols found)...done.
(gdb) break getbuf
Breakpoint 1 at 0x8048d92
(gdb) run -u user < firecracker-exploit.bin
Starting program: /home/user/CS247/buflab/buflab-handout/bufbomb -u user < firecracker-exploit.bin
Userid: ...
Cookie: ...
Breakpoint 1, 0x08048d92 in getbuf ()
(gdb) print buf
No symbol table is loaded. Use the "file" command.
(gdb)
As has been pointed out by some other people, buf is allocated on the stack at run time. See these lines in the getbuf() function:
0x08048d92 <+0>: sub $0x3c,%esp
0x08048d95 <+3>: lea 0x10(%esp),%eax
0x08048d99 <+7>: mov %eax,(%esp)
The first line subtracts 0x3c (60) bytes from the stack pointer, effectively allocating that much space. The extra bytes beyond 32 are probably for parameters for Gets (Its hard to tell what the calling convention is for Gets is precisely, so its hard to say) The second line gets the address of the 16 bytes up. This leaves 44 bytes above it that are unallocated. The third line puts that address onto the stack for probably for the gets function call. (remember the stack grows down, so the stack pointer will be pointing at the last item on the stack). I am not sure why the compiler generated such strange offsets (60 bytes and then 44) but there is probably a good reason. If I figure it out I will update here.
Inside the gets function we have the following lines:
0x08048c66 <+0>: push %ebp
0x08048c67 <+1>: push %edi
0x08048c68 <+2>: push %esi
0x08048c69 <+3>: push %ebx
0x08048c6a <+4>: sub $0x1c,%esp
0x08048c6d <+7>: mov 0x30(%esp),%esi
Here we see that we save the state of some of the registers, which add up to 16-bytes, and then Gets reserves 28 (0x1c) bytes on the stack. The last line is key: It grabs the value at 0x30 bytes up the stack and loads it into %esi. This value is the address of buf put on the stack by getbuf. Why? 4 for the return addres plus 16 for the registers+28 reserved = 48. 0x30 = 48, so it is grabbing the last item placed on the stack by getbuf() before calling gets.
To get the address of buf you have to actually run the program in the debugger because the address will probably be different everytime you run the program, or even call the function for that matter. You can set a break point at any of these lines above and either dump the %eax register when the it contains the address to be placed on the stack on the second line of getbuf, or dump the %esi register when it is pulled off of the stack. This will be the pointer to your buffer.
to be able to see debugging info while using gdb,you must use the -g3 switch with gcc when you compile.see man gcc for more details on the -g switch.
Only then, gcc will add debugging info (symbol table) into the executable.
0x08048cd4 <+110>: mov %eax,(%esp)
0x08048cd7 <+113>: **call 0x8048820 <_IO_getc#plt>**
0x08048cdc <+118>: cmp $0xffffffff,%eax
0x0848cdf <+121>: je 0x8048ce6 <Gets+128>
0x08048ce1 <+123>: cmp $0xa,%eax
0x08048ce4 <+126>: jne 0x8048c7f <Gets+25>
0x08048ce6 <+128>: movb $0x0,(%ebx)
0x08048ce9 <+131>: mov 0x804e100,%eax
0x08048cee <+136>: movb $0x0,0x804e140(%eax,%eax,2)
0x08048cf6 <+144>: mov %esi,%eax
0x08048cf8 <+146>: add $0x1c,%esp
0x08048cfb <+149>: **pop %ebx**
0x08048cfc <+150>: **pop %esi**
0x08048cfd <+151>: **pop %edi**
---Type <return> to continue, or q <return> to quit---
0x08048cfe <+152>: **pop %ebp**
0x08048cff <+153>: ret
End of assembler dump.
I Don't know your flavour of asm but there's a call in there which may use the start address
The end of the program pops various pointers
That's where I'd start looking
If you can tweak the asm for these functions you can input your own routines to dump data as the function runs and before those pointers get popped
buf is allocated on the stack. Therefore, you will not be able to spot its address from an assembly listing. In other words, buf is allocated (and its address therefore known) only when you enter the function getbuf() at runtime.
If you must know the address, one option would be to use gbd (but make sure you compile with the -g flag to enable debugging support) and then:
gdb a.out # I'm assuming your binary is a.out
break getbuf # Set a breakpoint where you want gdb to stop
run # Run the program. Supply args if you need to
# WAIT FOR your program to reach getbuf and stop
print buf
If you want to go this route, a good gdb tutorial (example) is essential.
You could also place a printf inside getbuf and debug that way - it depends on what you are trying to do.
One other point leaps out from your code. Upon return from getbuf, the result of Gets will be trashed. This is because Gets is presumably writing its results into the stack-allocated buf. When you return from getbuf, your stack is blown and you cannot reliably access buf.