gdb weird backtrace - c

My program is statically compiled with dietlibc. It is compiled on ubuntu x64 (compiled for x86 using the -m32 flag) and is run on a centos x86.
The compiled size is only about 100KB. I compile it with -ggdb3 and no optimization flags.
My program uses signal.h to handle a SIGSEGV signal and then calls abort().
The program runs without problems for days but sometimes segfaults. This is when I get weird backtraces that I do not understand:
username#ubuntu:~/Desktop$ gdb -c core.28569 program-name
GNU gdb (GDB) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-linux-gnu --target=i386-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from program-name...done.
[New Thread 28569]
Core was generated by `program-name'.
Program terminated with signal 6, Aborted.
#0 0x00914410 in __kernel_vsyscall ()
Setting up the environment for debugging gdb.
Function "internal_error" not defined.
Make breakpoint pending on future shared library load? (y or [n]) [answered N; input not from terminal]
Function "info_command" not defined.
Make breakpoint pending on future shared library load? (y or [n]) [answered N; input not from terminal]
.gdbinit:8: Error in sourced command file:
Argument required (one or more breakpoint numbers).
(gdb) bt
#0 0x00914410 in __kernel_vsyscall ()
During symbol reading, incomplete CFI data; unspecified registers (e.g., eax) at 0x914411.
#1 0x0804d7f4 in __unified_syscall ()
#2 0xbf8966c0 in ?? ()
#3
#4 0x2054454e in ?? ()
#5 0x20524c43 in ?? ()
#6 0x2e352e33 in ?? ()
#7 0x32373033 in ?? ()
#8 0x2e203b39 in ?? ()
#9 0x2054454e in ?? ()
#10 0x20524c43 in ?? ()
#11 0x2e302e33 in ?? ()
#12 0x32373033 in ?? ()
#13 0x4d203b39 in ?? ()
#14 0x61696465 in ?? ()
#15 0x6e654320 in ?? ()
#16 0x20726574 in ?? ()
#17 0x36204350 in ?? ()
#18 0x203b302e in ?? ()
#19 0x54454e2e in ?? ()
#20 0x43302e34 in ?? ()
#21 0x00000029 in ?? ()
#22 0xbf8989a8 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) bt full
#0 0x00914410 in __kernel_vsyscall ()
No symbol table info available.
#1 0x0804d7f4 in __unified_syscall ()
No symbol table info available.
#2 0xbf8966c0 in ?? ()
No symbol table info available.
#3
No symbol table info available.
#4 0x2054454e in ?? ()
No symbol table info available.
#5 0x20524c43 in ?? ()
No symbol table info available.
#6 0x2e352e33 in ?? ()
No symbol table info available.
#7 0x32373033 in ?? ()
No symbol table info available.
#8 0x2e203b39 in ?? ()
No symbol table info available.
#9 0x2054454e in ?? ()
No symbol table info available.
#10 0x20524c43 in ?? ()
No symbol table info available.
#11 0x2e302e33 in ?? ()
No symbol table info available.
#12 0x32373033 in ?? ()
No symbol table info available.
#13 0x4d203b39 in ?? ()
No symbol table info available.
#14 0x61696465 in ?? ()
No symbol table info available.
#15 0x6e654320 in ?? ()
No symbol table info available.
#16 0x20726574 in ?? ()
No symbol table info available.
#17 0x36204350 in ?? ()
No symbol table info available.
#18 0x203b302e in ?? ()
No symbol table info available.
#19 0x54454e2e in ?? ()
No symbol table info available.
#20 0x43302e34 in ?? ()
No symbol table info available.
#21 0x00000029 in ?? ()
No symbol table info available.
#22 0xbf8989a8 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) quit

It's a stack overrun.
#4 0x2054454e in ?? ()
That looks like text, " TEN" or "NET "
#5 0x20524c43 in ?? ()
" RLC" or "CLR "
And so on.
Treat the addresses as if they were text - see if you can identify where this text overwrites your stack.

Your stack trace is actually very easy to understand:
You got SIGSEGV somewhere,
Your signal handler did whatever it does, then called abort()
Which issued raise(2) system call, by calling __unified_syscall()
The reason you get no stack trace in GDB is that
__unified_syscall is implemented in assembly, and
does not use frame pointer, and
does not have proper cfi directives to describe how to unwind from it.
I would consider this a bug in dietlibc, quite easy to fix, actually. See if this (untested) patch fixes it for you:
--- dietlibc-0.31/i386/unified.S.orig 2011-03-13 10:16:23.000000000 -0700
+++ dietlibc-0.31/i386/unified.S 2011-03-13 10:21:32.000000000 -0700
## -31,8 +31,14 ## __unified_syscall:
movzbl %al, %eax
.L1:
push %edi
+ cfi_adjust_cfa_offset (4)
+ cfi_rel_offset (edi, 0)
push %esi
+ cfi_adjust_cfa_offset (4)
+ cfi_rel_offset (esi, 0)
push %ebx
+ cfi_adjust_cfa_offset (4)
+ cfi_rel_offset (ebx, 0)
movl %esp,%edi
/* we use movl instead of pop because otherwise a signal would
destroy the stack frame and crash the program, although it
## -61,8 +67,11 ## __unified_syscall:
#endif
.Lnoerror:
pop %ebx
+ cfi_adjust_cfa_offset (-4)
pop %esi
+ cfi_adjust_cfa_offset (-4)
pop %edi
+ cfi_adjust_cfa_offset (-4)
/* here we go and "reuse" the return for weak-void functions */
#include "dietuglyweaks.h"
If you can't rebuild dietlibc, or if the patch is incorrect, you may still be able to analyze the stack trace better. As far as I can tell, __unified_syscall does not touch %ebp. So you might be able to get a reasonable stack trace by doing this:
define xbt
set $xbp = (void **)$arg0
while 1
x/2a $xbp
set $xbp = (void **)$xbp[0]
end
end
xbt $ebp
Note: if the xbt works, it is likely to go into the weeds around the SIGSEGV signal frame (that frame does not use frame pointer either). This may result in complete garbage, or in a skipped frame or two (which would be exactly the frames where SIGSEGV happened).
So you really are much better off getting proper unwind descriptors into dietlibc.

Related

Selenium webscraping TimeoutException with stacktrace of pointers

I am currently using selenium to web-scrape for articles on the RSC corpus. I keep running into an error due to the line:
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '.capsule.capsule--article')))
I have tried both EC.presence_of_element_located and EC.visibility_of_element_located, and both give me the same TimeoutException error. I am not sure what else I can try to get rid of the TimeoutException, as the CSS selector is correct, and the URL properly loads the query on RSC.
Here is my stacktrace:
Traceback (most recent call last):
File "rsc.py", line 40, in <module>
main(url, query, page, location)
File "rsc.py", line 22, in main
dois = scraper.get_doi(query=query, page=page)
File "/home/ssarrouf/Documents/GitHub/WaterRemediationParser/batterydataextractor-main/batterydataextractor/scrape/rsc.py", line 62, in get_doi
_ = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '.capsule.capsule--article')))
File "/home/ssarrouf/.pyenv/versions/3.8.16/lib/python3.8/site-packages/selenium/webdriver/support/wait.py", line 95, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
Stacktrace:
#0 0x55e10fae5d93 <unknown>
#1 0x55e10f8b42d7 <unknown>
#2 0x55e10f8f0caa <unknown>
#3 0x55e10f8f0db1 <unknown>
#4 0x55e10f92e8f4 <unknown>
#5 0x55e10f91461d <unknown>
#6 0x55e10f92c619 <unknown>
#7 0x55e10f914353 <unknown>
#8 0x55e10f8e3e40 <unknown>
#9 0x55e10f8e5038 <unknown>
#10 0x55e10fb398be <unknown>
#11 0x55e10fb3d8f0 <unknown>
#12 0x55e10fb1df90 <unknown>
#13 0x55e10fb3eb7d <unknown>
#14 0x55e10fb0f578 <unknown>
#15 0x55e10fb63348 <unknown>
#16 0x55e10fb634d6 <unknown>
#17 0x55e10fb7d341 <unknown>
#18 0x7f0fad5a9b43 <unknown>
My code in the "get_doi" method:
def get_doi(self, query, page):
"""
Get a list of dois from query massages and the exact page.
:param query: the query text (e.g. battery materials)
:param page: the number of page
:return: a list of dois of the relevant query text and page.
"""
if self.driver is None:
driver = webdriver.Chrome()
else:
driver = self.driver
if self.url is None:
url = "http://pubs.rsc.org/en/results?searchtext="
url = url + query
else:
url = self.url
driver.get(url)
wait = WebDriverWait(driver, self.max_wait_time)
# To make sure we don't overload the server
sleep(1)
next_button = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a[class^=paging__btn]")))[1]
page_string = """document.querySelectorAll("a[class^=paging__btn]")[1].setAttribute("data-pageno", \""""\
+ str(page) + """\")"""
driver.execute_script(page_string)
next_button.click()
sleep(1)
_ = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '.capsule.capsule--article')))
doi_lists = driver.find_elements(By.PARTIAL_LINK_TEXT, 'https://doi.org')
dois = [doi.text for doi in doi_lists]
return dois
My code for downloading the articles:
def download_doi(self, doi, file_location):
"""
Download the html paper of the doi
:param doi: doi of the paper
:param file_location: the saving location
:return:
"""
doi = doi.split("org/")[-1]
r = requests.get('http://doi.org/' + doi, headers={'User-Agent': 'Mozilla/5.0'})
result = re.findall(r'https://pubs.rsc.org/en/content/articlehtml/.*?"', r.text)
url = result[0][:-1]
web_content = requests.get(url, headers={"User-Agent": "Mozilla/5.0"}).content
result = self.get_rsc_abstract(web_content)
exact_date = result['date'].split("/")
doi = result['doi'].replace("/", "_")
if len(exact_date) == 3:
name = exact_date[0] + exact_date[1] + exact_date[2] + '_' + doi
else:
name = result['online_date'].replace("/", "") + '_' + doi
with open(file_location + name + '.html', 'wb') as f:
f.write(web_content)
return
And the main method I am calling for the web-scraping:
def main(url, query, page, file_location):
"""
RSC web-scraper runner
:param url: the scraping url (or default)
:param query: query text (e.g. battery materials)
:param page: the page number of the query pages
:param file_location: saving location
:return:
"""
scraper = RSCWebScraper(url=url)
dois = scraper.get_doi(query=query, page=page)
for doi in dois:
try:
scraper.download_doi(doi, file_location)
# Some papers don't have html access
except:
continue
return
if __name__ == "__main__":
# Download papers within a certain date range
url = "https://pubs.rsc.org/en/results/all?Category=All&AllText=water%20remediation&IncludeReference=false&Select" \
"Journal=false&DateRange=true&SelectDate=true&DateToYear={}&DateFromYear={}&DateFromMonth={}&DateTo" \
"Month={}&PriceCode=False&OpenAccess=false".format("2022", "2021", "06", "01")
query = "water remediation"
location = r"/home/ssarrouf/Documents/webscrape/to_date_papers/rsc/"
for page in range(1, 120):
main(url, query, page, location)
Changing the method for
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, '.capsule.capsule--article')))
``` did not give me any results, and I also changed the query / url text to various months and years to ensure that the request was properly loading. These did nothing and I got TimeoutExceptions for each attempt.

GDB Backtrace with Shared Library Symbols

I am debugging an application crash using the net-snmp library. The backtrace is as follows:
#0 0x001a1c7e in _snmp_parse (sessp=0x0, session=<value optimized out>, pdu=0xb6183858, data=0xa3401248 "\301", length=12097984) at snmp_api.c:4408
#1 0x001a20bb in snmp_resend_request (slp=0x0, rp=0xb6184818, incr_retries=1) at snmp_api.c:6383
#2 0x00b89944 in send_trap_to_sess (sess=0xb6122848, template_pdu=0xa3401440) at agent_trap.c:945
#3 0x00b8ab46 in netsnmp_send_traps (trap=-1, specific=-1, enterprise=0xbbd080, enterprise_length=10, vars=0xa5400468, context=0x0, flags=0)
at agent_trap.c:839
#4 0x00b8b0fa in send_enterprise_trap_vars (trap=-1, specific=-1, enterprise=0xbbd080, enterprise_length=10, vars=0xa5400468) at agent_trap.c:863
#5 0x00b8b153 in send_trap_vars (trap=-1, specific=-1, vars=0xa5400468) at agent_trap.c:975
#6 0x00b8b1fe in send_v2trap (vars=0xa5400468) at agent_trap.c:1049
#7 0x00288382 in applicationBaseClass::sendTraps (temp=0x161d3018) at appBaseClass.cpp:750
#8 0x00288311 in applicationBaseClass::sendTrapsByTimer (temp=0x161d3018) at appBaseClass.cpp:736
#9 0x00ac05f3 in check_timers () at exec_timer.c:383
I installed the net-snmp-debuginfo to get the source files. The details:
net-snmp-libs-5.5-57.el6_8.1.i686
net-snmp-utils-5.5-57.el6_8.1.i686
net-snmp-debuginfo-5.5-57.el6_8.1.i686
net-snmp-5.5-57.el6_8.1.i686
We can see that the source files correspond to the net-snmp version installed - 5.5-57
Now my application is linking to net-snmp :
[root#stg blr]# lsof -p 4043 | grep snmp
serv_trans 4043 root mem REG 8,2 1760620 4031 /usr/lib/libnetsnmpmibs.so.20.0.0
serv_trans 4043 root mem REG 8,2 701880 4978 /usr/lib/libnetsnmp.so.20.0.0
serv_trans 4043 root mem REG 8,2 313568 3982 /usr/lib/libnetsnmpagent.so.20.0.0
serv_trans 4043 root mem REG 8,2 157008 4017 /usr/lib/libnetsnmphelpers.so.20.0.0
So the application is rightly linking with the net-snmp libraries of version 5.5-57
Now, to the backtrace. The stack seems to be incorrect. It is not showing the exact call sequence. For example, in the frame 2, the line number that gdb shows- 6383 -is actually a declaration statement.
#1 0x001a20bb in snmp_resend_request (slp=0x0, rp=0xb6184818, incr_retries=1) at snmp_api.c:6383
6383 u_char *pktbuf = NULL, *packet = NULL;
What could be I missing? I seem to be having the right source files for gdb. Why isn't the stack trace from gdb not pointing to the exact sequence of events?

Realtime output in CakePHP

I'd like to print the output of a program in php in "real time" (buffers are not important). The process takes a long time and having the (partial) data earlier would be very helpful.
Usually I'd use plain passthru() but this is done in CakePHP and it doesn't output anything until I do this:
$this->response->file($file, array('download' => true));
return $this->response;
If I just remove these lines and swap the exec() with a passthru() I get a MissingViewException
Error: [MissingViewException] View file "Songs/download.ctp" is missing.
And If I do this
$this->response=$out; #$out being the output of exec()
return $this->response;
I get this
2015-08-10 01:18:06 Error: Fatal Error (1): Call to a member function body() on string in [/storage/www/sonerezh/lib/Cake/Controller/Controller.php, line 960]
2015-08-10 01:18:06 Error: [InternalErrorException] Internal Server Error
Request URL: /songs/download/2307
Stack Trace:
#0 /storage/www/sonerezh/lib/Cake/Error/ErrorHandler.php(213): ErrorHandler::handleFatalError(1, 'Call to a membe...', '/storage/www/so...', 960)
#1 [internal function]: ErrorHandler::handleError(1, 'Call to a membe...', '/storage/www/so...', 960, Array)
#2 /storage/www/sonerezh/lib/Cake/Core/App.php(931): call_user_func('ErrorHandler::h...', 1, 'Call to a membe...', '/storage/www/so...', 960, Array)
#3 /storage/www/sonerezh/lib/Cake/Core/App.php(904): App::_checkFatalError()
#4 [internal function]: App::shutdown()
#5 {main}
What can I do?
You could try this (not tested):
$this->response->body(function () {
passthru ('./program') ;
}) ;
return $this->response ;
More information here.
Note: I assumed your were using CakePHP 3 since CakeResponse::file does not exist in CakePHP 2.

Understand elements of the gdb core print

I have a core generated. /var/log/messages displays this line:
Jan 29 07:50:40 NetAcc-02 kernel: LR.exe[15326]: segfault at 51473861 ip 081e2dba sp 00240030 error 4 in LR.exe[8048000+34c000]
Jan 29 07:50:52 NetAcc-02 abrt[20696]: saved core dump of pid 15252 (/home/netacc/active/LR.exe) to /var/spool/abrt/ccpp-2015-01-29-07:50:40-15252.new/coredump (1642938368 bytes)
Jan 29 07:50:52 NetAcc-02 abrtd: Directory 'ccpp-2015-01-29-07:50:40-15252' creation detected
Jan 29 07:50:54 NetAcc-02 abrtd: Executable '/home/netacc/active/LR.exe' doesn't belong to any package
Jan 29 07:50:54 NetAcc-02 abrtd: Corrupted or bad dump /var/spool/abrt/ccpp-2015-01-29-07:50:40-15252 (res:2), deleting
Does the last line mean that the core is corrupted? Because a bt of my corefile seems to be corrupted:
#0 0x081e2dba in CfaPepDecision (pBuf=0xa0d6735, pIp=0x5147384d, u2DirectFlag=1, ppepserver=0x67684e6f, paccl=0x45517377, pPepMode=0x6a31396c "") at /home/TAN/release/rel/idu-sw/pep/pep/src/pepcfa.c:498
#1 0x52367331 in ?? ()
#2 0x0a0d6735 in gProfileVsatTable ()
#3 0x5147384d in ?? ()
#4 0x75417875 in ?? ()
#5 0x38000200 in ?? ()
Strangely the gProfileVsatTable is a global array!
The address pIp = 0x5147384d is out of bounds in gdb.
Any inputs are helpful.
Because a bt of my corefile seems to be corrupted:
This is usually the result of analyzing the wrong binary. Invoke GDB like this:
gdb /home/netacc/active/LR.exe \
/var/spool/abrt/ccpp-2015-01-29-07:50:40-15252.new/coredump
Make sure that you have not updated the binary since Jan 29 07:50:52. In particular, make sure you did not rebuild the binary with different options after the crash.

Magento Join Query AddProductFilter To Collection Fail

$q = Mage::getModel('qa/post')->getCollection()->addProductFilter(1)->printLogQuery(true);
public function addProductFilter($productId)
{
$this->addFieldToFilter('qa_post_relation.product_id', $productId);
return $this;
}
Returns an error:
a:5:{i:0;s:94:"SQLSTATE[42S22]: Column not found: 1054 Unknown column 'qa_post_relation.product_id' in 'where clause'";i:1;s:2016:"#0 C:\xampp\htdocs\ahw\lib\Varien\Db\Statement\Pdo\Mysql.php(110): Zend_Db_Statement_Pdo->_execute(Array)
#1 C:\xampp\htdocs\ahw\lib\Zend\Db\Statement.php(300): Varien_Db_Statement_Pdo_Mysql->_execute(Array)
#2 C:\xampp\htdocs\ahw\lib\Zend\Db\Adapter\Abstract.php(479): Zend_Db_Statement->execute(Array)
#3 C:\xampp\htdocs\ahw\lib\Zend\Db\Adapter\Pdo\Abstract.php(238): Zend_Db_Adapter_Abstract->query('SELECT `main_ta...', Array)
#4 C:\xampp\htdocs\ahw\lib\Varien\Db\Adapter\Pdo\Mysql.php(419): Zend_Db_Adapter_Pdo_Abstract->query('SELECT `main_ta...', Array)
#5 C:\xampp\htdocs\ahw\lib\Zend\Db\Adapter\Abstract.php(734): Varien_Db_Adapter_Pdo_Mysql->query('SELECT `main_ta...', Array)
#6 C:\xampp\htdocs\ahw\lib\Varien\Data\Collection\Db.php(734): Zend_Db_Adapter_Abstract->fetchAll('SELECT `main_ta...', Array)
#7 C:\xampp\htdocs\ahw\app\code\core\Mage\Core\Model\Resource\Db\Collection\Abstract.php(521): Varien_Data_Collection_Db->_fetchAll('SELECT `main_ta...', Array)
#8 C:\xampp\htdocs\ahw\lib\Varien\Data\Collection\Db.php(566): Mage_Core_Model_Resource_Db_Collection_Abstract->getData()
#9 C:\xampp\htdocs\ahw\lib\Varien\Data\Collection.php(741): Varien_Data_Collection_Db->load()
#10 C:\xampp\htdocs\ahw\app\code\local\AJW\Qa\controllers\IndexController.php(25): Varien_Data_Collection->getIterator()
#11 C:\xampp\htdocs\ahw\app\code\core\Mage\Core\Controller\Varien\Action.php(419): AJW_Qa_IndexController->indexAction()
#12 C:\xampp\htdocs\ahw\app\code\core\Mage\Core\Controller\Varien\Router\Standard.php(250): Mage_Core_Controller_Varien_Action->dispatch('index')
#13 C:\xampp\htdocs\ahw\app\code\core\Mage\Core\Controller\Varien\Front.php(176): Mage_Core_Controller_Varien_Router_Standard->match(Object(Mage_Core_Controller_Request_Http))
#14 C:\xampp\htdocs\ahw\app\code\core\Mage\Core\Model\App.php(354): Mage_Core_Controller_Varien_Front->dispatch()
#15 C:\xampp\htdocs\ahw\app\Mage.php(683): Mage_Core_Model_App->run(Array)
#16 C:\xampp\htdocs\ahw\index.php(87): Mage::run('', 'store')
#17 {main}";s:3:"url";s:5:"/qas/";s:11:"script_name";s:10:"/index.php";s:4:"skin";s:7:"default";}
The Query: SELECT main_table.* FROM qa_posts AS main_table WHERE (qa_post_relation.product_id = '1')
Fails Unknown column 'qa_post_relation.product_id' in 'where clause'
However:
SELECT *
FROM qa_posts_relation
WHERE product_id =1
returns one result
I'm stumped beating my wife (jk) and stuff.
I figured it out, it was missing a join between the "relation" and the main table.
public function joinRel()
{
$this->setFlag('relation', true);
$this->getSelect()->joinLeft(
array('relation' => $this->getTable('qa/relation')),
'main_table.postid=relation.post_id'
);
return $this;
}

Resources