GWAN Floating point exception - c

When trying to run GWAN on Ubuntu 12.04 LTS I sometimes get the "Floating point exception" error. Sometimes it will happen many times in a row, and it can start and run fine a few times in a row. But it always happens now and then, seems to be random..

Nagi's questions above are relevant: it would help to know under which conditions you are using G-WAN.
The link Nagi provided is also relevant but if you were facing a script error then the crash would be constant.
There's another known case which may lead to an erratic floating point error at startup, and that's hypervisors.
You are most probably experiencing the second case, for which workarounds have been implemented in May last year (G-WAN v4.5+).
Taking place between the hardware and the OS, many hypervisors are breaking the CPU description reported by the Linux kernel, and they also happen to break CPU features(!) and system features like timers, memory allocation, etc.
In short, you are likely to use G-WAN v4.3.28 and need to upgrade to a more recent release. We give more recent releases to registered users and people who contribute to the project with code, ideas, etc.
The next public release will be make available after our G-WAN-based Cloud services are shipped (later this year).

Related

Can bad C code cause a Blue Screen of Death?

I am a new coder in c, recently moved over from python, but still like to challenge myself with fairly ambitious projects (like a chess program), and have found that my computer suffers an unusual number of BSODs, both when I am running a program and not (admittedly, attempting to use the entirety of my memory as a hash table may not have been the greatest idea).
So my question is, are these most likely caused by my crappy c code, or is it more likely that my 3 year old, overworked laptop is the culprit?
If it could be the code, what are the big things I should avoid doing so as to prevent this?
BSOD usually contains some information as to what caused it.
What information it contains, and how exactly it is displayed depends on the version of Windows you are running.
As can be seen from the list here:
https://hetmanrecovery.com/recovery_news/bsod-errors
Most BSOD errors come from device / driver / kernel code, and not from your typical userland program.
That said, it might be possible to trigger BSOD if your code uses particularly low level windows API, especially if you run it with administrator privileges.
Note, that simply filling up memory will result in allocations for your program failing, and possibly your program, but not the whole OS crashing.
Also, windows does place limits on how much an individual process can allocate.
One final note:
"3 year old laptop" does not provide enough information to tell anything about your hardware, since there are different tiers of laptops available, and some of the high end 3 year old ones will still be better performing then a mid tier one bought yesterday.
As a troubleshooting measure, I would recommend backing up your data, making a clean install of your OS (aka "format the machine"), then making sure all your drivers are up to date.
You may also want to try hardware diagnostic tools, such as memtes86, check SMART on your storage, etc.
It's not supposed to be possible for anything you do in an ordinary "user space" program to crash the whole computer. Something else must also be wrong. Here are some possibilities:
If you are making the computer do CPU- and RAM-intensive work for long periods, you may stress the hardware to the point where a marginally defective component fails. Usually it's either the RAM, the power supply, or the cooling fans at fault.
Make sure your power supply is rated for all of the kit you have, running simultaneously. Make sure you have enough airflow for the amount of heat you're generating; check for dust-clogged heatsinks and fans that aren't actually spinning. If you have more than one RAM stick, take one out at a time and see if that makes the problem disappear.
I'd like to tell you to get error-correcting RAM if you don't have it already, but for infuriating market differentiation reasons you'd have to replace the motherboard and CPU as well. It's still worth doing, in the long run, but it amounts to replacing the whole computer.
You may be tickling a bug in the OS or the drivers. The most probable culprit is the GPU driver, particularly if your program does anything graphical. Regrettably, all you can do about this is make sure you're fully patched up.

Debug printf hanging kernel — possible causes and solutions?

There are numerous parts in kernel where putting printf can lead to hanging of kernel (especially at early boot). (On a side note, it comes from my experience the same situation can have place with printk at early stages of boot of Linux). It can have numerous reasons, but sometimes reason is not obvious. You just put printf in a function that there is no way it is executed more than once during boot (or at least just a few times, not hundreds of times). It could still have some timing issues?
What are the typical causes of kernel boot hanging by adding printf in code?
As far as it comes for solution, you can probably create your own data structure and push your "stack trace" there and then print it at some point. But finding proper point to print it could be problematic. Can there be any possible policy to apply in such problematic early stages of boot? Or we are doomed to do detailed step by step analysis every time we encounter such a problem?

CLion uses system memory excessively

I recently started to use CLion, on Windows 7 64-bit, for editing C files.
One thing that bothers me a lot is that it uses too much system memory. It doesn't cause out of memory error as asked in another question. Actually CLion shows much lesser memory consumption in IDE (~500 mb out of ~2000 mb) than it takes from system (~1000 mb). You can see a snapshot of the system memory usage and CLion's memory display below:
I use CLion not for C++ but for C projects. My project isn't that big (~5 c files < 300 lines and ~10 h files). I don't use it to compile the project, I just use it for editing. And during the snapshot there was no user program running by it. And CLion wasn't showing any processes running (indexing etc). It is a general behaviour.
I'm not sure if what I experience is something expected/normal, or it is caused because of my system setup, project settings or the way I use the IDE.
Is there any known causes for excessive memory usage? Can you suggest practices to decrease memory usage?
The post is 2 years old, but I am also having this issue with CLion 2018.1, and I imagine, others do, too. Some tips that worked for me:
Excluding directories from indexing.
Deleting source files I don't need.
Resolving a circular dependency between two classes. (Note: I can't vouch it was exactly that, because I tried several things at once, and it seems odd that such a powerful IDE would be affected by such an issue, but I can't rule it out.)
If it's really bad, the indexing can be paused. Guaranteed to reduce the memory usage. Of course, the intelligent completion won't work then.
Currently the RAM usage is stable at ~1 Gb with RocksDB, RapidJson, and ~50 classes.
UPDATE: tweaking clion64.exe.vmoptions reduced the consumption radically.
Same issue here. I haven't used CLion just sitting there so that I do not have to open again, 2 projects few files open, nothing major, still eating up +3GB is not something that I can accept, switching back to Sublime, that works fine, as others have mentioned I am using it only for editing/refactoring, compilation happens in Terminal.
(PyCharm has similar issues)
CLion need to index and support all information about the system headers to provide you smart completion, auto-import and symbol resolution. Your project is the smallest part of code base for analyzing.
I have heard about version 2020.3, which brings option to switch off refreshing files.
https://intellij-support.jetbrains.com/hc/en-us/community/posts/360007093580-How-to-disable-refreshing-files-after-build
Unfortunately I cannot try it out in my professional development environment.

What Can Cause a C Program to Crash Operating System

I recently found that a fairly large image manipulation program I'm writing in C on a Windows 8 machine has a bug when used in very particular circumstances. Unfortunately, the bug is causing my entire computer to come to a standstill so that my only option is to pull the plug on the computer (especially annoying when I'm working remotely...)
Because it's an image manipulation program, I can't just flood it with print statements to isolate the problematic section - the problem occurs somewhere in a loop that's called billions of times, so adding a printf slows it down to the point that it would take days to get to a failing iteration.
I understand, therefore, if this question is too broad, as it isn't really reasonable for me to put down all of the code that could cause my problem, I'm simply asking
What are the circumstances in which C code can, instead of seg faulting or halting the program, actually freeze the entire OS
When I search the problem, I see code golf questions like this
A C program which crashes the system(shuts down the system)
This is not what I'm asking - obviously I haven't written system("shutdown") anywhere in my loop.
Being most familiar with python and java, this problem is not what I'm used to, but in my experience,
Dividing by zero produces a seg fault
Accessing memory by accident that is slightly outside an intended array causes a seg fault (sometimes down the road a little)
Accessing protected memory causes the program to hang
Stack overflow causes a seg fault
Dereferencing a non-initialized pointer causes a seg fault
Is this impression false - could those cases cause the whole system to crash? What cases am I missing? Is it dependent on my version of gcc, or my permission status?
I haven't been able to try to reproduce it on a different operating system yet, as it requires a few dependencies to run the entire program.
If my only option is to sit for days waiting for the program to run with print statements, or avoid weird situations, then, of course, so be it. I'm looking for key places to look for the bug.
On modern systems with hardware-enforced privilege separation between user-mode and kernel-mode, and an operating system that functions to correctly configure these mechanisms, you simply cannot crash the system from a user mode process.
Any of those errors are trapped by the CPU, which call exception handlers in the OS which will quickly pull the plug on your system.
If I had to guess, a piece of hardware is overheating or malfunctioning:
Overheating CPU due to poor thermal conductivity with heatsink
Failing / under-sized power supply
Failing DIMMs
Failing hard drive
Failing CPU
Failing / overheating GPU
I've seen cryptocoin-mining software bring a system to its knees because it was pushing the limits of the GPU. When the card would lock-up/reset, the driver would get confused or lock-up, and the system would end up needed rebooted.
Your system is doing next to nothing when you're just sitting there browsing the web, etc. But if your system locks up when you start running a CPU-intensive application, it can bring out problems that you didn't know where there.
While this is a little out-of-place on Stack Overflow, it falls into one of those grey areas between hardware and software. I would stress-test your system, keeping an eye on CPU/GPU/memory temperatures, and power supply voltages. Check out MemTest86, Stresslinux.
The most trivial cause of OS freezing is "memory full". If you have processes that use a lot of memory, then your system is going to swap from main memory (typically RAM) to secondary memory (typically disk) which lead to a very huge overhead... As a user what you usually observe is a almost freezed computer, sometimes so freezed that you think it is crashed. If your OS is badly designed then it sometimes crashes!

Does attaching to a process make it behave differently?

While I am aware of the differences between debug and release builds, I am curious if attaching the debugger to a process (built release or debug) changes that processes behaviour?
For reference, I'm developing on HP 11.31 Itanium but still am curious for the general case.
http://en.wikipedia.org/wiki/Heisenbug#Heisenbug
Of course, attaching a debugger will change the timing (which can change e.g. thread race conditions), and also some system calls can detect if a debugger is attached.
It certainly can, depending on the platform and the method of debugging. For example, when debugging on Windows, there is actually the IsDebuggerPresent function. As noted that function can be circumvented, but then there are other means. So basically, it's complicated.
Yep, lots of things inside the Windows data structures change when a debugger is attached. It changes how memory is allocated/freed, it adds additional housekeeping code and "markers" on the stack (Ever noticed the F00D values in newly allocated memory) in fact many of the changes are used by anti-debuggers to detect if an application is being debugged.
In interpreted languages (Java, .NET) the runtime will often generate different machine instructions when running under a debugger to help it trap and display exceptions, show the original code, etc. It will usually generate unoptimized code as well when a debugger is attached.
Some of these changes affect the way the software behaves and can result complicate transient bugs that are caused by optimizations or extremely fine timinig dependencies.
Yes, I've often found that attaching a debugger to a process instantly makes bugs disappear, only to have them reappear when I compile my app in release mode. Unfortunately I usually can't really ask all my users to open a debugger just to run my app, so it can be quite frustrating.
Another thing to keep in mind is that for multithreaded apps attaching the debugger definitely can yield very different results. These are the kind of things referred to as "Heisenbugs."
Sure, in multithreaded apps, attaching a debugger can yield different result.
However, how about the codes which are not related to threads?
I have seen a release build, which has a debugger attached, doesn't have problems. But, when a debugger is not attached, it has problems.
If it is launched first and a debugger is attached to it, it also shows the same problems.

Resources