Why are the errors reported by Coverity with version 7.7 and 2020.12 different for same code-base? - static

I am working on migration of Coverity server where Coverity has versions 7.7 and 2020.12 on old and new servers.
I can see that Coverity is reporting different errors for same code-base on these 2 Coverity servers. Most of the errors are common for both the servers, but there some errors which are not reported by latest version of Coverity (v 2020.12) but reported earlier (by Coverity version 7.7), and vice-versa.
How to decide if the new Coverity server is reporting legitimate errors?
Also want to know what kind of differences we can see if coverity version is different and what is the reason for such differences?

Q1. How do we decide if the new Coverity is reporting legitimate errors?
You have to inspect the differences individually to see if each one is a false positive (incorrect) or true positive (correct). If there are too many to do so in a reasonable time, sample them at random to get a statistical estimate. Be aware that accurately inspecting static analysis findings is time-consuming; I usually budget 5-10 minutes for each one.
Q2. What kind of differences we can see if the Coverity version is different?
It's not really possible to state any limit on the qualitative kinds of differences that may appear.
Quantitatively, historically at least, the Coverity developers attempt to ensure that there is no more than 5% "churn", where churn is defined as:
new false positives + lost true positives
-----------------------------------------
number of old findings
This 5% rule is applied to successive major releases. The churn when upgrading past multiple major releases can be higher.
There is no bound on the number of new true positives or lost false positives.
Q3. What is the reason for such differences?
The Coverity tool is continually being developed, with the intention (among other goals) of finding more true positives and fewer false positives. To the extent this intention is realized, the new results should be "better" (overall more accurate) than the old.
However, as with any piece of software, mistakes may be made (or intentional tradeoffs made) that result in less accurate results in some cases.
Aside from these general statements, the release notes may have additional details about what has changed.

Related

How to deploy simultaneously several critical bugs with Gitflow process as soon as possible?

So we use the Gitflow process for a safe and sound CID
(https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow)
But something's wrong.
Let's say we have 10 critical bugs on prod - they must be fixed asap.
So 10 bugs are assigned to 10 devs. Dev 1 creates a hotfix 0.0.1, commits the fix to it, fix gets uploaded to staging env and handed to QA for verifying. In that time dev 2 wants to create hotfix 0.0.2 for his bug, but he can't as Gitflow forbids creating new hotfix if one exists already.
So he either have to wait for closing hotfix/0.0.1 or commit to the same hotfix. Of course the normal choice is to commit to that hotfix/0.0.1, because his bug is critical and cant wait fix for bug 1.
So does every other 10 devs with their 10 critical bugs.
QA verifies bug 1 and confirm for deployment. But hotfix can't be closed and deployed, because the other 9 bugs are not tested or just not working
So to be closed, every bug in that hotfix must be working and confirmed.
In our case this takes time, a lot of time - 1,2 weeks, luckily - as more and more things needs to be fixed asap on production. So devs commit to that branch more and more, getting the first bugs to be put on hold even more.
So eventually, after all those bugs are fixed and confirmed it is time to finally close the hotfix and deploy those critical bugs on prod...with quite of a big time delay. But there is also another big problem - THE MERGE with dev branch, where other devs do their work (like minor bugfixings for next release). A horrible, several-hours lasting merge.
So we obviously are doing something wrong - what could be a solution for this? For our hotfixes to be on time and to not have terrible merges to dev.
One solution we were thinking, is to put a limit in the hotfix for the number of bugs which it should contain - for example 5 bugs max, other bugs must wait until those are fixed. This, though, means that only lead-dev should commit to hotfix, as he must ensure the limit (it is hard to control otherwise). But this way bugs are still put on hold and fast-deployment is not achieved.
What would be the normal procedure? Please share your ideas.
If you are spending several hours on a merge, then you are probably doing it wrong. One approach is to get a version control expert in for a day to train your team. That is a relatively small cost compared to salaries. I should think you could find a very good person for £500 ($600 US) per day, and you might only need them for a day or two.
I wonder also whether your testing (with a QA team) is too manual. Can bugfixes be accompanied by unit tests to prove they are an improvement? If so, the developer should be able to merge master into their bugfix branch, spin up a new instance for some simple QA checking, get QA to check the system, get a team lead to PR the fix and unit tests, then merge straight into master and redeploy on its own.
Also, make sure your deployment to live is fully automated. Doing several live (small) releases per day is a good target to have.
Updates
The advice above still stands, but since you have now said this is an old project:
Get some automated tests in place. Try to set up functional/browser tests for the bulk of the project, so you can have confidence in changes.
Don't reject unit tests out of hand. Perhaps they could be just for new code or new changes, at least to start with? Yes, the project is old, but don't let yourself off the hook easily. Unit tests will pay dividends in the long run. If you don't have dependency injection in the project, get that added, even if it is not used for all instantiation immediately.
Repeated from above: make your releases smaller.

Versioning with SemVer

I need a bit of help/advice with versioning with SemVer.
I'm working on a client's website who has sent several related amends both large and small to his site in a word document (like they always do).
I have a branch based off my master branch for these new amends, and have created commits for each completed amend I have done so far.
The idea was that I would complete all of the amends and then release them in the next release (v2.0.0) because I think all of these changes are related and all of them combined are significant enough to warrant a bump in version number.
The issue I have is that the client wants a few of these amends to be made live immediately, before the release of 2.0.0, so what would the best way of handling this be - would I upload these few completed amends into the existing version and increment the minor number, or would I bump it up to 2.0.0 even though all of the amends aren't complete?
I am a bit of a noob when it comes to versioning, but am trying to learn as best I can by reading and trying to make sense of Semantic Versioning site.
You should always consider these two things:
What the real changes are? If there are no visible changes, and/or if there are no major changes underneath, it may be better to avoid to increment the major version number.
What the customer should perceive of your changes? Saying version 1.1 or version 2.0 may make some difference in how the changes are perceived.
So if the modifications are limited and/or there are no visible things that have changed, it may make sense to increment the minor only and then wait for all of them to be complete to bump it to 2.0.0.

SQLite's test code to production code ratio

SQLite claim to have 679 times more test code than production one.
http://www.sqlite.org/testing.html
Does anyone knows how it is possible? Do they generate any test code automatically? What are the major parts of these "45678.3 KSLOC" of test code?
"Does anyone knows how it is possible?"
"It is possible" to have 679 times as much test code because a single feature can be used in many different ways. Consider just a single function that takes two parameters. I can generate alot of test code for that one function that tests boundary conditions and many other combinations of conditions. When you consider setup/teardown of the tests, there is additional code there. Depending on their testing framework this overhead may significantly add to the amount of code in testing.
What it really boils down to is the fact the a piece of software can be used in so many different ways, which means that you have many different scenarios to test for. This is the beauty of elegant software, in that a simple program can be applied to numerous scenarios, but that is the same thing that makes verifying and testing software so challenging.
It's presumably possible if the developers spent 679 times as much time writing test code as they spent writing production code. Just think: if they'd opted instead for 339 times as much test code, they could have had two entire database engines, each still with a ludicrous amount of test coverage.
I once watched a fellow developer trying to placate a furious customer about slipped deadlines by informing them that he had written 5 times as much test code as production code. The customer was not placated, if you can imagine. At least I don't think 5X coverage is extreme anymore.
It uses Tcl to power the test framework so it's much easier to write tests than it is to write the implementation. This encourages thorough testing, which is what you want in a database, yes? Moreover, a fair fraction of those tests are proprietary, aimed at testing in embedded environments; I imagine some corporate user (or users) paid for that sort of thing. It's also quite possible that the same feature is tested multiple times.
Looking at section 3.1 (OOM):
OOM testing is accomplished by
simulating OOM errors. SQLite allows
an application to substitute an
alternative malloc() implementation
using the
sqlite3_config(SQLITE_CONFIG_MALLOC,...)
interface. The TCL and TH3 test
harnesses are both capable of
inserting a modified version of
malloc() that can be rigged to fail
after a certain number of allocations.
These instrumented mallocs can be set
to fail only once and then start
working again, or to continue failing
after the first failure. OOM tests are
done in a loop. On the first iteration
of the loop, the instrumented malloc
is rigged to fail on the first
allocation. Then some SQLite operation
is carried out and checks are done to
make sure SQLite handled the OOM error
correctly. Then the time-to-failure
counter on the instrumented malloc is
increased by one and the test is
repeated. The loop continues until the
entire operation runs to completion
without ever encountering a simulated
OOM failure. Tests like this are run
twice, once with the instrumented
malloc set to fail only once, and
again with the instrumented malloc set
to fail continuously after the first
failure.
Note that section 7 explicitly states 100% core coverage as determined by gcov. I agree with Donal Fellows that the test framework is largely responsible for the test coverage beyond what a call graph would suggest. Its a much different thing to see malloc() entered nn times and write a test for it than it is to write dozens of tests geared to simulate environments where malloc() is likely to fail.
Yes, the resulting coverage is an artifact of diligence, however so is the selection of a test framework that enables that kind of diligence.
Finally, reiterating the obvious, malloc() takes only a single void pointer. This suggests that the tests written around it are by deliberate design, not automatically generated.

Does large SQL Server Memory Usage cause errors?

Just started getting a bunch of errors on our C# .Net app that seemed to be happening for no reason. Things like System.IndexOutOfRangeException on a SqlDataReader object for an index that should be returned and has been returning for a while now.
Anyways, I looked at the Task Manager and saw that sqlservr.exe was running at around 1,500,000 K Mem Usage. I am by no means a DBA, but that large usage of memory looked wrong to me on a Win Server 2003 R2 Enterprise with Intel Xeon 3.33Ghz with 4GB ram. So I restarted the SQL Server instance. After the restart, everything went back to normal. Errors suddenly stopped occurring. So does this large main memory usage eventually cause errors?
Additionally, I did a quick Google for high memory usage mssql. I found that if left to default settings; SQL Server can grow to be that large. Also, found a link to MS about How to adjust memory usage by using configuration options in SQL Server.
Question now is...how much main memory should SQL Server should be limited to?
I'd certainly be very surprised if it's the database itself, SQLServer is an extremely solid product - far better than anything in Office or Windows itself, and can generally be relied on absolutely and completely.
1.5Gb is nothing for a rdbms - and and all of them will just keep filling up their available buffers with cached data. Reads in core are typically 1000x or more faster than disk access, so using every scrap of memory available to it is optimal design. In fact if you look at any RDBMS design theory you'll see that the algorithms used to decide what to throw away from core are given considerable prominence as it makes a major impact on performance.
Most dedicated DB servers will be running with 4Gb memory (assuming 32bit) with 90% dedicated to SQL Server, so you are certainly not looking at any sort of edge condition here.
Your most likely problem by far is a coding error or structural issue (such as locking)
I do have one caveat though. Very (very, very - like twice in 10 years) occasionally I have seen SQL Server return page tear errors due to corruption in its database files, both times caused by an underlying intermittent hardware failure. As luck would have it on both occasions these were in pages holding the indexes and by dropping the index, repairing the database, backing up and restoring to a new disk I was able to recover without falling back to backups. I am uncertain as to how a page tear error would feed through to the C# API, but conceivably if you have a disk error which only manifests itself after core is full (i.e. it's somewhere on some swap space) then an index out of bounds error does seem like the sort of manifestation I would expect as a call could be returning junk - hence falling outside an array range.
There are a lot of different factors that can come into play as to what limit to set. Typically you want to limit it in a manner that will prevent it from using up too much of the ram on the system.
If the box is a dedicated SQL box, it isn't uncommon to set it to use 90% or so of the RAM on the box....
However, if it is a shared box that has other purposes, there might be other considerations.
how much main memory should MSSQL
should be limited to?
As much as you can give it, while ensuring that other system services can function properly. Yes, it's a vague answer, but on a dedicated DB box, MSSQL will be quite happy with 90% of the RAM or such. By design it will take as much RAM as it can.
1.5GB of 4.0GB is hardly taxing... One of our servers typically runs at 1.6GB of 2.5GB with no problems. I think I'd be more concerned if it wasn't using that much.
I don't mean to sound harsh but I wouldn't be so quick to blame the SQL Server for application errors. From my experience, every time I've tried to pass the buck on to SQL Server, it's bit me in the ass. It's usually sys admins or rogue queries that have brought our server to its knees.
There were several times where the solution to a slow running query was to restart the server instead of inspecting the query, which were almost always at fault. I know I personally rewrote about a dozen queries where the cost was well above 100.
This really sounds like a case of "'select' is broken" so I'm curious if you could find any improvements in your code.
SQL needs the ram that it is taking. If it was using 1.5 gigs, its using that for data cache, procedure cache, etc. Its generally better left alone - if you set a cap too low, you'll end up hurting performance. If its using 1.5 gigs on a 4 gig web box, i wouldn't call that abnormal at all.
Your errors could very likely have been caused by locking - i'd have a hard time saying that the SQL memory usage that you defined in the question was causing the errors you were getting.

If you were asked if a system could sustain double growth, what 3 things would you do to answer?

Let's say at your job your boss says,
That system over there, which has lost all institutional knowledge but seems to run pretty good right now, could we dump double the data in it and survive?
You're completely unfamiliar with the system.
It's in SQL Server 2000 (primarily a database app).
There's no test environment.
You might be able to hijack it on the weekends if you needed to run a benchmark.
What would be the 3 things you'd do to convince yourself and then your manager that you could take on that extra load. And if you couldn't do it, on the same hardware... the extra hardware (measured in dollars) it would take to satisfy that request.
To address the response from doofledorfer, you assumptions are almost all 180 degrees off. But that's my fault for an ambiguous question.
One of the main servers runs 7x24 at 70% base and spikes from there and no one knows what it is doing.
This isn't an issue of buy-in or whining... Our company may not have much of a choice in the matter.
Because this is being externally mandated, delays in implementation could result in huge fines. So large meeting to assess risk are almost impossible. There is one risk, that dumping double the data would take the system down for the existing customers.
I was hoping someone would say something like, see if you take the system off line Sunday night at midnight and run SQLIO tests to see how close the storage subsystem is to saturation. Things like that.
Set up a test environment, even if I have to do it on my laptop.
Enable some kind of logging on the production system to get an idea of the volume of transactions in addition to the volume of data.
Read the source code as I run stress tests on my laptop with increasing amounts of data.
Having said that, I sympathize with this assignment, because it's unfair. It's like asking someone in a boat if the boat can float with twice the cargo -- but you can't get out of the boat or take it out of its regular service.
You've just described a typical Agile project. Your answer should be:
I don't know, and I won't be able to tell without testing.
In addition to data volume, there might be issues with usage patterns, application interactions, database and server tuning, etc.
So let's work through a basic list of risk factors, and how we might resolve them.
Once we've done that, let's work through them in inverse order of risk; and make a stop/continue decision as we develop the results.
etc.
Without management buy-in and participation at least at that level, any other answer you might give is high-risk wishing, and "3 most important" is a non sequitur.
I'd be optimistic unless your current system is substantially loaded already. Most servers should run at less than 50% capacity on all resources, or else be on life-support.
And I expect you wouldn't be having the conversation if the existing server were already dealing with load issues; although "seems to run pretty good right now" is imprecise enough to be worrisome.
it mostly depends on its current level. If doubling is going from 2GB to 4GB just do it. If it's going from 1TB to 2TB you've got some planning to do.
I'd collect some info using Performance Monitor and provide it to help make an educated decision.
It depends what you mean by "double the data".
If that is going to affect one table only (say product table) then you are probably safe as most queries that are referring to that one table are most likely to double the time of execution (that assumes that you do not reference the same time twice in a query).
The problem will arise if you double the amount of data in all the tables as the execution time may grow in exponential fashion then and it can lead to some serious issues.
But in general I would support the answer by doofledorfer

Resources