Historic reason for using periods in version numbers? - versioning

Is there a historic reason that periods are used instead of any other separator for software versions?
One of our products was previously version 3.5, and now it's 3.08 -- I'm sure this was management saying that putting a leading zero would make it less confusing for our customers once we hit 3.10. But as a software developer, version 3.08 looks strange to me.
If we didn't use periods, the difference between version 3:9 and 3:10 or 3-9 to 3-10 would be more apparent, because it wouldn't be read as a decimal number. Moreover, to someone who is generally unfamiliar with software versioning, the decimal number seems to imply that version 3.5 is halfway to the next major release, when in reality we can't make any assumptions about the number of minor releases until the next major release.
I understand that now we typically use periods as a convention because that's what everyone else is doing - but was there a reason for using periods in the first place?

As DVK suggested, it almost certainly derives from SCCS, the original Source Code Control System. The numbers it used were 1.1, 1.2, ... 3.14, 3.15, ... etc.
If you want a deeper reason than that, you might want to ask Marc Rochkind (created SCCS).
Edit: okay, I emailed Marc Rochkind myself, and he said:
I think this started by analogy with decimal numbers. Version 1, version 2, version 2.1, etc., etc. Then adding more decimals, which makes no mathematical sense at all, but it's just a string anyway.
I don't think it originated with SCCS. I think this scheme was already in use by 1972 when I first started work on SCCS, so for us at Bell Labs it would have been the normal thing. So it's "earlier convention that SCCS used as its own inspiration".
... So, I wonder if ALGOL had been coded to use the European convention for the radix point, if we would all be using commas for our version separators instead ...

My guess is it has something to do with early operating system naming conventions. The first thing you want to do when you have a second version, is label any files and directories that are specific to that version.
Looking at Wikipedia, "/", "\", ":" and even "%" and "#" have implications to the location of the file, and would therefore be problematic as file names, particularly in a fairly primitive operating system.
"-", "_" and "." are all regularly used in filenames, so they'd be available for version naming.
But "-" has been used in date formats for a long time.
I'd actually argue that the model of decimal isn't such a bad one. While it does suggest that 1.5 is halfway between 1 and 2 -- it also suggests that the 1.1 version is not not as big a change from the 1.0 version as the 2.0 version will be. And it makes it possible to point out noticeable shifts in the baseline.

I'm not certain of the exact reason, but one possible influence may have been the versioning imposed by code repository systems (such as RCS/CVS) - which of course find the numbers much easier to manipulate than strings.
In addition, whoever came up with using decimal notation probably wasn't thinking at the time of either greater-than-nine subversions, or sub-subversioning. Those two limitations aside, decimal notation does serve as a decently intuitive approximation to software version's status.

The only interesting thing I find is this part of the Wikipedia entry about Software verisonning, which states (quoting) :
When printed, the sequences may be
separated with characters. The choice
of characters and their usage varies
by scheme. The following list shows
hypothetical examples of separation
schemes for the same release (the
thirteenth third-level revision to the
fourth second-level revision to the
second first-level revision):
A scheme may use the same character between all sequences: 2.4.13, 2/4/13,
2-4-13
A scheme choice of which sequences to separate may be inconsistent,
separating some sequences but not
others: 2.413
A scheme's choice of characters may be inconsistent within the same
identifier: 2.4_13
When a period is used to separate
sequences, it does not represent a
decimal point, and the sequences do
not have positional significance. An
identifier of 2.5, for instance, is
not "two and a half" or "half way to
version three", it is the fifth
second-level revision of the second
first-level revision, and would not be
appropriate[citation needed] unless
there had been a 2.1, 2.2, 2.3, and
2.4.

You can also wonder why decimal point is the sign used to terminate sentences.08 How confusing!

It's not a decimal point. It's just a version seperator.
People on continental europe still use a period for version seperation.

That's not a decimal point, just a separator. Why they used this symbol is unknown and of little interest as a programming question...
But too much people think this is a decimal point, leading to confusion. But what about Firefox 3.0.13 or whatever 1.9.0.5213? Major number, minor number, revision number and build number are not uncommon (at least in the Microsoft world...).

I think it's easier to just come up with a new number than a new name. Even Microsoft has returned to their regular numbering scheme again, going from Windows 2, 3, 3.1, 3.11 to 95, 98, ME, 2000, XP, Vista and now back to 7.
Besides, using letter codes might result in unwanted associations. E.g. we have Windows CE, ME and NT which are three different Windows system that were all operational at about the same moment. (Just put the letters together.) At least, with numbers you don't have the risk of accidently spelling out some strange words. (Then again, Borland/Codegear/Embarcadero did skip version 13 of the Delphi RAD studio.) People also tend to avoid version numbers like 6.66 or 6.6.6 for some devilish reasons...

Related

What does the 4th number mean in Java 9's version string scheme?

According to this blog on Java 9's new version string scheme, the version is supposed to be like MAJOR.MINOR.SECURITY, i.e., there are supposed to be 3 numbers and 2 periods in between.
However, with Azul's Zulu 9, when I print the Java version, it has 4 numbers and 3 periods:
./jdk/bin/java -version
openjdk version "9.0.0.15"
OpenJDK Runtime Environment (Zulu build 9.0.0.15+181)
OpenJDK 64-Bit Server VM (Zulu build 9.0.0.15+181, mixed mode)
What do the 4 numbers represent ?
That blog posting is a bit out of date. The actually implemented scheme in Java 9 is documented in JEP 223: New Version-String Scheme
The meaning of the first three numbers is standardized. The meaning of the 4th and (any) subsequent numbers are left to the vendor to specify.
Note also the interesting relationship between the 2nd and 3rd numbers.
Here are the relevant parts of the JEP.
"The sequence may be of arbitrary length but the first three elements are assigned specific meanings, as follows:
$MAJOR.$MINOR.$SECURITY
$MAJOR - The major version number, incremented for a major release that contains significant new features as specified in a new edition of the Java SE Platform Specification, e.g., JSR 337 for Java SE 8. Features may be removed in a major release, given advance notice at least one major release ahead of time, and incompatible changes may be made when justified. The $MAJOR version number of JDK 8 is 8; the $MAJOR version number of JDK 9 is 9. When $MAJOR is incremented, all subsequent elements are removed.
$MINOR - The minor version number, incremented for a minor update release that may contain compatible bug fixes, revisions to standard APIs mandated by a Maintenance Release of the relevant Platform Specification, and implementation features outside the scope of that Specification such as new JDK-specific APIs, additional service providers, new garbage collectors, and ports to new hardware architectures.
$SECURITY - The security level, incremented for a security-update release that contains critical fixes including those necessary to improve security. $SECURITY is not reset to zero when $MINOR is incremented. A higher value of $SECURITY for a given $MAJOR value, therefore, always indicates a more secure release, regardless of the value of $MINOR.
The fourth and later elements of a version number are free for use by downstream consumers of the JDK code base. Such a consumer may, e.g., use the fourth element to identify patch releases which contain a small number of critical non-security fixes in addition to the security fixes in the corresponding security release.
i.e., there are supposed to be 3 numbers and 2 periods in between.
Not necessarily and you can validate the versions using the JDK itself as detailed below.
In addition to the JEP which holds true as linked by #Stephen in the other answer, there has been an API addition to the JDK as well for the Runtime.Version which can be used to validate a given version string. This can be done using a sample stub as :
[I wonder using JShell could be interesting here, no IDEs!]
Runtime.Version version = Runtime.Version.parse("9");
version = Runtime.Version.parse("9.0.1");
version = Runtime.Version.parse("9.0.0.15");
version = Runtime.Version.parse("9.0.0.15+181");
The code makes use of the Version.parse that
Parses the given string as a valid version string containing a version
number followed by pre-release and build information.
and can be further used(primarily) to get information like major, minor, pre-release and security number of the (runtime) version.

SAT solving with more than 2^32 clauses

I'm trying to solve a big CNF formula using a SAT solver. The formula (in DIMACS format) has 4,697,898,048 = 2^32 + 402,930,752 clauses, and all SAT solvers I could find are having trouble with it:
(P)lingeling reports that there are "too many clauses" (i.e. more clauses than the header line specifies, but this is not the case)
CryptoMiniSat4 & picosat claim to read the header line as saying 402,930,752 clauses which are 2^32 too few
Glucose seems to parse 98,916,961 clauses and then reports to have solved the formula as UNSAT using simplification, but this is
unlikely to be correct (an initial segment of the formula this short
is very likely to be SAT).
Is anyone aware of a SAT solver that can handle files this large? Or is there something like a compiler switch that can sidestep this sort of behaviour? I believe all solvers are compiled for 64bit linux. (I'm a bit of a noob when it comes to handling numbers this big, sorry.)
I'm the developer of CryptoMiniSat. In most cases where the CNF is so huge, the issue is not the SAT solver but that the translation of the original problem into CNF wasn't done carefully enough. I assume you didn't write that CNF by hand -- you had a problem which you translated to CNF using an automated tool.
The act of translating a problem into CNF is called encoding and it has a huge literature in academia. It's a whole topic to itself, or more appropriately, whole topics to themselves. Please see the research papers on Constraint Programming (CP), Pseudo-boolean constraints (PB), ANF-to-CNF translation techniques (see crypo workshops/conferences) and electronic circuit encoding (search for AIG, Tseitin encoding and its variants and look at the references). These are the big topics but there are many others. Taking a peek at these will reduce your CNF by at least 3 orders of magnitude, probably more.

Why are version numbers typically "2.1.3" instead of 2.13?

Why are version numbers typically "2.1.3" instead of 2.13?
Seems like the latter makes more sense, since you can run numerical comparators over it.
For the same reason that when the version number is encoded into an integer (e.g. Python's sys.hexversion), it's padded with zeroes:
You frequently have to go beyond 10.
Many projects adopt a major.minor.bugfix scheme (e.g. Semantic Versioning). Maybe version 2.1.9 has a security hole which need to patch; you'd need to call it 2.1.10 (because calling it 2.2.0 implies new features and possible minor incompatibilities). Maybe version 3 completely changes the syntax so you want to continue to add features to version 2.
Maybe your project simply releases so frequently that you have more than 100 minor/bugfix versions per major version (kernel.org lists 2.6.34.14 and 3.0.60).
Finally, it's a string. Sure, you can parse it into a double for comparison purposes, but plenty of languages/libraries support "numeric" string comparisons (so "Document 9" comes before "Document 10"); Apache's mod_autoindex even calls it "VersionSort".
The three level versioning scheme typically reflects major.minor.build (or major.minor.revision).
The three levels allow for decreasing significance of the changes between levels. The difference between software with major 1, minor 13 and major 1, minor 12 should be signficantly greater than the difference between major 1, minor 1, build 3 and major 1, minor 1, build 2.
You can split the version numbers on the . and compare each level individually.
Even there is only one decimal point, you cannot use numerical comparator over it.
Here is the example.
Consider version 1.13 and 1.2, obviously 1.13 is the later version. But 1.2 > 1.13.

What does number of lines of code tell you about your application?

Recently, we were asked to find the lines of code in our application by our managers. I have actually been pondering since then; what does this metric signify ?
Is it to measure the average lines of code the developer has written over the time ?
IF no re-factoring happens then this can be a possibility.
Does it tell how good is your application ?
Does it help one in marketing the product ?
I don't know how does it help. Can some one please guide me in the right direction or answer what does this metric signify ?
Thanks.
Something I found recently http://folklore.org/StoryView.py?project=Macintosh&story=Negative_2000_Lines_Of_Code.txt&sub=HN0
The number of lines of code is a popular but a problematic metrics.
Advantages
Number of lines of code shows a moderate (0.4-0.5) correlation with the number of bugs [Rosenberg 1997, Zhang 2009], i.e., larger modules usually have more bugs, and which might be more interesting, more bugs per line [Fenton and Ohlsson 2000, Zhang 2009]. I would like to stress that there are better (but more complex) ways to predict the number of bugs.
Number of lines of code can be used to predict the development effort, i.e., there are effort prediction models (e.g., COCOMO) that take the number of source lines of code as one of the input parameters.
Some of the more complex OO-metrics show strong correlation with class size [El Emam et al. 2001].
Disadvantages
Using lines of code as a productivity measure is extremely problematic since it becomes difficult to compare modules in different languages or written by different developers. Indeed, some languages are more verbose due to, e.g., presence/absence of “built-in” functionality or structural verbosity (e.g., .h in C). Moreover, as already mentioned above, some developers are paid per line of code which necessarily leads to ridiculously complicated code. Finally, code generation should be taken into account.
While "lines of code" is a common metrics, one has to be careful with distinguishing different kinds of "lines of code": with blank lines or without, with comments or without, counting logical statements of physical lines...
What does number of lines of code tell you about your application?
The number of lines of code will tell you roughly how much disk space you need to store the uncompressed source files. Even this is rough, as each line will have a different number of characters and different encodings could be used (UTF-8 takes twice the disk space of Latin-1).
Is it to measure the average lines of code the developer has written over the time ?
No.
Does it tell how good is your application ?
No.
Does it help one in marketing the product ?
No.
It signifies that your managers are incompetent
If you were being measured by number of lines of code, as a developer what would you do to achieve the target...
Google for this metric, it will tell you it's the dumbest strategy since Adolf decided to win the war in Europe by invading Russia.

C style guide tips for a <80 char line

I can't find many recommendations/style guides for
C that mention how to split up lines in C so you
have less then 80 characters per line.
About the only thing I can find is PEP 7,
the style guide for the main Python implmentation
(CPython).
Does a link exist to a comprehensive C style guide
which includes recommendations for wrapping?
Or failing that, at least some good personal advive
on the matter?
P.S.: What do you do with really_long_variable_names_that_go_on_forever
(besides shortening)? Do you put them on the left edge or let it spill?
Here is Linus'original article about (linux) kernel coding style. The document probably evolved since, it is part of the source distribution.
You can have a look at the GNU Coding Standards which covers much more than coding style, but are pretty interesting nonetheless.
The 80 characters per line "rule" is obsolete.
http://richarddingwall.name/2008/05/31/is-the-80-character-line-limit-still-relevant/
http://en.wikipedia.org/wiki/Characters_per_line
http://news.ycombinator.com/item?id=180949
We don't use punched cards much anymore. We have huge displays with great resolutions that will only get larger as time goes on (obviously hand-helds, tablets, and netbooks are big part of modern computing, but I think most of us are coding on desktops and laptops, and even laptops have big displays these days).
Here are the rules that I feel we should consider:
One line of code does one thing.
One line of code is written as one line of code.
In other words, make each line as simple as possible and do not split a logical line into several physical lines. The first part of the rule helps to ensure reasonable brevity so that conforming to the second part is not burdensome.
Some people believe that certain languages encourage complex "one-liners." Perl is an example of a language that is considered by some to be a "write once, read never" language, but you know what? If you don't write obfuscated Perl, if instead you do one thing per line, Perl code can be just as manageable as anything else... ok, maybe not APL ;)
Besides complex one-liners, another drawback that I see with conforming to some artificial character limit is the shortening of identifiers to conform to the rule. Descriptive identifiers that are devoid of abbreviations and acronyms are often clearer than shortened alternatives. Clear identifiers move us that much closer to literate programming.
Perhaps the best "modern" argument that I've heard for keeping the 80, or some other value, character limit is "side-by-side" comparison of code. Side-by-side comparison is useful for comparing different versions of the same source file, as in source code version control system merge operations. Personally, I've noticed that if I abide by the rules I've suggested, the majority of my lines of code are sufficiently short to view them in their entirety when two source files (or even three, for three-way merges) are viewed side-by-side on a modern display. Sure, some of them overrun the viewport. In such cases, I just scroll a little bit if I need to see more. Also, modern comparison tools can easily tell you which lines are different, so you know which lines you should be looking at. If your tooling tells you that there's no reason to scroll, then there's no reason to scroll.
I think the old recommendation of 80 chars per line comes from a time when monitors were 80x25, nowadays 128 or more should be fine.

Resources