What is this `enum`ish notation in RFC on TLS? - c

This is from RFC 3749 (Transport Layer Security Protocol Compression Methods):
Compression Methods
TLS [2] includes the following compression method structure in
sections 6.1 and 7.4.1.2 and Appendix sections A.4.1 and A.6:
enum { null(0), (255) } CompressionMethod;
I'm not really familiar with C, but I know enough to mark it as resemblant of C enum. What I can't understand though, are the null(0) and (255) parts. I can't seem to find anywhere what parentheses and null would mean in this context.
(I seems hard to even come up with a (Google?) search pattern that would consist of something less ubiquitous than "rfc", "null", "c", "parentheses" and would lead me to other places than questions on "null function pointer" or the most fundamental basics.)
So what do these notations mean syntactically?
Why is 255 in parentheses?
Why null looks like a function call?
Is this even supposed to be C? Or is it a common notation shared across RFCs? And if it's C, is it specific to enum?
How is this different from enum { 0, 255 } CompressionMethod; or enum { NULL, 255 } CompressionMethod;?

You may be overreasoning a bit here :)
You should have quoted the lines that follow your quote:
which allows for later specification of up to 256 different
compression methods.
That already explains what the line means. Now, if you follow the [2] to the list of references, you'll notice it refers to RFC 2246. And that document contains the following paragraph:
4. Presentation language
This document deals with the formatting of data in an external
representation. The following very basic and somewhat casually
defined presentation syntax will be used. The syntax draws from
several sources in its structure. Although it resembles the
programming language "C" in its syntax and XDR [XDR] in both its
syntax and intent, it would be risky to draw too many parallels. The
purpose of this presentation language is to document TLS only, not to
have general application beyond that particular goal.
So, the authors of that RFC seem to have concocted a simple syntax from familiar elements to simplify the representation of the subject of the RFC, namely TLS. For enumerateds, they specify the language used in 4.5:
4.5. Enumerateds
An additional sparse data type is available called enum. A field of
type enum can only assume the values declared in the definition.
Each definition is a different type. Only enumerateds of the same
type may be assigned or compared. Every element of an enumerated must
be assigned a value, as demonstrated in the following example. Since
the elements of the enumerated are not ordered, they can be assigned
any unique value, in any order.
enum { e1(v1), e2(v2), ... , en(vn) [[, (n)]] } Te;
Enumerateds occupy as much space in the byte stream as would its
maximal defined ordinal value. The following definition would cause
one byte to be used to carry fields of type Color.
enum { red(3), blue(5), white(7) } Color;
One may optionally specify a value without its associated tag to
force the width definition without defining a superfluous element.
In the following example, Taste will consume two bytes in the data
stream but can only assume the values 1, 2 or 4.
enum { sweet(1), sour(2), bitter(4), (32000) } Taste;
The names of the elements of an enumeration are scoped within the
defined type. In the first example, a fully qualified reference to
the second element of the enumeration would be Color.blue. Such
qualification is not required if the target of the assignment is well
specified.
Color color = Color.blue; /* overspecified, legal */
Color color = blue; /* correct, type implicit */
For enumerateds that are never converted to external representation,
the numerical information may be omitted.
enum { low, medium, high } Amount;

What it's saying is CompressionMethod.null has the value 0 and then 255 slots are reserved:
which allows for later specification of up to 256 different
compression methods

(255) informs you the size of this field. So for encoding you know how many bytes you need. If it was (400) you would require 2 bytes to specify the 0 for compressionMethod.nullor 0x00 + 0x00. Because 255 can be represented as 1 byte you only need 0x00.
Essentially it lets you know the size of the enum field.

Related

Maximum of 2 INTEGERS?

What's the syntax for the maximum of 2 INTEGERS?
Eiffel documentation is so bad, literally could not find the answer to this simple question anywhere!
Or does it not exist and I have to use if-statements?
In most cases in Eiffel, the source is all the documentation you need. By right-clicking on a class' name in EiffelStudio, you can see its ancestor tree. You can also use the flat view feature to see all the inherited features of a class within the same editor.
Typically, INTEGER is an alias for INTEGER_32. INTEGER_32 inherits from COMPARABLE (through INTEGER_32_REF). COMPARABLE provides the max and min features. Their signature is
max (other: like Current): like Current
meaning all descendants of COMPARABLE take and return another value of the same type as themselves.
Therefore:
local
a, b, maximum: INTEGER
do
a := <some value>
b := <some value>
maximum := a.max(b) -- or b.max(a)
end
Eiffel has a unified type system, which means every type is defined as a class, even 'primitive' types that get special treatments in most other languages. INTEGER, CHARACTER, BOOLEAN, ARRAY and other such basic types thus come with a rich set of features you can consult in their own class files like you would with any other type. Since operators are defined as regular features too, this is also the way to figure out exactly what operators exist for any given class.

Variant UUID format

RFC 4122 defines UUID as having a specific format:
Timestamp: 8 bytes; Clock sequence and variant: 2 bytes; Node: 6 bytes
In my application, I use UUIDs for different types of entities, say foo, bar and baz. IOW, every object of type foo gets a UUID, so does every bar and baz. This results in a slew of UUIDs in the input and output (logs, CLI) of the application.
To validate the input and to make the output easier to handle, I am considering devoting one nibble (4 bits) to the type of the entity: foo, bar, baz, etc. For example, if that nibble were '5', it would indicate it is a UUID of a foo object. Each UUID would be self-describing in this scheme, can be validated on input and can be automatically prefixed with category names in the output, like "foo:5ace6432-..."
Is it a good idea to tweak the UUID format in this way? Randomness is not a source of worry, as we still have 118 bits (128 bits total - 6 bits used by the standard for variants - 4 bits used by me).
Where should I place this nibble? If I place it up front, it would overwrite a part of the timestamp. if I place it at the end, t would be less visible. If I place it somewhere in the middle, it will be even less visible. Is overwriting a part of the timestamp a problem?
Thanks.
UUIDs are 128 bit numbers, format is just for reading.
Is a bad practice try to interpret the internal parts. The UUID should be opaque: one single unit that identifies something. If you need other information about your entities, you need other fields.
In short: self-describing UUIDs is a bad idea.

Shouldn't these types be closely related?

I am trying to analyze the following file which is supposed to be VHDL-2008 compatible.
entity closely_related is
end;
architecture example of closely_related is
type integer_vector is array (natural range <>) of integer;
type real_vector is array (natural range <>) of real;
begin
process
variable int_vect: integer_vector(1 to 3);
variable real_vect: real_vector(1 to 3);
begin
real_vect := ( 1.0, 2.0, 3.0 );
int_vect := integer_vector( real_vect );
wait;
end process;
end;
This is supposed to be an experiment about closely related types. According to the LRM, there are two cases of closely related types:
— Abstract numeric types—Any abstract numeric type is closely related to any other abstract numeric
type.
— Array types—Two array types are closely related if and only if the types have the same
dimensionality and the element types are closely related
I understand that reals and integers are closely related; type conversion (aka type casting) between them works ok. Then why doesn't it work for the above array types?
GHDL gives the following error:
conversion not allowed between not closely related types
And Modelsim Altera 10.1e (with -2008 switch) is no better:
Illegal type conversion from std.STANDARD.REAL_VECTOR to std.STANDARD.INTEGER_VECTOR
(array element type difference).
Just to be thorough, I tried to do the same operation one element at a time:
int_vect(1) := integer( real_vect(1) );
int_vect(2) := integer( real_vect(2) );
int_vect(3) := integer( real_vect(3) );
And it works perfectly. Any ideas?
Shouldn't these types be closely related?
Not for ghdl which is strictly -1993 compliant by default.
This is from IEEE Std 1076-1993, 7.3.5 Type conversions:
A type conversion provides for explicit conversion between closely related types.
...
b. Array Types -- Two array types are closely related if and only if
-- The types have the same dimensionality;
-- For each index position, the index types are either the same or are closely related; and
-- The element types are the same.
...
No other types are closely related.
So the issue is that the element types are not the same.
In -2008, 9.3.6 Type conversions:
— Abstract numeric types—Any abstract numeric type is closely related to any other abstract numeric type.
— Array types — Two array types are closely related if and only if the types have the same dimensionality and the element types are closely related
Which tells us that any abstract numeric type (integer, real) are closely related and that array types are now closely related when their element types are closely related.
So it looks like either the Modelsim version you specifiy isn't compliant with the change or something stopped the invocation of your -2008 flag for it.
I don't have the -2008 versions of the libraries loaded for ghdl on my Mac. I wouldn't bet it would with the --std=08 flag, either.
I checked out the the latest ghdl source code from ghdl-updates on Sourceforge, The change implementing closely related array types with closely related elements has not been incorporated. See sem_names.adb lines 1024 - 1047.
When we find things that don't get implemented due to revisions in the standard like this it's generally because there isn't a test case that fails or passes when it should, and because there is no way to see changes between standards versions.
You need a diff PDF and a way of correlating requirements between various clauses and subclauses as well as determining whether a statement in standard is testable, and if so by what code. It's safe to say that copyright is getting in the way of implementation.
The thud factor (page count) of the -2008 standard also affects the likelihood compliance gotchas will occur.
Any ideas?
int_vect := integer_vector'(integer(real_vect(1)),integer(real_vect(2)), integer(real_vect(3)));

Multiple data types in bison/flex

I'm writing a bison/flex parser, with multiple data types, all compatible with ANSI C. It won't be a C language, but will retain its data types.
Thing is... I am not sure how to do this correctly.
For example, in an expression, say 'n1' + 'n2', if 'n1' is double and 'n2' is a 32 bit integer, I will need to do type conversion right? How to do it correctly?
i.e. I will logically need to evaluate which type is bigger (here it's double), then convert the int32 to double and then perform the add operation, which would result in a double of value n1 + n2.
I also want to provide support for type casting.
What's the best way to do it correctly? Is there a way to do it nicely or will I have to put a billion of conversion functions like uint32todouble, int32todouble, int32tolongdouble, int64tolongdouble, etc etc.
Thanks!
EDIT: I have been asked to clarify my question, so I will.
I agree this is not directly related to bison/flex, but I would like people experienced in this context to hint me.
Say I have such an operation in my own 'programming' language (i would say it's more scripting, but anyways) i.e. the one I would parse :
int64 b = 237847823435ll
int64 a = int64(82 + 3746.3746434 * 265.345 + b)
Here, the int64() pseudo-function is a type cast. First, we can see the 82 is an int constant, followed by 3746.3746434 and 265.345, and b is an int64. So when I do the operation at A, I will have to :
Change the type of 82 to double
Change the type of b to double
Do the calculations
Since we have a double and that we want to cast it to an int64, convert the double to an int64, and store the result in variable 'a'
As you see, it's quite lots of type changes... And I wonder how I can for example do them in the most elegant and less work possible. I'm talking about the internal implementation.
I could for example write things like :
int64_t double_to_int64(double k) {
return (int64_t) k; // make specific double to int64 conversion
}
For each of the types, so I'd have functions specific to each conversion, but it would take quite lots of time to achieve it and bsides it's an ugly way of doing things. Since some of the variables and number tokens in my parser/lexer are stored in buffers (for different reasons), I don't see really how I could find a way to convert from a type to another without doing such functions. Not to mention with all the unsigned/signed types, it will double the number of required functions.
Thanks
This has nothing to do with flex or bison. It is a language design question.
I suggest you have a look at the type promotion features of other languages. For example, C and Java promote byte, char, and short to int whenever used in an expression. So that cuts a lot of cackle straight away.
These operations are single instructions on the hardware. You don't need to write any functions at all; just generate the appropriate code. If you're designing an interpretive system, design the p-code accordingly.

When to use enums?

I'm currently reading about enums in C. I understand how they work, but can't figure out situations where they could be useful.
Can you give me some simple examples where the usage of enums is appropriate?
They're often used to group related values together:
enum errorcode {
EC_OK = 0,
EC_NOMEMORY,
EC_DISKSPACE,
EC_CONNECTIONBROKE,
EC_KEYBOARD,
EC_PBCK
};
Enums are just a way to declare constant values with more maintainability. The benefits include:
Compilers can automatically assign values when they are opaque.
They mitigate programmers doing bad things like int monday = SUNDAY + 1.
They make it easy to declare all of your related constants in a single spot.
Use them when you have a finite list of distinct, related values, as in the suits of a deck of cards. Avoid them when you have an effectively unbounded list or a list that could often change, as in the set of all car manufacturers.
Sometime you want to express something that is finite and discrete. An example from the GNU C Programming tutorial are compass directions.
enum compass_direction
{
north,
east,
south,
west
};
Another example, where the ability of enums to correspond to integers comes in handy, could be status codes.
Usually you start the OK code with 0, so it can be used in if constructs.
The concept behind an enum is also sometimes called an "enumerated type". That is to say, it's a type all of whose possible values are named and listed as part of the definition of the type.
C actually diverts from that a bit, since if a and b are values in an enum, then a|b is also a valid value of that type, regardless of whether it's listed and named or not. But you don't have to use that fact, you can use an enum just as an enumerated type.
They're appropriate whenever you want a variable whose possible values each represent one of a fixed list of things. They're also sometimes appropriate when you want to define a bunch of related compile-time constants, especially since in C (unlike C++), a const int is not a compile-time constant.
I think that their most useful properties are:
1) You can concisely define a set of distinct constants where you don't really care what the values are as long as they're different. typedef enum { red, green, blue } color; is better than:
#define red 0
#define green 1
#define blue 2
2) A programmer, on seeing a parameter / return value / variable of type color, knows what values are legal and what they mean (or anyway knows how to look that up). Better than making it an int but documenting "this parameter must be one of the color values defined in some header or other".
3) Your debugger, on seeing a variable of type color, may be able to do a reverse lookup, and give red as the value. Better than you looking that up yourself in the source.
4) The compiler, on seeing a switch of an expression of type color, might warn you if there's no default and any of the enumerated values is missing from the list of cases. That helps avoid errors when you're doing something different for each value of the enum.
Enums are primarily used because they are easier to read than integer numbers to programmers. For example, you can create an enum that represents dayy in a week.
enum DAY {Monday, Tuesday....}
Monday is equivalent to integer 0, Tuesday = Monday + 1 etc.
So instead of integers thta represent each week day, you can use a new type DAY to represent this same concept. It is easier to read for the programmer, but it means the same to the compiler.

Resources