ISO 9660 Level 1 compliant directory names - filesystems

I am confused about the exact limitations on folder names in an ISO 9660 (Level 1) compliant filesystem. I read through the wikipedia page and it says:
File names are limited to eight characters with a three-character extension, using upper case letters, numbers and underscore only. - wikipedia
When it says 'File Names' does it really mean file or folder names? if not then what are the restrictions on folder names?
Thanks!

You can get the original ISO 9660 standard and its 1987 revision from Ecma International. The precise text is
10.1 Level 1
At Level 1 the following restrictions shall apply:
Each file shall consist of only one File Section;
a File Name shall not contain more than 8 [characters];
a File Name Extension shall not contain more than 3 [characters];
a Directory Identifier shall not contain more than 8 [characters].
(I've elided the distinction between "d-characters" and "d1-characters" which is irrelevant here.)
So the answer to your question is, at level 1, file names are restricted to 8+3 characters a la DOS, but directory ("folder") names are restricted to eight characters with no extension (unlike DOS, if I remember correctly).
Note that the standard has always included Level 2, which allows 31-character filenames (but still, if I'm reading it right, with only one dot); Level 1 is only for interop with pre-VFAT DOS, and shouldn't be necessary in a CDROM mastered today. (The restrictions on the size and depth of a CDROM directory hierarchy are, unfortunately, still relevant.)

Related

variable names in C by Dennis Ritchie [duplicate]

When taken literally, it makes sense, but what exactly does it mean to be a significant character of a variable name?
I'm a beginning learner of C using K&R. Here's a direct quote from the book:
"At least the first 31 characters of an internal name are significant. For function names and external variables, the number may be less than 31, because external names may be used by assemblers and loaders over which the language has no control. For external names, the standard guarantees only for 6 characters and a single case."
By the way, what does it mean by "single case"?
Single Case usually means "lower case". Except in some OS's where it means "upper case". The point is that mixed case is not guaranteed to work.
abcdef
ABCDEF
differ only in case. This is not guaranteed to work.
The "Significance" issue is one of how many letters can be the same.
Let's say we only have 6 significant characters.
a_very_long_name
a_very_long_name_thats_too_similar
Look different, but the first 16 characters are the same. Since only 6 are significant, those are the same variable.
It means what you fear it means. For external names, the C standard at the time K&R 2nd ed. was written really does give only six case-insensitive characters! So you can't have afoobar and aFooBaz as independent entities.
This absurd limitation (which was to accommodate legacy linkers now long-gone) is no longer relevant to any environment much. The C99 standard offers 31 case-sensitive characters for external names and 63 internally, and commonly-used linkers in practice support much longer names.
It just means that if you have two variables named
abcdefghijklmnopqrstuvwxyz78901A,
and
abcdefghijklmnopqrstuvwxyz78901B,
that there is no guarantee that will be treated as different, separate variables...
It means that :
foobar1
foobar2
might be the same external name, because only the first 6 characters need be considered. The single case means that upper and lower case names need not be distinguished.
Please note that almost all modern linkers will consider much longer names, thogh there will still be a limit, dependent on the linker.
G'day,
One of the problems with this limited symbol resolution occurs at link time.
Multiple symbols with the same name can exist across several libraries and the link editor usually only takes the first one it finds that matches what it is looking for.
So, using S.Lott's example from above, if your link editor is searching for the symbol "a_very_long_name" and it finds a library on its search path that contains the symbol "a_very_long_name_thats_too_similar" it will take this one. This will happen even if the library that contains the symbol that you want, i.e. "a_very_long_name" has been specified in your command. For example specifying the libraries as:
-L/my/library/path -lmy_wrong_lib -lmy_correct_lib
There are now compiler options, or more correctly compile time options which are passed through to the link editor, which enforce a search for multiple symbols in your link path. These are then usually raised as errors at link time.
In addition, many compilers, e.g. gcc, will default to such behaviour. You have to explicitly enable multiple definitions to allow the link editor to proceed without raising a fatal error if it finds multiple definitions for a symbol.
BTW I'd highly recommend working through the exercises in conjunction with Clovis Tondo's book "The C Answer Book 2nd ed.".
Doing this really helps make C stick in your mind.
HTH
cheers,

K & R C Variable Names

I have some confusion for contents about variable names in K & R C. Original text as below:
At least the first 31 characters of an internal name are significant. For function names and external variables, the number may be less than 31, because external names may be used by assemblers and loaders over which the language has no control. For external names, the standard guarantees uniqueness only for 6 characters and a single case. Keywords like if, else, int, float, etc., are reserved: you can't use them as variable names. They must be in lower case.
It's wise to choose variable names that are related to the purpose of the variable, and that are unlikely to get mixed up typographically. We tend to use short names for local variables, especially loop indices, and longer names for external variables.
What confused me was the external names, the standard guarantees uniqueness only for 6 characters and a single case. Does it means that for external names, only the 6 leading chars are valid and remaining chars are all ignored? For example, we defined two external variable myexvar1 and myexvar2, the compiler will treat these two variables as one? If this is true, why they advise us to use longer names for external variables?
Does it means that for external names, only the 6 leading chars are valid and remaining chars are all ignored? For example, we defined two external variable myexvar1 and myexvar2, the compiler will treat these two variables as one?
Yes this was true in 1990. Or rather, 6 unique leading characters of external identifiers was what the C90 standard set as minimum limit for a compiler. This was of course madness - which is why this limit was increased to 31 in C99.
In practice, most C90 compilers had at least 31 unique characters for internal and external identifiers both.
If this is true, why they advise us to use longer names for external variables?
Not sure if they advise it. But the coding style used in K&R is often plain horrible, so it is definitely not a book you should consult for coding style advise.
In modern C, it is required (C17 5.2.4.1) that we have:
63 significant initial characters in an internal identifier or a macro name
31 significant initial characters in an external identifier
So don't worry too much about which limitations the dinosaurs faced, but follow modern standard C.
As pointed out in another answer, even the restriction of 31 significant initial characters for external identifiers is listed as obsolete, meaning this might get increased even further, to 255, in future standards.
Truth be told K&R is pretty old, so I assume things have changes since then.
I really don't know the reason why the give exactly 6 characters here:
For external names, the standard guarantees uniqueness only for 6 characters and a single case.
But you have to understand that all compiler does is translating a translation unit (usually a *.c file) into an object file (*.o). That's it. Compiler does not produce a ready to run program.
Those object files might contain references to unresolved symbols to be found in other object files as well as a table of their own external symbols, the ones they provide to be referenced from the outside. The symbols do have textual names, which are the names you've given to your external variables.
Linkers and dynamic loaders still have to do their jobs to build the program and get it running. Along the way the have to resolve all unresolved symbols, so they perform textual lookup for those symbols in object files. Linkers and loaders are not compiler. The might have their own rules about treating those names (back in the days of K&R, I guess). That's what this ...
because external names may be used by assemblers and loaders over which the language has no control.
... is about.
These days though all your K&R concerns sound outdated and irrelevant. Pick a newer standard to follow.
This is due to the historical background concerning the length of exported symbols to the linker of the system.
I quote from The New C Standard -- An Economic and Cultural Commentary.
The values of 6 and 10 were chosen so that the encodings \u1234 and
\U12345678 could be used.
The Fortran significant character limit of six was followed by many
suppliers of linkers for a long time. The need for longer identifiers
to support name mangling in C++ ensured that most modern linkers
support many more significant characters in an external identifier.
Common Implementations
Historically, the number of significant
characters in an external identifier was driven by the behavior of the
host vendor-supplied linker. Only since the success of MS-DOS have
developers become used to translator vendors supplying their own
linker. Previously, most linkers tended to be supplied by the hardware
vendor. The mainframe world tended to be driven by the requirements of
Fortran, which had six significant characters in an internal or
external identifier. In this environment it was not always possible to
replace the system linker by one supporting more significant
characters. The importance of the mainframe environment waned in the
1990s. In modern environments it is very often possible to obtain
alternative linkers.
So the main issue was to be able to link together libraries compiled in C with libraries compiled in Fortran, and Fortran imposed the limit of 6.
You can read more at the given reference.
That's a legacy of the past that is not anymore important. No today compiler has those limits, and that was something that dates from the times the old unix was made. The reasons were (then and today) the limits imposed by the compiler to the names in the symbol table (31) and the limit the linker used (6) in that time.
But that's not applicable anymore. At least you can be sure that today's linkers will allow different identifiers to state different with at least a common prefix of length 100.

Is the remove function guaranteed to delete the file?

The wording of the C99 standard seems a bit ambiguous regarding the behavior of the remove function.
In section 7.19.4.1 paragraph 2:
The remove function causes the file whose name is the string pointed to by filename
to be no longer accessible by that name. A subsequent attempt to open that file using that
name will fail, unless it is created anew.
Does the C99 standard guarantee that the remove function will delete the file on the filesystem, or could an implementation simply ignore the file -- leaving the file on filesystem, but just inaccessible to the current program via that filename-- for the remainder of the program?
I don't think you're guaranteed anything by the C standard, which says (N1570, 7.21.4.1 2):
The remove function causes the file whose name is the string pointed to by filename
to be no longer accessible by that name. A subsequent attempt to open that file using that
name will fail, unless it is created anew. If the file is open, the behavior of the remove
function is implementation-defined.
So, if you had a pathological implementation, it could be interpreted, I suppose, to mean that calling remove() merely has the effect of making the file invisible to this running instance of this program, but that would be, as I said, pathological.
However, all is not utterly stupid! The POSIX specification for remove() says,
If path does not name a directory, remove(path) shall be equivalent to unlink(path).
If path names a directory, remove(path) shall be equivalent to rmdir(path).
And the POSIX documentation for unlink() is pretty clear:
The unlink() function shall remove a link to a file.
Therefore, unless your implementation (a) Does not conform to POSIX requirements, and (b) is extremely pathological, you can be assured that the remove() function will actually try to delete the file, and will return 0 only if the file is actually deleted.
Of course, on most filesystems currently in use, filenames are decoupled from the actual files, so if you've got five links to an inode, that file's going to keep existing until you delete all five of them.
References:
The Open Group Base Specifications Issue 6, IEEE Std 1003.1, 2004 Edition
The Open Group Base Specifications Issue 7, IEEE Std 1003.1™, 2013 EditionNote:"IEEE Std 1003.1 2004 Edition" is "IEEE Std 1003.1-2001 with corrigenda incorporated". "IEEE Std 1003.1 2013 Edition" is "IEEE Std 1003.1-2008 with corrigendum incorporated".
The C99 standard does not guarantee anything.
The file could remain there for any of the reasons unlink(2) can fail. For example you don't have permission to do this.
Consult http://linux.die.net/man/2/unlink for examples what can all go wrong.
On Unix / Linux, there are several reasons for the file not to be removed:
You dont't have write permission on the file's directory (in that case, remove() will return ERROR, of course)
there is another hard link on the file. Then the file will remain on disk but only be accessible by the other path name(s)
the file is kept open by any process. In that case the directory entry is removed immediatly, so that no subsequent open() can access the file (or an appropriate call will create a new file), but the file itself will remain on disk as long as any process keeps it open.
Typically, that only unlinks the file from the file system. This means all the data that was in the file, is still there. Given enough experience or time, someone would be able to get that data back.
There are some options to not have the file be read again, ever. The *nix utility shred will do that. If you are looking to do it from within a program, open the file to write, and write nonsense data over what you are looking to 'remove'.

same name but with different case variable and function names in c

I have a variable named setlocal and a function named void SetLocal(void)
I am using C51 keil compiler to build the code and the linker gives following error:
"EXTERNAL ATTRIBUT DO NOT MATCH PUBLIC"
Is it not possible to use same name for function and a variable? with different case?
That particular compiler is for embedded systems (using the 8051 chips) and is really targeted for those environments. I've seen compilers in that arena that don't even support floating point, and Keil make it clear that, while it's based on C90, there are deviations from that standard.
As per the compiler limitations listed on the Keil website:
Names may be up to 255 characters long. The C language provides for case sensitivity in regard to function and variable names. However, for compatibility reasons, all names in the object file appear in capital letters. It is therefore irrelevant if an external object name within the source program is written in capital or small letters.
So it's a safe bet that, as far as the linker is concerned, you have a conflict between the setlocal variable and the SetLocal function, both of which would be seen as SETLOCAL.
That also explains (as stated in one on your comments) why changing the variable name to setlocal1 fixes your problem. While the symbols are not case sensitive, they are unique to 255 characters.

(K&R) At least the first 31 characters of an internal name are significant?

When taken literally, it makes sense, but what exactly does it mean to be a significant character of a variable name?
I'm a beginning learner of C using K&R. Here's a direct quote from the book:
"At least the first 31 characters of an internal name are significant. For function names and external variables, the number may be less than 31, because external names may be used by assemblers and loaders over which the language has no control. For external names, the standard guarantees only for 6 characters and a single case."
By the way, what does it mean by "single case"?
Single Case usually means "lower case". Except in some OS's where it means "upper case". The point is that mixed case is not guaranteed to work.
abcdef
ABCDEF
differ only in case. This is not guaranteed to work.
The "Significance" issue is one of how many letters can be the same.
Let's say we only have 6 significant characters.
a_very_long_name
a_very_long_name_thats_too_similar
Look different, but the first 16 characters are the same. Since only 6 are significant, those are the same variable.
It means what you fear it means. For external names, the C standard at the time K&R 2nd ed. was written really does give only six case-insensitive characters! So you can't have afoobar and aFooBaz as independent entities.
This absurd limitation (which was to accommodate legacy linkers now long-gone) is no longer relevant to any environment much. The C99 standard offers 31 case-sensitive characters for external names and 63 internally, and commonly-used linkers in practice support much longer names.
It just means that if you have two variables named
abcdefghijklmnopqrstuvwxyz78901A,
and
abcdefghijklmnopqrstuvwxyz78901B,
that there is no guarantee that will be treated as different, separate variables...
It means that :
foobar1
foobar2
might be the same external name, because only the first 6 characters need be considered. The single case means that upper and lower case names need not be distinguished.
Please note that almost all modern linkers will consider much longer names, thogh there will still be a limit, dependent on the linker.
G'day,
One of the problems with this limited symbol resolution occurs at link time.
Multiple symbols with the same name can exist across several libraries and the link editor usually only takes the first one it finds that matches what it is looking for.
So, using S.Lott's example from above, if your link editor is searching for the symbol "a_very_long_name" and it finds a library on its search path that contains the symbol "a_very_long_name_thats_too_similar" it will take this one. This will happen even if the library that contains the symbol that you want, i.e. "a_very_long_name" has been specified in your command. For example specifying the libraries as:
-L/my/library/path -lmy_wrong_lib -lmy_correct_lib
There are now compiler options, or more correctly compile time options which are passed through to the link editor, which enforce a search for multiple symbols in your link path. These are then usually raised as errors at link time.
In addition, many compilers, e.g. gcc, will default to such behaviour. You have to explicitly enable multiple definitions to allow the link editor to proceed without raising a fatal error if it finds multiple definitions for a symbol.
BTW I'd highly recommend working through the exercises in conjunction with Clovis Tondo's book "The C Answer Book 2nd ed.".
Doing this really helps make C stick in your mind.
HTH
cheers,

Resources