What are the limits on pact named identifiers? - pact-lang

When I write a function name like create-account or a module name MyModule, what are the allowed characters?
Convention seems to suggest dull lower-case dashed alphanumeric names but apparently you can use most ASCII characters except "().:[]'\ and space. Are there any other limitations? Eg length

Related

Regex to find all calls of the certain function and NOT its declaration

I need to write a regex which will match only lines with the C function call, not its declaration.
So, I need it to match only lines, where funcName() is not preceeded by int, double, float, char etc. and an arbitrary number of spaces.
The problem is, I can run into following expressions:
printf("Hello"); int f() {return 1;};
So I must consider even the situation, where there are some other characters before the date-type name.
myStruct f();
In this situation I want regex to match it, ONLY basic data-types should be excluded.
So far I've got to this expression:
^(?!(void|int|double|char))\s*f\(\).*$
But I have no idea, how to take care of the situation with characters before the type name.
The following regex meets your specs:
(^|((^|\s)(?!(void|int|double|char))[^\s]+)\s+)([a-zA-Z_]+\(\)?)
The function name is defined by a character class containing letters and the underscore.
The line starts with the function call, or
the line contains at least one non-whitespace character before the function name. In that case ...
this non-WS sequence does not match the excluded keywords
there is at least 1 WS character before the function name
See the live demo at regex101.
Caveat
As several commentors have noted, this is not a robust solution. It will work for a tightly constrained set of function call and declaration patterns only.
A general regex-based solution (if possible at all, which would heavily depend on the regex engine features available) will be of theoretical interest only as it had to mimic completely the C preprocessor.

Database name restrictions

What I wonder what are the database, table, and column naming restrictions in databases? (MySQL, MSSQL, Oracle, etc.)
To give an example:
database_name
columnName
TableName
column_1
table.name
we can use the symbol "_" and also there no problem with numbers, etc.
but we can not use "." when we are naming.
The rules for the format of regular identifiers depend on the database compatibility level. This level can be set by using ALTER DATABASE . When the compatibility level is 100 , the following rules apply:
The first character must be one of the following:
A letter as defined by the Unicode Standard 3.2. The Unicode definition of letters includes Latin characters from a through z, from A through Z, and also letter characters from other languages.
The underscore (_), at sign (#), or number sign (#).
Certain symbols at the beginning of an identifier have special meaning in SQL Server. A regular identifier that starts with the at sign always denotes a local variable or parameter and cannot be used as the name of any other type of object. An identifier that starts with a number sign denotes a temporary table or procedure. An identifier that starts with double number signs (##) denotes a global temporary object. Although the number sign or double number sign characters can be used to begin the names of other types of objects, we do not recommend this practice.
Some Transact-SQL functions have names that start with double at signs (##). To avoid confusion with these functions, you should not use names that start with ##.
Subsequent characters can include the following:
Letters as defined in the Unicode Standard 3.2.
Decimal numbers from either Basic Latin or other national scripts.
The at sign, dollar sign ($), number sign, or underscore.
The identifier must not be a Transact-SQL reserved word. SQL Server reserves both the uppercase and lowercase versions of reserved words.
Embedded spaces or special characters are not allowed.
Supplementary characters are not allowed.
When identifiers are used in Transact-SQL statements, the identifiers that do not comply with these rules must be delimited by double quotation marks or brackets.
Please read this article about DataBase Naming Conventions:
https://launchbylunch.com/posts/2014/Feb/16/sql-naming-conventions/

what does the `TEXT` around the format string mean in "printf"

The following prints the percentage of memory used.
printf (TEXT("There is %*ld percent of memory in use.\n"),
WIDTH, statex.dwMemoryLoad);
WIDTH is defined to be equal to 7.
What does TEXT mean, and where is this sort of syntax defined in printf?
As others already said, TEXT is probably a macro.
To see what they become, simply look at the preprocessor output. If are using gcc:
gcc -E file.c
Just guessing but TEXT is a char* to char* function that takes care of translating a text string for internationalization support.
Note that if this is the case then may be you are also required to always use TEXT with a string literal (and not with expressions or variables) to allow an external tool to detect all literals that need translations by a simple scan of the source code. For example may be you should never write:
puts(TEXT(flag ? "Yes" : "No"));
and you should write instead
puts(flag ? TEXT("Yes") : TEXT("No"));
Something that is instead standard but not used very often is the parameteric width of a field: for example in printf("%*i", x, y) the first parameter x is the width used to print the second parameter y as a decimal value.
When used with scanf instead the * special char can be used to specify that you don't want to store the field (i.e. to "skip" it instead of reading it).
TEXT() is probably a macro or function which returns a string value. I think it is user defined and does some manner of formatting on that string which is passed as an argument to the TEXT function. You should go to the function declaration for TEXT() to see what exactly it does.
TEXT() is a unicode support macro defined in winnt.h. If UNICODE is defined then it prepends L to the string making it wide.
Also see TEXT vs. _TEXT vs. _T, and UNICODE vs. _UNICODE blog post.
_TEXT() or _T() is a microsoft specific macro.
This MSDN link says
To simplify code development for various international markets,
the Microsoft run-time library provides Microsoft-specific "generic-text" mappings for many data types, routines, and other objects.
These mappings are defined in TCHAR.H.
You can use these name mappings to write generic code that can be compiled for any of the three kinds of character sets:
ASCII (SBCS), MBCS, or Unicode, depending on a manifest constant you define using a #define statement.
Generic-text mappings are Microsoft extensions that are not ANSI compatible.
_TEXT is a macro to make a strings "character set neutral".
For example _T("HELLO");
Characters can either be denoted by 8 bit ANSI standards or the 16 bit Unicode notation.
If you define _TEXT for all strings and define a preprocessor symbol "_UNICODE", all such strings will follow UNICODE encoding. If you don’t define _UNICODE, the strings will all be ANSI.
Hence the macro _TEXT allows you to have all strings as UNICODE or ANSI.
So no need to change every time you change your character set.

Which ASCII characters are forbidden for use in SGML attributes?

Apart from whitespace, quotation mark, equal sign, and tab, which other characters of the printable subset of ASCII are forbidden to be used as attribute names in SGML?
By default, SGML allows only alphanumeric values for SGML names. What additional characters are allowed for SGML names is controlled by the SGML declaration; specifically UCNMCHAR and LCNMCHAR under NAMING.
For example, if you look at the SGML declaration for HTML 4, you'll see:
LCNMCHAR ".-_:"
UCNMCHAR ".-_:"
This means that the characters ., -, _, and : are also allowed in SGML names (element/attribute/entity/etc).
NOTE: Only a letter is allowed as the first character of an SGML name.

C universal macro names - gcc -fextended-identifiers

I'm looking for how can I write identifiers name with characters like [ ' " or #.
Everytime that I try to do that, I give the error:
error: macro names must be identifiers
But learning about gcc, I found this option:
-fextended-identifiers
But it seems not working like I wanted, please, somebody know how to accomplish that?
Identifiers can't include such characters. It is defined that way in the language syntax, identifiers are letters, digits or underline (and mustn't begin with a digit to avoid ambiguity with litteral numbers).
If it was possible this would conflict with the C compiler (that uses [ for arrays) and C preprocessor syntax (that uses #). Extended identifiers extension only allow using characters non forbidden by the language syntax inside identifiers (basically unicode foreign letters, etc.).
But if you really, really want to do this, nothings forbids you to preprocess your source files with your own "extended macro preprocessor", practically creating a new "C like" language. That looks like a terrible idea, but it's not really hard to do. Then you'll see soon enough by yourself why it's not a good idea...
According to this link, -fextended-identifiers only enables UTF-8 support for identifiers, so it won't help in your case.
So, answer is: You can't use such characters in macro identifiers.
Even if the extended identifier characters support was fully enabled, it wouldn't help you get characters such as:
[ ' " #
enabled for identifiers. The standard allows 'universal character names' or 'other implementation-defined characters' to be part of an identifier, but they cannot be part of the basic character set. Out of the basic character set, only _, letters and digits can be part of an identifier name (6.4.2.1 Identifiers/General).

Resources