How do I insert a date in Datomic db.type/instant? - datomic

Is there a neat way to save a Date into a Datomic attribute of type db.type/instant? For instance, there is a d/tempid and d/squuid functions to produce a tempid and a squuid.

Datomic doesn't provide an API endpoint for generating dates as opposed to the cases for tempid (something Datomic makes a specific use of) and squuid (the generated value is changed from a standard uuid and will leak time information, which precludes some secure use, but allows for better indexing performance).
In Clojure code you can use the #inst reader literal or (java.util.Date.). You can obviously use the java.util.Date constructor in Java code as well (or use a library that generates the same type).

Related

UUID for representing physical objects in a database

I am building database for physical objects and trying to make sure I represent them in the most intelligent way possible.
I would like to generate a QR code with a UUID, and would like to have the following attributes:
Guaranteed to be unique
Ideally would allow there to be a canonical UUID for something, and then an iterable component of an instance of it that would allow for knowing that without looking at the database. For example, I might have 10 identical lamps; ideally the encoding would have one UUID for the lamp, then a way to identify each one. Maybe this would be a UUID with something additional tagged onto it? Or is it better to just link them together on the backend and use conventional UUIDs?
My understanding is that the best way to generate the UUID would be to do a version-5 using my domain to namespace it. I would also like to have a human-readable shortened version similar to the git short hash; I assume I could take the first 7-8 characters for that purpose then have a collision likelihood for that shortened version that is XXX (what would that likelihood be???).

How to not expose base64 encoded UUIDs

I have a doubt regarding the exposure of internal database primary keys.
I have decided to use UUIDs in place of auto-increment longs (see here for details). This way, among other things, people cannot discover the relative size of my data or their growth over time.
Now, the UUID doesn't provide any internal information but it is not very URL friendly, although it is URL safe. Furthermore if long PKs shouldn't be exposed, then UUIDs shouldn't either.
Usually to make UUIDs more user friendly, people base64 encode them.
Example:
- UUID: 7b3149e7-bdab-4895-b659-a5f5b0d0
- base64: ezFJ572rSJW2WQAApfWw0A
My point is: anyone could still take those base64 string from the url and decode them in order to obtain the original UUID. This means that even in this case UUIDs would end up being exposed as well.
Should I use another type of encoding? Is out there something already known or should I create my custom encoding? If yes, should I follow any guidelines?
Thank you
On the first look to be able to provide a small tiny level of Secrecy to those Identifiers you can use one way Hash functions such as SHA2(which is a Cryptographic function and not Encoding). This will literally buy you no specific security advantage.
If you are relying only on Object Reference IDs for access control and try to make them secret then I suggest you think twice at your Access Control and Authorization Model.
It is good to have random/non-guessable/Collision Free Object Reference IDs, however If you are relying on Secrecy of Reference ID for security this is a big flaw (in Old OWASP Top10 this was referred as Direct Object Reference Identifier Issue and in OWASP 2017 this is referred as Broken Access Control Issue). You need to consider a Full AAA chain: Authentication,Authorization,Audit/Accountability for Access by relying on a Random unique Token with a short validity period, which later on can be used to decide on Authorization and Access levels of your system's to be tied with a subject and permit them to interact with the Objects that they are entitled with.
The reason you aren't supposed to expose PKs is that they may (a) leak information and (b) allow people to guess other values. Neither is true of UUIDs (at least v3/4/5), which is one of the main reasons to use them in the first place. The human factor you mention is why so many folks use base64 (or other) encoding; it's not for security.
That said, you should never rely on URL secrecy as security; there are far too many ways that URLs leak, and your users may even do it intentionally--but they'd be very upset if sending a link to their friend meant that friend had full access to their account.

How to store "meta" source code in a database

I would like to store a computer program in a database instead of a number of text files. It should contain the structure and all objects, methods, dependencies etc. of the program. I do not want to store a specific language in the database but some kind of "meta" programming language. In a second step I would like to transform/export this structure in the database into either source code of a classic language (C#, Java, etc.) or compile it directly for CLR/JVM.
I think I am not the first person with this idea. I searched the internet and I think what I am looking for is called "source code in a database (SCID)" - unfortunately I could not find an implementation of this idea.
So my questions is:
Is there any program that stores "meta" program code inside of a database and let's you generate traditional text source code from it that can be compiled/executed?
Short remarks:
- It can also be a noSQL database
- I currently don't care how the program is imported/entered into the database
It sounds like you're looking for some kind of common markup language that adequately describes the common semantics of each target language - e.g. objects, functions, inputs, return values, etc.
This is less about storing in a database, and more about having a single (I imagine, XML-like) structure that can subsequently be parsed and eval'd by the target language to produce native source/bytecode. If there was such a thing, storing it in a database would be trivial -- that's not the hard part. Even a key/value database could handle that.
The hard part will be finding something that can abstract away the nuances of multiple languages and attempt to describe them in a common format.
Similar questions have already been asked, without satisfying solutions.
It may be that you don't need the full source, but instead just a description of the runtime data-- formats like XML and JSON are intended exactly for this purpose and provide a simplified description of Objects that can be parsed and mapped to native equivalents, with your source code built around the dynamic parsing of that data.
It may be possible to go further in certain languages. For example, if your language of choice converts to bytecode first, you might technically be able to store the binary bytecode in a BLOB and then run it directly. Languages that offer reflection and dynamic evaluation can probably handle this -- then your DB is simply a wrapper for storing that data on compilation, and retrieving it prior to running it. That'd depend on your target language and how compilation is handled.
Of course, if you're only working with interpreted languages, you can simply store the full source and eval it (in whatever manner is preferred by the target language).
If you give more info on your intended use case, I'm sure you'll get some decent suggestions on how to handle it without having to invent a sourcecode Babelfish.

How do I correctly use libsodium so that it is compatible between versions?

I'm planning on storing a bunch of records in a file, where each record is then signed with libsodium. However, I would like future versions of my program to be able to check signatures the current version has made, and ideally vice-versa.
For the current version of Sodium, signatures are made using the Ed25519 algorithm. I imagine that the default primitive can change in new versions of Sodium (otherwise libsodium wouldn't expose a way to choose a particular one, I think).
Should I...
Always use the default primitive (i.e. crypto_sign)
Use a specific primitive (i.e. crypto_sign_ed25519)
Do (1), but store the value of sodium_library_version_major() in the file (either in a dedicated 'sodium version' field or a general 'file format revision' field) and quit if the currently running version is lower
Do (3), but also store crypto_sign_primitive()
Do (4), but also store crypto_sign_bytes() and friends
...or should I do something else entirely?
My program will be written in C.
Let's first identify the set of possible problems and then try to solve it. We have some data (a record) and a signature. The signature can be computed with different algorithms. The program can evolve and change its behaviour, the libsodium can also (independently) evolve and change its behaviour. On the signature generation front we have:
crypto_sign(), which uses some default algorithm to produce signatures (at the moment of writing is just invokes crypto_sign_ed25519())
crypto_sign_ed25519(), which produces signatures based on specific ed25519 algorithm
I assume that for one particular algorithm given the same input data and the same key we'll always get the same result, as it's math and any deviation from this rule would make the library completely unusable.
Let's take a look at the two main options:
Using crypto_sign_ed25519() all the time and never changing this. Not that bad of an option, because it's simple and as long as crypto_sign_ed25519() exists in libsodium and is stable in its output you have nothing to worry about with stable fixed-size signature and zero management overhead for this. Of course, in future someone can discover some horrible problem with this algorithm and if you're not prepared to change the algorithm that could mean horrible problem for you.
Using crypto_sign(). With this we suddenly have a lot of problems, because the algorithm can change, so you must store some metadata along with the signature, which opens up a set of questions:
what to store?
should this metadata be record-level or file-level?
What do we have in mentioned functions for the second approach?
sodium_library_version_major() is a function to tell us the library API version. It's not directly related to changes in supported/default algorithms so it's of little use for our problems.
crypto_sign_primitive() is a function that returns a string identifying the algorithm used in crypto_sign(). That's a perfect match for what we need, because supposedly its output will change at exactly the time when the algorithm would change.
crypto_sign_bytes() is a function that returns the size of signature produced by crypto_sign() in bytes. That's useful for determining the amount of storage needed for the signature, but it can easily stay the same if algorithm changes, so it's not the metadata we need to store explicitly.
Now that we know what to store there is a question of processing that stored data. You need to get the algorithm name and use that to invoke matching verification function. Unfortunately, from what I see, libsodium itself doesn't provide any simple way to get the proper function given the algorithm name (like EVP_get_cipherbyname() or EVP_get_digestbyname() in openssl), so you need to make one yourself (which of course should fail for unknown name). And if you have to make one yourself maybe it would be even easier to store some numeric identifier instead of the name from library (more code though).
Now let's get back to file-level vs record-level. To solve that there are another two questions to ask — can you generate new signatures for old records at any given time (is that technically possible, is that allowed by policy) and do you need to append new records to old files?
If you can't generate new signatures for old records or you need to append new records and don't want the performance penalty of signature regeneration, then you don't have much choice and you need to:
have dynamic-size field for your signature
store the algorithm (dynamic string field or internal (for your application) ID) used to generate the signature along with the signature itself
If you can generate new signatures or especially if you don't need to append new records, then you can get away with simpler file-level approach when you store the algorithm used in a special file-level field and, if the signature algorithm changes, regenerate all signatures when saving the file (or use the old one when appending new records, that's also more of a compatibility policy question).
Other options? Well, what's so special about crypto_sign()? It's that its behaviour is not under your control, libsodium developers choose the algorithm for you (no doubt they choose good one), but if you have any versioning information in your file structure (not signature-specific, I mean) nothing prevents you from making your own particular choice and using one algorithm with one file version and another with another (with conversion code when needed, of course). Again, that's also based on the assumption that you can generate new signature and that's allowed by policy.
Which brings us back to the original two choices with question of whether it's worth the trouble of doing all that compared to just using crypto_sign_ed25519(). That mostly depends on your program life span, I'd probably say (just as an opinion) that if that's less than 5 years then it's easier to just use one particular algorithm. If it can easily be more than 10 years, then no, you really need to be able to survive algorithm (and probably even whole crypto library) changes.
Just use the high-level API.
Functions from the high-level API are not going to use a different algorithm without the major version of the library being bumped.
The only breaking change one can expect in libsodium 1.x.y is the removal of deprecated/undocumented functions (that don't even exist in current releases compiled with the --enable-minimal switch). Everything else will remain backward compatible.
New algorithms might be introduced in 1.x.y versions without high-level wrappers, and will be stabilized and exposed via a new high-level API in libsodium 2.
Therefore, do not bother calling crypto_sign_ed25519(). Just use crypto_sign().

google app engine computedProperty: when to use? When not to use?

When would using a ComputedProperty (ndb) in google app engine give you a distinct advantage over just computing when needed, on the backend (such as in a handler), without the datastore being involved?
Everything I'm reading seems to indicate that it's mostly useless, and would just slow queries down (at least the put operation if nothing else).
Thoughts?
I did see this:
"Note: Use ComputedProperty if the application queries for the computed value. If you just want to use the derived version in Python code, define a regular method or use Python's #property built-in."
but that doesn't really explain any advantage (why query if you can derive?)
The documentation is quite clear on that regard, and i'll cite it again for reference, the Computed Properties section:
Note: Use ComputedProperty if the application queries for the computed value. If you just want to use the derived version in Python code, define a regular method or use Python's #property built-in.
When to use it? When you need to query some derived data, it needs to be written to the datastore so it gets indexed.
First example that came to mind: You're already storing the birthday of a user, but also need to filter by actual age, adding a property to derive that value might be the easiest and most efficient solution:
age = ndb.ComputedProperty(lambda self: calc_age(self.birthday))
Of course you could just have a function that returns the age, but that's only useful after you get the entity, can't use it for queries.

Resources