Serialise and export the transaction to a base 64 format and use this as the input to the next step - hedera-hashgraph

need to make a scheduled transfer from one account to another and then serialise it to base 64. which reads it in the serialised format and provides the required signature and submit it*
I've made a scheduled transfer but not knowing how to make it to a serialised base64 format

Using the SDKs, you can create a transaction object (e.g. a CryptoTransferTransaction) and convert it to bytes which you can then turn to base64 or any encoding of your choice using available libraries.
The receiver of the base64 payload can then convert back to bytes and deserialize (e.g. const transaction = Transaction.fromBytes(transactionBytes);) the transaction back to an object.
You may find these examples useful:
https://github.com/hashgraph/hedera-sdk-js/blob/develop/examples/schedule-example.js
and
https://github.com/hashgraph/hedera-sdk-js/blob/develop/examples/multi-sig-offline.js
Also note that ScheduleGetInfoQuery will return the body of the transaction that's been scheduled, so you can technically share the scheduleId alone, the other party can pull the transaction and sign it after verifying it does correspond to their expectations.

Related

Is there a reason why not to store encrypted data as binary in a database?

I have to store AES-GCM encrypted data in a database. Currently we use MariaDB but with the option to later change to PostgreSQL. (however other databases should be considered as well)
Since the algorithm does not actually encrypt strings, but bytes and the output of an encryption algorithm is also a byte[], why not store the encrypted data directly in a binary column?
For MariaDB/MySql that would be as a BLOB. I understand PostgreSQL even has a preferred special data type for encrypted data called bytea.
However most programmers seem to encode the encrypted bytes as Base64 instead and store the resulting string in a VARCHAR.
Encoding to and decoding from Base64 seems counter intuitive to me. It makes the data up to 50% longer and is an extra step each time. It also forces the database to apply a character encoding when storing and retrieving the data. This is an extra step and surely costs extra time and resources, while all we really need to store are some bytes. The encrypted data makes no sense in any character encoding any way.
Question:
Is there any good reason for or against storing encrypted data as binary in a database? Is there a security, data integrity or performance reason why I may not want to store the encrypted data directly as binary?
(I assume this question will shortly be closed as "opinion based" - but nevertheless)
Is there any good reason for or against storing encrypted data as binary in a database
No. I don't see any reason against using a proper "blob" type (BLOB, bytea, varbinary(max), ....)
The general rule of thumb is: use the data type that matches the data. So BLOB (or the equivalent type) is the right choice.
Using base64 encoded strings might be reasoned because not all libraries (obfuscation layers like ORMs) might be able to deal with "blobs" correctly, so people chose to use something that is universally applicable (ignoring the overhead in storage and processing).
Note that Postgres' bytea is not "a special type for encrypted data". It's a general purpose data type for binary data (images, documents, music, ...)

what is the format of this image value?

it has image datatype in the database
0xFFD8FFE000104A4649460001020101C201C20000FFE11A004578696600004D4D002A000000080007011200030000000100010000011A00050000000100000062011B0005000000010000006A012800030000000100020000013100020000001C0000007201320002000000140000008E8769000400000001000000A4000000D00044AA20000027100044AA200000271041646F62652050686F746F73686F70204353332057696E646F777300323031373A31313A32382031353A33353A30350000000003A00100030000000100010000A002000400000001000001F4A003000400000001000001F40000000000000006010300030000000100060000011A0005000000010000011E011B0005000000010000012601280003000000010002000002010004000000010000012E0202000400000001000018CA0000000000000048000000010000004800000001FFD8FFE000104A46494600010200004800480000FFED000C41646F62655F434D0001FFEE000E41646F626500648000000001FFDB0084000C08080809080C09090C110B0A0B11150F0C0C0F1518131315131318110C0C0C0C0C0C110C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C010D0B0B0D0E0D100E0E10140E0E0E14140E0E0E0E14110C0C0C0C0C11110C0C0C0C0C0C110C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0C0CFFC000110800A000A003012200021101031101FFDD0004000AFFC4013F0000010501010101010100000000000000030001020405060708090A0B0100010501010101010100000000000000010002030405060708090A0B1000010401030204020507060805030C33010002110304211231054151611322718132061491A1B14223241552C16233347282D14307259253F0E1F163733516A2B283264493546445C2A3743617D255E265F2B384C3D375E3F3462794A485B495C4D4E4F4A5B5C5D5E5F55666768696A6B6C6D6E6F637475767778797A7B7C7D7E7F711000202010204040304050607070605350100021103213112044151617122130532819114A1B14223C152D1F0332462E1728292435315637334F1250616A2B283072635C2D2449354A317644555367465E2F2B384C3D375E3F34694A485B495C4D4E4F4A5B5C5D5E5F55666768696A6B6C6D6E6F62737475767778797A7B7C7FFDA000C03010002110311003F00D149249586BA92492494A49243C8C8A31A87E464D82AA6B12FB1DC0FFC939DFBA9292004900092780164E7FD68E938363A9DCEC9B99A3994C6D07C1D6BBD9FE66F5CF75DFADB7E61FB3F4F2FC5C5FCE7F16593FBDB3F9BAFFE0DAEFF008C5CF7B8C1EC784C33ECAA25EAAFFAF57FFDA7C4ADA3C6C739DF93D359F6FD6BEB963CBD9946B9FCCADAD0D1FF004566D3819570058C304F3DBF1471D133E34638C41DA019FEB6D519CBDE4178C332341229FF00E73F5F000FB6BFBEB0D3FF007D57F17EBB750AC3464D55E48D64C7A6F3FDA67B3FF0358B6F4CCBAE7756E1CC8DA7440B68B2BFA4D2207088C9E2838E437043E8581F587A467C0AEF14DA449AAF861F93CFE8DCB4A383D8EA0F623C9793C763C1F9ADBFABFF00596EE984E3DE1F918A46956ED5841FA74EE9DBFF00169E27DD6EA1EF5241C4CAC7CCC6AF2B19DBE9B44B49D0823E931EDFCD7B1193D4A4924925292492494FFFD0D149249586BA92492494A5C4FD71EA8FC8CF3D3D8628C33B5C01D1D6912F7BBFE2BF9A62EDD8407B49E0193F01AAF2BCD7B6CCBBDEC3BD8EB1EE6B8E8482E27726CCE8AEA85AD71700D04B8C468BA1E8DF56ACBDCDB721BEC3F4587BFDC81F577A59BEDFB4BC7B1BA3477927FF0022BBDC3A006EE81CC2A79
The image data type is an old data type from the Sybase days. Applications aren't supposed to use it any more.
However, all it was used for was storing BLOBs of any type, not just actual images (as in photos). There's no built-in function that's going to tell you what it was. It could be a .jpg, a .wav, a word document, anything.
If it really was an image, you could pull it in to a .NET object and use an image library to try to identify it. But you wouldn't want to be trying to do that in SQL Server. Even in SQL CLR, most of the libraries that you'd need can't be loaded. For example, you can't load System.Drawing.
As other comments and answers have said, the IMAGE data type is used to store arbitrary binary data, and neither restricts nor stores any information about the meaning of that data.
The representation you have there is not base64 (as implied by your tags) but simply hexadecimal: each pair of characters after the leading 0x represents a single byte of data.
If you extracted the value as raw bytes, or converted the hexadecimal representation back to the raw bytes, you could save it to disk as a file, and use a utility to try to guess the format. For instance, there is a standard Unix utility file, which has a lookup table of "signatures" for different file formats, usually the first few bytes of the file.
However, without context, this could be a binary format specific to a particular application, such as a game save state, in which case no generic tool will be able to tell you much about it. Ultimately, it's just a way of representing a long series of bits, and could be absolutely anything.

How to store strings in database that can be in different encodings

I have an application that processes email and stores the subject / message body in a database.
This information is used to to preview the emails in a web appliciation before opening the email from disk in outlook.
Emails can be sent with different text-encodings.
To store the subject and body content into the database I feel I need to store the encoding used in the email and then I have 2 options:
store them in binary form
store them base64 encoded
(from: Safely Store HTML in DB without affecting Character encoding)
this because I fear I will lose data because of the collation on the column/table in sqlserver. (since the encoding is row specific in my case)
In this particular case I don't care much for issues regarding searching / sorting the data in the ui.
Another option might be to convert the incoming characters to a specific encoding (utf-8 for instance).
Since I don't feel I fully understand the problem domain (and yes I did read http://www.joelonsoftware.com/articles/Unicode.html) I'd like to know if there is a better way or maybe I'm concerned for nothing.

How to store write-once, read-rarely data

I have an application which produces a large amount of data, that is all written once and then unchangeable (by law), and is rarely ever read. When it is read, it is always read in its entirety, as in, all the data for 2012 is read in one shot, and either processed for reporting or output in a different format for export (or gasp printed). The only way to access the data is to access an entire day's worth of data, or more than one day.
This data is easily represented as either two or three relational tables, or as a long list of self-contained documents.
What is the most storage-space-efficient way to store such data in a file system? Specifically, we're thinking of using Amazon S3 (File storage) for storage, though we could use something like RDS (their version of MySQL).
My current best bet is a gzipped file with JSON data for the entire day, one file per day.
Unless my data was pure ASCII (and even if it was), I would probably choose a binary storage method like one of
BSON
Protocol Buffers
B encode
I would use Windows Azure's Table Storage because it allows for heterogenous structured data to be stored in a single table. Having a database-like storage will allow you to append data as needed. You can easily create new table for each year.

Does Google AppEngine store binary data in the datastore as plain 8-bit or as UTF-8?

I guess the title says it all, but let me elaborate a little:
Basically, if I store (uniformly distributed) 8-bit data into the AppEngine Datastore, can I expect to use up 1 Byte of storage for every 1 byte in my byte[], or is there some encoding overhead, like for example 50% overhead if it was UTF-8 encoded (since values 128..255 take up 2 bytes)?
In the docs it says:
BlobProperty - Uninterpreted byte string
Not sure if this means 8-bit bytes (uninterpreted byte) or if this implies some encoding (string)...
Unfortunately I can't find the source-code for the setStringValueAsBytes(...)-method in the PropertyValue class. It probably holds the answer...
PS: This question is probably independent of whether you use Python, Java, Go, or PHP, but in case it matters: I'm using the Java API.
This info isn't entirely published, but Google has mentioned that the HRD is built on top of BigTable, and we also know that internally Google isn't really using BigTable, but Megastore, a more advanced development.
Now we don't really know exactly what the HRD is running on, but I think if you read up on how BigTable and the Megastore work, you might get a pretty good idea.
http://research.google.com/archive/bigtable.html
http://research.google.com/pubs/pub36971.html
Now to answer your question, I have no idea why you'd think Google would store binary data as UTF-8. It's a retarded idea to begin with, and it would be pretty retarded for Google to implement it that way. So I highly doubt they'll do that.
More realistically, most data storage systems have some minimum block size that is allocated for any block of data. The BigTable whitepaper mentions that it's configurable but it defaults to 64KB. We don't know if the HRD uses this default or some other tuned value.
In any case, I don't think you should worry about your binary data being stored as UTF-8. That's extremely unlikely. However, it's highly likley that your data will take up some minimum block size. Keep in mind that your entities are stored along with their attribute names, so there's going to be that overhead. But most likely your overhead will be your entity squeezed into the lowest block size rather than any UTF-8 worries. It's realistic to worry that your attribute names might by stored in UTF-8, so I'd avoid having extended characters in the attribute names.
FINAL UPDATE:
I had some time to actually test this out by first creating 1024 entities, each with a Blob-property consisting of 100KB of random 8-bit data, waiting FIVE days for all the statistics to update in the AppEngine console, and then replacing the properties on each entity with 200KB of random 8-bit data. The difference in Datastore Stored Data under Quota Details and the difference in Total Size under Datastore Statistics both was 100MB exactly, so no overhead. If the data was UTF-8 encoded, the difference would have been 150MB.
So the answer is:
The AppEngine Datastore stores 8-bit binary data as plain 8-bit bytes WITHOUT encoding.
Good...
One side note: "1 GByte" of Datastore Stored Data in the quotas corresponds to 10243 bytes (the original definition of GB, which is now often called GiB), not 109 (the metric interpretation of Giga). Yay! 7.3% more storage for our money! And that's why I like Google better than Western Digital... :)
ORIGINAL ANSWER:
I finally did find some sparse documentation to shed some light on this question:
The datastore defines a set of data types that it supports: str, int32, int64, double, and bool.
This part of the documentation states that "[i]nternally, all str values are UTF-8 encoded and sorted by UTF-8 code points".1
Now, the Python2 documentation of the Types and Property Classes defines the class Blob as "a subclass of the built-in str type" and says that "this value is stored as a byte string and is not encoded as text".
While this is far from 100% clear (does "not encoded as text" really mean "not encoded as UTF-8"?), it seems to suggest that the data does remain as 8-bit bytes in the datastore when physically saved.
Rather than saying that this is a better answer than #dragonx's, I will take this as further evidence that he is correct, especially since I completely agree with his statement that "[it would be] a retarded idea to begin with, and it would be pretty retarded for Google to implement it that way".
Maybe one day I'll do an actual test. Until then, I will hope that Google indeed is not being retarded. 50% storage-cost overhead on any binary data in the datastore should be painful enough for them to try to avoid...
Asides:
1 This is why I was worried about this topic in the first place.
2 This is probably why I missed this info. The equivalent Java docs don't really mention this.
UPDATE:
Found some more supporting evidence:
Half way through the "Entities Table" section at the end of the "Properties" section, it says:
"Internally, App Engine stores entities in protocol buffers, efficient mechanisms for serializing structured data; see the open source project page for more details."
The open source project contains a file called CodedOutputStream.java that (I think) is responsible for actually assembling the binary data that makes up the stored protocol buffer. It defines two methods: writeString(...) and writeByteArray(...), which both call their respective write...NoTag(...) methods.
In writeStringNoTag(...) we finally find the line that does the UTF-8 encoding for Strings:
final byte[] bytes = value.getBytes("UTF-8");
This line does not exist in writeByteArrayNoTag(...).
I think this implies the following:
When I store a Blob, it ends up using the writeByteArray(...) method as I see no other reason for its existence and the Blob class stores its data internally in a byte[] rather than a String.
=> So no encoding here...
As writeStringNoTag(...) performs the UTF-8 encoding for Strings, it is likely that any encoding is done by the protocol buffer library.
=> So no further encoding later either...
Now, is all this enough to contradict "[i]nternally, ALL str values are UTF-8 encoded" and "Binary data [...] is a subclass of the built-in str type", which explicitly implies that binary data is also UTF-8 encoded?
I think so...

Resources