Undocumented transaction types - coinbase-api

While integrating with the Transactions API I noticed the following undocumented types:
pro_deposit: seems to be equivalent to exchange_deposit
pro_withdrawal: seems to be equivalent to exchange_withdrawal
order: similar to a send but tracks a payment to a merchant
delayed_canceled: not sure what it represents
Can anyone point me to an updated list of transaction types and/or explain what delayed_canceled transactions represent?

Related

Google Datastore app architecture questions

I'm working on a Google AppEngine app connecting to the Google Cloud Datastore via its JSON API (I'm using PHP).
I'm reading all the documentation provided by Google and I still have questions:
In the documentation about Transactions, there is the following mention: "Transactions must operate on entities that belong to a limited number (5) of entity groups" (BTW few lines later we can found: "All Datastore operations in a transaction can operate on a maximum of twenty-five entity groups"). I'm not sure about what is an entity group. Let's say that I've an object Country which is identified only by its kind (COUNTRY) and a datastore's auto affected key id. So there is no ancestor path, hierarchical relationships, etc... Is all the countries entities counting for only 1 entity group? Or each country is counting for one?
For the Country entity kind I need to have an incremental unique id (like the SQL AUTOINCREMENT). It has to be absolutely unique and without gap. Also, this kind of object won't be created more than few / minute so there is no need to handle contention & sharding. I'm thinking about having a unique counter that will reflect the auto increment and using it inside a transaction. Is the following code pattern OK?:
Starting transaction, getting the counter, commit the creation of the Country along with the update of the counter. Rollback the transaction if the commit fails. Does this pattern prevents the affectation of 2 same ids? Could you confirm me that if 2 processes get the counter at the same time (so the same value), the first one who commits will make the other to fail (so it will be able to restart and get the new counter value)?
The documentation also mention that: "If your application receives an exception when attempting to commit a transaction, it does not necessarily mean that the transaction has failed. It is possible to receive exceptions or error messages even when a transaction has been committed and will eventually be applied successfully" !? How are we supposed to handle that case? If this behavior occurs on the creation of my country (question #2), I will have an issue with my auto increment id, no!?
Since the datastore needs that all the writes actions of a transaction to be done in only one call. And since the transaction ensure that all or none of the transaction's actions will be performed, why do we have to make a rollback?
Is the limit of 1 write / sec only on an entity (so something defined by its kind and its key path) and not a whole entity group (I will be reassured only when I'll be sure about what exactly is an entity group ;-) question #1)
I'm stoping here to not make a huge post. I'll probably get back with others (or refined) questions after getting answers on this ones ;-)
Thanks for your help.
[UPDATE] Country is just used as a sample class object.
No, ('Country', 123123) and ('Country', 679621) are not in the same entity group. But ('Country', 123123, 'City', '1') and ('Country', 123123, 'City', '2') are in the same entity group. Entities with the same ancestor are in the same group.
Sounds like really bad idea to use auto-increment for things like countries. Just generate an ID based on the name of the country.
From the same paragraph:
Whenever possible, structure your Datastore transactions so that the end result will be unaffected if the same transaction is applied more than once.
In internal datastore APIs like db or ndb you don't have to worry about rolling back, its happening automatically.
It's about 1 write per sec per whole entity group, that's why you need to keep groups as smaller as possible.

Adjustable, versioned graph database

I'm currently working on a project where I use natural language processing to extract emotions from text to correlate them with contextual information.
Definition of contextual information: Every information that is relevant to describe an entity's situation in time an space.
Description of the data structure I'm looking for:
There is a arbitrary number of entities (an entity can either be a person or a group for example (twitter hash tags)) of which I want to track contextual information and their conversations with other entities. Conversations between entities are processed in order to classify their emotional features. Basic emotional features consist of a vector that specifies their occurrence percentually: {fear: 0.1, happiness: 0.4, joy: 0.1, surprise: 0.9, anger: 0}
Entities can also submit any contextual information they'd like to share, for example: location, room-temperature, blood pressure, ... and so on (will refer to this as contextual variables).
Because neither the number of conversations of an entity, nor the number of contextual variables they want to share is clear at any point in time, the data structure needs to be able to adjust accordingly.
Important: Every change in the data must also represent an own state as I'm looking forward to correlate certain changes in state with each other.
Example: Bob and Alice have a conversation that shows high magnitude of fear. A couple of hours later they have another conversation that shows no more fear, but happiness.
Now, one could argue that high magnitude fear, followed by happiness actually could be interpreted as the emotion relief.
However, in order to be able to extract this very information I need to be able to correlate different states with each other.
Same goes for using contextual information to correlate them with the tracked emotions in conversations.
This is why every state change must be recorded and available.
To make this more clear to you, I've created a graphic and attached it to the question.
Now, the actual question I have is: Which database/data structure can I use to solve this problem?
I've looked into event-sourcing databases but wasn't quite convinced if I can easily recreate a graph structure with them. I also looked at graph databases but didn't find what I was looking for.
Therefore it would be nice if someone here could at least point me in the right direction or help me adjust my structure accordingly to solve the problem. If however there are data structures supporting, what I call it graph databases with snapshots then ease of usage is probably the most important feature to filter for.
There's a database called Datomic by Rich Hickey (of Clojure fame) that stores facts over time. Every entry in the database is a fact with a timestamp, append-only as in Event Sourcing.
These facts can be queried with a relational/logical language ala Datalog (remiscent of Prolog). Please see This post by kisai for a quick overview. It has been used for querying graphs with some success in the past: Using Datomic as a Graph Database.
While I have no experience with Datomic, it does seem to be quite suitable for your particular problem.
You have an interesting project, I do not work on things like this directly but for my 2 cents -
It seems to me your picture is a bit flawed. You are trying to represent a graph database overtime but there isn't really a way to represent time this way.
If we examine the image, you have conversations and context data changing over time, but the fact of "Bob" and "Alice" and "Malory" actually doesn't change over time. So lets remove them from the equation.
Instead focus on the things you can model over time, a conversation, a context, a location. These things will change as new data comes in. These objects are an excellent candidate for an event sourced model. In your app, the conversation would be modeled as a series of individual events which your aggregate would use and combine and factor to generate a final state which would be your 'relief' determination.
For example you could write logic where if a conversation was angry then a very happy event came in then the subject is now feeling relief.
What I would do is model these conversation states in your graph db connected to your 'Fact' objects "Bob", "Alice", etc. And a query such as 'What is alice feeling right now?' would be a graph traversal through your conversation states factoring in the context data connected to alice.
To answer a question such as 'What was alice feeling 5 minutes ago?' you would take all the event streams for the conversations and rewind them to the appropriate point then examine the state of the conversations.
TLDR:
Separate the time dependent variables from the time independent variables and use event sourcing to model time.
There is an obvious 1:1 correspondence between your states at a given time and a relational database with a given schema. So there is an obvious 1:1 correspondence between your set of states over time and a changing-schema database, ie a variable whose value is a database plus metadata, manipulated by both DDL and DML update commands. So there is no evidence that you shouldn't just use a relational DBMS.
Relational DBMSs allow generic querying with automated implementation at a certain computational complexity with certain opportunities for optimization. Any application can have specialized queries that make a specialized data structure and operators a better choice. But you must design your application and know about such special aspects to justify this. As it is, with the obvious correspondences between your states and relational states, this has not been justified.
EAV is frequently used instead of DDL and a changing schema. But under EAV the DBMS does not know the real tables you are concerned with, which have columns that are EAV attributes, and which are explicit in the DDL/DML changing schema approach. So EAV foregoes simplicity, clarity, optimization and most of all integrity and ACID. It can only be justified (compared to DDL/DML, assuming a relational representation is otherwise appropriate) by demonstrating that DDL with schema updates (adding, deleting and changing columns and tables) is worse (per the above) than EAV in your particular application.
Just because you can draw a picture of your application state at some time using a graph does not mean that you need a graph database. What matters is what specialized queries/expressions you will be evaluating. You should understand what these are in terms of your problem domain, which is probably most easily expressible per some specialized data structure and operators and relationally. Then you can compare the expressive and computational demands to a specialized data structure, a relational representation, and the models of particular graph databases. Be sure to google stackoverflow.
According to Wikipedia "Neo4j is the most popular graph database in use today".

Understanding vCloud statueses

I'm trying to wrap my mind around the statuses that vCloud returns in their SDK, but there seems to be very light documentation on them. A few of them I don't understand what they're about, and in practice I'm only seeing POWERED_ON, POWERED_OFF, and SUSPENDED. The only documentation on the statuses that I can find are here:
http://www.vmware.com/support/vcd/doc/rest-api-doc-1.5-html/operations/GET-VApp.html
What confuses me are things like "what is an 'entity'? And what does it mean when it's 'resolved'?" When I go to provision a VM and monitor its state, it starts at POWERED_OFF and goes to POWERED_ON, when I would expect to see some intermediary statuses while it's in the process of provisioning. Does anyone know where I can go to find out more about this?
This page from the vCD 5.1 documentation shows the possible values of the status field for various entities. The current doc uses numerical values but the API also has a few spots where string values are returned instead. The reference you found from the 1.5 API includes some of them; I think as part of the 5.1 doc update the string values were dropped from the schema reference.
An entity in the vCloud API is very similar to the likewise-named notion in database modeling. Wikipedia provides a fair definition of the term from entity-relationship modeling:
An entity may be defined as a thing which is recognized as being
capable of an independent existence and which can be uniquely
identified.
The RESOLVED (numerical value 1) state means that most of the parts of the entity are present, but it isn't fully constructed yet. You typically see it when uploading an OVF and all of the bits have be transferred to vCD but stuff is still happening in the background prior to it being usable.

Can this strange behaviour be explained by Eventual Consistency on the App Engine Datastore?

I have implemented two server-side HTTP endpoints which 1) stores some data and 2) processes it. Method 1) calls method 2) through App Engine Tasks since they are time consuming tasks that I do not want the client to wait for. The process is illustrated in the sequence diagram below.
Now from time to time I experience that the processing task (named processSomething in the sequence diagram below) can't find the data when attempting to process - illustrated with the yellow throw WtfException() below. Can this be explained with the Eventual Consistency model described here?
The document says Strong consistency for reads but eventual consistency for writes. I'm not sure what exactly that means related to this case. Any clarification is appreciated.
edit: I realize I'm asking a boolean question here, but I guess I'm looking for an answer backed up with some documentation on what Eventual Consistency is in general and specifically on Google Datastore
edit 2: By request here are details on the actual read/write operations:
The write operation:
entityManager.persist(hand);
entityManager.close()
I'm using JPA for data persistance. Object 'hand' is recieved from client and not previously stored in the db so a new key Id will be assigned.
The read operation:
SELECT p FROM Hand p WHERE p.GameId = :gid AND p.RoundNo = :rno
Neither GameId nor RoundNo is the primary key. GameId is a "foreign key" although the Datastore is oblivious of that by design.
It would help if you showed actual code, showing how you save the entity and how you retrieve it, but assuming that id is an actual datastore ID, part of a Key, and that your load is a get using the id and not a query on some other property then eventual-consistency is not your issue.
(The documentation on this is further down the page you linked.)

Understanding Transactions in databases

I'm new to database applications and I'm trying to use Datamapper to make a ruby web application.
I stumbled across this piece of code which I don't understand:
transaction do |txn|
link = Link.new(:identifier => custom)
link.url = Url.create(:original => original)
link.save
end
I have a few questions: What exactly are transactions? And why was this preferred instead of just doing:
link = Link.new(:identifier => custom)
link.url = Url.create(:original => original)
link.save
When should I consider using transactions? What are the best use-cases? Is there any resource available online where I can read more about such concepts.
Thanks!
Transaction is an indivisible unit of work. The idea comes from the database world and is connected with the problems of data selection/update. Consider the following situation:
user A asks for object O in order to change it.
While A was doing his/her stuff, user B asked for the same object. Object O currently is equal for both users.
Then A puts the update to the database, with changed property O1 to the object O. User B hasn't got this change - his object O is still the same as it was before.
B puts the update to the database with changed property O2 to the object O. The change to O1 is effectively lost.
Basically, it has to do with multi-user access and changes - there are several kinds of problems that arise.
Transaction are also used to couple different operations together into one logical processing statement. For example, you need to delete User with all his/her associated Photos.
The topic is really vast to cover in one post, so I'd recommend reading following articles: wiki#1 and wiki#2.
A transaction is a series of instructions which, upon execution, are seen as one atomic instruction.
This means that all of the instructions must succeed in order for the transaction to succeed. If only one of them fails, you return at the state you were before the beginning of the transaction. This is good for fault-tolerance, for example.
One other field in which transactions are useful is in concurrent applications. Using a transaction avoids interference by other processes.
Hope this helps.

Resources