What does the ndb.EVENTUAL_CONSISTENCY option mean? - google-app-engine

The documentation of the ndb.Query class, states that it accepts a read_policy option that can be set to EVENTUAL_CONSISTENCY to allow faster queries that might not be strongly consistent. Which implies that not using this option would return strongly consistent results.
However, global queries are always eventually consistent. So what does this flag actually do?

You can choose to have an ancestor-query, which would normally be strongly-consistent, use the eventually-consistent policy instead for the stated speed improvement.
The old 'db' module docs explain this.
(If you've only ever used NDB, then the DB docs are definitely worth reading - there is a lot more detail on how things work, and how best to make use of datastore.)

Related

Informations about TieredMergePolicy

I would like to well understand Solr merge behaviour. I did some researches on the different merge policies. And it seems that the TieredMergePolicy is better than old merge policies (LogByteSizeMergePolicy, etc ...). That's why I use this one and that's the default policy on last solr versions.
First, I give you some interesting links that I've read to have a better idea of merge process :
http://java.dzone.com/news/merge-policy-internals-solr
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
According to the official documentation of Lucene, I would like to ask several questions on it :
http://lucene.apache.org/core/3_2_0/api/all/org/apache/lucene/index/TieredMergePolicy.html
Questions
1- In the official documentation, there is one method called setExpungeDeletesPctAllowed(double v). And in the Solr 4.3.0, I have checked in the TieredMergePolicy class and I didn't find this method. There is another method that look like this one, called : setForceMergeDeletesPctAllowed(double v). Is there any differences between both methods ?
2- Are both methods above called only when you do a ExpungeDelete and an optimization or Are they called when a normal merge.
3- I've read that merges beetween segments are done according a pro-rata of deleted documents percentage on a segment. By default, this percentage is set to 10%. Does it possible to set this value to 0% to be sure that there is no more deleted documents in the index after merging ?
I need to reduce the size of my index without call optimize() method if it's possible. That's why any informations about merge process would be interesting for me.
Thanks
you appear to be mixing up your documentation. If you are using Lucene 4.3.0, use the documentation for it (see the correct documentation for TieredMergePolicy in 4.3.0), rather than for version 3.2.0.
Anyway, on these particular questions: See #Lucene-3577
1 - Seems to be mainly a necessary name change, for all intents and purposes.
2 - Firstly, IndexWriter.expungeDeletes no longer exists in 4.3.0. You can use IndexWriter.forceMergeDeletes(), if you must, though it is strongly recommended against, as it is very, very costly. I believe this will only impact a ForceMergeDeletes() call. If you want to favor reclaiming deletions, set that in the MergePolicy, using: TieredMergePolicy.setReclaimDeletesWeight
3 - The percent allowed is right there in the method call you've indicated in your first question. Forcing all the deletions to be merged out when calling ForceMergeDeletes() will serve to make an already very expensive operation that much more expensive as well, though.
Just to venture a guess, if you need to save disk space taken by your index, you'll likely have much more success looking more closely at how much data your are storing in the index. Not enough information to say for sure, of course, but seems a likely solution to consider.

Is there a way to translate database table rows into Prolog facts?

After doing some research, I was amazed with the power of Prolog to express queries in a very simple way, almost like telling the machine verbally what to do. This happened because I've become really bored with Propel and PHP at work.
So, I've been wondering if there is a way to translate database table rows (Postgres, for example) into Prolog facts. That way, I could stop using so many boring joins and using ORM, and instead write something like this to get what I want:
mantenedora_ies(ID_MANTENEDORA, ID_IES) :-
papel_pessoa(ID_PAPEL_MANTENEDORA, ID_MANTENEDORA, 1),
papel_pessoa(ID_PAPEL_IES, ID_IES, 6),
relacionamento_pessoa(_, ID_PAPEL_IES, ID_PAPEL_MANTENEDORA, 3).
To see why I've become bored, look at this post. The code there would be replaced for these simple lines ahead, much easier to read and understand. I'm just curious about that, since it will be impossible to replace things around here.
It would also be cool if something like that was possible to be done in PHP. Does anyone know something like that?
check the ODBC interface of swi-prolog (maybe there is something equivalent for other prolog implementations too)
http://www.swi-prolog.org/pldoc/doc_for?object=section%280,%270%27,swi%28%27/doc/packages/odbc.html%27%29%29
I can think of a few approaches to this -
On initialization, call a method that performs a selects all data from a table and asserts it into the db. Do this for each db. You will need to declare the shape of each row as :- dynamic ies_row/4 etc
You could modify load_files by overriding user:prolog_load_files. From this activity you could so something similar to #1. This has the benefit of looking like a load_files call. http://www.swi-prolog.org/pldoc/man?predicate=prolog_load_file%2F2 ... This documentation mentions library(http_load), but I cannot find this anywhere (I was interested in this recently)!
There is the Draxler Prolog to SQL compiler, that translates some pattern (like the conjunction you wrote) into the more verbose SQL joins. You can find in the related post (prolog to SQL converter) more info.
But beware that Prolog has its weakness too, especially regarding aggregates. Without a library, getting sums, counts and the like is not very easy. And such libraries aren't so common, and easy to use.
I think you could try to specialize the PHP DB interface for equijoins, using the builtin features that allows to shorten the query text (when this results in more readable code). Working in SWI-Prolog / ODBC, where (like in PHP) you need to compose SQL, I effettively found myself working that way, to handle something very similar to what you have shown in the other post.
Another approach I found useful: I wrote a parser for the subset of SQL used by MySQL backup interface (PHPMyAdmin, really). So routinely I dump locally my CMS' DB, load it memory, apply whathever duty task I need, computing and writing (or applying) the insert/update/delete statements, then upload these. This can be done due to the limited size of the DB, that fits in memory. I've developed and now I'm mantaining this small e-commerce with this naive approach.
Writing Prolog from PHP should be not too much difficult: I'd try to modify an existing interface, like the awesome Adminer, that already offers a choice among basic serialization formats.

Instead of Triggers - Can they coexist with regular Triggers

Can Instead Of triggers co-exist with regular triggers? If so, are there any potential issues we should be aware of?
INSTEAD OF triggers can coexist with normal triggers. I've done this a good bit.
INSTEAD OF triggers have numerous potential issues, mainly around the fact that what they replace the normal insert/update/delete behavior with whatever you define. A developer may think nothing of UPDATE User SET Address = 'foo' WHERE UserID = 4, but if your trigger is using that as a hook to touch a dozen authentication tables and maybe talk to a server around the world, you've bought yourself a lot of potential confusion.
Keep the behavior of these triggers inline with expected behavior of IUD statements. Don't do too much.
INSTEAD OF triggers are a very powerful tool, easily misused. Use them appropriately and thoughtfully.
I haven't found anything to be concerned about with respect to using both INSTEAD OF and AFTER (AKA FOR) triggers at the same time. The main issues with INSTEAD OF triggers are:
You can only have one INSTEAD OF trigger per operation, per table;
They can mess with OUTPUT INTO clauses (i.e. you'll get identity values of 0);
If you make any schema changes to the table, things may mysteriously break at some point in the future if you weren't careful to maintain the trigger.
None of these caveats are related to AFTER triggers, so you don't really have anything to worry about in that regard. Although I will say that it's more common to write INSTEAD OF triggers on views as opposed to tables, because there's less chance of them interfering with table operations. They were primarily designed as a tool to help you create insertable/updatable views.
Anyway, you'll be fine if you're careful, but I would still recommend against using an INSTEAD OF trigger unless you actually need to, because they make ordinarily simple operations harder to reason about.

Should I use the keyword AS when aliasing columns in SQL Server?

Simple question: Office debate about whether the keyword AS is necessary in our T-SQL statements. I've always used it in cases such as
SELECT mycol AS Something
FROM MYTABLE;
I know you don't NEED it but is there any benefit of using it (or not using it)? I think it makes statements easier to read but people seem to disagree with me.
Thanks!
Generally yes, as it makes it easier to see what is aliased to what.
I agree with you that including the AS keyword makes queries easier to read. It's optional, but not including it is lazy. It doesn't make a significant difference to the performance of the query - the query plan will be the same. I would always prefer to include it for clarity.
I think it depends upon how readable your schema is to start with. If the field names are cryptic, then yes, using an alias can make it easier to understand the output of the SQL statement. However, there can be a cost associated with this when debugging. In a complex schema it can be difficult to track down the source of a column unless you look at the SQL statement itself to understand what field the alias is referring to.
I have almost always aliased my table names, and sometimes aliased my column names.
For production queries, I suggest that you go with uniformity - if you do it, do it at all times, and use the same convention. If you do not, then just leave things as they are.
I don't think it makes much difference. It certainly makes no difference to the performance. I think the important thing is to be consistent.
Similarly with table aliases:
SELECT mycol AS Something
FROM MYTABLE AS m;
Personally, I prefer to omit the AS, because it is faster to write and fewer characters to read.
I think you should always keep using AS when aliasing columns as it is obligatory for some other DBRM engines such as Oracle, for instance. So, if you are used to use this syntax, you won't get bothered for something as simple as that.

DAL using typed dataset

Are there any performance issues in using Typed Data Sets as DAL? Is it a recommended approach? I am using it for listing purposes only (repeater). It has paging, sorting functionalities too.
It works through untyped DataSet and just incapsulates type casting.
A dataset includes a lot of information other than the data that you need in the list. Therefore, if you can read the data out in a different way it would give you less information to transfer and therefore better performance.
Having said that, depending on your app you may not notice the difference. Do whatever is easiest for you and then check the performance.

Resources