I am studying relational algebra these days and I was wondering...
Don't you thing it would be better if a compiler was existed which could compile relational algebra than compiling SQL?
In which case a database programmer would be more productive?
Is there any research on relational algebra compilers?
Thanks in advance
See Tutorial D by C J Date, he also has a good rant somewhere on the evils of SQL.
Also see datalog, although not exactly relational algebra, is similar.
On my school one student implemented relational algebra parser as a Bachelor thesis. You can test it here:
http://mufin.fi.muni.cz/projects/PA152/relalg/index.cgi
It's in czech language but I think you can get a point.
I tried to write some Relational Algebra queries and it was much better than equivalent queries in SQL! They were much shorter, simplier to write, more straightforward, more understandable. I really enjoyed to write them.
So I don't understand why we all are using SQL when there is Relational Algebra.
There is indeed research on compiling relational algebra
A good place to start:
Thomas Neumann: Efficiently Compiling Efficient Query Plans for Modern Hardware. PVLDB 4(9): 539-550 (2011)
Related
There are lots of databases of training sets where one can test one's machine learning algorithms. Is there also one where I could test my (Integer) Linear Programming Solver?
For LP problems there is NETLIB. MIP problems can be found in MIPLIB. Hans Mittelmann has a collection here.
I state that my answer to the object question is Yes in my case is convinient but I ask here to the expert.
I developed a lot of plpgsql functions and just one in C but I already understood that the learning curve is definitely more sloped.
In may case I need a real developing language that plpgsql sometimes is not, but also I need performance otherwise I'd looked at python.
But here the question.
Mainly I need to retrieve data with some select and join, make elaboration on them, sametimes complex and return a table of data.
From the time of execution point of view is quicker a c function for this kind of use?
I apreciate any comment
luca
But here the question. Mainly I need to retrieve data with some select and join, make elaboration on them, sametimes complex and return a table of data.
I would go with pl/pgsql for this, as that's what it is designed for. In general, pl/pgsql performs very well within its problem domain, and I doubt you are likely to get significantly better performance by going with C. To the extent you can push your elaborations into the main query, all the better performance-wise.
This is assuming that your elaborations can be done with existing functions and not a huge amount of complex data manipulation (in particular, say, converting between datatypes, like arrays and sets). If that is the case, I would still put the main query and light manipulation in the pl/pgsql, and put the specific operations that need to be tuned in C. There are two reasons for doing this:
It means less C code, which means the C code is easier to read, follow, and prove correct.
It separates concerns so that you can use similar manipulations elsewhere.
There's a lot of performance tuning that has gone into pl/pgsql for its problem domain and reinventing all of that would be a lot of work both in development and testing. To the extent you can leverage tools that are already there you can get the performance you need with a lot less effort and a lot more in the way of guarantees.
EDIT
If you want to write PL/PGSQL code that performs well, you want to have it be a large main query with modest support logic. The more you can push into your query the better, and the more of your elaborations you can do in SQL (with possible C functions as mentioned above), the better. Not only does this mean better performance but it means better maintainability. As ArtemGr mentioned, certain operations are very expensive in PL/PGSQL. and in these cases you want to supplement with C code in order to get the performance you need.
I know C/C++ well and for me it's easier to write a PostgreSQL function in C++ than to learn the intricacies of pgSQL syntax and workaround its limitations. I'd say go with the language you (and the rest of your team) are more familiar with. C should be faster than pgSQL (and Tcl, Perl, Python) for complex data manipulation. Usually 5-10 times faster. Javascript (http://code.google.com/p/plv8js/) might be nearly as fast as C if it has a chance to spin it's JIT. Python code can actually use a Cython extension under the hood which might be nearly as fast as C.
You should probably measure how much time is spent in the data manipulation in question and relative to the time spent in the I/O before making a decision. In some domains C isn't faster, for example Tcl and Javascript has very good regular expression engines.
How do AI based agents infer a decision that are not necessary rational but logical correct based on previous experience.
In the field of AI how do experts system infer, what kind of maths and probabilities are involved here?
I plan on creating an intelligent, but don't no where to start. Pointers or links to any resources would be grateful. Preferably a resources that describes the mathematical concept for those whom are not mathematical minded.
I don't understand your question. In AI parlance, rationality is taken to mean, "Acting in a way, given a situation and a history, that is expected to maximize some performance measure." One does not sacrifice rationality, because that would be acting in a way not expected to maximize performance.
Maybe you are thinking that rationality and predicate- or first order logic are the same thing; they're not.
In any case, your question is too broad to really answer. But, I believe you'll want to start with basic probability, then specifically Bayesian probability and statistics, and then (having the correct tools) you can look into probabilistic AI techniques: Markov chains, Markov decision processes, etc. You can also look at machine learning techniques.
Be aware: These are not simple mathematics. There is no way around that.
Note that this answer speaks to my personal biases; it is not an exhaustive list of techniques.
One approach is to use Propositional Logic or First Order Logic. The latter is more flexible.
First you define the current knowledge and then you can perform inferences applying rules. Prolog is a very powerful programming language for this purpose. In prolog you define you current knowledge using facts and then you can create rules that denote relationships. Then you can perform queries based on your facts and rules you defined.
I have a biology database that I would like to query. There is also a given terminology bank I have access to that has formalizable predicates. I would like to build a query language for this DB using the predicates mentioned. How would you go about it? My solution is the following:
formalize the predicates
translate into a query language (sql, sparql, depends)
Build a specific language with ANTLR or other such tools
Translate from 3 to 2.
Is this a valid approach? Are there better ones? Any pointers would be much appreciated.
Take a look at Booleano.
Use BNF to get a head-start into the language semantics..GoldParser will help you by playing around with the semantics and syntax (link here: http://www.devincook.com/). Once you have the BNF semantics sorted out, you can then build up actions based on the inputs, for example, a bnf grammar section dealing with extracting a composition of a limb's genetic makeup classification (I do not know if that is in existence, abstract example here but you get the gist) for a particular query...'fetch stats on limb where limb is leg', then behind the scenes you would issue a SQL select on a column alias or name from a predefined table ... I could be wrong on the approach... Hope it helps?
I suggest you take a look at the i2b2 framework, it's a graphical query language and query engine platform for patient databases.
It's probably hard to grasp all first but do take a look at the CRC cell or webservice in there, you'll see how they approached SQL generation from a clinical graphical query language in an interesting way (albeit, not so performance friendly :))
Consider using Irony.NET from here: Irony.NET
If I wanted to create my own relational database with a modern language to replace TSQL, what language would that be? Or if I end up creating my own language, what features would I have to include to make it better than TSQL ?
Chris Date and (to a somewhat lesser extent) Hugh Darwen have spent >20 years trying to expose all the flaws, fallacies and mistakes of the SQL language.
All flaws and fallacies of the SQL language are also flaws and fallacies of any language that has the character combination "SQL" in its name, so it applies to TSQL too.
Hugh Darwen has also spent a signigicant effort trying to expose the flaws, fallacies and mistakes of the TSQL2 language (that is, the 1990's proposal for a new SQL standard that attempted to incorporate temporal features, also the proposal that eventually didn't make it to becoming a standard, and that is, nevertheless and despite all well-founded criticisms, still taken as the implementation basis for every implementation that calls itself "TSQL").
Read (no, I'l make that "study very very carefully") their writings and you'll have more "drawbacks" than you ever dreamed possible.
Study their most recent TTM book ("Databases, Types and the Relational Model") plus its forthcoming sequel (not yet published - alas) too, and you'll know everything that is foundational and prerequisite for the "true" next-generation database programming language.
You'll also have the answer to the following question that was asked in comment here : "Assume you can invent a new database from scratch, without worrying about standards. What language would you use?". Answer : D. Or, more precisely : a language that conforms to all the prescriptions/proscriptions for qualifying as a D.