Choice of database to insert data from Scala code

Choice of database to insert data from Scala code - database

I have a project written in Scala where I want to save incoming data to some database. My mentor suggested Persistence (Akka), but from what I've read it seems like that just keeps track of the state so that a former state can be recovered if it crashes.
Sorry for my inexperience in this field, I just want to get some input regarding whether it is possible to use Persistence in this case. Otherwise, suggestions of alternative approaches would be much appreciated.

Using Persistence in this case would certainly be possible. As you noticed, it's a specific add-on for the akka actors, so it's only adapted in this case.
That being said, "save incoming data to some database" is a bit too broad of a mission to really say if it's the most fitted solution for you !
I encourage you to dig the subject with your mentor, since you're lucky to have one ! :-D
And finally, if it turns out there was a misunderstanding of what you needed, I'd suggest looking at slick, which is afaik a very classic choice for "writing data to some database" !

Related

datomic and the constant transferring of big data

I just watched this talk by Rick Hickey and it was very eye opening. Now I can't simply go back and use postgresql and mongodb without thinking how much more interesting it could be. There is just one thing that I don't quite understand.
If I understand correctly, the basic premise here is that your database has to be a persistent data structure and how you work with it is by using pure functions that receive it as an argument and return a new one, pretty standard functional programming.
But what if I have a huge amount of data? All of that is just supposed to be constantly going back and fourth between my code and the datomic instance? How can that be efficient? Or am I misunderstanding something here?
Thanks

Creating and using databases

So the solid consensus I got from the answers to this question: Editing a single line in a large text file
was that instead of using a text file I should create a database and store my data there. While I think this is a great idea, I don't know the first thing about databases, the programming languages used for databases, or how to use a database once I have set it up. Could you guys give me a shove in the right direction and point me an absolute noob tutorial that might help me with this?
UPDATE: Hey guys, so I was looking at mySQL and there are a whole bunch of versions! The Cluster CGE looks like the best one, and it says it is good for "real-time open source transactional database designed for fast, always-on access to data under high throughput conditions" which just about hits the nail on the head of what I need. It says commercial next to it though, so I don't know if I would have to pay some god awful fee for it. I tried it anyway, and it said I should have gotten a license already, and until I did I could only use it for 30 days. Im confused...
Can I get this version for free? If so, where do I get the license?
Is this version way overpowered for what I need? I need:
1. A storage medium through which I can store large amounts of data
2. Read and write from in real time with simultaneous access
3. Have two different "keys" (I think I'm using that right, I need to be able to search for entrees based on one of two criteria).

MySQL is a great choice, given your Python flair.
http://dev.mysql.com/tech-resources/articles/mysql_intro.html
Good luck!

Change in database structure

We already have a database structure, but it is the structure without normalization and very confused and in need of change, but already has a large volume of stored data, for example, all financial data company, which finance department officials are afraid of losing.
We are undecided about remodeling the entire structure of the database and retrieve the most basic and all that is possible, or continue with the same model along with their problems.
I wonder if someone has made a change like this, if you can actually transfer the data to a new structure.
thanks

Before you do any thing I would BACKUP!!! Next I would create a new database with the ideas that you had in mind. Remember this is were all the real work should be once this is created it is hard to go back. Put a lot of thought in and make the design a bullet proof tiger to the design of your company. Next create some procedures to transform the data you have in the new database as you see fit. It would help if you mentioned the platform(s) you are using and mabey provide some generic examples
I have found SSIS packages work well for projects like this if you are using SQLSERVER. While you will need to still write your transforms out the packages make the work easier for others to see what is happening
Anything can be done by you the developer. However it might make business sense to check out various 3rd party tools. There are many out there and depending on exactly what you are doing you may benefit from doing some research

Yes, it's called "database conversion". It is a very common practice, but it must be done carefully and methodically, ideally by someone who has done many of them and knows the pitfalls. It is not to be done casually by any means. Moreover, it is not unusual in the financial sector to run the "old system" in parallel with the new system for a couple of months, to reconcile month-end reports, before saying goodbye to the old system. Running parallel is a PITA, and can only be done if all of the conversion programs are in place, but it's better to be safe than sorry when the numbers must be correct to the penny.

I had the same problem, the way I solved this is by re-design a new database, then I made a script that copies the data from the old schema to the new one. It's not an easy task because you need to take care of what you are copying from the old model to the new one but it's doable!

absolutely you can migrate the data to an new structure. The real question is 'how difficult (expensive/time consuming/reliable) will the migration be?' To answer that question one would have to know
The accuracy of the existing data - does it have gaps, duplication that disagrees with each other and no way to resolve, errors, etc.
What structure do you imagine going to and is this going to introduce complexity to the migration
the skill level of the person/team doing the migration
How long the migration will take and will the platforms be changing (either the live system being modified or the new system design changing)

How to identify ideas and concepts in a given text

I'm working on a project at the moment where it would be really useful to be able to detect when a certain topic/idea is mentioned in a body of text. For instance, if the text contained:
Maybe if you tell me a little more about who Mr Jones is, that would help. It would also be useful if I could have a description of his appearance, or even better a photograph?
It'd be great to be able to detect that the person has asked for a photograph of Mr Jones. I could take a really naïve approach and just look for the word "photo" or "photograph", but this would obviously be no good if they wrote something like:
Please, never send me a photo of Mr Jones.
Does anyone know where to start with this? Is it even possible?
I've looked into things like nltk, but I've yet to find an example of someone doing something similar and am still not entirely sure what this kind of analysis is called. Any help that can get me off the ground would be great.
Thanks!

The best thing out there that might be useful to you is automatic sentiment analysis. This is used, for example, to judge whether, say, a customer review is positive or negative. I cannot give you direct pointers to available tools, but this is what you are looking for.
I must say, though, that this is a current hot topic in natural language processing and I’ve seen a number of papers at conferences. It’s definitely quite a complex matter and if you’re starting from scratch, it might take quite some time before you get the results that you want.

NLTK is not a bad framework for parsing natural language but beware that this is not a simple matter. Doing stuff like this is really research level programming.
A good thing that makes it much easier is if you have a very limited domain - say your application focuses on information about famous writers, then you can avoid some complexities of natural language like certain types of ambiguities.
Where to start? Good question. I don't know of any tutorials on the topic (and I presume you tried the Google option) but I'd imagine that iTunes U would have a course on the topic. If not I can post a link to a course I've done that mentions the subject and wasn't completely horrible: http://www.inf.ed.ac.uk/teaching/courses/inf2a/lecturematerials/index.html#lecture01

The problem that u tackle is very challenging.
I would start by first identifying the entities in the text (problem referred as Named Entity Recognition, google it), and then a I would try to identify concepts.
If want to roughly identify what is the text about, I suggest that you start by using WordNet and according to the words and their places in the hierarchy to identify the concepts involved.
If you want to produce a system which show real intelligence than you should start researching about resources such as CYC (OpenCYC) which will allow you to convert the sentences into FOL sentences.
This hardcore AI, approach to solving your problem. For simple chat bot, it would be easier to rely on simple statistical methods.
good luck

Is any group or foundation developing an algorithm for better storing massive amounts of data?

I've looked at several approaches to enterprise architecture for databases that store massive amounts of data, and it usually comes down to more hardware, database sharding, and storing JSON objects. Has any group been doing research, or does anyone have a more dynamic approach that processes the available data and tells you how to better store it, and then instructs you how to retrieve it given the new method of storage? I know it sounds a bit fanciful, but I figured I would ask anyway.

You might find this interesting:
http://en.wikipedia.org/wiki/BigTable

Very interesting question. It seems to me like the Semantic Web folks may have to deal with this issue before too long. It also seems to me that they've got some technologies that might provide at least part of the solution. Have a look at the OWL specs, for instance.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Choice of database to insert data from Scala code - database

Related

datomic and the constant transferring of big data

Creating and using databases

Change in database structure

How to identify ideas and concepts in a given text

Is any group or foundation developing an algorithm for better storing massive amounts of data?

Categories

Resources