datomic and the constant transferring of big data - datomic

I just watched this talk by Rick Hickey and it was very eye opening. Now I can't simply go back and use postgresql and mongodb without thinking how much more interesting it could be. There is just one thing that I don't quite understand.
If I understand correctly, the basic premise here is that your database has to be a persistent data structure and how you work with it is by using pure functions that receive it as an argument and return a new one, pretty standard functional programming.
But what if I have a huge amount of data? All of that is just supposed to be constantly going back and fourth between my code and the datomic instance? How can that be efficient? Or am I misunderstanding something here?
Thanks

Related

Choice of database to insert data from Scala code

I have a project written in Scala where I want to save incoming data to some database. My mentor suggested Persistence (Akka), but from what I've read it seems like that just keeps track of the state so that a former state can be recovered if it crashes.
Sorry for my inexperience in this field, I just want to get some input regarding whether it is possible to use Persistence in this case. Otherwise, suggestions of alternative approaches would be much appreciated.
Using Persistence in this case would certainly be possible. As you noticed, it's a specific add-on for the akka actors, so it's only adapted in this case.
That being said, "save incoming data to some database" is a bit too broad of a mission to really say if it's the most fitted solution for you !
I encourage you to dig the subject with your mentor, since you're lucky to have one ! :-D
And finally, if it turns out there was a misunderstanding of what you needed, I'd suggest looking at slick, which is afaik a very classic choice for "writing data to some database" !

One Google Docs workbook as a database for another using a script?

Disclaimer: I started working with spreadsheets in depth this week, prior to that it was basic usage. I've read the rules and this does relate to programming, it's just my ignorance of programming keeps me from asking a specific question. I'm new to this, I want to learn, I have to start somewhere.
I want to create two separate spreadsheet documents, one as a database for another. I want one to be able to query the other in a way similar to the VLOOKUP() function or something along those lines.
These are very large files hence the need for separate documents.
I am learning about scripting and think there might be a way there. If that's the case please appreciate that I literally started reading about scripts this morning and know nothing (yet) about them.
All I need to know is, if it's possible and what functions to use, I'll figure out how to use them. I just don't have a working knowledge of all the script functions, and a limited knowledge of spreadsheet functions.
The IMPORTRANGE() function is limited to 50 per spreadsheet, given how I want to use it, that is not enough. Unless you know a work around. That and I only want one cell of information at a time and it doesn't need to be displayed, just usable.
Also, efficiency is king since I'm working with such large amounts of data. I used to have almost 1500 VLOOKUP functions as I was building what I already have and that sucker was starting to bog down. Then I realized I didn't need a dynamic database for that aspect of the sheet. I killed about two thirds of them and it runs much better. I'd like to keep it that way, or at least try.
Finally I may have bitten off more than I can chew, but this has been a fun challenge for me, and I've met with success so far. Please don't dismiss me out of hand because I don't know the right questions to ask, or I'm trying to fit a square peg in a round hole, everyone has to start somewhere.
Thanks!
This is totally possible, though you will quickly find that spreadsheet functions are too cumbersome for this sort of operation.
With Google Apps Scripts you can query and write to and from multiple workbooks with ease. You would be working in Javascript, using javascript objects and arrays.
Start by reading the Google documentation and checking out their examples.

Creating and using databases

So the solid consensus I got from the answers to this question: Editing a single line in a large text file
was that instead of using a text file I should create a database and store my data there. While I think this is a great idea, I don't know the first thing about databases, the programming languages used for databases, or how to use a database once I have set it up. Could you guys give me a shove in the right direction and point me an absolute noob tutorial that might help me with this?
UPDATE: Hey guys, so I was looking at mySQL and there are a whole bunch of versions! The Cluster CGE looks like the best one, and it says it is good for "real-time open source transactional database designed for fast, always-on access to data under high throughput conditions" which just about hits the nail on the head of what I need. It says commercial next to it though, so I don't know if I would have to pay some god awful fee for it. I tried it anyway, and it said I should have gotten a license already, and until I did I could only use it for 30 days. Im confused...
Can I get this version for free? If so, where do I get the license?
Is this version way overpowered for what I need? I need:
1. A storage medium through which I can store large amounts of data
2. Read and write from in real time with simultaneous access
3. Have two different "keys" (I think I'm using that right, I need to be able to search for entrees based on one of two criteria).
MySQL is a great choice, given your Python flair.
http://dev.mysql.com/tech-resources/articles/mysql_intro.html
Good luck!

Clojure database interaction - application design/approach

I hope this question isn't too general or doesn't make sense.
I'm currently developing a basic application that talks to an SQLite database, so naturally I'm using the clojure.java.jdbc library (link) to interact with the DB.
The trouble is, as far as I can tell, the way you insert data into the DB using this library is by simply passing a map (e.g. {:id 1 :name "stackoverflow"} and a table name (e.g. :website)
The thing that that I'm concerned about is how can I make this more robust in the wider context of my application? What I mean by this is when I write data to the database and retrieve it, I want to use the same formatted map EVERYWHERE in the application, so from the data access layer (returning or passing in maps) all the way up to the application layer where it works on the data and passes it back down again.
What I'm trying to get at is, is there an 'idiomatic' clojure equivalent of JavaBeans?
The problem I'm having right now is having to repeat myself by defining maps manually with column names etc - but if I change the structure of my table in the DB, my whole application has to be changed.
As far as I know, there really isn't such a library. There are various systems that make it easier to write queries, but not AFAIK, anything that "fixes" your data objects.
I've messed around trying to write something like you propose myself but I abandoned the project since it became very obvious very quickly that this is not at all the right thing to do in a clojure system (and actually, I tend to think now that the approach has only very limited use even in languages that have really "fixed" data structures).
Issues with the clojure collection system:
All the map access/alteration functions are really functional. That
means that alterations on a map always return a new object, so it's
nearly impossible to create a forcibly fixed map type that's also
easy to use in idiomatic clojure.
General conceptual issues:
Your assumption that you can "use the same formatted map EVERYWHERE
in the application, so from the data access layer (returning or
passing in maps) all the way up to the application layer where it
works on the data and passes it back down again" is wrong if your
system is even slightly complex. At best, you can use the map from
the DB up to the UI in some simple cases, but the other way around is
pretty much always the wrong approach.
Almost every query will have its own result row "type"; you're
probably not going to be able to re-use these "types" across queries
even in related code.
Also, forcing these types on the rest of the program is probably
binding your application more strictly to the DB schema. If your
business logic functions are sane and well written, they should only
access as much data as they need and no more; they should probably
not use the same data format everywhere.
My serious answer is; don't bother. Write your DB access functions for the kinds of queries you want to run, and let those functions check the values moving in and out of the DB as much detail as you find comforting. Do not try to forcefully keep the data coming from the DB "the same" in the rest of your application. Use assertions and pre/post conditions if you want to check your data format in the rest of the application.
Clojure favour the concept of Few data structure and lots of functions to work on these few data structures. There are few ways to create new data structure (which I guess internally uses the basic data structures) like defrecord etc. But again if you are able to use them that won't really solve the problem that DB schema changes should effect the code less as you will eventually have to go through layers to remove/add the effects of the schema changes, because anywhere you are reading/creating that data that needs to be changed

Change in database structure

We already have a database structure, but it is the structure without normalization and very confused and in need of change, but already has a large volume of stored data, for example, all financial data company, which finance department officials are afraid of losing.
We are undecided about remodeling the entire structure of the database and retrieve the most basic and all that is possible, or continue with the same model along with their problems.
I wonder if someone has made a change like this, if you can actually transfer the data to a new structure.
thanks
Before you do any thing I would BACKUP!!! Next I would create a new database with the ideas that you had in mind. Remember this is were all the real work should be once this is created it is hard to go back. Put a lot of thought in and make the design a bullet proof tiger to the design of your company. Next create some procedures to transform the data you have in the new database as you see fit. It would help if you mentioned the platform(s) you are using and mabey provide some generic examples
I have found SSIS packages work well for projects like this if you are using SQLSERVER. While you will need to still write your transforms out the packages make the work easier for others to see what is happening
Anything can be done by you the developer. However it might make business sense to check out various 3rd party tools. There are many out there and depending on exactly what you are doing you may benefit from doing some research
Yes, it's called "database conversion". It is a very common practice, but it must be done carefully and methodically, ideally by someone who has done many of them and knows the pitfalls. It is not to be done casually by any means. Moreover, it is not unusual in the financial sector to run the "old system" in parallel with the new system for a couple of months, to reconcile month-end reports, before saying goodbye to the old system. Running parallel is a PITA, and can only be done if all of the conversion programs are in place, but it's better to be safe than sorry when the numbers must be correct to the penny.
I had the same problem, the way I solved this is by re-design a new database, then I made a script that copies the data from the old schema to the new one. It's not an easy task because you need to take care of what you are copying from the old model to the new one but it's doable!
absolutely you can migrate the data to an new structure. The real question is 'how difficult (expensive/time consuming/reliable) will the migration be?' To answer that question one would have to know
The accuracy of the existing data - does it have gaps, duplication that disagrees with each other and no way to resolve, errors, etc.
What structure do you imagine going to and is this going to introduce complexity to the migration
the skill level of the person/team doing the migration
How long the migration will take and will the platforms be changing (either the live system being modified or the new system design changing)

Resources