How to drop all the data in xtdb during local development - database

During development I find that there's a lot of resources that get "left behind" on my local xtdb server.
What's the best way to clear this data locally without restarting my repl?

While posing the question I thought of the answer:
;; Given some `xtdb-node`
(let [res (xt/q (xt/db xtdb-node)
'{:find [id]
:where [[id :xt/id _]]})
ids (map first res)]
(->> ids
(mapv (fn [id] [::xt/delete id]))
(xt/submit-tx xtdb-node)))
Just search for all documents with the :xt/id key as that is a requirement of xtdb.

Related

How do I partition mongodb datasets?

I'm stuck in mongodb sharding and I need your help!
My first question is "How do I make my database partitioned:true in sh.status()?
I've worked with sharding servers and mongos but I need to partition my documents base on datetime.So I used tags and zone-ranges but I couldn't make this option true!
Here is the option I'm talking about:
I tried query it by sh.shardCollection("db.coll" , partitioned:true) but it doesn't work.
Create the index on which you would like to shard/partition:
use <database>
db.<collection>.createIndex({"<shard key field>":1})
Enable Sharding ( partition) the database:
sh.enableSharding("<database>")
Shard the collection :
sh.shardCollection("<database>.<collection>", { "<shard key field>" : 1, ... } )

Find all entities that are missing a particular attribute

In my schema I have the attribute :base/type that is supposed to exist for every entity created. To check that this is indeed true, I'm trying to find entities where it is missing:
[:find [?entities ...]
:in $ :where
[(missing? $ ?entities :base/type)]]
Unfortunately this gives me back:
Execution error (Exceptions$IllegalArgumentExceptionInfo) at datomic.error/arg (error.clj:57).
:db.error/insufficient-binding [?entities] not bound in expression clause: [(missing? $ ?entities :base/type)]
How should this query be constructed?
This is because your query is too general. You need at least one positive clause in where statement if you use query API. You can access raw index to get the result, though. If you have enough RAM, you can:
(def snapshot (d/db connection)) ; your db snapshot
(def all-datoms (d/datoms snapshot :eavt))
(def base-type-id (:db/id (d/entity snapshot :base/type))
(def entities (group-by #(.e %) all-datoms))
(def entities-without-base-type (map
(comp #(into {} %) (partial d/entity snapshot) first)
(filter (fn [[k l]]
(empty? (filter #(= base-type-id (.a %))
l)))
entities)))
(def only-relevant-entities (filter #(not (or (:db/ident %) (:db/txInstant %))) entities-without-base-type))
only-relevant-entities
The last filter is to get rid of attribute definitions and transactions (they are stored as datoms in the db as well!).
If you have too many entities you can chunk datoms using the async atoms API.
Using ::singer/songs as an example attribute, this is how to do the query:
[:find [?entities ...]
:in $ :where
[?entities ::singer/songs ?s]
[(missing? $ ?entities :base/type)]]
Unfortunately (in my answer) many such queries would be required until the whole database has been covered. So another query with ::song/composers, etc...

:result-set-fn in clojure jdbc return error "The result set is closed." Why?

Often I need load huge data size from database server. Sometimes it have million rows and more. So I try to download data lazily. That what I want to do: I want get a lazy-sequence and pull data partial from server, i.e if row count is more than 500, I want primarily get with help of that lazy-sequence first 500 elements, then by another request i want receive next 500 elements and so on until i recieve all data from server.
But I have a problem. Clojure jdbc realize entire lazy-sequence, but I want obtain data from it partially.
I research that question and find good reply about similar problem:
clojure.java.jdbc lazy query
I took this example and wrote this:
(defn get_data
[arg1 arg2]
(let [full-db-spec (get ...)
sql_query (get ...)
row-n (atom 0)
prepared-statement (-> full-db-spec
(jdbc/get-connection)
(jdbc/prepare-statement sql_query {:fetch-size 3}))]
(jdbc/with-db-transaction [tx full-db-spec]
(jdbc/query full-db-spec [prepared-statement arg1 arg2]
{:fetch-size 3
:row-fn (fn [r] (do (prn "r" #row-n) (swap! row-n inc) r))
:result-set-fn identity}))))
Here I want to get a lazy-sequence to further extract data partially from this lazy-sequence. But when :result-set-fn is contain identity or (take 500 ...) the code return error: The result set is closed. Why? But when I change :result-set-fn to first or doall or last It works fine but it realize full lazy-sequence!
I use:
ms sql [com.microsoft.sqlserver/mssql-jdbc "6.3.3.jre8-preview"] (yet I test it on postgresql [org.postgresql/postgresql "9.4.1212.jre7"]. Same result)
[org.clojure/java.jdbc "0.7.3"]
That lazy sequence is reading values from your connection, but the connection is closed outside of the with-db-transaction scope. You need to realize/do further processing inside of that scope.

How to safely remove a duplicate index from a Rails 3 schema?

I'm working on a Rails 3 app, and we recently realized we have a duplicate index:
# from schema.rb
add_index "dogs", ["owner_id"], :name => "index_dogs_on_owner"
add_index "dogs", ["owner_id"], :name => "index_dogs_on_owner_id"
How can I check which index ActiveRecord is using for relevant queries? Or do I even need to? If one of the indices is removed will ActiveRecord happily just use the other?
I can play around with it locally, but I'm not sure our production environment behaves exactly the same at the DB level.
The name of the index is arbitrary. The database engine will look at the indexes based on the column name, not the human name. The index will not affect ActiveRecord. I recommend removing whichever index is least obvious, in this case index_dogs_on_owner, because the other index is clearly on the owner_id column.
remove_index :dogs, :name => 'index_dogs_on_owner'
Cite: http://apidock.com/rails/ActiveRecord/ConnectionAdapters/SchemaStatements/remove_index

Query to list all partitions in Datomic

What is a query to list all partitions of a Datomic database?
This should return
[[:db.part/db] [:db.part/tx] [:db.part/user] .... ]
where .... is all of the user defined partitions.
You should be able to get a list of all partitions in the database by searching for all entities associated with the :db.part/db entity via the :db.install/partition attribute:
(ns myns
(:require [datomic.api :as d]))
(defn get-partitions [db]
(d/q '[:find ?ident :where [:db.part/db :db.install/partition ?p]
[?p :db/ident ?ident]]
db))
Note
The current version of Datomic (build 0.8.3524) has a shortcoming such that :db.part/tx and :db.part/user (two of the three built-in partitions) are treated specially and aren't actually associated with :db.part/db via :db.install/partition, so the result of the above query function won't include the two.
This problem is going to be addressed in one of the future builds of Datomic. In the meantime, you should take care of including :db.part/tx and :db.part/user in the result set yourself.
1st method - using query
=> (q '[:find ?i :where
[:db.part/db :db.install/partition ?p] [?p :db/ident ?i]]
(db conn))
2nd method - from db object
(filter #(instance? datomic.db.Partition %) (:elements (db conn)))
The second method returns sequence of datomic.db.Partition objects which may be useful if we want to get additional info about the partition.
Both methods have known bug/inconsistency: they don't return :db.part/tx and :db.part/user built-in partitions.

Resources