Datomic query - find all records (entities) with value - datomic

Query:
(d/q '[:find [?e ...]
:in $ ?value
:where [?e _ ?value]]
db "Germany")
returns nothing, while:
(d/q '[:find [?e ...]
:in $ ?value
:where [?e :country/name ?value]]
db "Germany")
returns list of entities as expected.
Shouldn't the _ serve as a wildcard for any attribute name and return everything that holds a value ?
I read this Datomic query: find all entities with some value, but can't figure how do I stick an actual value as a parameter.
Datomic version: datomic-pro-0.9.5966

I figured this dirty, time consuming method, but it does the job:
(defn all-by-value
[db value]
(reduce
(fn [res ident]
(try
(->> (d/q '[:find [?e ...] :in $ ?a ?v :where [?e ?a ?v]] db ident value)
(map #(d/entity db %))
(concat res))
(catch Exception _ res)))
[] (d/q '[:find [?e ...] :where [?e :db/ident]] db)))
Hope some of you will find it useful.

Related

Working with Python in Azure Databricks to Write DF to SQL Server

We just switched away from Scala and moved over to Python. I've got a dataframe that I need to push into SQL Server. I did this multiple times before, using the Scala code below.
var bulkCopyMetadata = new BulkCopyMetadata
bulkCopyMetadata.addColumnMetadata(1, "Title", java.sql.Types.NVARCHAR, 128, 0)
bulkCopyMetadata.addColumnMetadata(2, "FirstName", java.sql.Types.NVARCHAR, 50, 0)
bulkCopyMetadata.addColumnMetadata(3, "LastName", java.sql.Types.NVARCHAR, 50, 0)
val bulkCopyConfig = Config(Map(
"url" -> "mysqlserver.database.windows.net",
"databaseName" -> "MyDatabase",
"user" -> "username",
"password" -> "*********",
"dbTable" -> "dbo.Clients",
"bulkCopyBatchSize" -> "2500",
"bulkCopyTableLock" -> "true",
"bulkCopyTimeout" -> "600"
))
df.bulkCopyToSqlDB(bulkCopyConfig, bulkCopyMetadata)
That's documented here.
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-spark-connector
I'm looking for an equivalent Python script to do the same job. I searched for the same, but didn't come across anything. Does someone here have something that would do the job? Thanks.
Please try to refer to PySpark offical document JDBC To Other Databases to directly write a PySpark dataframe to SQL Server via the jdbc driver of MS SQL Server.
Here is the sample code.
spark_jdbcDF.write
.format("jdbc")
.option("url", "jdbc:sqlserver://yourserver.database.windows.net:1433")
.option("dbtable", "<your table name>")
.option("user", "username")
.option("password", "password")
.save()
Or
jdbcUrl = "jdbc:mysql://{0}:{1}/{2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
connectionProperties = {
"user" : jdbcUsername,
"password" : jdbcPassword,
"driver" : "com.mysql.jdbc.Driver"
}
spark_jdbcDF.write \
.jdbc(url=jdbcUrl, table="<your table anem>",
properties=connectionProperties ).save()
Hope it helps.
Here is the complete PySpark code to write a Spark Data Frame to an SQL Server database including where to input database name and schema name:
df.write \
.format("jdbc")\
.option("url", "jdbc:sqlserver://<servername>:1433;databaseName=<databasename>")\
.option("dbtable", "[<optional_schema_name>].<table_name>")\
.option("user", "<user_name>")\
.option("password", "<password>")\
.save()

Pyspark connection to the Microsoft SQL server?

I have a huge dataset in SQL server, I want to Connect the SQL server with python, then use pyspark to run the query.
I've seen the JDBC driver but I don't find the way to do it, I did it with PYODBC but not with a spark.
Any help would be appreciated.
Please use the following to connect to Microsoft SQL:
def connect_to_sql(
spark, jdbc_hostname, jdbc_port, database, data_table, username, password
):
jdbc_url = "jdbc:sqlserver://{0}:{1}/{2}".format(jdbc_hostname, jdbc_port, database)
connection_details = {
"user": username,
"password": password,
"driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver",
}
df = spark.read.jdbc(url=jdbc_url, table=data_table, properties=connection_details)
return df
spark is a SparkSession object, and the rest are pretty clear.
You can also pass pushdown queries to read.jdbc
I use pissall's function (connect_to_sql) but I modified it a little.
from pyspark.sql import SparkSession
def connect_to_sql(
spark, jdbc_hostname, jdbc_port, database, data_table, username, password
):
jdbc_url = "jdbc:mysql://{0}:{1}/{2}".format(jdbc_hostname, jdbc_port, database)
connection_details = {
"user": username,
"password": password,
"driver": "com.mysql.jdbc.Driver",
}
df = spark.read.jdbc(url=jdbc_url, table=data_table, properties=connection_details)
return df
if __name__=='__main__':
spark = SparkSession \
.builder \
.appName('test') \
.master('local[*]') \
.enableHiveSupport() \
.config("spark.driver.extraClassPath", <path to mysql-connector-java-5.1.49-bin.jar>) \
.getOrCreate()
df = connect_to_sql(spark, 'localhost', <port>, <database_name>, <table_name>, <user>, <password>)
or you can use SparkSession .read method
df = spark.read.format("jdbc").option("url","jdbc:mysql://localhost/<database_name>").option("driver","com.mysql.jdbc.Driver").option("dbtable",<table_name>).option("user",<user>).option("password",<password>).load()

How do I pull all entities linked from another entity in Datomic?

I don't know how to word my question.
:host/id has a link to :server/id. I want to pull all servers linked to a specific host.
I've tried several approaches but I get either an empty result, all results or an IllegalArgumentExceptionInfo :db.error/not-a-keyword Cannot interpret as a keyword.
I tried following the documentation but I keep getting lost. Here are my attempts so far:
All hosts
(d/q '[:find (pull ?server [{:host/id [:host/hostname]}])
:in $ ?hostname
:where
[?host :host/hostname ?hostname]
[?server :server/name]] db "myhost")
IllegalArgumentExceptionInfo
(d/q '[:find (pull ?server [{:host/id [:host/hostname]}])
:in $ ?hostname
:where
[?server :server/name ?host]
[?host :host/hostname ?hostname]] db "myhost")
[]
(d/q '[:find (pull ?host [{:host/id [:host/hostname]}])
:in $ ?hostname
:where
[?host :host/hostname ?hostname]
[?host :server/name]] db "myhost")
Assuming you have these entities in datomic:
(d/transact conn [{:host/name "host1"}])
(d/transact conn [{:server/name "db1"
:server/host [:host/name "host1"]}
{:server/name "web1"
:server/host [:host/name "host1"]}])
And assuming each server has a reference to host (please see schema below), in order to query which servers are linked to a host, use the reverse relation syntax '_':
(d/q '[:find (pull ?h [* {:server/_host [:server/name]}])
:in $ ?hostname
:where
[?h :host/name ?hostname]]
(d/db conn)
"host1")
will give you:
[[{:db/id 17592186045418,
:host/name "host1",
:server/_host [#:server{:name "db1"} #:server{:name "web1"}]}]]
Here is the sample schema for your reference:
(def uri "datomic:free://localhost:4334/svr")
(d/delete-database uri)
(d/create-database uri)
(def conn (d/connect uri))
(d/transact conn [{:db/ident :server/name
:db/cardinality :db.cardinality/one
:db/unique :db.unique/identity
:db/valueType :db.type/string}
{:db/ident :server/host
:db/cardinality :db.cardinality/one
:db/valueType :db.type/ref}
{:db/ident :host/name
:db/cardinality :db.cardinality/one
:db/unique :db.unique/identity
:db/valueType :db.type/string}])

Query using bigint attribute return empty for certain values

I created a minimal entity with one attribute of bigint type, my problem is that the query fail for certain values; this is the schema:
[{:db/ident :home/area,
:db/valueType :db.type/bigint,
:db/cardinality :db.cardinality/one,
:db/doc "the doc",
:db.install/_attribute :db.part/db,
:db/id #db/id[:db.part/db -1000013]}]
I inserted a sample value:
(d/transact (d/connect uri2)
[{
:db/id #db/id[:db.part/user]
:home/area 123456789000000N}
])
And confirmed that It was created by using the datomic console. It happens that the following query doesn’t return the entity previously inserted, as expected:
(d/q '[
:find ?e
:in $ ?h
:where
[?e :home/area ?h]]
(d/db (d/connect uri2))
123456789000000N
)
;;--- #{}
Maybe I’m missing something in the way the value is expressed. Another test using a different value like 100N for the attribute :home/area returns the correct answer:
(d/transact (d/connect uri2)
[{
:db/id #db/id[:db.part/user]
:home/area 100N}
])
(d/q '[
:find ?e
:in $ ?h
:where
[?e :home/area ?h]]
(d/db (d/connect uri2))
100N
)
;;-- #{[17592186045451]}
Also works fine with the value 111111111111111111111111111111111111N which is confusing to me.
Datomic version: "0.9.5390" java version "1.8.0_05" Java(TM) SE
Runtime Environment (build 1.8.0_05-b13) Java HotSpot(TM) 64-Bit
Server VM (build 25.5-b02, mixed mode) MySQL as Storage service
Thanks in advance for any any suggestions.
To Clojure users, the name :db.type/bigint can be misleading, since it actually maps to java.math.BigInteger, not clojure.lang.BigInt.
I reproduced the same steps and I can't tell you why the Datalog query fails on 123456789000000N but not 100N and 111111111111111111111111111111111111N. It seems however that the following always works:
(d/q '[
:find ?e
:in $ ?h
:where
[?e :home/area ?h]]
(d/db (d/connect uri2))
(.toBigInteger 100N)
)
I ran your example and got different results (it worked in all cases). I am not sure why, but maybe adding my example will help. The only changes I made were to use uri instead of uri2, I slurped the schema, and I performed a (def conn (d/connect uri)) and a (d/create-database uri). I assume you performed similar steps, which is why I don't know why my example worked:
Clojure 1.8.0
user=> (use '[datomic.api :only [q db] :as d])
nil
user=> (use 'clojure.pprint)
nil
user=> (def uri "datomic:mem://bigint")
#'user/uri
user=> (d/create-database uri)
true
user=> (def conn (d/connect uri))
#'user/conn
user=> (def schema-tx (read-string (slurp "path/to/the/schema.edn")))
#'user/schema-tx
user=> #(d/transact conn schema-tx)
{:db-before datomic.db.Db#b8774875,
:db-after datomic.db.Db#321a2712,
:tx-data [#datom[13194139534312 50 #inst "2016-08-14T18:53:23.158-00:00" 13194139534312 true]
#datom[63 10 :home/area 13194139534312 true] #datom[63 40 60 13194139534312 true]
#datom[63 41 35 13194139534312 true] #datom[63 62 "the doc" 13194139534312 true]
#datom[0 13 63 13194139534312 true]],
:tempids {-9223367638809264717 63}}
(d/transact (d/connect uri)
[{
:db/id #db/id[:db.part/user]
:home/area 123456789000000N}
])
#object[datomic.promise$settable_future$reify__6480 0x5634d0f4
{:status :ready, :val {:db-before datomic.db.Db#321a2712,
:db-after datomic.db.Db#f6ef3cd8,
:tx-data [#datom[13194139534313 50 #inst "2016-08-14T18:53:34.325-00:00" 13194139534313 true]
#datom[17592186045418 63 123456789000000N 13194139534313 true]],
:tempids {-9223350046623220288 17592186045418}}}]
(d/q '[
:find ?e
:in $ ?h
:where
[?e :home/area ?h]]
(d/db (d/connect uri))
123456789000000N
)
#{[17592186045418]}
(d/transact (d/connect uri)
[{
:db/id #db/id[:db.part/user]
:home/area 100N}
])
#object[datomic.promise$settable_future$reify__6480 0x3b27b497
{:status :ready, :val {:db-before datomic.db.Db#f6ef3cd8,
:db-after datomic.db.Db#2385c058,
:tx-data [#datom[13194139534315 50 #inst "2016-08-14T18:54:13.347-00:00" 13194139534315 true]
#datom[17592186045420 63 100N 13194139534315 true]],
:tempids {-9223350046623220289 17592186045420}}}]
(d/q '[
:find ?e
:in $ ?h
:where
[?e :home/area ?h]]
(d/db (d/connect uri))
100N
)
#{[17592186045420]}
user=>
Can you run (first schema-tx) on the REPL line to confirm your schema transacted? I noticed you were using the console and I am wondering if /bigint did not get defined or you were looking at the first uri (since I noticed you had a 2, I am assuming you have multiple examples).

VB.NET Data type mismatch, and data entry is to long

I have been working on this for days now, and any help would be great. I am trying to insert information into database using VB.NET and an access database. Currently I am having 2 problems with it. The first problem is that I have a memo field in my database (Response), and if I try to insert more than 250 characters into that field I get an error that says my entry is to long. The other problem I am having is that if I try to execute this code more than once while running my program I get an error saying "Data type mismatch in criteria expression...". The last issue is the one I am having with the data type mismatch.
here is the code in question
con = New OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=..\..\Backends\IncidentReport.mdb")
con.Open()
comStr = "INSERT INTO tblIncidentCommonItemsInfo(recid, Location, DescOrTypeInjIfOther, DateOf, TimeOf, TypeIncident, Doctor, " &
"DateDocNotified, TimeDocNotified, DateRespPartyNotified, RespPartyNotified, " &
" TimeRespPartyNotified, StateNotified, DateStateNotified, TimeStateNotified, Response) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"
cmd = New OleDbCommand(comStr, con)
cmd.Parameters.AddWithValue("#p1", IDLabel.Text)
cmd.Parameters.AddWithValue("#p2", LocTextBox.Text)
cmd.Parameters.AddWithValue("#p3", DescTextBox.Text)
cmd.Parameters.AddWithValue("#p4", DateOfTextBox.Text)
cmd.Parameters.AddWithValue("#p5", TimeOfTextBox.Text)
cmd.Parameters.AddWithValue("#p6", TypeTextBox.Text)
cmd.Parameters.AddWithValue("#p7", DocComboBox.SelectedItem)
cmd.Parameters.AddWithValue("#p8", DocDayDateTimePicker.Value)
cmd.Parameters.AddWithValue("#p9", DocTimeDateTimePicker.Value)
cmd.Parameters.AddWithValue("#p10", FamilyDayDateTimePicker.Value)
cmd.Parameters.AddWithValue("#p11", RespPtyTextBox.Text)
cmd.Parameters.AddWithValue("#p12", FamilyTimeDateTimePicker.Value)
cmd.Parameters.AddWithValue("#p13", IDPHYesNoComboBox.SelectedItem)
cmd.Parameters.AddWithValue("#p14", IDPHDayDateTimePicker.Value)
cmd.Parameters.AddWithValue("#p15", IDPHTimeDateTimePicker.Value)
cmd.Parameters.AddWithValue("#p16", ResidentWordsRichTextBox.Text)
Try
cmd.ExecuteNonQuery()
MsgBox("Incident Saved")
Catch ex As Exception
MessageBox.Show(ex.Message & " - " & ex.Source)
End Try
SavedTextBox.Text = "Yes"
con.Close()
Any help would be much appreciated, Thank you.
For the size problem you could try to specify exactly what kind of value you are passing through the parameter. I suspect that using AddWithValue will use a shorter parameter size
cmd.Parameters.Add("#p16", OleDbType.LongVarWChar).Value = ResidentWordsRichTextBox.Text)
The Data type mismatch in criteria expression error could be caused by the same problem.
The method AddWithValue determines the Parameter DataType looking at the type of the value you pass.
In your code you have passed Text for fields that appear to be of different kind. For example recid seems to be an integer (numeric) field, but the AddWithValue use a textbox.text that is a string. You really should apply a Convert.ToInt32(IDLabel.Text) and the same checking should be done for the potetially DateTime fields

Resources