SqlAlchemy - How to query a table based on a key property saved as NestedMutableJson - database

Let suppose a Postgres user table contains a property of type NestedMutableJson.
first_name character varying (120)
last_name character varying (120
country_info NestedMutableJson
...
from sqlalchemy_json import NestedMutableJson
country_info = db.Column(NestedMutableJson, nullable=True)
country_info = {"name": "UK", "code": "11"}
How to query the user table based on a country_info key.
POSTGRES Query
SELECT * FROM user WHERE country_info ->> 'name' = 'UK'
Does any SqlAlchemy way give the same query result?
I tried several ways, example:
Way 1:
User.query.filter(User.country_info['name'].astext == 'UK').all()
Error:
Operator 'getitem' is not supported on this expression
Way 2:
User.query.filter(User.country_info.op('->>')('name') == 'UK').all()
Issue:
Always getting an empty response
I'm wondering if the issue caused by the column definition db.Column(NestedMutableJson, nullable=True)
I'm avoiding using db.session.execute("SELECT * FROM user WHERE country_info ->> 'name' = 'UK'").fetchall(). looking for something else

simply user the text which allows you to write plain text filter inside orm process, it works like get function for a dict, and it can handle None fields also.
User.query.filter(text("COALESCE(user.country_info ->> 'name', '') = 'UK'")).all()
note that the name of the table should be the real name in the database

Take a look at the JSON datatype documentation.
You should be able to do use this filter clause
select(CountryInfo).filter(CountryInfo.country_info['name'].astext == "UK")

You can use .op('->>') to use the PostgreSQL operator ->> in the following way:
from sqlalchemy import Column, Integer, create_engine, String, select
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session
from sqlalchemy.dialects.postgresql import JSON
from sqlalchemy_json import NestedMutableJson
dburl = 'postgresql://...'
Base = declarative_base()
class CountryInfo(Base):
__tablename__ = 'country_info'
id = Column(Integer, unique=True, nullable = False, primary_key=True)
name = Column(String)
country_info = Column(NestedMutableJson, nullable=True)
def __repr__(self):
return f'CountryInfo({self.name!r}, {self.country_info!r})'
engine = create_engine(dburl, future=True, echo=True)
Base.metadata.drop_all(engine)
Base.metadata.create_all(engine)
with Session(engine) as session:
test = CountryInfo(name='test', country_info={"name": "UK", "code": "11"})
test2 = CountryInfo(name='test2', country_info={"name": "NL", "code": "12"})
test3 = CountryInfo(name='test3', country_info={"name": "UK", "code": "13"})
session.add(test)
session.add(test2)
session.add(test3)
session.commit()
stmt = select(CountryInfo).filter(CountryInfo.country_info.op('->>')('name') == "UK")
query = session.execute(stmt).all()
for row in query:
print(row)
This results in the following SQL:
SELECT country_info.id, country_info.name, country_info.country_info
FROM country_info
WHERE (country_info.country_info ->> %(country_info_1)s) = %(param_1)s
with {'country_info_1': 'name', 'param_1': 'UK'}
Which results in:
(CountryInfo('test', {'name': 'UK', 'code': '11'}),)
(CountryInfo('test3', {'name': 'UK', 'code': '13'}),)

Related

NIFI - upload binary.zip to SQL Server as varbinary

I am trying to upload a binary.zip to SQL Server as varbinary type column content.
Target Table:
CREATE TABLE myTable ( zipFile varbinary(MAX) );
My NIFI Flow is very simple:
-> GetFile:
filter:binary.zip
-> UpdateAttribute:<br>
sql.args.1.type = -3 # as varbinary according to JDBC types enumeration
sql.args.1.value = ??? # I don't know what to put here ! (I've triying everything!)
sql.args.1.format= ??? # Is It required? I triyed 'hex'
-> PutSQL:<br>
SQLstatement= INSERT INTO myTable (zip_file) VALUES (?);
What should I put in sql.args.1.value?
I think it should be the flowfile payload, but it would work as part of the INSERT in the PutSQL? Not by the moment!
Thanks!
SOLUTION UPDATE:
Based on https://issues.apache.org/jira/browse/NIFI-8052
(Consider I'm sending some data as attribute parameter)
import java.nio.charset.StandardCharsets
import org.apache.nifi.controller.ControllerService
import groovy.sql.Sql
def flowFile = session.get()
def lookup = context.controllerServiceLookup
def dbServiceName = flowFile.getAttribute('DatabaseConnectionPoolName')
def tableName = flowFile.getAttribute('table_name')
def fieldName = flowFile.getAttribute('field_name')
def dbcpServiceId = lookup.getControllerServiceIdentifiers(ControllerService).find
{ cs -> lookup.getControllerServiceName(cs) == dbServiceName }
def conn = lookup.getControllerService(dbcpServiceId)?.getConnection()
def sql = new Sql(conn)
flowFile.read{ rawIn->
def parms = [rawIn ]
sql.executeInsert "INSERT INTO " + tableName + " (date, "+ fieldName + ") VALUES (CAST( GETDATE() AS Date ) , ?) ", parms
}
conn?.close()
if(!flowFile) return
session.transfer(flowFile, REL_SUCCESS)
session.commit()
maybe there is a nifi native way to insert blob however you could use ExecuteGroovyScript instead of UpdateAttribute and PutSQL
add SQL.mydb parameter on the level of processor and link it to required DBCP pool.
use following script body:
def ff=session.get()
if(!ff)return
def statement = "INSERT INTO myTable (zip_file) VALUES (:p_zip_file)"
def params = [
p_zip_file: SQL.mydb.BLOB(ff.read()) //cast flow file content as BLOB sql type
]
SQL.mydb.executeInsert(params, statement) //committed automatically on flow file success
//transfer to success without changes
REL_SUCCESS << ff
inside the script SQL.mydb is a reference to groovy.sql.Sql oblject

sqlalchemy.orm.exc.UnmappedInstanceError: Class 'builtins.dict' is not mapped AND using marshmallow-sqlalchemy

I don't get it. I'm trying to start a brand new table in MS SQL Server 2012 with the following:
In SQL Server:
TABLE [dbo].[Inventory](
[Index_No] [bigint] IDENTITY(1,1) NOT NULL,
[Part_No] [varchar(150)] NOT NULL,
[Shelf] [int] NOT NULL,
[Bin] [int] NOT NULL,
PRIMARY KEY CLUSTERED
(
[Index_No] ASC
)
UNIQUE NONCLUSTERED
(
[Part_No] ASC
)
GO
NOTE: This is a BRAND NEW TABLE! There is no data in it at all
Next, this is the Database.py file:
import pymssql
from sqlalchemy import create_engine, Table, MetaData, select, Column, Integer, Float, String, text,
func, desc, and_, or_, Date, insert
from marshmallow_sqlalchemy import SQLAlchemyAutoSchema
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
USERNAME = "name"
PSSWD = "none_of_your_business"
SERVERNAME = "MYSERVER"
INSTANCENAME = "\SQLSERVER2012"
DB = "Inventory"
engine = create_engine(f"mssql+pymssql://{USERNAME}:{PSSWD}#{SERVERNAME}{INSTANCENAME}/{DB}")
class Inventory(Base):
__tablename__ = "Inventory"
Index_No = Column('Index_No', Integer, primary_key=True, autoincrement=True)
Part_No = Column("Part_No", String, unique=True)
Shelf = Column("Shelf", Integer)
Bin = Column("Bin", Integer)
def __repr__(self):
return f'Drawing(Index_No={self.Index_No!r},Part_No={self.Part_No!r}, Shelf={self.Shelf!r}, ' \
f'Bin={self.Bin!r})'
class InventorySchema(SQLAlchemyAutoSchema):
class Meta:
model = Inventory
load_instance = True
It's also to note that I'm using SQLAlchemy 1.4.3, if that helps out.
and in the main.py
import Database as db
db.Base.metadata.create_all(db.engine)
data_list = [{Part_No:123A, Shelf:1, Bin:5},
{Part_No:456B, Shelf:1, Bin:7},
{Part_No:789C, Shelf:2, Bin:1}]
with db.Session(db.engine, future=True) as session:
try:
session.add_all(data_list) #<--- FAILS HERE AND THROWS AN EXCEPTION
session.commit()
except Exception as e:
session.rollback()
print(f"Error! {e!r}")
raise
finally:
session.close()
Now what I've googled on this "Class 'builtins.dict' is not mapped", most of the solutions brings me to marshmallow-sqlalchemy package which I've tried, but I'm still getting the same error. So I've tried moving the Base.metadata.create_all(engine) from the Database.py into the main.py. I also tried implementing a init function in the Inventory class, and also calling the Super().init, which doesn't work
So what's going on?? Why is it failing and is there a better solution to this problem?
Try creating Inventory objects:
data_list = [
Inventory(Part_No='123A', Shelf=1, Bin=5),
Inventory(Part_No='456B', Shelf=1, Bin=7),
Inventory(Part_No='789C', Shelf=2, Bin=1)
]

How to redefine tables with the same name in SQLAlchemy using Classical Mapping

I am using SQLAlchemy classical mapping to define a table with the same name but different columns depending on the db, I have mapped the class as it's explained on docs, but I am getting errors every single time I try to redefine the class for another database. For instance:
from sqlalchemy import (Table, MetaData, String, Column)
from sqlalchemy.orm import mapper
class MyTable(object):
def __init__(self, *args, **kwargs):
[setattr(self, k, v) for k, v in kwargs.items()]
default_cols = (
Column('column1', String(20), primary_key=True),
Column('column2', String(20))
)
def myfunc1():
engine = create_engine('connection_to_database1')
session = sessionmaker(bind=engine)()
metadata = MetaData()
mytable = Table('mytable', metadata, *default_cols)
mapper(MyTable, mytable)
metadata.create_all(bind=engine)
def myfunc2():
engine = create_engine('connection_to_database2')
session = sessionmaker(bind=engine)()
metadata = MetaData()
columns = list(default_cols) + [Column('column3', String(20))]
mytable = Table('mytable', metadata, *columns)
mapper(MyTable, mytable)
metadata.create_all(bind=engine)
myfunc1()
myfunc2()
The error I get:
Column object 'column1' already assigned to Table 'mytable'
How is this happening if I am using completely different instances of MetaData and engines? Is there a way to achieve this?
Using the default_cols variable was actually the problem, seems like this kind of setup doesn't work unless the columns are defined individually on each function:
def myfunc1():
engine = create_engine('connection_to_database1')
session = sessionmaker(bind=engine)()
metadata = MetaData()
mytable = Table('mytable', metadata,
Column('column1', String(20), primary_key=True),
Column('column2', String(20))
)
mapper(MyTable, mytable)
metadata.create_all(bind=engine)
def myfunc2():
engine = create_engine('connection_to_database2')
session = sessionmaker(bind=engine)()
metadata = MetaData()
columns = [
Column('column1', String(20), primary_key=True),
Column('column2', String(20),
Column('column3', String(20))
]
mytable = Table('mytable', metadata, *columns)
mapper(MyTable, mytable)
metadata.create_all(bind=engine)
Otherwise it will raise the Exception:
Column object 'column1' already assigned to Table 'mytable'
I couldn't reproduce your error. To get the code to work I had to swap the order of the Mapper arguments and add primary keys to the tables definitions. More significantly perhaps, I had to set one of the mappers as non-primary after getting this error:
sqlalchemy.exc.ArgumentError: Class '<class '__main__.MyTable'>' already has a primary
mapper defined. Use non_primary=True to create a non primary Mapper. clear_mappers()
will remove *all* current mappers from all classes.
from sqlalchemy import Table, MetaData, String, Column, create_engine, Integer
from sqlalchemy.orm import mapper
class MyTable(object):
def __init__(self, *args, **kwargs):
[setattr(self, k, v) for k, v in kwargs.items()]
def myfunc1():
engine = create_engine("mysql+pymysql:///test")
metadata = MetaData()
mytable = Table(
"mytable111",
metadata,
Column("id", Integer, primary_key=True),
Column("column1", String(20)),
Column("column2", String(20)),
)
mapper(MyTable, mytable)
metadata.create_all(bind=engine)
def myfunc2():
engine = create_engine("postgresql+psycopg2:///test")
metadata = MetaData()
mytable = Table(
"mytable111",
metadata,
Column("id", Integer, primary_key=True),
Column("column1", String(20)),
Column("column2", String(20)),
Column("column3", String(20)),
)
mapper(MyTable, mytable, non_primary=True)
metadata.create_all(bind=engine)
myfunc1()
myfunc2()
Using Python3.8, SQLAlchemy 1.3.10.

df.sqlContext.sql() not recognizing DB table

I have below code running in spark env::
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.SQLContext
import sqlContext.implicits._
import java.util.Properties
val conf = new SparkConf().setAppName("test").setMaster("local").set("spark.driver.allowMultipleContexts", "true");
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val df = sqlContext.read.format("jdbc").option("url","jdbc:sqlserver://server_IP:port").option("databaseName","DB_name").option("driver","com.microsoft.sqlserver.jdbc.SQLServerDriver").option("dbtable","tbl").option("user","uid").option("password","pwd").load()
val df2 = df.sqlContext.sql("SELECT col1,col2 FROM tbl LIMIT 5")
exit()
When I am trying to execute the above code, I get the error as "org.apache.spark.sql.AnalysisException: Table not found: tbl;", however, if I remove df2, and execute the code, I can see the content of the table tbl successfully. IS there anything am doing wrong? I am using spark 1.6.1, so I checked the documentation, the syntax to fire the sql query through sqlcontext is rightly placed by me "https://spark.apache.org/docs/1.6.0/sql-programming-guide.html", please refer "Running SQL Queries Programmatically" topic.
Following are the only trace from the full trace error ::
conf: org.apache.spark.SparkConf = org.apache.spark.SparkConf#5eea8854
sc: org.apache.spark.SparkContext = org.apache.spark.SparkContext#7790a6fb
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext#a9f4621
df: org.apache.spark.sql.DataFrame = [col1: int, col2: string, col3: string, col4: string, col5: string, col6: string, col7: string, col8: string, col9: timestamp, col10: timestamp, col11: string, col12: string]
org.apache.spark.sql.AnalysisException: Table not found: tbl;
the df in your code is a DataFrame.
If you want to do any select operations do like df.select().
If you want to execute query by using sqlcontext.sql() you have first register the dataframe as temporary table with df.registerTempTable(tableName: String).

SQLAlchemy Declarative - schemas in SQL Server and foreign/primary keys

I'm struggling to create tables that belong to a schema in a SQL Server database, and ensuring that primary/foreign keys work correctly.
I'm looking for some examples of code to illustrate how this is done
The ingredients needed for this are __table_args__ and the use of the schema prefix on the ForeignKey
DBSession = sessionmaker(bind=engine)
session = DBSession()
from sqlalchemy import Column, Integer, String
from sqlalchemy.orm import relationship
Base = declarative_base()
class Table1(Base):
__tablename__ = 'table1'
__table_args__ = {"schema": 'my_schema'}
id = Column(Integer,primary_key = True)
col1 = Column(String(150))
col2 = Column(String(100))
reviews = relationship("Table2", cascade = "delete")
class Table2(Base):
__tablename__ = 'table2'
__table_args__ = {"schema": 'my_schema'}
id = Column(Integer,primary_key = True)
key = Column(Integer)
col2 = Column(String(100))
key = Column(Integer, ForeignKey("my_schema.table1.id"), index=True)
premise = relationship("Table1")
Base.metadata.create_all(bind=engine)

Resources