sqlalchemy.orm.exc.UnmappedInstanceError: Class 'builtins.dict' is not mapped AND using marshmallow-sqlalchemy

sqlalchemy.orm.exc.UnmappedInstanceError: Class 'builtins.dict' is not mapped AND using marshmallow-sqlalchemy - sql-server

I don't get it. I'm trying to start a brand new table in MS SQL Server 2012 with the following:
In SQL Server:
TABLE [dbo].[Inventory](
[Index_No] [bigint] IDENTITY(1,1) NOT NULL,
[Part_No] [varchar(150)] NOT NULL,
[Shelf] [int] NOT NULL,
[Bin] [int] NOT NULL,
PRIMARY KEY CLUSTERED
(
[Index_No] ASC
)
UNIQUE NONCLUSTERED
(
[Part_No] ASC
)
GO
NOTE: This is a BRAND NEW TABLE! There is no data in it at all
Next, this is the Database.py file:
import pymssql
from sqlalchemy import create_engine, Table, MetaData, select, Column, Integer, Float, String, text,
func, desc, and_, or_, Date, insert
from marshmallow_sqlalchemy import SQLAlchemyAutoSchema
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
USERNAME = "name"
PSSWD = "none_of_your_business"
SERVERNAME = "MYSERVER"
INSTANCENAME = "\SQLSERVER2012"
DB = "Inventory"
engine = create_engine(f"mssql+pymssql://{USERNAME}:{PSSWD}#{SERVERNAME}{INSTANCENAME}/{DB}")
class Inventory(Base):
__tablename__ = "Inventory"
Index_No = Column('Index_No', Integer, primary_key=True, autoincrement=True)
Part_No = Column("Part_No", String, unique=True)
Shelf = Column("Shelf", Integer)
Bin = Column("Bin", Integer)
def __repr__(self):
return f'Drawing(Index_No={self.Index_No!r},Part_No={self.Part_No!r}, Shelf={self.Shelf!r}, ' \
f'Bin={self.Bin!r})'
class InventorySchema(SQLAlchemyAutoSchema):
class Meta:
model = Inventory
load_instance = True
It's also to note that I'm using SQLAlchemy 1.4.3, if that helps out.
and in the main.py
import Database as db
db.Base.metadata.create_all(db.engine)
data_list = [{Part_No:123A, Shelf:1, Bin:5},
{Part_No:456B, Shelf:1, Bin:7},
{Part_No:789C, Shelf:2, Bin:1}]
with db.Session(db.engine, future=True) as session:
try:
session.add_all(data_list) #<--- FAILS HERE AND THROWS AN EXCEPTION
session.commit()
except Exception as e:
session.rollback()
print(f"Error! {e!r}")
raise
finally:
session.close()
Now what I've googled on this "Class 'builtins.dict' is not mapped", most of the solutions brings me to marshmallow-sqlalchemy package which I've tried, but I'm still getting the same error. So I've tried moving the Base.metadata.create_all(engine) from the Database.py into the main.py. I also tried implementing a init function in the Inventory class, and also calling the Super().init, which doesn't work
So what's going on?? Why is it failing and is there a better solution to this problem?

Try creating Inventory objects:
data_list = [
Inventory(Part_No='123A', Shelf=1, Bin=5),
Inventory(Part_No='456B', Shelf=1, Bin=7),
Inventory(Part_No='789C', Shelf=2, Bin=1)
]

Related

SqlAlchemy - How to query a table based on a key property saved as NestedMutableJson

Let suppose a Postgres user table contains a property of type NestedMutableJson.
first_name character varying (120)
last_name character varying (120
country_info NestedMutableJson
...
from sqlalchemy_json import NestedMutableJson
country_info = db.Column(NestedMutableJson, nullable=True)
country_info = {"name": "UK", "code": "11"}
How to query the user table based on a country_info key.
POSTGRES Query
SELECT * FROM user WHERE country_info ->> 'name' = 'UK'
Does any SqlAlchemy way give the same query result?
I tried several ways, example:
Way 1:
User.query.filter(User.country_info['name'].astext == 'UK').all()
Error:
Operator 'getitem' is not supported on this expression
Way 2:
User.query.filter(User.country_info.op('->>')('name') == 'UK').all()
Issue:
Always getting an empty response
I'm wondering if the issue caused by the column definition db.Column(NestedMutableJson, nullable=True)
I'm avoiding using db.session.execute("SELECT * FROM user WHERE country_info ->> 'name' = 'UK'").fetchall(). looking for something else

simply user the text which allows you to write plain text filter inside orm process, it works like get function for a dict, and it can handle None fields also.
User.query.filter(text("COALESCE(user.country_info ->> 'name', '') = 'UK'")).all()
note that the name of the table should be the real name in the database

Take a look at the JSON datatype documentation.
You should be able to do use this filter clause
select(CountryInfo).filter(CountryInfo.country_info['name'].astext == "UK")

You can use .op('->>') to use the PostgreSQL operator ->> in the following way:
from sqlalchemy import Column, Integer, create_engine, String, select
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session
from sqlalchemy.dialects.postgresql import JSON
from sqlalchemy_json import NestedMutableJson
dburl = 'postgresql://...'
Base = declarative_base()
class CountryInfo(Base):
__tablename__ = 'country_info'
id = Column(Integer, unique=True, nullable = False, primary_key=True)
name = Column(String)
country_info = Column(NestedMutableJson, nullable=True)
def __repr__(self):
return f'CountryInfo({self.name!r}, {self.country_info!r})'
engine = create_engine(dburl, future=True, echo=True)
Base.metadata.drop_all(engine)
Base.metadata.create_all(engine)
with Session(engine) as session:
test = CountryInfo(name='test', country_info={"name": "UK", "code": "11"})
test2 = CountryInfo(name='test2', country_info={"name": "NL", "code": "12"})
test3 = CountryInfo(name='test3', country_info={"name": "UK", "code": "13"})
session.add(test)
session.add(test2)
session.add(test3)
session.commit()
stmt = select(CountryInfo).filter(CountryInfo.country_info.op('->>')('name') == "UK")
query = session.execute(stmt).all()
for row in query:
print(row)
This results in the following SQL:
SELECT country_info.id, country_info.name, country_info.country_info
FROM country_info
WHERE (country_info.country_info ->> %(country_info_1)s) = %(param_1)s
with {'country_info_1': 'name', 'param_1': 'UK'}
Which results in:
(CountryInfo('test', {'name': 'UK', 'code': '11'}),)
(CountryInfo('test3', {'name': 'UK', 'code': '13'}),)

How to redefine tables with the same name in SQLAlchemy using Classical Mapping

I am using SQLAlchemy classical mapping to define a table with the same name but different columns depending on the db, I have mapped the class as it's explained on docs, but I am getting errors every single time I try to redefine the class for another database. For instance:
from sqlalchemy import (Table, MetaData, String, Column)
from sqlalchemy.orm import mapper
class MyTable(object):
def __init__(self, *args, **kwargs):
[setattr(self, k, v) for k, v in kwargs.items()]
default_cols = (
Column('column1', String(20), primary_key=True),
Column('column2', String(20))
)
def myfunc1():
engine = create_engine('connection_to_database1')
session = sessionmaker(bind=engine)()
metadata = MetaData()
mytable = Table('mytable', metadata, *default_cols)
mapper(MyTable, mytable)
metadata.create_all(bind=engine)
def myfunc2():
engine = create_engine('connection_to_database2')
session = sessionmaker(bind=engine)()
metadata = MetaData()
columns = list(default_cols) + [Column('column3', String(20))]
mytable = Table('mytable', metadata, *columns)
mapper(MyTable, mytable)
metadata.create_all(bind=engine)
myfunc1()
myfunc2()
The error I get:
Column object 'column1' already assigned to Table 'mytable'
How is this happening if I am using completely different instances of MetaData and engines? Is there a way to achieve this?

Using the default_cols variable was actually the problem, seems like this kind of setup doesn't work unless the columns are defined individually on each function:
def myfunc1():
engine = create_engine('connection_to_database1')
session = sessionmaker(bind=engine)()
metadata = MetaData()
mytable = Table('mytable', metadata,
Column('column1', String(20), primary_key=True),
Column('column2', String(20))
)
mapper(MyTable, mytable)
metadata.create_all(bind=engine)
def myfunc2():
engine = create_engine('connection_to_database2')
session = sessionmaker(bind=engine)()
metadata = MetaData()
columns = [
Column('column1', String(20), primary_key=True),
Column('column2', String(20),
Column('column3', String(20))
]
mytable = Table('mytable', metadata, *columns)
mapper(MyTable, mytable)
metadata.create_all(bind=engine)
Otherwise it will raise the Exception:
Column object 'column1' already assigned to Table 'mytable'

I couldn't reproduce your error. To get the code to work I had to swap the order of the Mapper arguments and add primary keys to the tables definitions. More significantly perhaps, I had to set one of the mappers as non-primary after getting this error:
sqlalchemy.exc.ArgumentError: Class '<class '__main__.MyTable'>' already has a primary
mapper defined. Use non_primary=True to create a non primary Mapper. clear_mappers()
will remove *all* current mappers from all classes.
from sqlalchemy import Table, MetaData, String, Column, create_engine, Integer
from sqlalchemy.orm import mapper
class MyTable(object):
def __init__(self, *args, **kwargs):
[setattr(self, k, v) for k, v in kwargs.items()]
def myfunc1():
engine = create_engine("mysql+pymysql:///test")
metadata = MetaData()
mytable = Table(
"mytable111",
metadata,
Column("id", Integer, primary_key=True),
Column("column1", String(20)),
Column("column2", String(20)),
)
mapper(MyTable, mytable)
metadata.create_all(bind=engine)
def myfunc2():
engine = create_engine("postgresql+psycopg2:///test")
metadata = MetaData()
mytable = Table(
"mytable111",
metadata,
Column("id", Integer, primary_key=True),
Column("column1", String(20)),
Column("column2", String(20)),
Column("column3", String(20)),
)
mapper(MyTable, mytable, non_primary=True)
metadata.create_all(bind=engine)
myfunc1()
myfunc2()
Using Python3.8, SQLAlchemy 1.3.10.

Cannot Insert into SQL using PySpark, but works in SQL

I have created a table below in SQL using the following:
CREATE TABLE [dbo].[Validation](
[RuleId] [int] IDENTITY(1,1) NOT NULL,
[AppId] [varchar](255) NOT NULL,
[Date] [date] NOT NULL,
[RuleName] [varchar](255) NOT NULL,
[Value] [nvarchar](4000) NOT NULL
)
NOTE the identity key (RuleId)
When inserting values into the table as below in SQL it works:
Note: Not inserting the Primary Key as is will autofill if table is empty and increment
INSERT INTO dbo.Validation VALUES ('TestApp','2020-05-15','MemoryUsageAnomaly','2300MB')
However when creating a temp table on databricks and executing the same query below running this query on PySpark as below:
%python
driver = <Driver>
url = "jdbc:sqlserver:<URL>"
database = "<db>"
table = "dbo.Validation"
user = "<user>"
password = "<pass>"
#import the data
remote_table = spark.read.format("jdbc")\
.option("driver", driver)\
.option("url", url)\
.option("database", database)\
.option("dbtable", table)\
.option("user", user)\
.option("password", password)\
.load()
remote_table.createOrReplaceTempView("YOUR_TEMP_VIEW_NAMES")
sqlcontext.sql("INSERT INTO YOUR_TEMP_VIEW_NAMES VALUES ('TestApp','2020-05-15','MemoryUsageAnomaly','2300MB')")
I get the error below:
AnalysisException: 'unknown requires that the data to be inserted have the same number of columns as the target table: target table has 5 column(s) but the inserted data has 4 column(s), including 0 partition column(s) having constant value(s).;'
Why does it work on SQL but not when passing the query through databricks? How can I insert through pyspark without getting this error?

The most straightforward solution here is use JDBC from a Scala cell. EG
%scala
import java.util.Properties
import java.sql.DriverManager
val jdbcUsername = dbutils.secrets.get(scope = "kv", key = "sqluser")
val jdbcPassword = dbutils.secrets.get(scope = "kv", key = "sqlpassword")
val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
// Create the JDBC URL without passing in the user and password parameters.
val jdbcUrl = s"jdbc:sqlserver://xxxx.database.windows.net:1433;database=AdventureWorks;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"
// Create a Properties() object to hold the parameters.
val connectionProperties = new Properties()
connectionProperties.put("user", s"${jdbcUsername}")
connectionProperties.put("password", s"${jdbcPassword}")
connectionProperties.setProperty("Driver", driverClass)
val connection = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword)
val stmt = connection.createStatement()
val sql = "INSERT INTO dbo.Validation VALUES ('TestApp','2020-05-15','MemoryUsageAnomaly','2300MB')"
stmt.execute(sql)
connection.close()
You could use pyodbc too, but the SQL Server ODBC drivers aren't installed by default, and the JDBC drivers are.
A Spark solution would be to create a view in SQL Server and insert against that. eg
create view Validation2 as
select AppId,Date,RuleName,Value
from Validation
then
tableName = "Validation2"
df = spark.read.jdbc(url=jdbcUrl, table=tableName, properties=connectionProperties)
df.createOrReplaceTempView(tableName)
sqlContext.sql("INSERT INTO Validation2 VALUES ('TestApp','2020-05-15','MemoryUsageAnomaly','2300MB')")
If you want to encapsulate the Scala and call it from another language (like Python), you can use a scala package cell.
eg
%scala
package example
import java.util.Properties
import java.sql.DriverManager
object JDBCFacade
{
def runStatement(url : String, sql : String, userName : String, password: String): Unit =
{
val connection = DriverManager.getConnection(url, userName, password)
val stmt = connection.createStatement()
try
{
stmt.execute(sql)
}
finally
{
connection.close()
}
}
}
and then you can call it like this:
jdbcUsername = dbutils.secrets.get(scope = "kv", key = "sqluser")
jdbcPassword = dbutils.secrets.get(scope = "kv", key = "sqlpassword")
jdbcUrl = "jdbc:sqlserver://xxxx.database.windows.net:1433;database=AdventureWorks;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"
sql = "select 1 a into #foo from sys.objects"
sc._jvm.example.JDBCFacade.runStatement(jdbcUrl,sql, jdbcUsername, jdbcPassword)

Read error with spark.read against SQL Server table (via JDBC Connection)

I have a problem in Zeppelin when I try to create a dataframe reading directly from a SQL table. The problem is that I dont know how to read a SQL column with the geography type.
SQL table
This is the code that I am using, and the error that I obtain.
Create JDBC connection
import org.apache.spark.sql.SaveMode
import java.util.Properties
val jdbcHostname = "XX.XX.XX.XX"
val jdbcDatabase = "databasename"
val jdbcUsername = "user"
val jdbcPassword = "XXXXXXXX"
// Create the JDBC URL without passing in the user and password parameters.
val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname};database=${jdbcDatabase}"
// Create a Properties() object to hold the parameters.
val connectionProperties = new Properties()
connectionProperties.put("user", s"${jdbcUsername}")
connectionProperties.put("password", s"${jdbcPassword}")
connectionProperties.setProperty("Driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
Read from SQL
import spark.implicits._
val table = "tablename"
val postcode_polygons = spark.
read.
jdbc(jdbcUrl, table, connectionProperties)
Error
import spark.implicits._
table: String = Lookup.Postcode50m_Lookup
java.sql.SQLException: Unsupported type -158
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:233)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:290)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:290)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:289)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:114)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:52)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:307)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:193)

Adding to thebluephantom answer have you tried changing the type to string as below and loading the table.
val jdbcDF = spark.read.format("jdbc")
.option("dbtable" -> "(select toString(SData) as s_sdata,toString(CentroidSData) as s_centroidSdata from table) t")
.option("user", "user_name")
.option("other options")
.load()

This is the final solution in my case, the idea of moasifk is correct, but in my code I cannot use the function "toString". I have applied the same idea but with another sintaxis.
import spark.implicits._
val tablename = "Lookup.Postcode50m_Lookup"
val postcode_polygons = spark.
read.
jdbc(jdbcUrl, table=s"(select PostcodeNoSpaces, cast(SData as nvarchar(4000)) as SData from $tablename) as postcode_table", connectionProperties)

SQLAlchemy Declarative - schemas in SQL Server and foreign/primary keys

I'm struggling to create tables that belong to a schema in a SQL Server database, and ensuring that primary/foreign keys work correctly.
I'm looking for some examples of code to illustrate how this is done

The ingredients needed for this are __table_args__ and the use of the schema prefix on the ForeignKey
DBSession = sessionmaker(bind=engine)
session = DBSession()
from sqlalchemy import Column, Integer, String
from sqlalchemy.orm import relationship
Base = declarative_base()
class Table1(Base):
__tablename__ = 'table1'
__table_args__ = {"schema": 'my_schema'}
id = Column(Integer,primary_key = True)
col1 = Column(String(150))
col2 = Column(String(100))
reviews = relationship("Table2", cascade = "delete")
class Table2(Base):
__tablename__ = 'table2'
__table_args__ = {"schema": 'my_schema'}
id = Column(Integer,primary_key = True)
key = Column(Integer)
col2 = Column(String(100))
key = Column(Integer, ForeignKey("my_schema.table1.id"), index=True)
premise = relationship("Table1")
Base.metadata.create_all(bind=engine)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

sqlalchemy.orm.exc.UnmappedInstanceError: Class 'builtins.dict' is not mapped AND using marshmallow-sqlalchemy - sql-server

Try creating Inventory objects: data_list = [ Inventory(Part_No='123A', Shelf=1, Bin=5), Inventory(Part_No='456B', Shelf=1, Bin=7), Inventory(Part_No='789C', Shelf=2, Bin=1) ]

Related

SqlAlchemy - How to query a table based on a key property saved as NestedMutableJson

How to redefine tables with the same name in SQLAlchemy using Classical Mapping

Cannot Insert into SQL using PySpark, but works in SQL

Read error with spark.read against SQL Server table (via JDBC Connection)

SQLAlchemy Declarative - schemas in SQL Server and foreign/primary keys

Categories

Resources