Why sequence-identity not working in SQL server sequences? - sql-server

Why sequence-identity not working in SQL server sequences?
#GenericGenerator( name = "sequence",
strategy = "sequence-identity",
parameters = {
#org.hibernate.annotations.Parameter(
name = "sequence",
value = "SEQ_PARTNER_TIMETABLE_ID"
)
})

try this:
in SQL Server create a sequence
CREATE SEQUENCE [schema_name . ]sequence_name
START WITH 1
INCREMENT BY 1
NO CYCLE
;
then call this sequence in your entity's id
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "users_seq_gen")
#SequenceGenerator(name = "users_seq_gen", sequenceName = "sequence_name")

Related

JPA, SQL Server - column with prefix and sequence

Creation of column with prefix and sequence using JPA and SQL Server.
I need some column (not primary key) with prefix and sequence, for example T1, T2, T3.
I've tried this:
CREATE SEQUENCE t_sequence START WITH 1 INCREMENT BY 1;
ALTER TABLE gate ADD T_NUM INT;
ALTER TABLE gate ADD T VARCHAR(30);
...
#Column(name = "T_NUM")
#GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "t_sequence ")
#SequenceGenerator(name = "t_sequence ", t_sequence = "mir_sequence", allocationSize = 1, initialValue = 1)
private int tNum;
#Column(name = "T")
private String t;
...
#PrePersist
public void onCreate() {
super.onCreate();
t= "T" + this.tNum;
}
As a result I always have T0 in t column.

linq2db - server side bulkcopy

I'm trying to do a "database side" bulk copy (i.e. SELECT INTO/INSERT INTO) using linq2db. However, my code is trying to bring the dataset over the wire which is not possible given the size of the DB in question.
My code looks like this:
using (var db = new MyDb()) {
var list = db.SourceTable.
Where(s => s.Year > 2012).
GroupBy(s => new { s.Column1, s.Column2 }).
Select(g => new DestinationTable {
Property1 = 'Constant Value',
Property2 = g.First().Column1,
Property3 = g.First().Column2,
Property4 = g.Count(s => s.Column3 == 'Y')
});
db.Execute("TRUNCATE TABLE DESTINATION_TABLE");
db.BulkCopy(new BulkCopyOptions {
BulkCopyType = BulkCopyType.MultipleRows
}, list);
}
The generated SQL looks like this:
BeforeExecute
-- DBNAME SqlServer.2017
TRUNCATE TABLE DESTINATION_TABLE
DataConnection
Query Execution Time (AfterExecute): 00:00:00.0361209. Records Affected: -1.
DataConnection
BeforeExecute
-- DBNAME SqlServer.2017
DECLARE #take Int -- Int32
SET #take = 1
DECLARE #take_1 Int -- Int32
SET #take_1 = 1
DECLARE #take_2 Int -- Int32
...
SELECT
(
SELECT TOP (#take)
[p].[YEAR]
FROM
[dbo].[SOURCE_TABLE] [p]
WHERE
(([p_16].[YEAR] = [p].[YEAR] OR [p_16].[YEAR] IS NULL AND [p].[YEAR] IS NULL) AND ...
...)
FROM SOURCE_TABLE p_16
WHERE p_16.YEAR > 2012
GROUP BY
...
DataConnection
That is all that is logged as the bulkcopy fails with a timeout, i.e. SqlException "Execution Timeout Expired".
Please note that running this query as an INSERT INTO statement takes less than 1 second directly in the DB.
PS: Anyone have any recommendations as to good code based ETL tools to do large DB (+ 1 TB) ETL. Given the DB size I need things to run in the database and not bring data over the wire. I've tried pyspark, python bonobo, c# etlbox and they all move too much data around. I thought linq2db had potential, i.e. basically just act like a C# to SQL transpiler but it is also trying to move data around.
I would suggest to rewrite your query because group by can not return first element. Also Truncate is a part of the library.
var sourceQuery =
from s in db.SourceTable
where s.Year > 2012
select new
{
Source = s,
Count = Sql.Ext.Count(s.Column3 == 'Y' ? 1 : null).Over()
.PartitionBy(s.Column1, s.Column2).ToValue()
RN = Sql.Ext.RowNumber().Over()
.PartitionBy(s.Column1, s.Column2).OrderByDesc(s.Year).ToValue()
};
db.DestinationTable.Truncate();
sourceQuery.Where(s => s.RN == 1)
.Insert(db.DestinationTable,
e => new DestinationTable
{
Property1 = 'Constant Value',
Property2 = e.Source.Column1,
Property3 = e.Source.Column2,
Property4 = e.Count
});
After some investigation I stumbled onto this issue. Which lead me to the solution. The code above needs to change to:
db.Execute("TRUNCATE TABLE DESTINATION_TABLE");
db.SourceTable.
Where(s => s.Year > 2012).
GroupBy(s => new { s.Column1, s.Column2 }).
Select(g => new DestinationTable {
Property1 = 'Constant Value',
Property2 = g.First().Column1,
Property3 = g.First().Column2,
Property4 = g.Count(s => s.Column3 == 'Y')
}).Insert(db.DestinationTable, e => e);
Documentation of the linq2db project leaves a bit to be desired however, in terms of functionality its looking like a great project for ETLs (without horrible 1000s of line copy/paste sql/ssis scripts).

Cannot Insert into SQL using PySpark, but works in SQL

I have created a table below in SQL using the following:
CREATE TABLE [dbo].[Validation](
[RuleId] [int] IDENTITY(1,1) NOT NULL,
[AppId] [varchar](255) NOT NULL,
[Date] [date] NOT NULL,
[RuleName] [varchar](255) NOT NULL,
[Value] [nvarchar](4000) NOT NULL
)
NOTE the identity key (RuleId)
When inserting values into the table as below in SQL it works:
Note: Not inserting the Primary Key as is will autofill if table is empty and increment
INSERT INTO dbo.Validation VALUES ('TestApp','2020-05-15','MemoryUsageAnomaly','2300MB')
However when creating a temp table on databricks and executing the same query below running this query on PySpark as below:
%python
driver = <Driver>
url = "jdbc:sqlserver:<URL>"
database = "<db>"
table = "dbo.Validation"
user = "<user>"
password = "<pass>"
#import the data
remote_table = spark.read.format("jdbc")\
.option("driver", driver)\
.option("url", url)\
.option("database", database)\
.option("dbtable", table)\
.option("user", user)\
.option("password", password)\
.load()
remote_table.createOrReplaceTempView("YOUR_TEMP_VIEW_NAMES")
sqlcontext.sql("INSERT INTO YOUR_TEMP_VIEW_NAMES VALUES ('TestApp','2020-05-15','MemoryUsageAnomaly','2300MB')")
I get the error below:
AnalysisException: 'unknown requires that the data to be inserted have the same number of columns as the target table: target table has 5 column(s) but the inserted data has 4 column(s), including 0 partition column(s) having constant value(s).;'
Why does it work on SQL but not when passing the query through databricks? How can I insert through pyspark without getting this error?
The most straightforward solution here is use JDBC from a Scala cell. EG
%scala
import java.util.Properties
import java.sql.DriverManager
val jdbcUsername = dbutils.secrets.get(scope = "kv", key = "sqluser")
val jdbcPassword = dbutils.secrets.get(scope = "kv", key = "sqlpassword")
val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
// Create the JDBC URL without passing in the user and password parameters.
val jdbcUrl = s"jdbc:sqlserver://xxxx.database.windows.net:1433;database=AdventureWorks;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"
// Create a Properties() object to hold the parameters.
val connectionProperties = new Properties()
connectionProperties.put("user", s"${jdbcUsername}")
connectionProperties.put("password", s"${jdbcPassword}")
connectionProperties.setProperty("Driver", driverClass)
val connection = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword)
val stmt = connection.createStatement()
val sql = "INSERT INTO dbo.Validation VALUES ('TestApp','2020-05-15','MemoryUsageAnomaly','2300MB')"
stmt.execute(sql)
connection.close()
You could use pyodbc too, but the SQL Server ODBC drivers aren't installed by default, and the JDBC drivers are.
A Spark solution would be to create a view in SQL Server and insert against that. eg
create view Validation2 as
select AppId,Date,RuleName,Value
from Validation
then
tableName = "Validation2"
df = spark.read.jdbc(url=jdbcUrl, table=tableName, properties=connectionProperties)
df.createOrReplaceTempView(tableName)
sqlContext.sql("INSERT INTO Validation2 VALUES ('TestApp','2020-05-15','MemoryUsageAnomaly','2300MB')")
If you want to encapsulate the Scala and call it from another language (like Python), you can use a scala package cell.
eg
%scala
package example
import java.util.Properties
import java.sql.DriverManager
object JDBCFacade
{
def runStatement(url : String, sql : String, userName : String, password: String): Unit =
{
val connection = DriverManager.getConnection(url, userName, password)
val stmt = connection.createStatement()
try
{
stmt.execute(sql)
}
finally
{
connection.close()
}
}
}
and then you can call it like this:
jdbcUsername = dbutils.secrets.get(scope = "kv", key = "sqluser")
jdbcPassword = dbutils.secrets.get(scope = "kv", key = "sqlpassword")
jdbcUrl = "jdbc:sqlserver://xxxx.database.windows.net:1433;database=AdventureWorks;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"
sql = "select 1 a into #foo from sys.objects"
sc._jvm.example.JDBCFacade.runStatement(jdbcUrl,sql, jdbcUsername, jdbcPassword)

snowflake jdbc paramter returning VARCHAR for all datatypes

Snowflake JDBC driver is reporting parameter metadata for all the datatypes as VARCHAR. Is there any way to overcome this problem?
DDL:-
CREATE TABLE INTTABLE(INTCOL INTEGER)
Below is the output from Snowflake ODBC Driver
SQLPrepare:
In:StatementHandle = 0x00000000021B1B50, StatementText = "INSERT INTO INTTABLE(INTCOL) VALUES(?)", TextLength = 42
Return: SQL_SUCCESS=0
SQLDescribeParam:
In:StatementHandle = 0x00000000021B1B50, ParameterNumber = 1, DataTypePtr = 0x00000000001294D0, ParameterSizePtr = 0x0000000000126950,DecimalDigits =0x0000000000126980, NullablePtr = 0x00000000001269B0
Return: SQL_SUCCESS=0
Out:*DataTypePtr = SQL_VARCHAR=12, *ParameterSizePtr = 16777216, *DecimalDigits = 0, *NullablePtr = SQL_NULLABLE=1
Below is Output with Snowflake JDBC Driver.
PreparedStatement ps = c.prepareStatement("INSERT INTO INTTABLE(INTCOL) VALUES(?)");
ParameterMetaData psmd = ps.getParameterMetaData();
for(int i=1 ;i<=psmd.getParameterCount(); i++) {
System.out.println(psmd.getParameterType(i)+ " " + psmd.getParameterTypeName(i));
}
Output:-
12 text
Thank you for adding more information to your thread. I still may be doing a little guesswork though.
If you are trying to change the table values type from Varchar, and there are no values in it, you can drop the table, then re-recreate it.
If you want to ALTER what is already in the table try altering the table first: Manual Reference
There is also the CREATE OR REPLACE TABLE(col , col 2 ) that takes care of both.
Is this what you are looking for?

Cannot understand how will Entity Framewrok generate a SQL statement for an Update operation using timestamp?

I have the following method inside my asp.net mvc web application :
var rack = IT.ITRacks.Where(a => !a.Technology.IsDeleted && a.Technology.IsCompleted);
foreach (var r in rack)
{
long? it360id = technology[r.ITRackID];
if (it360resource.ContainsKey(it360id.Value))
{
long? CurrentIT360siteid = it360resource[it360id.Value];
if (CurrentIT360siteid != r.IT360SiteID)
{
r.IT360SiteID = CurrentIT360siteid.Value;
IT.Entry(r).State = EntityState.Modified;
count = count + 1;
}
}
IT.SaveChanges();
}
When I checked SQL Server profiler I noted that EF will generated the following SQL statement:
exec sp_executesql N'update [dbo].[ITSwitches]
set [ModelID] = #0, [Spec] = null, [RackID] = #1, [ConsoleServerID] = null, [Description] = null, [IT360SiteID] = #2, [ConsoleServerPort] = null
where (([SwitchID] = #3) and ([timestamp] = #4))
select [timestamp]
from [dbo].[ITSwitches]
where ##ROWCOUNT > 0 and [SwitchID] = #3',N'#0 int,#1 int,#2 bigint,#3 int,#4 binary(8)',#0=1,#1=539,#2=1502,#3=1484,#4=0x00000000000EDCB2
I can not understand the purpose of having the following section :-
select [timestamp]
from [dbo].[ITSwitches]
where ##ROWCOUNT > 0 and [SwitchID] = #3',N'#0 int,#1 int,#2 bigint,#3 int,#4 binary(8)',#0=1,#1=539,#2=1502,#3=1484,#4=0x00000000000EDCB2
Can anyone advice?
Entity Framework uses timestamps to check whether a row has changed. If the row has changed since the last time EF retrieved it, then it knows it has a concurrency problem.
Here's an explanation:
http://www.remondo.net/entity-framework-concurrency-checking-with-timestamp/
This is because EF (and you) want to update the updated client-side object by the newly generated rowversion value.
First the update is executed. If this succeeds (because the rowversion is still the one you had in the client) a new rowversion is generated by the database and EF retrieves that value. Suppose you'd immediately want to make a second update. That would be impossible if you didn't have the new rowversion.
This happens with all properties that are marked as identity or computed (by DatabaseGenertedOption).

Resources