Working with Express, node, the MSSQL package to create a backend for an application, and I would like to do as much processing on the server as possible before sending to the client.
I have two queries I need to run, but I need to combine the data in a specific way before sending to the client.
The first query gathers data that will be of a one-to-one relationship, and the other is a one-to-many relationship. I would like to append the one-to-Many onto the One-to-one.
First Query:
select updatedInfo.*,
nameInfo.*, nameInfo.updated as nameUpdated, nameInfo.alreadyCorrect as nameWasCorrect,
addressInfo.*, addressInfo.alreadyCorrect as addWasCorrect, addressInfo.updated as addUpdated,
phoneInfo.*, phoneInfo.alreadyCorrect as phoneWasCorrect, phoneInfo.updated as phoneUpdated,
emailInfo.*, emailInfo.alreadyCorrect as emailWasCorrect, emailInfo.updated as emailUpdated
from updatedInfo join nameInfo on updatedInfo.IndivId=nameInfo.nameInfoId
join addressInfo on updatedInfo.IndivId=addressInfo.addressInfoId
join emailInfo on updatedInfo.IndivId=emailInfo.emailInfoId
join phoneInfo on updatedInfo.IndivId=phoneInfo.phoneInfoId
where updatedInfo.correctedInFNV is not null
order by updatedInfo.IndivId
Second Query: ID is a variable passed to the query
select * from positionInfo where IndivId='${id}'
How would I go about appending the second query results to the first on the correct record?
I'm using the mssql package and using it like this:
var sqlConfig = {
server: 'IP',
database: 'db',
user: 'sweeper',
password: 'pass'
}
const connPool = new mssql.ConnectionPool(sqlConfig, err => {
console.error(err);
});
var query = {
getAllUpdatedPool: () => {
connPool.Request().query(`----first query ----`)
.then((set) => {
console.log(set);
return set;
}).catch((err) => {
console.error(err);
return err;
});
},
getPositionByIdPool: (id) => {
connPool.Request().query(`----second query-----`)
.then((set) => {
console.log(set);
return set;
}).catch((err) => {
console.error(err);
return err;
});
How should I call these to add the results of the second query to the results of the first one as an additional property? Callbacks are making this confusing.
It looks like both queries execute on the same server, have you considered using subqueries? (https://learn.microsoft.com/en-us/sql/relational-databases/performance/subqueries?view=sql-server-2017). If you can express what you're trying to do in SQL it'll probably be 1) cleaner and 2) faster to just do it with subqueries than manually merging recordsets. If they exist on different servers, you could use linked servers to achieve the same subquery result.
Related
I'm using sequelize, and using transactions, but I have to make a lot of inserts every night, my fear is if these inserts/changes are stored in memory until transaction is commited and can crash the server and lost it all.
Or if these changes are stored and handled by the DBMS (in this case i'm using aurora/postgresql) and i don't have to worry about nothing
Help!
I'm usings express 4, sequelize 5 and this will run maybe on a cronJob
This is an abstract example of my structure
const db = require('../database/models')
class Controller {
async test (req, res) {
let transaction = await db.sequelize.transaction()
try {
await this.storeData(req.body, transaction)
await transaction.commit()
res.status(200)
} catch (error) {
if (transaction) await transaction.rollback()
res.status(400)
}
}
async storeDate (params, transaction = null) {
// Calculation of the data to insert
var records = []
await Promise.all(records.map(async item => {
await db.MyModel.create(item, { transaction })
}
))
}
A transaction feature in Sequelize is just a wrapper over a DB transaction and, of course, a transactional DBMS has a transaction log and stores all ongoing transaction operations there.
One edge case would be if you really like to take too many objects and insert them all in one operation so I'd recommend to divide a huge amount of rows into smaller batches.
I'm trying to achieve a query similar to this:
SELECT r.*, (SELECT COUNT(UserID) FROM RoleUsers ru WHERE ru.RoleId = r.Id) AS Assignments
FROM Roles r
To retrieve the number of the users per each role.
The simplest and the most straightforward option to implement desired output:
this.DbContext.Set<Role>().Include(x => x.RoleUser)
.Select(x => new { x, Assignments = x.RoleUsers.Count() });
Retrieves all the roles, and then N queries to retrieve count:
SELECT COUNT(*)
FROM [dbo].[RoleUsers] AS [r0]
WHERE #_outer_Id = [r0].[RoleId]
Which is not an option at all. I tried also to use GroupJoin, but it loads all the required data set in one query and performs grouping in memory:
this.DbContext.Set<Role>().GroupJoin(this.DbContext.Set<RoleUser>(), role => role.Id,
roleUser => roleUser.RoleId, (role, roleUser) => new
{
Role = role,
Assignments = roleUser.Count()
});
Generated query:
SELECT [role].[Id], [role].[CustomerId], [role].[CreateDate], [role].[Description], [role].[Mask], [role].[ModifyDate], [role].[Name], [assignment].[UserId], [assignment].[CustomerId], [assignment].[RoleId]
FROM [dbo].[Roles] AS [role]
LEFT JOIN [dbo].[RoleUser] AS [assignment] ON [role].[Id] = [assignment].[RoleId]
ORDER BY [role].[Id]
Also, I was looking into a way, to use windowing functions, where I can just split count by partition and use distinct roles, but I have no idea how to wire up windowing function in EF:
SELECT DISTINCT r.*, COUNT(ra.UserID) OVER(PARTITION BY ru.RoleId)
FROM RoleUsers ru
RIGHT JOIN Roles r ON r.Id = ru.RoleId
So, is there any way to avoid EntitySQL?
Currently there is a defect in EF Core query aggregate translation to SQL when the query projection contains a whole entity, like
.Select(role => new { Role = role, ...}
The only workaround I'm aware of is to project to new entity (at least this is supported by EF Core) like
var query = this.DbContext.Set<Role>()
.Select(role => new
{
Role = new Role { Id = role.Id, Name = role.Name, /* all other Role properies */ },
Assignments = role.RoleUsers.Count()
});
This translates to single SQL query. The drawback is that you have to manually project all entity properties.
this.DbContext.Set<Role>()
.Select(x => new { x, Assignments = x.RoleUsers.Count() });
you dont need to add include for RoleUser since you are using Select statement. Furhtermore, I guess that you are using LazyLoading where this is expected behavior. If you use eager loading the result of your LINQ will run in one query.
you can use context.Configuration.LazyLoadingEnabled = false; before your LINQ query to disable lazy loading specifically for this operation
I have two schemas in my postgres
public // default schema
first_user
Now I have same tables in both schemas
I changed the table structure, so I want to run the sync now,
I sync the tables using:
const db = new Sequelize(postgres_db, postgres_user, postgres_pwd, {
host: postgres_host,
port: 5432,
dialect: 'postgres',
logging: false,
});
db.sync().then(() => {
console.log('Table Synced');
}, (err) => {
console.log(err);
});
After running this my table structure inside the public schema changed successfully, but my first_user schema's table structure remains same.
How to solve this?
NOTE: I don't want to lose my data inside my table.
Finally implemented this using sequelize migrations
http://docs.sequelizejs.com/manual/tutorial/migrations.html
If you can't use Sequelize migration because of lack of Typescript support you can fall back to Migra which is easy to use.
https://djrobstep.com/docs/migra
You Can Try CREATE TABLE AS TABLE Query.
create table first_user.tableName as table public.tableName;
It will create the table with updated table structure as well as with the data.
Thanks..
How do I select certain set of data(rows) from sql database by the time when they are inserted? I don't see any related documents in regards to how to do this using mssql module in node.js... could anyone suggest me any reading material or something else? So my question is how to create timestamp column when data are inserted in database
Thank you
The doc seem straight forward on how to achieve it
const sql = require('mssql')
async () => {
try {
const pool = await
sql.connect('mssql://username:password#localhost/database')
const result = await sql.query`SELECT * FROM TABLE WHERE DATE BETWEEN '09/16/2010 05:00:00' and '09/21/2010 09:00:00'`
console.dir(result)
} catch (err) {
// ... error checks
}
}
I'd like to export tom_test2 postgresql table to elastic search. The table has 176805 rows:
=> select count(*) from tom_test2;
count
--------
176805
(1 row)
The following logstach conf file import correctly my data to elastic search:
input {
jdbc {
# Postgres jdbc connection string to our database, mydb
jdbc_connection_string => "xxx"
# The user we wish to execute our statement as
jdbc_user => "xxx"
jdbc_password => "xxx"
# The path to our downloaded jdbc driver
jdbc_driver_library => "xxx"
# The name of the driver class for Postgresql
jdbc_driver_class => "org.postgresql.Driver"
# our query
statement => "select * from tom_test2"
}
}
output {
elasticsearch {
hosts => ["xxx"]
index => "tom"
document_type => "tom_test"
}
}
In elastic search:
GET tom/tom_test/_search
"hits": {
"total": 176805,
"max_score": 1,
}
I'm deleting my index in elastic search:
delete tom
And I now would like to do the same operation using jdbc_page_size in case my data becomes bigger, my logstach conf file is now:
input {
jdbc {
# Postgres jdbc connection string to our database, mydb
jdbc_connection_string => "xxx"
# The user we wish to execute our statement as
jdbc_user => "xxx"
jdbc_password => "xxx"
# The path to our downloaded jdbc driver
jdbc_driver_library => "xxx"
# The name of the driver class for Postgresql
jdbc_driver_class => "org.postgresql.Driver"
# our query
statement => "select * from tom_test2"
jdbc_page_size => 1000
jdbc_paging_enabled => true
}
}
output {
elasticsearch {
hosts => ["xxx"]
index => "tom"
document_type => "tom_test"
}
}
My count is now wrong:
GET tom/tom_test/_search
"hits": {
"total": 106174,
"max_score": 1,
}
as 176805-106174=70631 rows are missing
The reason you are facing this - you have ordering problems: your query doesn't controlls the order in which the data is received, and in general postgresql should not guarantie that in unordered consequent paging calls you don't fetch the same data: this produces situation when some data will be not fetched at all, and some data will be fetched multiple times :( even when the data is not modified during these calls, the background vacuum worker may change the order of the data in the physical file, and thus reproduce described situation.
Either add order to your statement SELECT * FROM tom_test2 ORDER BY id and page your data. But be aware: in this case your upload to elasticsearch will not assure the exact replica of the table at moment of time. The cause of that will be, that during logstash processing of consequent paging request the update of data in upcoming page introduced, i.e. you are uploading at the moment page 1 to 10000 and update happened at data on page 10001 and 20000, and then later otherwise... so you have problem in consistency of your data.
Or if you want to fetch all the data and generously use memory on logstash... , then you need to control the jdbc_fetch_size parameter: i.e. you are performing the same SELECT * FROM tom_test2. With this approach you will create a single query resultset, but will "pump" it in pieces, and data modification during your "pumping" will not cause you: you will be fetching the state at the moment of query start.
Because ordering is not guaranteed between queries in jdbc_page_size as WARNED in the documentation of jdbc_paging_enabled.
I recommend using jdbc_fetch_size instead of using jdbc_page_size as the documentation also says that for large result-sets.
P.S: sometimes ;) asking your questions at http://discuss.elastic.co is better answered by elastic maintainers