slow SQLite read speed (100 records a second) - database

I have a large SQLite database (~134 GB) that has multiple tables each with 14 columns, about 330 million records, and 4 indexes. The only operation used on the database is "Select *" as I need all the columns(No inserts or updates). When I query the database, the response time is slow when the result set is big (takes 160 seconds for getting ~18,000 records).
I have improved the use of indexes multiple times and this is the fastest response time I got.
I am running the database as a back-end database for a web application on a server with 32 GB of RAM.
is there a way to use RAM (or anything else) to speed up the query process?
Here is the code that performs the query.
async.each(proteins,function(item, callback) {
`PI[item] = []; // Stores interaction proteins for all query proteins
PS[item] = []; // Stores scores for all interaction proteins
PIS[item] = []; // Stores interaction sites for all interaction proteins
var sites = {}; // a temporarily holder for interaction sites
var query_string = 'SELECT * FROM ' + organism + PIPE_output_table +
' WHERE ' + score_type + ' > ' + cutoff['range'] + ' AND (protein_A = "' + item + '" OR protein_B = "' + item '") ORDER BY PIPE_score DESC';
db.each(query_string, function (err, row) {
if (row.protein_A == item) {
PI[item].push(row.protein_B);
// add 1 to interaction sites to represent sites starting from 1 not from 0
sites['S1AS'] = row.site1_A_start + 1;
sites['S1AE'] = row.site1_A_end + 1;
sites['S1BS'] = row.site1_B_start + 1;
sites['S1BE'] = row.site1_B_end + 1;
sites['S2AS'] = row.site2_A_start + 1;
sites['S2AE'] = row.site2_A_end + 1;
sites['S2BS'] = row.site2_B_start + 1;
sites['S2BE'] = row.site2_B_end + 1;
sites['S3AS'] = row.site3_A_start + 1;
sites['S3AE'] = row.site3_A_end + 1;
sites['S3BS'] = row.site3_B_start + 1;
sites['S3BE'] = row.site3_B_end + 1;
PIS[item].push(sites);
sites = {};
}
}

The query you posted uses no variables.
It will always return the same thing: all the rows with a null score whose protein column is equal to its protein_a or protein_b column. You're then having to filter all those extra rows in Javascript, fetching a lot more rows than you need to.
Here's why...
If I'm understanding this query correctly, you have WHERE Score > [Score]. I've never encountered this syntax before, so I looked it up.
[keyword] A keyword enclosed in square brackets is an identifier. This is not standard SQL. This quoting mechanism is used by MS Access and SQL Server and is included in SQLite for compatibility.
An identifier is something like a column or table name, not a variable.
This means that this...
SELECT * FROM [TABLE]
WHERE Score > [Score] AND
(protein_A = [Protein] OR protein_B = [Protein])
ORDER BY [Score] DESC;
Is the same as this...
SELECT * FROM `TABLE`
WHERE Score > Score AND
(protein_A = Protein OR protein_B = Protein)
ORDER BY Score DESC;
You never pass any variables to the query. It will always return the same thing.
This can be seen here when you run it.
db.each(query_string, function (err, row) {
Since you're checking that each protein is equal to itself (or something very like itself), you're likely fetching every row. And it's why you have to filter all the rows again. And that is one of the reasons why your query is so slow.
if (row.protein_A == item) {
BUT! WHERE Score > [Score] will never be true, a thing cannot be greater than itself except for null! Trinary logic is weird. So only if Score is null can that be true.
So you're returning all the rows whose score is null and the protein column is equal to protein_a or protein_b. This is a lot more rows than you need, I guess you have a lot of rows with null scores.
Your query should incorporate variables (I'm assuming you're using node-sqlite3) and pass in their values when you execute the query.
var query = " \
SELECT * FROM `TABLE` \
WHERE Score > $score AND \
(protein_A = $protein OR protein_B = $protein) \
ORDER BY Score DESC; \
";
var stmt = db.prepare(query);
stmt.each({$score: score, $protein: protein}, function (err, row) {
PI[item].push(row.protein_B);
...
});

Related

How to generate Stackoverflow table markdown from Snowflake

Stackoverflow supports table markdown. For example, to display a table like this:
N_NATIONKEY
N_NAME
N_REGIONKEY
0
ALGERIA
0
1
ARGENTINA
1
2
BRAZIL
1
3
CANADA
1
4
EGYPT
4
You can write code like this:
|N_NATIONKEY|N_NAME|N_REGIONKEY|
|---:|:---|---:|
|0|ALGERIA|0|
|1|ARGENTINA|1|
|2|BRAZIL|1|
|3|CANADA|1|
|4|EGYPT|4|
It would save a lot of time to generate the Stackoverflow table markdown automatically when running Snowflake queries.
The following stored procedure accepts either a query string or a query ID (it will auto-detect which it is) and returns the table results as Stackoverflow table markdown. It will automatically align numbers and dates to the right, strings, arrays, and objects to the left, and other types default to centered. It supports any query you can pass to it. It may be a good idea to use $$ to terminate the string passed into the procedure in case the SQL contains single quotes. You can create the procedure and test it using this script:
create or replace procedure MARKDOWN("queryOrQueryId" string)
returns string
language javascript
execute as caller
as
$$
const MAX_ROWS = 50; // Set the maximum row count to fetch. Tables in markdown larger than this become hard to read.
var [rs, i, c, row, props] = [null, 0, 0, 0, {}];
if (!queryOrQueryId || queryOrQueryId == 0){
queryOrQueryId = `select * from table(result_scan(last_query_id())) limit ${MAX_ROWS}`;
}
queryOrQueryId = queryOrQueryId.trim();
if (isUUID(queryOrQueryId)){
rs = snowflake.execute({sqlText:`select * from table(result_scan('${queryOrQueryId}')) limit ${MAX_ROWS}`});
} else {
rs = snowflake.execute({sqlText:`${queryOrQueryId}`});
}
props.columnCount = rs.getColumnCount();
for(i = 1; i <= props.columnCount; i++){
props["col" + i + "Name"] = rs.getColumnName(i);
props["col" + i + "Type"] = rs.getColumnType(i);
}
var table = getHeader(props);
while(rs.next()){
row = "|";
for(c = 1; c <= props.columnCount; c++){
row += escapeMarkup(rs.getColumnValueAsString(c)) + "|";
}
table += "\n" + row;
}
return table;
//------ End main function. Start of helper functions.
function escapeMarkup(s){
s = s.replace(/\\/g, "\\\\");
s = s.replaceAll('|', '\\|');
s = s.replace(/\s+/g, " ");
return s;
}
function getHeader(props){
s = "|";
for (var i = 1; i <= props.columnCount; i++){
s += props["col" + i + "Name"] + "|";
}
s += "\n";
for (var i = 1; i <= props.columnCount; i++){
switch(props["col" + i + "Type"]) {
case 'number':
s += '|---:';
break;
case 'string':
s += '|:---';
break;
case 'date':
s += '|---:';
break;
case 'json':
s += '|:---';
break;
default:
s += '|:---:';
}
}
return s + "|";
}
function isUUID(str){
const regexExp = /^[0-9a-fA-F]{8}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{12}$/gi;
return regexExp.test(str);
}
$$;
-- Usage type 1, a simple query:
call stackoverflow_table($$ select * from SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.NATION limit 5 $$);
-- Usage type 2, a query ID:
select * from SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.NATION limit 5;
set quid = (select last_query_id());
call stackoverflow_table($quid);
Edit: Based on Fieldy's helpful feedback, I modified the procedure code to allow passing null or 0 or a blank string '' as the parameter. This will use the last query ID and is a helpful shortcut. It also adds a constant to the code that will limit the returns to a set number of rows. This limit will be applied when using query IDs (or sending null, '', or 0, which uses the last query ID). The limit is not applied when the input parameter is the text of a query to run to avoid syntax errors if there's already a limit applied, etc.
Greg Pavlik's Javascript Stored Procedure solution made me wonder if this would be any easier with the new Python language support in Stored Procedures. This is currently a public-preview feature.
The Python Snowpark API supports returning a result as a Pandas dataframe, and Pandas supports returning a dataframe in Markdown format, via the tabulate package. Here's the stored procedure.
CREATE OR REPLACE PROCEDURE markdown_table(query_id VARCHAR)
RETURNS VARCHAR
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('snowflake-snowpark-python','pandas','tabulate', 'regex')
HANDLER = 'markdown_table'
EXECUTE AS CALLER
AS $$
import pandas as pd
import tabulate
import regex
def markdown_table(session, queryOrQueryId = None):
# Validate UUID
if(queryOrQueryId is None):
pandas_result = session.sql("""Select * from table(result_scan(last_query_id()))""").to_pandas()
elif(bool(regex.match("^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$", queryOrQueryId))):
pandas_result = session.sql(f"""select * from table(result_scan('{queryOrQueryId}'))""").to_pandas()
else:
pandas_result = session.sql(queryOrQueryId).to_pandas()
return pandas_result.to_markdown()
$$;
Which you can use as follows:
-- Usage type 1, use the result from the query ran immediately proceeding the Store-Procedure Call
select * from SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.NATION limit 5;
call markdown_table(NULL);
-- Usage type 2, pass in a query_id
select * from SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.NATION limit 5;
set quid = (select last_query_id());
select $quid;
call markdown_table($quid);
-- Usage type 3, provide a Query string to the Store-Procedure Call
call markdown_table('select * from SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.NATION limit 5');
The table can also be
N_NATIONKEY|N_NAME|N_REGIONKEY
--|--|--
0|ALGERIA|0
1|ARGENTINA|1
2|BRAZIL|1
3|CANADA|1
4|EGYPT|4
giving, so it can be a simpler solution
N_NATIONKEY
N_NAME
N_REGIONKEY
0
ALGERIA
0
1
ARGENTINA
1
2
BRAZIL
1
3
CANADA
1
4
EGYPT
4
I grab the result table and use notepad++ and replace tab \t with pipe space | and then insert by hand the header marker line. I sometime replace the empty null results with the text null to make the results make more sense. the form you use with the start/end pipes gets around the need for that.
DBeaver IDE supports "data export as markdown" and "advanced copy as markdown" out-of-the-box:
Output:
|R_REGIONKEY|R_NAME|R_COMMENT|
|-----------|------|---------|
|0|AFRICA|lar deposits. blithely final packages cajole. regular waters are final requests. regular accounts are according to |
|1|AMERICA|hs use ironic, even requests. s|
|2|ASIA|ges. thinly even pinto beans ca|
|3|EUROPE|ly final courts cajole furiously final excuse|
|4|MIDDLE EAST|uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl|
It is rendered as:
R_REGIONKEY
R_NAME
R_COMMENT
0
AFRICA
lar deposits. blithely final packages cajole. regular waters are final requests. regular accounts are according to
1
AMERICA
hs use ironic, even requests. s
2
ASIA
ges. thinly even pinto beans ca
3
EUROPE
ly final courts cajole furiously final excuse
4
MIDDLE EAST
uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl

Stored procedure - get anticipated columns before fully executing statement?

I'm working through a stored procedure and wondering if there's a way to retrieve the anticipated result column list from a sql statement before fully executing.
Scenarios:
dynamic SQL
a UDF that might vary the columns outside of our control
EX:
//inbound parameter
SET QUERY_DEFINITION_ID = 12345;
//Initial statement pulls query text from bank of queries
var sqlText = getQueryFromQueryBank(QUERY_DEFINITION_ID);
//now we run our query
var cmd = {sqlText: sqlText };
stmt = snowflake.createStatement(cmd);
What I'd like to be able to do is say "right - before you run this, give me the anticipated column list" so I can compare it to what's expected.
EX:
Expected: [col1, col2, col3, col4]
Got: [col1]
Result: Oops. Don't run.
Rationale here is that I want to short-circuit the execution if something is missing - before it potentially runs for a while. I can validate all of this after the fact, but it would be really helpful to stop early.
Any ideas very much appreciated!
This sample SP code shows how to get a list of columns that a query will project into the result before you run the query. It should only be used for large, long running queries because it will take a few seconds to get the column list.
There are a couple of caveats. 1) It will only return the names of the columns. It won't tell you how they were built, that is, whether they're aliased, direct from a table, calculated, etc. 2) The example query I used is straight from the Snowflake documentation here https://docs.snowflake.com/en/user-guide/sample-data-tpcds.html#functional-query-definition. For convenience, I minimized the query to a single line. The output of the columns includes object qualifiers in addition to the column names, so V1.I_CATEGORY, V1.D_YEAR, V1.D_MOY, etc. If you don't want them to make it easier to compare names, you can strip off the qualifiers using the JavaScript split function on the dot and take index 1 of the resulting array.
create or replace procedure EXPLAIN_BEFORE_RUNNING()
returns string
language javascript
execute as caller
as
$$
// Set the context for the session to the TPC-H sample data:
executeNonQuery("use schema snowflake_sample_data.tpcds_sf10tcl;");
// Here's a complex query from the Snowflake docs (minimized to one line for convienience):
var sql = `with v1 as( select i_category, i_brand, cc_name, d_year, d_moy, sum(cs_sales_price) sum_sales, avg(sum(cs_sales_price)) over(partition by i_category, i_brand, cc_name, d_year) avg_monthly_sales, rank() over (partition by i_category, i_brand, cc_name order by d_year, d_moy) rn from item, catalog_sales, date_dim, call_center where cs_item_sk = i_item_sk and cs_sold_date_sk = d_date_sk and cc_call_center_sk= cs_call_center_sk and ( d_year = 1999 or ( d_year = 1999-1 and d_moy =12) or ( d_year = 1999+1 and d_moy =1)) group by i_category, i_brand, cc_name , d_year, d_moy), v2 as( select v1.i_category ,v1.d_year, v1.d_moy ,v1.avg_monthly_sales ,v1.sum_sales, v1_lag.sum_sales psum, v1_lead.sum_sales nsum from v1, v1 v1_lag, v1 v1_lead where v1.i_category = v1_lag.i_category and v1.i_category = v1_lead.i_category and v1.i_brand = v1_lag.i_brand and v1.i_brand = v1_lead.i_brand and v1.cc_name = v1_lag.cc_name and v1.cc_name = v1_lead.cc_name and v1.rn = v1_lag.rn + 1 and v1.rn = v1_lead.rn - 1) select * from v2 where d_year = 1999 and avg_monthly_sales > 0 and case when avg_monthly_sales > 0 then abs(sum_sales - avg_monthly_sales) / avg_monthly_sales else null end > 0.1 order by sum_sales - avg_monthly_sales, 3 limit 100;`;
// Before actually running the query, generate an explain plan.
executeNonQuery("explain " + sql);
// Now read the column list from the explain plan from the result set.
var columnList = executeSingleValueQuery("COLUMN_LIST", `select "expressions" as COLUMN_LIST from table(result_scan(last_query_id())) where "operation" = 'Result';`);
// For now, just exit with the column list as the output...
return columnList;
// Your code here...
// Helper functions:
function executeNonQuery(queryString) {
var out = '';
cmd = {sqlText: queryString};
stmt = snowflake.createStatement(cmd);
var rs;
rs = stmt.execute();
}
function executeSingleValueQuery(columnName, queryString) {
var out;
cmd1 = {sqlText: queryString};
stmt = snowflake.createStatement(cmd1);
var rs;
try{
rs = stmt.execute();
rs.next();
return rs.getColumnValue(columnName);
}
catch(err) {
if (err.message.substring(0, 18) == "ResultSet is empty"){
throw "ERROR: No rows returned in query.";
} else {
throw "ERROR: " + err.message.replace(/\n/g, " ");
}
}
return out;
}
$$;
call Explain_Before_Running();

SQL Update statement updates one row multiple times, but is inconsistent

The first question ever on here.
I have an update statement, which seems to have been working perfectly fine, previously.
Out of nowhere, the update statement seems to update a single row multiple times, but not for all rows, if that makes sense?
The statement is run via SQL agent and scheduled every 10 seconds (along with other steps)
Sorry if it's big or messy, I'm self-trained!
The first step, is to insert data into a table that needs to be updated on the main table, using a view.
For the audit trail, I then insert IDs into another table, to track what's being updated.
The problematic part of this statement is the update main table (DT_V_POTATTENDANCE).
Below that is an insert into the main table where an ID cannot be found in the view.
The rest of the script again is all part of the audit trail which is how I've found it updating a single row multiple times for whatever reason (but not every time if that makes sense?) and setting the records as updated so they don't keep updating.
Thanks for your help, really appreciate it.
SELECT DISTINCT * INTO _TEMPTABLEAPPROVEDATTENDANCE
from BE_RPT_PA_ATTENDANCE_TO_UPDATE
SET IDENTITY_INSERT _TEMPTABLEAPPROVEDATTENDANCEUPDATEDSIGNIDS ON
INSERT INTO _TEMPTABLEAPPROVEDATTENDANCEUPDATEDSIGNIDS (SIGNID)
SELECT SIGNID FROM DT_PA_POTATTENDANCE WHERE APPROVED = 1 AND UPDATED IS NULL
SET IDENTITY_INSERT _TEMPTABLEAPPROVEDATTENDANCEUPDATEDSIGNIDS OFF
declare #time datetime
set #time = (select getdate())
UPDATE DT_V_POTATTENDANCE
SET
DT_V_POTATTENDANCE.CB_H_MON = DT_V_POTATTENDANCE.CB_H_MON + Y.CB_H_MON,
DT_V_POTATTENDANCE.CB_M_MON = DT_V_POTATTENDANCE.CB_M_MON + Y.CB_M_MON,
DT_V_POTATTENDANCE.CB_H_TUE = DT_V_POTATTENDANCE.CB_H_TUE + Y.CB_H_TUE,
DT_V_POTATTENDANCE.CB_M_TUE = DT_V_POTATTENDANCE.CB_M_TUE + Y.CB_M_TUE,
DT_V_POTATTENDANCE.CB_H_WED = DT_V_POTATTENDANCE.CB_H_WED + Y.CB_H_WED,
DT_V_POTATTENDANCE.CB_M_WED = DT_V_POTATTENDANCE.CB_M_WED + Y.CB_M_WED,
DT_V_POTATTENDANCE.CB_H_THU = DT_V_POTATTENDANCE.CB_H_THU + Y.CB_H_THU,
DT_V_POTATTENDANCE.CB_M_THU = DT_V_POTATTENDANCE.CB_M_THU + Y.CB_M_THU,
DT_V_POTATTENDANCE.CB_H_FRI = DT_V_POTATTENDANCE.CB_H_FRI + Y.CB_H_FRI,
DT_V_POTATTENDANCE.CB_M_FRI = DT_V_POTATTENDANCE.CB_M_FRI + Y.CB_M_FRI,
DT_V_POTATTENDANCE.H_H_MON = DT_V_POTATTENDANCE.H_H_MON + Y.H_H_MON,
DT_V_POTATTENDANCE.H_M_MON = DT_V_POTATTENDANCE.H_M_MON + Y.H_M_MON,
DT_V_POTATTENDANCE.H_H_TUE = DT_V_POTATTENDANCE.H_H_TUE + Y.H_H_TUE,
DT_V_POTATTENDANCE.H_M_TUE = DT_V_POTATTENDANCE.H_M_TUE + Y.H_M_TUE,
DT_V_POTATTENDANCE.H_H_WED = DT_V_POTATTENDANCE.H_H_WED + Y.H_H_WED,
DT_V_POTATTENDANCE.H_M_WED = DT_V_POTATTENDANCE.H_M_WED + Y.H_M_WED,
DT_V_POTATTENDANCE.H_H_THU = DT_V_POTATTENDANCE.H_H_THU + Y.H_H_THU,
DT_V_POTATTENDANCE.H_M_THU = DT_V_POTATTENDANCE.H_M_THU + Y.H_M_THU,
DT_V_POTATTENDANCE.H_H_FRI = DT_V_POTATTENDANCE.H_H_FRI + Y.H_H_FRI,
DT_V_POTATTENDANCE.H_M_FRI = DT_V_POTATTENDANCE.H_M_FRI + Y.H_M_FRI,
DT_V_POTATTENDANCE.AA_H_MON = DT_V_POTATTENDANCE.AA_H_MON + Y.AA_H_MON,
DT_V_POTATTENDANCE.AA_M_MON = DT_V_POTATTENDANCE.AA_M_MON + Y.AA_M_MON,
DT_V_POTATTENDANCE.AA_H_TUE = DT_V_POTATTENDANCE.AA_H_TUE + Y.AA_H_TUE,
DT_V_POTATTENDANCE.AA_M_TUE = DT_V_POTATTENDANCE.AA_M_TUE + Y.AA_M_TUE,
DT_V_POTATTENDANCE.AA_H_WED = DT_V_POTATTENDANCE.AA_H_WED + Y.AA_H_WED,
DT_V_POTATTENDANCE.AA_M_WED = DT_V_POTATTENDANCE.AA_M_WED + Y.AA_M_WED,
DT_V_POTATTENDANCE.AA_H_THU = DT_V_POTATTENDANCE.AA_H_THU + Y.AA_H_THU,
DT_V_POTATTENDANCE.AA_M_THU = DT_V_POTATTENDANCE.AA_M_THU + Y.AA_M_THU,
DT_V_POTATTENDANCE.AA_H_FRI = DT_V_POTATTENDANCE.AA_H_FRI + Y.AA_H_FRI,
DT_V_POTATTENDANCE.AA_M_FRI = DT_V_POTATTENDANCE.AA_M_FRI + Y.AA_M_FRI
FROM _TEMPTABLEAPPROVEDATTENDANCE Y
WHERE DT_V_POTATTENDANCE.ATTENDANCEWEEKID = Y.ATTENDANCEWEEKID
AND Y.ATTENDANCEWEEKID IS NOT NULL
AND Y.TRAINEEID <> '0683-0001-107827'
INSERT INTO DT_V_POTATTENDANCE
([TRAINEEID]
,[POT]
,[WEEKSTARTDATE]
,[CB_H_MON]
,[CB_M_MON]
,[CB_H_TUE]
,[CB_M_TUE]
,[CB_H_WED]
,[CB_M_WED]
,[CB_H_THU]
,[CB_M_THU]
,[CB_H_FRI]
,[CB_M_FRI]
,[H_H_MON]
,[H_M_MON]
,[H_H_TUE]
,[H_M_TUE]
,[H_H_WED]
,[H_M_WED]
,[H_H_THU]
,[H_M_THU]
,[H_H_FRI]
,[H_M_FRI]
,[AA_H_MON]
,[AA_M_MON]
,[AA_H_TUE]
,[AA_M_TUE]
,[AA_H_WED]
,[AA_M_WED]
,[AA_H_THU]
,[AA_M_THU]
,[AA_H_FRI]
,[AA_M_FRI])
SELECT [TRAINEEID]
,[POT]
,[WEEKSTARTDATE]
,[CB_H_MON]
,[CB_M_MON]
,[CB_H_TUE]
,[CB_M_TUE]
,[CB_H_WED]
,[CB_M_WED]
,[CB_H_THU]
,[CB_M_THU]
,[CB_H_FRI]
,[CB_M_FRI]
,[H_H_MON]
,[H_M_MON]
,[H_H_TUE]
,[H_M_TUE]
,[H_H_WED]
,[H_M_WED]
,[H_H_THU]
,[H_M_THU]
,[H_H_FRI]
,[H_M_FRI]
,[AA_H_MON]
,[AA_M_MON]
,[AA_H_TUE]
,[AA_M_TUE]
,[AA_H_WED]
,[AA_M_WED]
,[AA_H_THU]
,[AA_M_THU]
,[AA_H_FRI]
,[AA_M_FRI]
FROM _TEMPTABLEAPPROVEDATTENDANCE
WHERE ATTENDANCEWEEKID IS NULL
AND TRAINEEID <> '0683-0001-107827'
UPDATE DT_PA_POTATTENDANCE
SET UPDATED = 1
FROM DT_PA_POTATTENDANCE
WHERE APPROVEDTIMESTAMP < #time and
TRAINEEID <> '0683-0001-107827' AND
APPROVED = 1 AND UPDATED IS NULL AND SIGNID IN (SELECT SIGNID FROM _TEMPTABLEAPPROVEDATTENDANCEUPDATEDSIGNIDS)
INSERT INTO _TEMPTABLEAPPROVEDATTENDANCEUPDATED
([ATTENDANCEWEEKID],[TRAINEEID]
,[POT]
,[WEEKSTARTDATE]
,[CB_H_MON]
,[CB_M_MON]
,[CB_H_TUE]
,[CB_M_TUE]
,[CB_H_WED]
,[CB_M_WED]
,[CB_H_THU]
,[CB_M_THU]
,[CB_H_FRI]
,[CB_M_FRI]
,[H_H_MON]
,[H_M_MON]
,[H_H_TUE]
,[H_M_TUE]
,[H_H_WED]
,[H_M_WED]
,[H_H_THU]
,[H_M_THU]
,[H_H_FRI]
,[H_M_FRI]
,[AA_H_MON]
,[AA_M_MON]
,[AA_H_TUE]
,[AA_M_TUE]
,[AA_H_WED]
,[AA_M_WED]
,[AA_H_THU]
,[AA_M_THU]
,[AA_H_FRI]
,[AA_M_FRI])
SELECT [ATTENDANCEWEEKID],[TRAINEEID]
,[POT]
,[WEEKSTARTDATE]
,[CB_H_MON]
,[CB_M_MON]
,[CB_H_TUE]
,[CB_M_TUE]
,[CB_H_WED]
,[CB_M_WED]
,[CB_H_THU]
,[CB_M_THU]
,[CB_H_FRI]
,[CB_M_FRI]
,[H_H_MON]
,[H_M_MON]
,[H_H_TUE]
,[H_M_TUE]
,[H_H_WED]
,[H_M_WED]
,[H_H_THU]
,[H_M_THU]
,[H_H_FRI]
,[H_M_FRI]
,[AA_H_MON]
,[AA_M_MON]
,[AA_H_TUE]
,[AA_M_TUE]
,[AA_H_WED]
,[AA_M_WED]
,[AA_H_THU]
,[AA_M_THU]
,[AA_H_FRI]
,[AA_M_FRI]
FROM _TEMPTABLEAPPROVEDATTENDANCE
WHERE TRAINEEID <> '0683-0001-107827'
drop table _TEMPTABLEAPPROVEDATTENDANCE
I found the reason why it was doing it!
Below part of my script was making it run and run again.
The user in question, we have an MIS that sets the field "Approved" to 1, the script above then updates the master table DT_V_POTATTENDANCE, but it wasn't setting the field "Updated" to 1 because the users time was 3 minutes out!
UPDATE DT_PA_POTATTENDANCE
SET UPDATED = 1
FROM DT_PA_POTATTENDANCE
WHERE APPROVEDTIMESTAMP < #time and
TRAINEEID <> '0683-0001-107827' AND
APPROVED = 1 AND UPDATED IS NULL AND SIGNID IN
(SELECT SIGNID FROM _TEMPTABLEAPPROVEDATTENDANCEUPDATEDSIGNIDS)
Thanks for all your help guys, appreciate it!

Get random result from JPQL query over large table

I'm currently using JPQL queries to retrieve information from a database. The purpose of the project is testing a sample environments through randomness with different elements so I need the queries to retrieve random single results all through the project.
I am facing that JPQL does not implement a proper function for random retrieval and postcalculation of random takes too long (14 seconds for the attached function to return a random result)
public Player getRandomActivePlayerWithTransactions(){
List<Player> randomPlayers = entityManager.createQuery("SELECT pw.playerId FROM PlayerWallet pw JOIN pw.playerId p"
+ " JOIN p.gameAccountCollection ga JOIN ga.iDAccountStatus acs"
+ " WHERE (SELECT count(col.playerWalletTransactionId) FROM pw.playerWalletTransactionCollection col) > 0 AND acs.code = :status")
.setParameter("status", "ACTIVATED")
.getResultList();
return randomPlayers.get(random.nextInt(randomPlayers.size()));
}
As ORDER BY NEWID() is not allowed because of JPQL restrictions I have tested the following inline conditions, all of them returned with syntax error on compilation.
WHERE (ABS(CAST((BINARY_CHECKSUM(*) * RAND()) as int)) % 100) < 10
WHERE Rnd % 100 < 10
FROM TABLESAMPLE(10 PERCENT)
Have you consider to generate a random number and skip to that result?
I mean something like this:
String q = "SELECT COUNT(*) FROM Player p";
Query query=entityManager.createQuery(q);
Number countResult=(Number) query.getSingleResult();
int random = Math.random()*countResult.intValue();
List<Player> randomPlayers = entityManager.createQuery("SELECT pw.playerId FROM PlayerWallet pw JOIN pw.playerId p"
+ " JOIN p.gameAccountCollection ga JOIN ga.iDAccountStatus acs"
+ " WHERE (SELECT count(col.playerWalletTransactionId) FROM pw.playerWalletTransactionCollection col) > 0 AND acs.code = :status")
.setParameter("status", "ACTIVATED")
.setFirstResult(random)
.setMaxResults(1)
.getSingleResult();
I have figured it out. When retrieving the player I was also retrieving other unused related entity and all the entities related with that one and so one.
After adding fetch=FetchType.LAZY (don't fetch entity until required) to the problematic relation the performance of the query has increased dramatically.

How to count number of rows in SQL Server database?

I am trying to count the number of unread messages in my DB Table but is proving to be very difficult. I've even read tutorials online but to no avail.
What I'm doing should be simple.
Here's what I'm trying to do:
COUNT NUMBER OF ROWS IN NOTIFICATIONSTABLE
WHERE USERID = #0 AND MESSAGEWASREAD = FALSE
Can somebody please point me in the right direction? Any help will be appreciated.
Thank you
#helper RetrievePhotoWithName(int userid)
{
var database = Database.Open("SC");
var name = database.QuerySingle("select FirstName, LastName, ProfilePicture from UserProfile where UserId = #0", userid);
var notifications = database.Query("SELECT COUNT(*) as 'counter' FROM Notifications WHERE UserID = #0 AND [Read] = #1", userid, false);
var DisplayName = "";
if(notifications["counter"] < 1)
{
DisplayName = name["FirstName"] + " " + name["LastName"];
}
else
{
DisplayName = name["FirstName"] + ", you have " + notifications["counter"] + " new messages.";
}
<img src="#Href("~/Shared/Assets/Images/" + name["ProfilePicture"] + ".png")" id="MiniProfilePicture" /> #DisplayName
database.Close();
}
SELECT COUNT(*) FROM NotificationsTable WHERE
UserID = #UserID AND MessageWasRead = 0;
Sql Count Function
Okay so this is based on what I think should be done. I don't know the underlying types, so it is going to be my best guess.
var notifications = database.QuerySingle("Select COUNT(*) as NumRecs....");
if((int)notifications["NumRecs"] > 0)) .......
I changed the query for notifications to QuerySingle. You don't need a recordest, you only need a scalar value, so that should (hopefully remove your problem with the implicit conversion in the equals you were having.
I would also check to see if your database object implements IDisposable (place it in a using statement if so) as you are calling close, and this won't actually call close (I know it's not dispose but it might have dispose too) if you encounter and exception before the close function is called.
int unreadMessageCount = db.Query("SELECT * FROM Notification WHERE UserId=#0 AND Read=#1",UserId,false).Count();
string displayname = name["FirstName"] + " " + name["LastName"] + unreadMessageCount>0?",you have " + unreadMessageCount :"";

Resources