I'm trying to use sqlite3 to access data in a database based on a value for the datestamp. Please consider the following code:
live_db_conn = sqlite3.connect('/Users/user/Documents/database.db')
time_period = (dt.now() - timedelta(seconds=time)).strftime('%H:%M:%S')
time_period_data = pd.read_sql_query('SELECT * FROM table1 WHERE Datestamp > {}'.format(str(time_period)), live_db_conn)
When I run this code I get the following error:
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT * FROM table1 WHERE Datestamp > 12:33:33': near ":33": syntax error
I don't understand where this error comes from, because if I run the following code:
df = pd.read_sql_query('SELECT Datestamp FROM table1 LIMIT 10', live_db_conn)
print(df)
I get the following output:
Datestamp
0 10:46:54
1 10:46:59
2 10:47:04
3 10:47:09
4 10:47:14
5 10:47:19
6 10:47:24
7 10:47:29
8 10:47:34
9 10:47:39
So it seems (to me at least) that my sql query is correct. I've tried to do .format(time_period) instead of .format(str(time_period)) but I can't figure out what I'm doing wrong.
Question: How do I select the portion of the data that corresponds to the selected time period?
Edit: It seems that something is going wrong with the minutes in the timestamp. When I ran the code again I got the same error but with a different timestamp:
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT * FROM table1 WHERE Datestamp > 12:49:10': near ":49": syntax error
So I'd say that the syntax error has something to do with the minutes in the timestamp..
Instead of
time_period_data = pd.read_sql_query('SELECT * FROM table1 WHERE Datestamp > {}'.format(str(time_period)), live_db_conn)
I did:
time_period_data = pd.read_sql_query('SELECT * FROM table1 WHERE Datestamp > "{}"'.format(time_period), live_db_conn)
which solved the problem!
Related
Question(*):
The total number of cases and deaths as a percentage of the population, for each country (with country, % cases of population, % deaths of population as columns)
I have two tables :
countriesAffected(countriesAndTerritories,geoId,countryterritoryCode,popData2019,continentExp)
victimsCases(dateRep,cases,deaths,geoId)
where primary key(geoid)
I tried to do (*) by this method:
SELECT countriesAndTerritories, (100 *SUM(victimsCases.cases) / popData2019)as "cases" ,(100 * SUM(deaths) / popData2019) as "deaths"
FROM countriesAffected
INNER JOIN victimsCases ON victimsCases.geoId = countriesAffected.geoId
GROUP BY countriesAndTerritories
ORDER BY countriesAndTerritories DESC;
Error: near line 2: near "SELECT countriesAndTerritories": syntax error
But for some reason I get all types of syntax errors, i tried to sort it out but with no results. And not sure where did i went wrong.
If you are getting the error Error: near line 2: near "SELECT countriesAndTerritories": syntax error then the issue is with LINE 1 (perhaps no ; at the end of line 1).
Otherwise your query works albiet probably not as intended (as you may well want decimal places for the percentages).
Consider the following (that shows your SQL with additional SQL added to work as intended (see casesV2 and deathsV2 that utilise CAST to force INTEGER to REAL)).
DROP TABLE If EXISTS victimsCases;
DROP TABLE IF EXISTS countriesAffected;
CREATE TABLE IF NOT EXISTS countriesAffected (countriesAndTerritories TEXT,geoId INTEGER PRIMARY KEY,countryterritoryCode TEXT,popData2019 INTEGER,continentExp TEXT);
CREATE TABLE IF NOT EXISTS victimsCases (dateRep TEXT,cases INTEGER ,deaths INTEGER,geoId INTEGER);
INSERT INTO countriesAffected VALUES
('X',1,'XXX',10000,'?'),('Y',2,'YYY',20000,'?'),('Z',3,'ZZZ',30000,'?')
;
INSERT INTO victimsCases VALUES
('2019-01-01',100,20,1),('2019-01-02',100,25,1),('2019-01-03',100,15,1),
('2019-01-01',30,5,2),('2019-01-02',33,2,2),
('2019-01-01',45,17,3),('2019-01-02',61,4,3),('2019-01-03',75,7,3)
;
SELECT countriesAndTerritories,
(100 *SUM(victimsCases.cases) / popData2019)as "cases", /* ORIGINAL */
(100 * SUM(deaths) / popData2019) as "deaths", /* ORIGINAL */
CAST(SUM(victimsCases.cases) AS FLOAT) / popData2019 * 100 AS "casesV2",
CAST(SUM(victimscases.deaths) AS FLOAT) / popData2019 * 100 as "deathsV2"
FROM countriesAffected
INNER JOIN victimsCases ON victimsCases.geoId = countriesAffected.geoId
GROUP BY countriesAndTerritories
ORDER BY countriesAndTerritories DESC;
DROP TABLE If EXISTS victimsCases;
DROP TABLE IF EXISTS countriesAffected;
The result of the above is :-
This is on Windows SQL Server Cluster.
Query is coming from 3rd party application so I can not modify the query permanently.
Query is:
DECLARE #FromBrCode INT = 1001
DECLARE #ToBrCode INT = 1637
DECLARE #Cdate DATE = '31-mar-2017'
SELECT
a.PrdCd, a.Name, SUM(b.Balance4) as Balance
FROM
D009021 a, D010014 b
WHERE
a.PrdCd = LTRIM(RTRIM(SUBSTRING(b.PrdAcctId, 1, 8)))
AND substring(b.PrdAcctId, 9, 24) = '000000000000000000000000'
AND a.LBrCode = b.LBrCode
AND a.LBrCode BETWEEN #FromBrCode AND #ToBrCode
AND b.CblDate = (SELECT MAX(c.CblDate)
FROM D010014 c
WHERE c.PrdAcctId = b.PrdAcctId
AND c.LBrCode = b.LBrCode
AND c.CblDate <= #Cdate)
GROUP BY
a.PrdCd, a.Name
HAVING
SUM(b.Balance4) <> 0
ORDER BY
a.PrdCd
This particular query is taking too much time to complete execution. The same problem happens on a different SQL Server.
No table lock was found, processor and memory usage is normal while the query is running.
Normal "select top 1000" working and showing output instantly in both tables (D009021, D010014)
Reindex and rebuild / update stats done in both tables but problem did not resolve (D009021, D010014)
The same query is working if we reduce number of branch but slowly
(
DECLARE #FromBrCode INT =1001
DECLARE #ToBrCode INT =1001
)
The same query is working faster giving output within 2 mins if we replace any one variable and use the value directly
AND a.LBrCode BETWEEN #FromBrCode AND #ToBrCode
changed to
AND a.LBrCode BETWEEN 1001 AND #ToBrCode
The same query is working faster and giving output within 2 mins if we add "OPTION (RECOMPILE)" at end
I tried to clean cache query execution plan and optimized new one but problem still exists
Found that the query estimate plan and actual execution plan are different (see screenshots)
Table D010014 is aliased twice once as b and once as c
the they are joined to the same table.
Try toto remove the sub query below and create a temp table to store
the values you need. I added * to the fields you self join
SELECT MAX(c.CblDate)
FROM D010014 c
WHERE c.PrdAcctId = b.PrdAcctId
AND c.LBrCode = b.LBrCode
AND c.CblDate <= #Cdate
if you cant do that then try
SELECT TOP 1 c.CblDate
FROM D010014 c
WHERE c.PrdAcctId = b.PrdAcctId
AND c.LBrCode = b.LBrCode
AND c.CblDate <= #Cdate
ORDER BY c.CblDate DESC
I am sending this query to a sql server in R using RODBC::sqlQuery
MERGE "mytable" AS Target USING ( VALUES ('myname','POLYGON ((148.0000000000000000 -20.0000000000000000, 148.0000000000000000 -20.0000000000000000, 148.0000000000000000 -20.0000000000000000, 148.0000000000000000 -20.0000000000000000, 148.0000000000000000 -20.0000000000000000))')) AS Source ("name","polygon")
ON (Target."name" = Source."name")
WHEN MATCHED THEN
UPDATE SET Target."polygon" = Source."polygon"
WHEN NOT MATCHED BY TARGET THEN
INSERT ("name","polygon")
VALUES (Source."name", Source."polygon")
OUTPUT $action, Inserted.*, Deleted.*
It fails when row_at_time argument of sqlQuery is more than 10,
Error in odbcQuery(channel, query, rows_at_time) :
'Calloc' could not allocate memory (107374182400 of 1 bytes)
but works if row_at_time < 10. (still the query takes quite a few seconds which is surprising as the table is indexed and very small: less than 100 rows)
Any idea why?
Thank you
EDIT: This is the structure of the table I am writing on:
I have a SQL command that works great in SQL Server. Here's the query:
SELECT TOP 1000
(
SELECT COUNT(LINENUM)
FROM OEORDD D1
WHERE D1.ORDUNIQ = OEORDD.ORDUNIQ
)
- (SELECT COUNT(LINENUM)
FROM OEORDD D1
WHERE D1.ORDUNIQ = OEORDD.ORDUNIQ
AND D1.LINENUM > OEORDD.LINENUM)
FROM OEORDD
ORDER BY ORDUNIQ, LINENUM
The query looks at the total lines on an order, then looks at the current "LINENUM" field. With the value of the LINENUM field, it looks to see how many lines have a greater LINENUM value on the order and subtracts it from the number of lines on an order to get the correct Line number.
When I try to add it as a SQL expression in version 14.0.2.364 as follows:
(
(
SELECT COUNT("OEORDD"."LINENUM")
FROM "OEORDD" "D1"
WHERE "D1"."ORDUNIQ" = "OEORDD"."ORDUNIQ"
)
- (SELECT COUNT("OEORDD"."LINENUM")
FROM "OEORDD" "D1"
WHERE "D1"."ORDUNIQ" = "OEORDD"."ORDUNIQ"
AND "D1"."LINENUM" > "OEORDD"."LINENUM"
)
)
I get the error "Column 'SAMDB.dbo.OEORDD.ORDUNIQ' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
If I try to add GROUP BY "OEORDD"."ORDUNIQ" at the end, I get "Incorrect syntax near the keyword 'GROUP'. I've tried adding "FROM OEORDD" at the end of query and it errors out on the word "FROM". I have the correct tables linked in the Database Expert.
EDIT --------------
I was able to get the first query working by getting rid of the alias, it's as follows:
(
SELECT COUNT(LINENUM)
FROM OEORDD
WHERE OEORDH.ORDUNIQ=OEORDD.ORDUNIQ)
)
However, I believe I need to use the alias in the second query to compare line numbers. I'm still stuck on that one.
I am trying to run this query but the query keeps giving up on me:
Update StockInvoiceInfo set Quantity = Quantity - 2 where p_id = 5 AND ProductDate = convert(Cast('31-5-2015' as datetime)) ;
After Running this code it returns an error below:
Incorrect syntax near '31-5-2015'
The datatype of the ProductDate column isDate. I am using Sql Server 2012.
You have used Convert functions but didn't supplied it with parameters. Also there is no need for this function here. Also take care of date format. I have changed it to standard format:
Update StockInvoiceInfo set Quantity = Quantity - 2
where p_id = 5 AND ProductDate = Cast('2015-05-31' as datetime)
If all you are trying to do is compare a Sql Date, then just use an agnostic format like '20150531' or easier to read '2015-05-31'. No need for casts or convert at all, i.e.
WHERE ... AND ProductDate = '2015-05-31'
However, if ProductDate isn't a date, but one of the *DATETIME* data types, and you are looking to update any time on the same day, then I believe you are looking for something like:
Update StockInvoiceInfo
set Quantity = Quantity - 2
where
p_id = 5
AND CAST(ProductDate AS DATE) = '2015-05-31';
However, the performance will be lousy as the clause isn't likely to be SARGable. You're better off simply doing:
AND ProductDate >= '2015-05-31' AND ProductDate < '2015-06-01';
(Resist the temptation to use between, or hacks like ':23:59:59' as there will be data corner cases which will bite you)
use CAST('5-31-2015' as DATETIME)
with the above update statement you started convert but with incomplete syntax
here the convert syntax
Update StockInvoiceInfo
set Quantity = Quantity - 2
where p_id = 5
AND ProductDate = convert(datetime,Cast('31-5-2015' as varchar),103) ;