How to do GROUP BY with COUNT() and ordering by COUNT in influxdb? - analytics

Im using influxDb and recording user visits to a dictionary pages and trying to get some queries working.
Like for example I'm trying to find out how to get a sorted set of headwords sorted by a number of visits to a particular word definition within some timeframe. So basically words sorted by a number of visits.
Im trying something like this:
SELECT COUNT(*) FROM lookup GROUP BY word ORDER BY count_value LIMIT 100
^But it doesn't work. Error message is "Server returned error: error parsing query: only ORDER BY time supported at this time".
Is what im trying to do not achievable in influxDb?

As noted by the error that was returned
Server returned error: error parsing query: only ORDER BY time supported at this time
InfluxDB only supports ORDER BY time at the moment. To achieve the result that you're looking for you'd need to do the ORDER BY client side.

Related

How do I achieve this output in a Google Data Studio report, given the raw data?

I have a small set of test records in Google Data Studio and am attempting to create a table that gives me breakdowns of particular values relative to the total number of values, by dimension. I think the image here below describes the need clearly. I have tried using an approach I saw online entailing creating a calculated field like this:
case when Action = 'Clicked' then 1 else 0 end
and then creating a metric based upon that field, which does the 'Percentage of Total - Relative to Corresponding Data' but this produces incorrect numbers and seems really cumbersome (needing one calculated field per distinct value). My client wants this exact tabular presentation, not a chart(s).
How do I achieve the desired report?
Thanks!
Solution entails creating fields like 'Opened' which outputs 1 when Action = 'Opened', 0 otherwise. Then create fields like 'Opened Rate' with e.g. sum(Opened) / Record Count. Then set those as percentages.

Running a Query with to much Raw Data

I am trying to run a query to give me the result of past incidence over a period of time. The raw data source for the query has too much data to return any of the newer information. I have tried nesting the query in an array to divide out the way the query runs, and that didn't correct the problem.
=query(IMPORTRANGE("Spread Sheet Key","Coaching Responses!A:Z"),"select Col1,Col14,Col4,Col3,Col7,Col6,Col8,Col15,Col10,Col11,Col13,Col17,Col18,Col19,Col20,Col21,Col22,Col26 where Col23 contains '"&Cover!E2&"' Order by Col1 desc",1)
I also tried this formula
=Query({Importrange("Spread Sheet Key","Coaching Responses!A:Z5000"),Importrange("Spread Sheet Key","Coaching Responses!A5001:Z")} "select Col1,Col14,Col4,Col3,Col7,Col6,Col8,Col15,Col10,Col11,Col13,Col17,Col18,Col19,Col20,Col21,Col22,Col26 where Col23 contains '"&Cover!E2&"')
The first code will not return anything past row 5000 the second code
keeps giving me a parse error.
correct formula syntax should be:
=QUERY({IMPORTRANGE("ID1", "Coaching Responses!A1:Z5000");
IMPORTRANGE("ID1", "Coaching Responses!A5001:Z")},
"select Col1,Col14,Col4,Col3,Col7,Col6,Col8,Col15,Col10,Col11,Col13,Col17,Col18,Col19,Col20,Col21,Col22,Col26
where Col23 contains '"&Cover!E2&"'")

Gmail api messages/list Q after:{timestamp} doe not work properly

Good time!
I'm trying to get the list of message and to filter them I use Q after:{timestamp}
I do the following query
After getting the message id I do a query to get the details of the message:
As you can see timestamp in the query and internalDate of the message are the same.
When I increment timestamp value to 1559717792 and do a query I get the same result:
In my view, the result should be empty because the internalDate less than 1559717792. Is it an issue or is it my mistake?
Thank you!
Gmail API uses the same search syntax as web interface and it's documented here:
https://support.google.com/mail/answer/7190
Specifically it never says "after:<epochSeconds>" works but it only gives an option for a formatted date "after:YYYY/MM/DD". Emperically the <epochSeconds> does seem to work, but it's not documented (so beware that it's not guaranteed to be supported and may break at any time) and also it seems that there may be some rounding issues within the same second (so you may have to add or remove a second to always get the results you want if you need that level of accuracy).

Log Parser: HAVING Wildcards

I have a log parser query that gets the top 200 uris, however I don't want any cs-uri-stem entries that have a dot (.) in them.
This is as close as I've come, but it seems like the wildcards are not acting as I expected:
"SELECT TOP 200 cs-uri-stem, COUNT(*) AS Total INTO \Top200URIs_NoDots.csv
FROM "\2015-01\U*.log"
GROUP BY cs-uri-stem
HAVING cs-uri-stem NOT LIKE '%.%'
ORDER BY Total DESC"
When I run this I get an Error:
... HAVING cs-uri-stem NOT LIKE ''...
Error: Syntax Error: <having-clause>: not a valid <expression>
Why is it ignoring the '%'s and everything between?
HAVING is for filtering group results using aggregate functions on the grouped data. Filtering on the grouped data is more processing-intensive because the grouping must be completed first. In this case, your query will be more optimally performed using a WHERE clause anyway. Also, remember to use %% if this is in a batch file. A single % denotes a batch variable and won't make it to the program's arguments.

How to force table select to go over blocks

How can I make Sybase's database engine return an unsorted list of records in non-numeric order?
~~~
I have an issue where I need to reproduce an error in the application where I select from a table where the ID is generated in sequence, but the ID is not the last one in the selection.
Let me explain.
ID STATUS
_____________
1234 C
1235 C
1236 O
Above is 3 IDs. I had code where these would be the results of a
select #p_id = ID from table where (conditions).
However, there wasn't a clause to check for status = 'O' (open). Remember Sybase saves the last returned record into a variable.
~~~~~
I'm being asked to give the testing team something that will make the results not work. If Sybase selects the above in an unordered list, it could appear in ascending order, or, if the database engine needs to change blocks of stored data or something technical magic stuff, the order could be messed up. The original error was when the procedure would return say 1234 instead of 1236.
Is there a way that I can have a 100% guarantee that Sybase will search over a block of data and have to double back, effectively 'breaking' the ascending search, and returning not the last record, but any other one? (all records except the maximum will end up erroring, because they are all 'Closed')
I want some sort of magical SQL code that will make sure things don't search the table in exactly numeric order. Ideally I'd like to not have to change the procedure, as the test team want to see the exact same procedure breaking (as easy as plonking a order by id desc would fudge the results).
If you don't specify an order, there is no way to guarantee the return order of the results. It will be however the index is built - and can depend on the order of insertion, the type of index, and the content of index keys.
It's generally a bad idea to do those sorts of singleton SELECTs. You should always specify a specific record with the WHERE clause, or use a cursor, or TOPn or similar. The problem comes when someone tries to understand your code, because some databases when they see multiple hits take the first value, some take the last value, some take a random value (they call that "implementation-defined"), and some throw an error.
Is this by any chance related to 1156837? :)

Resources