Limit the number of rows counted in in google data studio - google-data-studio

I have a scorecard that looks at the number of URL clicks driven by all queries which works as expected. I am now trying to display the number of clicks driven by the top 10 queries in the scorecard. I was able to limit the number of rows in my table by disabling pagination to show only the top 10 queries but now I'm looking to sum the clicks in a scorecard to provide a quick summary rather than having a table.

I don't think what you want to do is possible dynamically via just the Search Console connector. Google Data Studio does not provide any way to calculate rankings via calculated fields, so there's no way for you to know which query is in the top 10 without looking at a sorted table. A few imperfect alternatives (roughly in order of increasing complexity):
You apply a filter so that the score card only aggregates values above a certain threshold. This would be hardcoded, so you would be filtering on the Clicks (ie aggregate all URL clicks above 100)
You apply a filter to the score card so that it only aggregates clicks from the top 10 URLs. This would not be a dynamically updating filter, so you'd have to look at the table to see which URLs are in the top 10, which would change as time goes on. This would end up being a filter like: "Include URLS Contains www.google.com,www.stackoverflow.com"
If you do not mind using google sheets as an intermediary, you could dump your Search Console data into a spreadsheet so that you can manipulate it however you like and then use the spreadsheet as the data source for data studio (as opposed to the Search Console connector). It looks like there might be some addons out there that you can use out of the box although I haven't used it myself, so not sure how difficult it is. Alternatively, you can build something out yourself via the Google Script and the Search Console API
You could build a custom Data Studio Community Visualization. (BTW just because they are called 'Community Visualizations' does not mean you have to make them publicly available.) Essentially here, you would be building a scorecard like component that aggregates the data according to your own rules, although this does require more coding experience. (Before you build one, check if something like what you need exists in the gallery, but at a quick glance, I don't see anything that would meet your needs.)

Related

Google Search API Wildcard

I have a Python project running on Google App Engine. I have a set of data currently placed at datastore. On user side, I fetch them from my API and show them to the user on a Google Visualization table with client side search. Because the limitations I can only fetch 1000 record at one query. I want my users search from all records that I have. I can fetch them with multiple queries before showing them but fetching 1000 records already taking 5-6 second so this process can exceed 30 seconds timeout and I don't think putting around 20.000 records on a table is good idea.
So I decided to put my records on Google Search API. Wrote a script to sync important data between datastore and Search API Index. When perform a search, couldn't find anything like wildcard character. For example let's say I have user field stores a string which contains "Ilhan" value. When user search for "Ilha" that record not show up. I want to show record includes "Ilhan" value even if it partially typed. So basically SQL equivalent of my search should be something like "select * from users where user like '%ilh%'".
I wonder if there is a way to that or is this not how Search API works?
I setup similar functionality purely within datastore. I have a repeated computed property that contains all the search substrings that can be formed for a given object.
class User(ndb.Model):
# ... other fields
search_strings = ndb.ComputedProperty(
lambda self: [i.lower() for i in all_substrings(strings=[
self.email,
self.first_name,
self.last_name,], repeated=True)
Your search query would then look like this:
User.query(User.search_strings == search_text.strip().lower()).fetch_page(20)
If you don't need the other features of Google Search API and if the number of substrings per entity won't put you at risk of hitting the 900 properties limit, then I'd recommend doing this instead as it's pretty simple and straight forward.
As for taking 5-6 seconds to fetch 1000 records, do you need to fetch that many? why not fetch only 100 or even 20 and use the query cursor for the user to pull the next page only if they need it.

Find out the popularity of values

I have a table with 1000 rows. Each row represents a prompt text for an application. For the start I only want to translate the most used 20% of the promts. In the daily use some dialogs appear more often than others. So the prompt texts for the most displayed dialogs get fetched more often than the others.
However, it looks to me like there is no built-in mechanism to analyse the data by their select rates.
There are no triggers on select. There is no way to filter the data in the profiler. There is no way to filter data in an Audit. Is that true?
Are there any options to do that inside the SQL Server?
No. There is no way to track the frequency of how often data is selected.
This sounds like application metrics. You will have to write metrics logic yourself.
For example, you might create a table of MissingTranslations that tracks the frequency of requests. If your application detects a missing translation, insert a row into this table with a frequency of 1, or increment the counter if it already exists in the table.
You could then write another application that sorts the missing translations by frequency descending. When a user enters the translation, the translation app removes the entry from the list of missing translations or marks it as complete.
All that being said, you could abuse some SQL Server features to get some information. For example, a stored procedure that returns these translations could generate a user-configurable trace event with the translation info. A SQL Profiler session could listen for these events and write them to a table. This would get you a basic frequency.
It might be possible to get the same information from implementing auditing and then calling sys.fn_get_audit_file, but that sounds cumbersome at best.
In my opinion, it sounds easier and more stable to me to write this logic yourself.
#TabAlleman: "no, there's nothing you can do"

How to fetch thousands of data from database without getting slow down?

I want auto search option in textbox and data is fetching from database. I have thousands of data in my database table (almost 8-10000 rows). I know how to achieve this but as I am fetching thousands of data, it will take a lot of time to fetch. How to achieve this without getting slow down? Should I follow any other methodology to achieve this apart from simple fetching methods? I am using Oracle SQL Developer for database.
Besides the obvious solutions involving indexes and caching, if this is web technology and depending on your tool you can sometimes set a minimum length before the server call is made. Here is a jquery UI example: https://api.jqueryui.com/autocomplete/#option-minLength
"The minimum number of characters a user must type before a search is performed. Zero is useful for local data with just a few items, but a higher value should be used when a single character search could match a few thousand items."
It depends on your web interface, but you can use two tecniques:
Paginate your data: if your requirements are to accept empty values and to show all the results load them in block of a predefined size. goggle for example paginates search results. On Oracle pagination is made using the rownum special variable (see this response). Beware: you must first issue a query with a order by and then enclose it in a new one that use rownum. Other databases that use the limit keyword behave in a different way. If you apply the pagination techique to a drop down you end up with an infinite scroll (see this response for example)
Limit you data imposing some filter that limits the number of rows returned; your search display some results only after the user typed at least n chars in the field
You can combine 1 & 2, but unless you find an existing web component (a jquery one for example) it may be a difficult task if you don't have a Javascript knowledge.

Using search server with cakephp

I am trying to implement customized search in my application. The table structure is given below
main table:
teacher
sub tables:
skills
skill_values
cities
city_values
The searching will be triggered with location which is located in the table city_values with a reference field user_id, and city_id . Here name of the city and its latitude and longitude is found under the table cities.
Searching also includes skills, the table relations are similar to city. users table and skill_values table can be related with field user_id in the table skill_values. The table skills and skill_values related with field skill_id in table skill_values.
Here we need find the location of the user who perform this search, and need to filter this results with in 20 miles radius. there are a few other filters also.
My problem is that i need to filter these results without page reload. So i am using ajax, but if number of records increase my ajax request will take a lot of time to get response.
Is that a good idea that if i use some opensource search servers like sphinx or solr for fetching results from server?
I am using CAKEPHP for development and my application in hosted on cloud server.
... but if number of records increase my ajax request will take a lot of time to get response.
Regardless of the search technology, there should be a pagination mechanism of some kind.
You should therefore be able to set the limit or maximum number of results returned per page.
When a user performs a search query, you can use Javascript to request the first page of results.
You can then simply incrementing the page number and request the second, third, fourth page, etc.
This should mean that the top N results always appear in roughly the same amount of time.
It's then up to you to decide if you want to request each page of search results sequentially (ie. as the callback for each successful response), or if you wait for some kind of user input (ie. clicking a 'more' link or scrolling to the end of the results).
The timeline/newsfeed pages on Twitter or Facebook are a good example of this technique.

Millions of rows auto-complete field - implementation ideas?

I have a location auto-complete field which has auto complete for all countries, cities, neighborhoods, villages, zip codes. This is part of a location tracking feature I am building for my website. So you can imagine this list will be in the multi-millions of rows. Expecting over 20 million atleast with all the villages and potal codes. To make the auto-complete work well I will use memcached so we dont hit the database always to get this list. It will be used a lot as this is the primary feature on the site. But the question is:
Is only 1 instace of the list stored in memcached irrespective of the users pulling the info or does it need to maintain a separate instance for each? So if say 20 million people are using it at the same time, will that differ from just 1 person using the location auto-complete? I am open to other ideas also on how to implement this location auto complete so it performs well.
Or can i do something like this: When a user logs in in the background I send them the list anyways, so by the time they reach the auto complete textfield their computer will have it ready to load instant?
Take a look at Solr (or Lucene itself), using NGram (or EdgeNGram) tokenizers you can get good autocomplete performance on massive datasets.

Resources