How to retrieve more than 50 records using Spotipy API - spotipy

I'm using the Spotipy API to retrieve song data from Spotify. Here's my code:
import pandas as pd
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id='<my_client_id>',
client_secret='<my_client_secret'))
results = sp.search(q="artist:guns n' roses", limit=50)
d = []
for idx, track in enumerate(results['tracks']['items']):
d.append (
{
'Track' : track['name'],
'Album' : track['album']['name'],
'Artist' : track['artists'][0]['name'],
'Release Date' : track['album']['release_date'],
'Track Number' : track['track_number'],
'Popularity' : track['popularity'],
'Track Number' : track['track_number'],
'Explicit' : track['explicit'],
'Duration' : track['duration_ms'],
'Audio Preview URL' : track['preview_url'],
'Album URL' : track['album']['external_urls']['spotify']
}
)
pd.DataFrame(d)
Per the docs, it appears that Spotify has a limit of 50 records.
Is it possible to retrieve all records for a given string search? (e.g. by chunking requests, etc.)
Thanks!

The Spotify Web API can return a maximum of 1000 items. (In this example, it found 390 tracks, so it got all of them.)
Here is the code to get them:
import pandas as pd
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id='<my_client_id>',
client_secret='<my_client_secret>'))
d = []
total = 1 # temporary variable
offset = 0
while offset < total:
results = sp.search(q="artist:guns n' roses", type='track', offset=offset, limit=50)
total = results['tracks']['total']
offset += 50 # increase the offset
for idx, track in enumerate(results['tracks']['items']):
d.append (
{
'Track' : track['name'],
'Album' : track['album']['name'],
'Artist' : track['artists'][0]['name'],
'Release Date' : track['album']['release_date'],
'Track Number' : track['track_number'],
'Popularity' : track['popularity'],
'Track Number' : track['track_number'],
'Explicit' : track['explicit'],
'Duration' : track['duration_ms'],
'Audio Preview URL' : track['preview_url'],
'Album URL' : track['album']['external_urls']['spotify']
}
)
pd.DataFrame(d)

Related

Find() takes no keyword arguments # web scraping

please help me find the error as i didn’t understand for correctly :
from bs4 import BeautifulSoup
import requests
import pandas as pd
url = 'https://www.imdb.com/chart/top/?ref_=nv_mv_250'
response = requests.get(url)
with open("imdb_top_250_movies.html", mode='wb') as file:
file.write(response.content)
soup = BeautifulSoup(response.content, 'lxml')
df_list = []
for movie in soup:
title = movie.find('td' , class_="titleColumn").find('a').contents[0]
year = movie.find('td' , class_="titleColumn").find('span').contents[0][1:-1]
user_rating = movie.find('td' , class_="ratingColumn imdbRating").find('strong').contents[0]
df_list.append({'title': title,
'year': int(year),
'user_ratings': float(user_rating)})
df = pd.DataFrame(df_list, columns = ['title', 'year', 'user_ratings'])
df
This is the error I got
TypeError Traceback (most recent call
last) Input In [125], in <cell line: 8>()
9 soup = BeautifulSoup(response.content, 'lxml')
10 df_list = []
---> 11 title = movie.find('td' , class_="titleColumn").find('a').contents[0]
12 year = soup.find('td' , class_="titleColumn").find('span').contents[0][1:-1]
13 user_rating = soup.find('td' , class_="ratingColumn imdbRating").find('strong').contents[0]
TypeError: find() takes no keyword arguments
Someone helped me with this answer as I wrote For incorrectly :
from bs4 import BeautifulSoup
import requests
import pandas as pd
url = 'https://www.imdb.com/chart/top'
response = requests.get(url)
with open("imdb_top_250_movies.html", mode='wb') as file:
file.write(response.content)
soup = BeautifulSoup(response.content, 'lxml')
df_list = []
for movie in soup.find('tbody' , class_="lister-list").find_all('tr'):
Place = movie.find('td' , class_="titleColumn").contents[0][1:-len('.\n ')]
title = movie.find('td' , class_="titleColumn").find('a').contents[0]
year = movie.find('td' , class_="titleColumn").find('span').contents[0][1:-1]
user_rating = movie.find('td' , class_="ratingColumn imdbRating").find('strong').contents[0]
df_list.append({'place': Place,
'title': title,
'year': int(year),
'user_ratings': float(user_rating)})
df = pd.DataFrame(df_list, columns = ['place','title', 'year', 'user_ratings'])
df.style.hide(axis='index')

Matomo ÄPI "Actions.getPageUrls" returns only 100 rows on rest api call

I am trying to fetch data from matomo api "Actions.getPageUrls" by using below code:
import requests
import pandas as pd
api_url="baseapi"
PARAMS = {'module': 'API',
'method':'Actions.getPageUrls',
'period' : 'range',
'date': '2019-01-01,2020-01-01',
'filter_limit' : '-1',
'idSite': '1',
'format': 'JSON',
'expanded' : '1',
'token_auth': "tocken"}
r = requests.post(url = api_url, params = PARAMS, verify=False)
print(r.url)
matomo_df = pd.DataFrame(r.json())
matomo_df.head()
matomo_df['label']
matomo_df = pd.DataFrame(r.json()[0]['subtable'])
matomo_df
But, it returns only 100 rows.
I want to get more than 100 rows. Could you please help me.
By default it is set to return only 100 rows, however when you set the 'filter-limit' to -1, it is suppose to return all the rows.Can you set the 'filter-limit' param to 10000 and try it.

Paytm checksum mismatch in ios swift4

I am Integrating paytm gateway in ios app when I generating checksum through api calling and pass this checksum with below param in web view
WEB VIEW is loading with Parms:
{
"CALLBACK_URL" = "https://securegw-stage.paytm.in/theia/paytmCallback?ORDER_ID=SJ-1552029210215";
"CHANNEL_ID" = WAP;
CHECKSUMHASH = "FRMuR8LLFvg3wkIf6gp4BqVnNigr6WvUaSm9EVJIJo6Z5RicUvU7acRGZlEfK1FGoeNSqN53R0OphttHFnJuZ0lKeAZDvrG7Pr5ZnlzTEUw=";
"CUST_ID" = 64;
EMAIL = "jitu123#gmail.com";
"INDUSTRY_TYPE_ID" = Retail;
MID = rxazcv89315285244163;
"MOBILE_NO" = 566789877;
"ORDER_ID" = "SJ-1552029210215";
"TXN_AMOUNT" = "246.0";
WEBSITE = WEBSTAGING;
}
I am getting back below response
{
"ORDERID" : "SJ-1552029210215",
"MID" : "rxazcv89315285244163",
"TXNAMOUNT" : "246.00",
"BANKTXNID" : "",
"RESPCODE" : "330",
"STATUS" : "TXN_FAILURE",
"CURRENCY" : "INR",
"RESPMSG" : "Paytm checksum mismatch."
}
how can i solve paytm checksum mismatch issue???

How to use cursors for search in gae?

When I RTFM, I can't understand how to specify paginated searches using the technique described in the manual. Here's my code:
def find_documents(query_string, limit, cursor):
try:
subject_desc = search.SortExpression(
expression='date',
direction=search.SortExpression.DESCENDING,
default_value=datetime.now().date())
# Sort up to 1000 matching results by subject in descending order
sort = search.SortOptions(expressions=[subject_desc], limit=1000)
# Set query options
options = search.QueryOptions(
limit=limit, # the number of results to return
cursor=cursor,
sort_options=sort,
#returned_fields=['author', 'subject', 'summary'],
#snippeted_fields=['content']
)
query = search.Query(query_string=query_string, options=options)
index = search.Index(name=_INDEX_NAME)
# Execute the query
return index.search(query)
except search.Error:
logging.exception('Search failed')
return None
class MainAdvIndexedPage(SearchBaseHandler):
"""Handles search requests for comments."""
def get(self):
"""Handles a get request with a query."""
regionname = 'Delhi'
region = Region.all().filter('name = ', regionname).get()
uri = urlparse(self.request.uri)
query = ''
if uri.query:
query = parse_qs(uri.query)
query = query['query'][0]
results = find_documents(query, 50, search.Cursor())
next_cursor = results.cursor
template_values = {
'results': results,'next_cursor':next_cursor,
'number_returned': len(results.results),
'url': url, 'user' : users.get_current_user(),
'url_linktext': url_linktext, 'region' : region, 'city' : '', 'request' : self.request, 'form' : SearchForm(), 'query' : query
}
self.render_template('indexed.html', template_values)
The code above works and does a search but it doesn't page the result. I wonder about the following code in the manual:
next_cursor = results.cursor
next_cursor_urlsafe = next_cursor.web_safe_string
# save next_cursor_urlsafe
...
# restore next_cursor_urlsafe
results = find_documents(query_string, 20,
search.Cursor(web_safe_string=next_cursor_urlsafe))
What is next_cursor used for? How do I save and what is the purpose of saving? How do I get a cursor in the first place? Should the code look something like this instead, using memcache to save an restore the cursor?
class MainAdvIndexedPage(SearchBaseHandler):
"""Handles search requests for comments."""
def get(self):
"""Handles a get request with a query."""
regionname = 'Delhi'
region = Region.all().filter('name = ', regionname).get()
uri = urlparse(self.request.uri)
query = ''
if uri.query:
query = parse_qs(uri.query)
query = query['query'][0]
# restore next_cursor_urlsafe
next_cursor_urlsafe = memcache.get('results_cursor')
if last_cursor:
results = find_documents(query_string, 50,
search.Cursor(web_safe_string=next_cursor_urlsafe))
results = find_documents(query, 50, search.Cursor())
next_cursor = results.cursor
next_cursor_urlsafe = next_cursor.web_safe_string
# save next_cursor_urlsafe
memcache.set('results_cursor', results.cursor)
template_values = {
'results': results,'next_cursor':next_cursor,
'number_returned': len(results.results),
'url': url, 'user' : users.get_current_user(),
'url_linktext': url_linktext, 'region' : region, 'city' : '', 'request' : self.request, 'form' : SearchForm(), 'query' : query
}
self.render_template('indexed.html', template_values)
Update
From what I see from the answer, I'm supposed to use an HTTP GET query string to save the cursor but I still don't know exactly how. Please tell me how.
Update 2
This is my new effort.
def get(self):
"""Handles a get request with a query."""
regionname = 'Delhi'
region = Region.all().filter('name = ', regionname).get()
cursor = self.request.get("cursor")
uri = urlparse(self.request.uri)
query = ''
if uri.query:
query = parse_qs(uri.query)
query = query['query'][0]
logging.info('search cursor: %s', search.Cursor())
if cursor:
results = find_documents(query, 50, cursor)
else:
results = find_documents(query, 50, search.Cursor())
next_cursor = None
if results and results.cursor:
next_cursor = results.cursor.web_safe_string
logging.info('next cursor: %s', str(next_cursor))
template_values = {
'results': results,'cursor':next_cursor,
'number_returned': len(results.results),
'user' : users.get_current_user(),
'region' : region, 'city' : '', 'request' : self.request, 'form' : SearchForm(), 'query' : query
}
I think that I've understood how it's supposed to work with the above, and it's outputting a cursor at the first hit so I can know how to get the cursor in the first place. This is clearly documented enough. But I get this error message: cursor must be a Cursor, got unicode
No, you should not use memcache for that, especially with a constant key like 'results_cursor' - that would mean that all users would get the same cursor, which would be bad.
You are already passing the cursor to the template context (although you should be converting to the web_safe_string as you do in the second example). In the template, you should ensure that the cursor string is included in the GET parameters of your "next" button: then, back in the view, you should extract it from there and pass it into the find_documents call.
Apart from the memcache issue, you're almost there with the second example, but you should obviously ensure that the second call to find_documents is inside an else block so it doesn't overwrite the cursor version.

Django ORM order_by

I have two models , Task and TaskComment :
class Task(models.Model):
title = models.CharField(max_length = 200)
creationDate = models.DateTimeField('date created')
lastUpdateDate = models.DateTimeField('date updated')
description = models.CharField(max_length = 5000)
class TaskComment(models.Model):
task = models.ForeignKey(Task, related_name='comments')
message = models.CharField(max_length = 5000)
creationDate = models.DateTimeField('date created')
Imagine a page where all tasks are listed. And thing i want to do is to order tasks by the number of comments linked to this task. I've tried several this like :
Task.objects.all().order_by("comments__count")
But it doesn't worked.
Can you help me ?
You need Annotation
from django.db.models import Count
Task.objects.all().annotate(num_comments=Count('taskcomment')).order_by('-num_comments')

Resources