Browsing old version of wikipedia using Wikipedia API - version

I would like parse wikipedia Data, but through time snapshots of the site, using the wikipedia API.
However, if it seems that it's possible to browse through different version of an article, I cannot find a way to browse article given a specific date or timespan.
Is their a way to do something like this using this API ?
for instance, if I use this code in python. I get the current 500 first categories.
import requests as rq
S = rq.Session()
url="https://fr.wikipedia.org/w/api.php"
PARAMS = {
"action": "query",
"format": "json",
"list": "allcategories",
"acmin":100,
"aclimit": 500
}
R = S.get(url=url, params=PARAMS)
DATA = R.json()
However, if I wanted to have access to the first 500 categories that were existing in wikipedia in january 2015, how would I do ?

Related

OCAPI Batch request to get all Orders after a Certain Date

I'm New to SFCC OCAPI. My purpose is to Export all Orders from "development.demandware.net" after certain date and this can happen Quite frequently like once in every 2 days. I'm currently using Python to achieve this using the endpoint "s/{{SITEID}}/dw/shop/v18_1/order_search". The problem is One call is getting me only 25 Records. Again i have change the query from dynamically to start from RecordNo 26 for the next call. So, If I have like 10,000 records, it makes upto 400 calls everytime the scirpt runs. The alternative options i'm aware of is:
OCAPI Batch requests
OCAPI Export job (Tried this, but haven't got enough knowledge to set this up)
So, I'd like to know if my purposed is achievable using the batch request. I tried to do this using the documentation. And, the response was 200 with no response body using the below code.
url = f"https://{DOMAIN}/s/-/dw/batch"
url_param = {'client_id': CLIENT_ID}
header = {'Authorization': 'Bearer ' + token,
'Origin': f'https://{DOMAIN}',
'Content-Type': 'multipart/mixed;boundary=23dh3f9f4',
'x-dw-http-method': 'POST',
'x-dw-resource-path': 's/{{SITEID}}/dw/shop/v18_8/order_search'}
body = """
{
"query" :
{
"filtered_query": {
"query": { "match_all_query": {} },
"filter": {
"range_filter": {
"field": "creation_date",
"from": "%s",
"from_inclusive": true
}
}
}
},
"select" : "(**)",
"sorts": [{
"field": "order_no",
"sort_order": "asc"
}],
"start": %s
}""" %(RETRIVE_RECORDS_FROM, startRecordFrom)
response = requests.post(url, params = url_param, headers = header, data = body)
My code doesn't have a x-dw-content-id as the above is an initial request. If its possible to achieve my purpose,
how should my sub-request looks like.?
And after that how do i retrieve the data of my request.? is there any endpoint i should use to get the batch results.?
I maybe asking for too much information. But, i couldn't find much information about this online so had to ask every question i have in one post.
My question might look similar to this question "Salesforce Commerce Cloud/Demandware - OCAPI query orders by date range", But I'm looking for information about batch requests and also to reduce the number of API calls.
Thanks in advance.

response.map is not a function

I'm trying to loop a request to extract the ID's from a previous request. I followed the steps in this video https://www.youtube.com/watch?v=4wuvgX-egdc but i can't get it to work. As I see it the problem is that {} is not an array but I would like to search within "campaigns" which seems to be an array. (As you probably understand I'm new to this)
Here's the request I've sent and would like to loop through to extract the ID's that I wish to use in the next request. (there are several hundreds of ID's)
{
"campaigns": [
{
"id": 373894,
"name": "Benriach",
"created_at": "2022-01-21 13:37:34",
"sent_at": "2022-01-21 13:37:53",
"status": "sent",
"type": "text_message"
},
Here's the test that I'm trying to run.
const response = pm.response.json();
const campaignids = response.map (campaignid => campaigns.id);
console.log(campaignids);
pm.variables.set('campaignids', campaignids);
Here's how it looks>>
Screenshot
The end goal is to use Postman to extract campaign statistics from an e-mail marketing tool and then send it on into Google Data Studio where I want to create a dashboard for e-mail-campaigns using both data from the e-mail marketing tool as well as website data.
const campaignids = response.map (campaignid => campaigns.id); here is the problem
space between map (
const response = pm.response.json();
const campaignids = response.map(campaign => campaign.id);
console.log(campaignids);
pm.variables.set('campaignids', campaignids);
and make sure response should be an array

google cloud online glossary creation returning "empty resource name" error

I am following the EXACT steps indicated here
https://cloud.google.com/translate/docs/glossary#create-glossary
to create a online glossary.
I am getting the following error
madan#cloudshell:~ (focused-pipe-251317)$ ./rungcglossary
{
"error": {
"code": 400,
"message": "Empty resource name.; Resource type: glossary",
"status": "INVALID_ARGUMENT"
}
}
Here is the body of my request.json
{
"languageCodesSet": {
"languageCodes": ["en", "en-GB", "ru", "fr", "pt-BR", "pt-PT", "es"]
},
"inputConfig": {
"gcsSource": {
"inputUri": "gs://focused-pipe-251317-vcm/testgc.csv"
}
}
}
The inputUri path i copied from the google cloud bucket file URI box.
I am not able to understand what the issue is. All I know is something is wrong with the inputUri string.
Please help.
Thanks.
I am a Google Cloud Technical Support Representative and we know that, for the moment, there is an issue with the REST API which is on track. I tried to reproduce your situation and while trying to create the glossary using directly the API I got the same issue as you.
After that, I have tried to create the glossary programmatically using a HTTP Triggered Python Cloud Function and everything went just right. In this manner your API will be called with the Cloud Functions service account.
I will attach the code of my Python Cloud function:
from google.cloud import translate_v3beta1 as translate
def create_glossary(request):
request_json = request.get_json()
client = translate.TranslationServiceClient()
## Set your project name
project_id = 'your-project-id'
## Set your wished glossary-id
glossary_id = 'your-glossary-id'
## Set your location
location = 'your-location' # The location of the glossary
name = client.glossary_path(
project_id,
location,
glossary_id)
language_codes_set = translate.types.Glossary.LanguageCodesSet(
language_codes=['en', 'es'])
## SET YOUR BUCKET URI
gcs_source = translate.types.GcsSource(
input_uri='your-gcs-source-uri')
input_config = translate.types.GlossaryInputConfig(
gcs_source=gcs_source)
glossary = translate.types.Glossary(
name=name,
language_codes_set=language_codes_set,
input_config=input_config)
parent = client.location_path(project_id, location)
operation = client.create_glossary(parent=parent, glossary=glossary)
result = operation.result(timeout=90)
print('Created: {}'.format(result.name))
print('Input Uri: {}'.format(result.input_config.gcs_source.input_uri))
The requirements.txt should include the following dependencies:
google-cloud-translate==1.4.0
google-cloud-storage==1.14.0
Do not forget to modify the code with your parameters
Basically, I have just followed the same tutorial as you, but for Python and I used Cloud Functions. My guess is that you can use App Engine Standard, as well.This may be an issue regarding the service account that are used to call this API. In case this doesn´t work for you let me know and I will try to edit my comment.

Wrong currency info being returned when calling spot price endpoint using Python

I have noticed that the data endpoint for getting the spot price is returning the wrong currency information when using Python. I am using a currency_pair of BTC-USD but getting results for GBP.
Example:
price = client.get_spot_price(currency_pair = 'BTC-USD')
Response:
{
"amount": "5578.85",
"base": "BTC",
"currency": "GBP"
}
Any ideas on whats causing this problem?
A workaround, though not using the official Coinbase client, would be as follows:
import requests
import json
# Do This to Avoid Warning
headers = {
'CB-VERSION': '2017-12-08'
}
# Make Request
data = requests.get('https://api.coinbase.com/v2/prices/BTC-USD/sell/', headers=headers).text
# Parse Response, Get Amount
price = json.loads(data)['data']['amount]
Obviously not very robust in terms of handling errors, exceptions, or any other types of assertions one would need (that the official client should have) that would be needed to provide the confidence needed for actual buys/sells/transfers.
EDIT: UPDATE
Apparently this is a known issue:
Read here:
https://github.com/coinbase/coinbase-python/issues/32
Supposedly already fixed in the github master, though obviously not reflected in the pip version yet.
Quoting user kflecki:
I fixed this by going into the client.py file and modifying the code to look like this. Works just fine now, however would be nice for the files to come like this. But it's a simple fix that you can do on your own.
def get_spot_price(self, **params):
"""https://developers.coinbase.com/api/v2#get-spot-price"""
if 'currency_pair' in params:
currency_pair = params['currency_pair']
else:
currency_pair = 'BTC-USD'
response = self._get('v2', 'prices', currency_pair, 'spot', data=params)
return self._make_api_object(response, APIObject)
And now the command works like so:
eth_price = client.get_spot_price(currency_pair = 'ETH-USD')

Google speech API v1beta1 (syncrecognize and asyncrecognize API call)

I am a Java developer and I have couple of questions related to Google speech API V1Beta1.
Question1 (Syncrecognize case):
I tried to upload (through GCS) small size (less than one min running file) audio file to google speech api it is working But the confidence output level is 0.32497215 only. That is my result is not exactly same to my audio input.
How to increase the confidence level output?
Question 2 (Asyncrecognize case):
I tried big size audio file (more than one min running file). This case I used the API call:
https://speech.googleapis.com/v1beta1/speech:asyncrecognize?key=XXXXXXXXXXXXXXXXXXXX
and Payload:
"{"config":{"encoding":"LINEAR16","sample_rate": 16000},"audio":{"uri":"gs://" + bucketName +"/"+ objectName + ""}}"
Here I got the output json like
{"name": "57...........................95"}.
After getting this output I make new API call (Operation interface) with this name value.
https://speech.googleapis.com/v1beta1/operations/57.................................95?key=XXXXXXXXXXXXXXXXX
I got the output
{
"name": "57....................................95",
"done": true,
"response": {
"#type": "type.googleapis.com/google.cloud.speech.v1beta1.AsyncRecognizeResponse"
}
}
How to proceed the work with this value? I need to get audio speech text.
Please help me to fix this issues. Thanks in advance.
Ideas to Question 1:
You should give more details in RecognitionConfig object, for example specify the languageCode and add hints via the SpeechContext object.
Answer to Question 2:
Check the sample rate of the audio file, you must be sure that is equal to the rate you gave in the request. You can check it e.g. with the following code soxi audio_file.flac (sox needed for this one).

Resources