aiohttp request times out when using a proxy, but is fine when i run without a proxy - request

I am attempting to scrape a site using a proxy in aiohttp with asyncio but whenever I run the code with a proxy it times out. I have removed the proxy from the code because it is a DC proxy. Does anyone know why my request would be timing out? It does work when I run without a proxy.
code:
import aiohttp
import asyncio
import random
async def fetch(session,):
UAList = ['Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) '
'Version/13.1.1 Safari/605.1.15',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/83.0.4103.97 '
'Safari/537.36',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:77.0) Gecko/20100101 Firefox/77.0',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/83.0.4103.97 Safari/537.36', ]
# chooses a random User Agent(From the list above) for the header
rUserAgent = random.choice(UAList)
rHeader = {
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,'
'application/signed-exchange;v=b3;q=0.9',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9',
'cache-control': 'max-age=0',
'cookie': 'cookie',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'same-origin',
'sec-fetch-user': '?1',
'upgrade-insecure-requests': '1',
'user-agent': rUserAgent
}
proxy = 'http://'
url = 'http://www.yeezysupply.com/product/GY7657'
async with session.get(url=url,allow_redirects=False,headers=rHeader,proxy=proxy) as resp:
assert resp.status == 200
return await resp.text()
async def main():
conn = aiohttp.TCPConnector()
async with aiohttp.ClientSession() as session:
html = await fetch(session)
print(html)
policy = asyncio.WindowsSelectorEventLoopPolicy()
asyncio.set_event_loop_policy(policy)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())

As I cannot tell from the post, your proxy should have the following structure:
http://username:password#proxy_url:proxy_port or http://proxy_url:_proxy_port
Please try to redefine your main() like this:
async def main():
conn = aiohttp.TCPConnector()
async with aiohttp.ClientSession(connector=conn, timeout=aiohttp.ClientTimeout()) as session:
html = await fetch(session)
print(html)
During my tests, I found that the assert resp.status == 200 would fail, if the toGet-URL was set to http:// instead of https:// as using http:// would result in a redirect (status code 301) for me.

Related

Scrape html data, in order to create a monitor. How do I find the available botton so i can scrape it. And use it as a tool for my product monitor

I already found the end-point with the products on it. Since there is no bot protection, it shouldnt be too difficult I hope. Im new to monitors so i would be glad to receive some help
import requests
import json
from bs4 import BeautifulSoup
headers = {
'authority': 'www.otto.de',
'accept': '*/*',
'accept-language': 'de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7',
# Requests sorts cookies= alphabetically
# 'cookie': f"visitorId=adf2777f-77ae-4140-a70a-94bca78fb3cf.v1; BrowserId=e0869ad3-79c9-4a7b-bdb3-1dac1c0b6724.v1; consentId=d8050879-021f-472c-a892-5a3e2d5ca942; cb=1; odn_id=e0869ad3-79c9-4a7b-bdb3-1dac1c0b6724.v1; _ga=GA1.2.1701684572.1640776274; aditionUserId=7046727957523724431; lId=fdd5b7d4bb3cdb20cb7392f1871c91f83ca5d221bfcc51f3e4a094587b3f15602a5310b86b217239056fdaae98fb0c4bf69bf0bebff649fd65b07cd93d0277381f6594acb5491b0c553d57d9662502d4ca29eb5d2d8ddb1f5684ebf7ccc0b769; cb=1; iabConsent=CPR8-VgPcGIPAEcACBENCXCgAP_AAAAAAAYgF7wBYATgBQACwBF4C8wF7gAAAGCQAQF5ioAIC8xkAEBeY6AEAi8BeZKAEAi8BeZSAEAi8BeY.f_gAAAAAAAAA||V1; mpt_rate_comparator_3399=0.2796272063841032|1660386738956; mpt_saw_Artikeldetailseite=0|1659004338965; mpt_visit_count=1|1659004338966; mpt_visit_timeout=1657794738966|1659004338967; mpt_initial_referrer=https%3A%2F%2Fwww.google.com%2F|1659004338967; mpt_referrer_forPoll=https%3A%2F%2Fwww.google.com%2F|1659004338968; FP_NITRO_SID=43c1122a-9c4b-4586-b2e4-53b02d24ebaa; mpt_tracking_active_polls_3399=1|session; mpt_poll.diced.customrate.664=0|1659004339435; mpt_poll.diced.customrate.3778=0|1659004339444; mpt_poll.diced.customrate.4195=0|1659004339454; mpt_poll.diced.customrate.4211=0|1659004339459; mpt_poll.diced.customrate.4216=1|1659004339460; mpt_poll.diced.customrate.4234=0|1659004339463; mpt_poll.diced.customrate.4241=0|1659004339465; mpt_poll.diced.customrate.4242=0|1659004339470; mpt_vid=165779473948161600|1720866739481; _gid=GA1.2.762140440.1657794740; dprt_main=v_id:0181fc443d8400153751a094fb730506f0068067007e8{_sn:1$_se:1$_ss:1$_st:1657796539590$ses_id:1657794739590%3Bexp-session$_pn:1%3Bexp-session;} ADS_NITRO=1.adex.1657794737915; FP_NITRO=1.abfa5a8e-5a21-416d-80dd-4a1b3aa7bba4.0.43c1122a-9c4b-4586-b2e4-53b02d24ebaa.1640694206368.1657794738856.1657794752645.1657794739112.10; mpt_pi_count=2|1659004352678; mpt_saw_ADS=0|1659004352679; mpt_Baumarkt=0|1659004352681; mpt_Multimedia=0|1659004352681; mpt_usedNavigation=0|1659004352680; mpt_tracking_active_3399=0|1660386752682; mpt_poll.diced.customrate.1802=0|1659004352689; mpt_poll.diced.customrate.3440=0|1659004352695; mpt_poll.diced.customrate.3950=0|1659004352699; mpt_poll.diced.customrate.4165=0|1659004352704; mpt_sawADS_directly=0|1659004352710; mpt_sawCheckout_4244=0|1659004352720; mpt_sawCheckout_4245=0|1659004352722; _uetsid=3e59d2c0036011edaa9d5722a379d455; _uetvid=09b9add0689811ecabc3691baa958505; devSpecs=w=601&h=609&bp=m",
'referer': 'https://www.otto.de/suche/ventilator/?verkaeufer=garosa&l=gq&sortiertnach=preis-aufsteigend',
'sec-ch-ua': '".Not/A)Brand";v="99", "Google Chrome";v="103", "Chromium";v="103"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
'x-find-trackable': 'san_ListEventType=tilelistview&ot_PageCluster=Suchergebnisseite&ot_VisibleUrl=%2Fsuche%2Fventilator%2F%3Fverkaeufer%3Dgarosa%26l%3Dgq%26sortiertnach%3Dpreis-aufsteigend&ts_Type=event&eventMergeId=7b3764fc-934f-f9f3-abef-d07883d8ebba&san_Interaction=sorting_change&san_SortingChange=preis-aufsteigend&ts_RemoveLabels=wk.nav_UnfilteredSelectionRule&san_SortingInitialVisible=true',
'x-requested-with': 'XMLHttpRequest',
}
response = requests.get('https://www.otto.de/centipede/tilelist?rule=(und.(ist.verkaeufer.garosa).(suchbegriff.ventilator).(~.(v.1)))&l=gq&sortiertnach=preis-aufsteigend', headers=headers)
data= response.json()
soup=BeautifulSoup(response.text, 'html.parser')
print(response.text)

Filepond React: Revert call does not contain any id (solved)

Bear with me, I've looked at the similar questions but they are of no help.
The React filepond project works great for uploading images, but reverting is not working. The DELETE call does not send to the server which file is actually being deleted. That makes it impossible to clean up files that the user does not want to be uploaded.
Here is my React Code.
return <FilePond
labelIdle={label}
name={name}
files={files}
allowMultiple={maxFiles !== 1}
maxFiles={maxFiles}
server={{
process: {
url: baseURL + `/${name}/image`,
method: 'POST',
headers: {
Authorization: 'Bearer ' + token,
userid
},
},
revert: baseURL + `/${name}/image/revert`
}}
onremovefile={(err: any, file: any) => {
debugger
}}
onupdatefiles={(fileItems: any) => onUpdateFiles(fileItems.map((fileItem: any) => fileItem.file))}
/>
The cURL request looks like this:
curl 'http://localhost:4000/tenant/image/revert' -X DELETE -H 'Connection: keep-alive' -H 'Origin: http://localhost:3000' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36' -H 'Accept: */*' -H 'Sec-Fetch-Site: same-site' -H 'Sec-Fetch-Mode: cors' -H 'Referer: http://localhost:3000/tenant' -H 'Accept-Encoding: gzip, deflate, br' -H 'Accept-Language: en-US,en;q=0.9' --compressed
As you can see in the cURL request there is no ID being sent to the server.
Its now solved. The back-end didn't send a proper ID back to the component.
return res.send(imageId)

Coinbase Pro API - invalid signature

This is edited from original post:
From the docs:
Signing a Message The CB-ACCESS-SIGN header is generated by creating a
sha256 HMAC using the base64-decoded secret key on the prehash string
timestamp + method + requestPath + body (where + represents string
concatenation) and base64-encode the output. The timestamp value is
the same as the CB-ACCESS-TIMESTAMP header.
Here is information from a key I deleted. This is from Coinbase Pro Sandbox:
publicKey:
06057d5b5e03d0f8587a248330402b21
passPhrase:
gcgs6k6rp0f
secretKey: EFAToD5heo66GIgZlT2TIZzJf8TYlmxyeRxRYDHTBv3lTt9XN6uaNS0RNAy0os/caR47x6EiPDOV3Ik+YzrfEA==
I'm using angular, specifically the node.js crypto-js library:
private generateSignaturePro(timestamp: string, method: string, resourceUrl: string, requestBody: string): string {
var prehash: string = timestamp + method + resourceUrl + requestBody;
var key = (Buffer.from(this.secretKey, 'base64')).toString();
return crypto.enc.Base64.stringify(crypto.HmacSHA256(prehash, key));
}
Server time is Time: 2019-05-20T19:01:38.711Z Epoch: 1558378898.711 (from /time endpoint)
here is my request and the server response:
Request:
Request URL: https://api-public.sandbox.pro.coinbase.com/accounts
Request Method: GET
Status Code: 400
Remote Address: 104.16.161.226:443
Referrer Policy: no-referrer-when-downgrade
Request Headers:
Provisional headers are shown
Accept: application/json, text/plain, */*
CB-ACCESS-KEY: 06057d5b5e03d0f8587a248330402b21
CB-ACCESS-PASSPHRASE: gcgs6k6rp0f
CB-ACCESS-SIGN: 0cc2BnQYdUhLucXSPwMTjpHjJ32G3RXSH44rSsEopvjAtY90uRCMVy6xUrzg/A/aRJBLqx390fcZc7lmJeP++g==
CB-ACCESS-TIMESTAMP: 1558378899
Referer: https://localhost:44342/dashboard
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36
Response Headers:
access-control-allow-headers: Content-Type, Accept, cb-session, cb-fp
access-control-allow-methods: GET,POST,DELETE,PUT
access-control-allow-origin: *
access-control-expose-headers: cb-before, cb-after, cb-gdpr
access-control-max-age: 7200
cache-control: no-store
cf-cache-status: MISS
cf-ray: 4da08f74ba97cf68-IAD
content-length: 31
content-type: application/json; charset=utf-8
date: Mon, 20 May 2019 19:01:38 GMT
etag: W/"1f-4RjKVp8I05+xcnQ5/G16yRoMSKU"
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
status: 400
strict-transport-security: max-age=15552000; includeSubDomains
vary: Accept-Encoding
x-content-type-options: nosniff
x-dns-prefetch-control: off
x-download-options: noopen
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
Response:
{"message":"invalid signature"}
What am I doing wrong?
EDIT: Changed method to the SHA 256 version. Still doesn't work.
I ran into same issue and my code was same as yours basically. I changed to the following (c#) and it finally worked. Weird thing is coinbase pro is only exchange i have had issues with so far with the signature. In any case here is the code that worked for me. Hope this helps. Would have saved me hours
public string ComputeSignature(
HttpMethod httpMethod,
string secret,
double timestamp,
string requestUri,
string contentBody = "")
{
var convertedString = System.Convert.FromBase64String(secret);
var prehash = timestamp.ToString("F0", CultureInfo.InvariantCulture) + httpMethod.ToString().ToUpper() + requestUri + contentBody;
return HashString(prehash, convertedString);
}
private string HashString(string str, byte[] secret)
{
var bytes = Encoding.UTF8.GetBytes(str);
using (var hmaccsha = new HMACSHA256(secret))
{
return System.Convert.ToBase64String(hmaccsha.ComputeHash(bytes));
}
}
From the gdax-java (as it was named prior to "coinbase pro") library the generate signature method is:
String prehash = timestamp + method.toUpperCase() + requestPath + body;
byte[] secretDecoded = Base64.getDecoder().decode(secretKey);
keyspec = new SecretKeySpec(secretDecoded, "HmacSHA256");
sha256 = (Mac) GdaxConstants.SHARED_MAC.clone();
sha256.init(keyspec);
return Base64.getEncoder().encodeToString(sha256.doFinal(prehash.getBytes()));
At least on initial inspection the code you're using specifies using SHA512 rather than HmacSHA256, so I'd suspect that to be a probabale cause.
There is also more help with NodeJS in the right hand column here for generating the signatures. https://docs.pro.coinbase.com/#creating-a-request
Had the same issue here. For me the answer was to use luxon DateTime instead of the native js Date functions as shown in the coinbase docs.
Here is the typescript that works for me. You can use the results of this function to populate your request headers.
import crypto from 'crypto';
import { DateTime } from 'luxon';
export const auth = (
method: 'GET' | 'POST',
path: string,
body?: Record<string, unknown>
) => {
const timestamp = DateTime.utc().toMillis() / 1000;
let message = timestamp + method + path;
if (body) {
message += JSON.stringify(body);
}
const secret = Buffer.from('YOUR_SECRET','base64');
const hmac = crypto.createHmac('sha256', secret);
return {
'CB-ACCESS-KEY': 'YOUR_KEY',
'CB-ACCESS-PASSPHRASE': 'YOUR_PASSPHRASE',
'CB-ACCESS-SIGN': hmac.update(message).digest('base64'),
'CB-ACCESS-TIMESTAMP': timestamp.toString()
};
};

"Download Network Failed" - Chrome failes on 160kb tiff file in React Redux app

So I have been looking around. Obviously there's a fair amt of on the web about these "Download Network Failded" problems. I feel our problem is somewhat unique in that we have one file. a 160kb tiff file (really it's a blog that we append a tiff extension too). I just stumbled on this when testing. It's a random image on my machine. I have much bigger and smaller files that process fine through the app. When debugging, the response looks good in fiddler, like any other good response. Also tracking the response through our React app it looks good all the way through. So the problem happens somewhere in Chrome and just for this one file. We've tried all standard stuff found here.
https://productforums.google.com/forum/#!topic/chrome/7XBU6g6_Ktc
Mainly fiddling with extensions (disabling them), download locations, reinstalling, etc. But the idea that is one smaller jpg file we are sending for conversion (the app is a basic convertor) has me perplexed. Has anyone ever seen something like this??
So here is how we handle the file in our redux action.
WE use these packages
import dataURLtoBlob from 'dataurl-to-blob';
import FileSaver from 'file-saver';
And we have a dispatch function we pass in for a response in our thunk (the fetch)
export function saveFile(data, fileName) {
return (dispatch) => {
var ie = navigator.userAgent.match(/MSIE\s([\d.]+)/),
ie11 = navigator.userAgent.match(/Trident\/7.0/) && navigator.userAgent.match(/rv:11/),
ieEDGE = navigator.userAgent.match(/Edge/g),
ieVer = (ie ? ie[1] : (ie11 ? 11 : (ieEDGE ? 12 : -1)));
if (ie && ieVer < 10) {
console.log("No blobs on IE ver<10");
return;
}
var mimeType = data.split(',')[0].split(':')[1].split(';')[0];
var extension = '';
if (mimeType.includes("zip")) {
extension = "zip"
}
else {
extension = mimeType.substr(mimeType.lastIndexOf('/') + 1);
}
var npmBlob = dataURLtoBlob(data);
if (ieVer > -1) {
FileSaver.saveAs(npmBlob, fileName + "." + extension);
} else {
var downloadLink = document.createElement("a");
document.body.appendChild(downloadLink);
downloadLink.style.display = "none";
downloadLink.href = data;
downloadLink.download = fileName;
downloadLink.click();
}
}
}
Relevant part of the fetch itself
}).then(response => {
//debugger;
var responseObj = JSON.parse(response);
//handle multi-retrieve
if (targetExtension.includes("/File/Retrieve")) {
for (let array of responseObj) {
if (array.ReturnDocument) {
if (responseObj.length > 1) {
dispatch(saveFile(responseObj[0].ReturnDocument, "testFiles_download"));
} else {
dispatch(saveFile(responseObj[0].ReturnDocument, responseObj[0].ticketID));
}
}
}
}
var returnObject = { returnResult: responseObj, loading: false };
return callback(returnObject);
Everything looks good. http status codes are 200 and all other files are working. There is really nothing special about this jpg we send in as far as we can tell. And it looks good coming back.
Here is the request sent in:
POST http://redacted/api/File/Convert HTTP/1.1
Host: redacted-dev
Connection: keep-alive
Content-Length: 168078
Origin: http://redacted-dev
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryhgZddb45UOHBhsgs
Accept: */*
Referer: http://redacted-dev/ui/Convert
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
Here is the raw response
HTTP/1.1 200 OK
Cache-Control: no-cache
Pragma: no-cache
Content-Type: application/json; charset=utf-8
Expires: -1
Server: Microsoft-IIS/8.5
X-AspNet-Version: 4.0.30319
Persistent-Auth: true
X-Powered-By: ASP.NET
Date: Tue, 02 Jan 2018 14:12:17 GMT
Content-Length: 3707173
Here is what the blob looks like when we get it back(abbreviated):
ReturnDocument=data:image/tiff;base64,SUkqAAg+............
You can file saver package to download a blob object.
Usage Example is as below:
// FileSaver Usage
import FileSaver from 'file-saver';
fetch('/records/download', {
method: 'post',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify(data)
}).then(function(response) {
return response.blob();
}).then(function(blob) {
FileSaver.saveAs(blob, 'fileName.zip');
})
One more way to download a file is that you make a get request which sends file from the server.
Then you can simply do the following:
window.open('full server link');
Then your file will get start downloading.

Set header in angularjs for get request

I am trying to send header using get method but header is not sent every time.
here is my app.js
var config = { headers:{
'Accept': 'application/json;odata=verbose',
"authId" : "236643d1-b4c8-4211-97ba-1f24e93b8568"
}
};
$http.get("http://52.36.44.187:8080/maistylz-webservice/me", config).then(function(data){
console.log(data);
}, function (error) {
console.log(angular.toJson(error));});
Here is the Request header
Accept:*/*/
Accept-Encoding:gzip, deflate, sdch
Accept-Language:en-US,en;q=0.8
Access-Control-Request-Headers:accept, authid
Access-Control-Request-Method:GET
Cache-Control:no-cache
Connection:keep-alive
Host:52.36.44.187:8080
Origin:http://localhost
Pragma:no-cache
Referer:http://localhost/swagger/login.php
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36
Request URL:http://52.36.44.187:8080/maistylz-webservice/me
Request Method:OPTIONS
Status Code:403 Forbidden
Remote Address:52.36.44.187:8080
Response Headers
Access-Control-Allow-Credentials:true
Access-Control-Allow-Headers:origin, content-type, accept, authId
Access-Control-Allow-Methods:GET,POST,DELETE,PUT,HEAD
Access-Control-Allow-Origin:http://localhost
Access-Control-Max-Age:1800
Content-Length:17
Content-Type:application/vnd.sun.wadl+xml
Date:Wed, 20 Apr 2016 06:13:06 GMT
Server:Apache-Coyote/1.1
It is showing me 403 status. What's is wrong with the code?
I have to pass authId in header.
It is necessary.
Here is console log
catcher.js:197 OPTIONS http://52.36.44.187:8080/maistylz-webservice/me XMLHttpRequest.send # catcher.js:197(anonymous function) # angular.js:11756m # angular.js:11517g # angular.js:11227(anonymous function) # angular.js:15961n.$eval # angular.js:17229n.$digest # angular.js:17045n.$apply # angular.js:17337(anonymous function) # angular.js:1749invoke # angular.js:4665c # angular.js:1747yc # angular.js:1767ee # angular.js:1652(anonymous function) # angular.js:30863b # angular.js:3166Qf # angular.js:3456Pf.d # angular.js:3444
login.php#0:1 XMLHttpRequest cannot load http://52.36.44.187:8080/maistylz-webservice/me. Response for preflight has invalid HTTP status code 403

Resources