How to find the the CloudFlare human verification element using Selenium - selenium-webdriver

The browser is FireFox and the language is Python.I am unable to complete the CloudFlare human verification.
In this website(https://chat.openai.com/chat), I'm unable to find the "mark" element by this code:
verify=WebDriverWait(driver, 10,0.1).until(EC.presence_of_element_located((By.CLASS_NAME, 'mark')))
HTML:
Error Message:
Traceback (most recent call last):
File ,
verify=WebDriverWait(driver, 10,0.1).until(EC.presence_of_element_located((By.CLASS_NAME, 'mark')))
File "...Python310\lib\site-packages\selenium\webdriver\support\wait.py", line 90, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
Stacktrace:
RemoteError#chrome://remote/content/shared/RemoteError.jsm:12:1
WebDriverError#chrome://remote/content/shared/webdriver/Errors.jsm:192:5
NoSuchElementError#chrome://remote/content/shared/webdriver/Errors.jsm:404:5
element.find/</<#chrome://remote/content/marionette/element.js:291:16
Why and how to fix it.

The element <span class="mark>...</mark> have a visible text in it. So to identify the element instead of presence_of_element_located() you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:
Using CSS_SELECTOR:
element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "label.ctp-checkbox-label span.mark")))
Using XPATH:
element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//label[#class='ctp-checkbox-label']//span[#class='mark']")))
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Related

selenium.common.exceptions.InvalidSelectorException: Message: Given xpath expression is invalid using By.XPATH through Selenium Python

I'm trying to develop an autologin für Instagram and I got the following problem.
Here is my code:
from time import sleep
from selenium import webdriver
from selenium.webdriver.common.by import By
browser = webdriver.Firefox()
browser.implicitly_wait(5)
browser.get('https://www.instagram.com/')
sleep(2)
login_link = browser.find_element(By.XPATH,"//button[text()=´Allow essential and optional cookies`]")
Here is the Error Message:
Traceback (most recent call last): File "C:\Users\justu\PycharmProject\botinsta\main.py", line 18, in
login_link = browser.find_element(By.XPATH,"//button[text()=´Allow essential and optional cookies]") File "C:\Users\justu\PycharmProject\botinsta\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 857, in find_element return self.execute(Command.FIND_ELEMENT, { File "C:\Users\justu\PycharmProject\botinsta\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 435, in execute self.error_handler.check_response(response) File "C:\Users\justu\PycharmProject\botinsta\venv\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 247, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.InvalidSelectorException: Message: Given xpath expression "//button[text()=´Allow essential and optional cookies]" is invalid: SyntaxError: Document.evaluate: The expression
is not a legal expression Stacktrace:
WebDriverError#chrome://remote/content/shared/webdriver/Errors.jsm:188:5
InvalidSelectorError#chrome://remote/content/shared/webdriver/Errors.jsm:348:5
find_#chrome://remote/content/marionette/element.js:320:11
element.find/</findElements<#chrome://remote/content/marionette/element.js:274:24
evalFn#chrome://remote/content/marionette/sync.js:136:7
PollPromise/<#chrome://remote/content/marionette/sync.js:156:5
PollPromise#chrome://remote/content/marionette/sync.js:127:10
element.find/<#chrome://remote/content/marionette/element.js:272:24
element.find#chrome://remote/content/marionette/element.js:271:10
findElement#chrome://remote/content/marionette/actors/MarionetteCommandsChild.jsm:245:25
receiveMessage#chrome://remote/content/marionette/actors/MarionetteCommandsChild.jsm:101:31
Can anyone help ?
Try single quotes
'Allow essential and optional cookies'
instead of
´Allow essential and optional cookies`
P.S. Since you use browser.implicitly_wait(5) there is no need in time.sleep(2)
You need to take care of a couple of things.
If you are passing the xpath within double quotes i.e. "..." then you need to provide the value of the attribues within single quotes i.e. '...'
<button> element is a clickable element. So you need to identify when the element is interactable.
Solution
Incorporating the above mentioned points and removing the sleep(2), your effective lines of code will be:
browser.get('https://www.instagram.com/')
login_link = WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Allow essential and optional cookies']")))
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Create a loop for web scraping data with selenium in python

i am new to python , i got 1 error when i scrape data on web page about , 1 2 times will work but it will return error when run it with 1 big array , ai please tell me why and how to fix it, thanks a lot
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.chrome.service import Service
#nhét driver vào cho chrome
path = Service('C:/Program Files (x86)/chromedriver.exe')
driver = webdriver.Chrome(service=path)
url = 'https://diemthi.mobiedu.vn/?typeExam=TOAN'
driver.get(url)
SBD = 25000001
# for i in range(1,5):
sbd = driver.find_element(By.XPATH, '/html/body/app-root/app-full-layout/app-home/div/div[1]/div/div[3]/input')
search = driver.find_element(By.XPATH, '/html/body/app-root/app-full-layout/app-home/div/div[1]/div/div[3]/button')
sbd.send_keys(SBD)
search.click()
sbd = driver.find_element(By.XPATH, '/html/body/app-root/app-full-layout/app-home/div/div[1]/div/div[3]/input')
sbd.clear()
time.sleep(2)
error is displayed:
Traceback (most recent call last):
File "D:\pythonProject\ScrapingWeb\Main.py", line 19, in
sbd = driver.find_element(By.XPATH, '/html/body/app-root/app-full-layout/app-home/div/div[1]/div/div[3]/input')
File "D:\pythonProject\ScrapingWeb\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 857, in find_element
return self.execute(Command.FIND_ELEMENT, {
File "D:\pythonProject\ScrapingWeb\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 435, in execute
self.error_handler.check_response(response)
File "D:\pythonProject\ScrapingWeb\venv\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 247, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/app-root/app-full-layout/app-home/div/div[1]/div/div[3]/input"}
(Session info: chrome=103.0.5060.114)
Stacktrace:

response status is not 200 on running webdriver to get url and on running beautiful soup to extract content, it throws attribute error

I have been trying to web scrape hotel reviews but on multiple page jumps, the url of the webpage doesn't change. So I am using webdriver from selenium to work this out. It is not showing any error but on checking if the response status is 200, it is showing false. In addition to that, running the line of code which I have mentioned below generates an error. If anyone can fix the issue, effort will be highly appreciated!
!pip install selenium
from selenium import webdriver
import requests
from bs4 import BeautifulSoup
import pandas as pd
# install chromium, its driver, and selenium
!apt-get update
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
# set options to be headless, ..
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
# open it, go to a website, and get results
wd = webdriver.Chrome('chromedriver',options=options)
code = wd.get('https://www.goibibo.com/hotels/highland-park-hotel-in-trivandrum-1383427384655815037/?hquery={%22ci%22:%2220211209%22,%22co%22:%2220211210%22,%22r%22:%221-2-0%22,%22ibp%22:%22v15%22}&hmd=766931490eb7863d2f38f56c6185a1308de782c89dfeeea59d262b827ca15441bf50472cbfdc1ee84aeed8af756809a2e89cfd6eaea0fa308c1ca839e8c313d016ac0f5948658353cf30f1cd83050fd8e6adb2e55f2a5470cadeb0c28b7becc92ac44d81966b82408effde826d40fbff47525e09b5f145e321fe6d104e12933c066323798e33a911e0cbed7312fc1634f8f92fe502c8602556c9a02f34c047d04ff1400c995799156776c1a04e218d6486493edad5b0f7e51a5ea25f5f1cb4f5ed497ee9368137f6ec73b3b1166ee7c1a885920b90c98542e0270b4fa9004005cfe87a4d1efeaedc8e33a848f73345f09bec19153e8bf625cc7f9216e692a1bcc313e7f13a7fc091328b1fb43598bd236994fdc988ab35e70cf3a5d1856c0b0fa9794b23a1a958a5937ac6d258d121a75b7ce9fc70b9a820af43a8e9a3f279be65b5c6fbfff2ba20bfb0f3e3ee425f0b930bf671c50878a540c6a9003b197622b6ab22ae39e07b5174cb12bebbcd2a132bb8570e01b9e253c1bd83cb292de97a&cc=IN&reviewType=gi&vcid=3877384277955108166&srpFilters={%22type%22:[%22Hotel%22]}')
str(code) == "<Response [200]>"
**Output: ** False
soup = BeautifulSoup(code.content,'html.parser')
On running the below line of code, there comes an error:
AttributeError Traceback (most recent call
last) in () ----> 1 soup
= BeautifulSoup(code.content,'html.parser')
AttributeError: 'NoneType' object has no attribute 'content'
get()
get(url: str) loads a web page in the current browser session and doesn't returns anything.
Hence, as per your code, code will be always NULL.
Solution
To validate the Response you can adopt any of the two approaches:
Using requests.head():
import requests
request_response = requests.head(https://www.goibibo.com/hotels/highland-park-hotel-in-trivandrum-1383427384655815037/?hquery={%22ci%22:%2220211209%22,%22co%22:%2220211210%22,%22r%22:%221-2-0%22,%22ibp%22:%22v15%22}&hmd=766931490eb7863d2f38f56c6185a1308de782c89dfeeea59d262b827ca15441bf50472cbfdc1ee84aeed8af756809a2e89cfd6eaea0fa308c1ca839e8c313d016ac0f5948658353cf30f1cd83050fd8e6adb2e55f2a5470cadeb0c28b7becc92ac44d81966b82408effde826d40fbff47525e09b5f145e321fe6d104e12933c066323798e33a911e0cbed7312fc1634f8f92fe502c8602556c9a02f34c047d04ff1400c995799156776c1a04e218d6486493edad5b0f7e51a5ea25f5f1cb4f5ed497ee9368137f6ec73b3b1166ee7c1a885920b90c98542e0270b4fa9004005cfe87a4d1efeaedc8e33a848f73345f09bec19153e8bf625cc7f9216e692a1bcc313e7f13a7fc091328b1fb43598bd236994fdc988ab35e70cf3a5d1856c0b0fa9794b23a1a958a5937ac6d258d121a75b7ce9fc70b9a820af43a8e9a3f279be65b5c6fbfff2ba20bfb0f3e3ee425f0b930bf671c50878a540c6a9003b197622b6ab22ae39e07b5174cb12bebbcd2a132bb8570e01b9e253c1bd83cb292de97a&cc=IN&reviewType=gi&vcid=3877384277955108166&srpFilters={%22type%22:[%22Hotel%22]})
status_code = request_response.status_code
if status_code == 200:
print("URL is valid/up")
else:
print("URL is invalid/down")
Using urlopen():
import requests
import urllib
status_code = urllib.request.urlopen(https://www.goibibo.com/hotels/highland-park-hotel-in-trivandrum-1383427384655815037/?hquery={%22ci%22:%2220211209%22,%22co%22:%2220211210%22,%22r%22:%221-2-0%22,%22ibp%22:%22v15%22}&hmd=766931490eb7863d2f38f56c6185a1308de782c89dfeeea59d262b827ca15441bf50472cbfdc1ee84aeed8af756809a2e89cfd6eaea0fa308c1ca839e8c313d016ac0f5948658353cf30f1cd83050fd8e6adb2e55f2a5470cadeb0c28b7becc92ac44d81966b82408effde826d40fbff47525e09b5f145e321fe6d104e12933c066323798e33a911e0cbed7312fc1634f8f92fe502c8602556c9a02f34c047d04ff1400c995799156776c1a04e218d6486493edad5b0f7e51a5ea25f5f1cb4f5ed497ee9368137f6ec73b3b1166ee7c1a885920b90c98542e0270b4fa9004005cfe87a4d1efeaedc8e33a848f73345f09bec19153e8bf625cc7f9216e692a1bcc313e7f13a7fc091328b1fb43598bd236994fdc988ab35e70cf3a5d1856c0b0fa9794b23a1a958a5937ac6d258d121a75b7ce9fc70b9a820af43a8e9a3f279be65b5c6fbfff2ba20bfb0f3e3ee425f0b930bf671c50878a540c6a9003b197622b6ab22ae39e07b5174cb12bebbcd2a132bb8570e01b9e253c1bd83cb292de97a&cc=IN&reviewType=gi&vcid=3877384277955108166&srpFilters={%22type%22:[%22Hotel%22]}).getcode()
if status_code == 200:
print("URL is valid/up")
else:
print("URL is invalid/down")

Cannot run the code and print out the result, using xpath and webdriver to click the pulldown menu

Cannot run the code and print out the result, using xpath and webdriver to click the pulldown menu by following codes
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get('URL')
driver.maximize_window()
wait = WebDriverWait(driver,40)
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,'div.combobox-input-wrap a[data-value="rbAll"]'))).click()
wait.until(EC.element_to_be_clickable((By.XPATH,'//div[#class="droplist-item"]/a[contains(.,"Headline Category")]'))).click()
wait.until(EC.element_to_be_clickable((By.XPATH,'//div[#id="rbAfter2006"]//div[#class="combobox-input-wrap"]/a[contains(.,"ALL")]'))).click()
wait.until(EC.element_to_be_clickable((By.XPATH,'//div[#class="droplist-group"]//ul[#class="droplist-items"]//li/a[contains(.,"Announcements and Notices")]'))).click()
ele=wait.until(EC.presence_of_element_located((By.XPATH,'//div[#class="droplist-group droplist-submenu level2"]//ul//li/a[contains(.,"New Listings (Listed Issuers/New Applicants)")]')))
ele.location_once_scrolled_into_view
ele.click()
ele2=wait.until(EC.presence_of_element_located((By.XPATH,'//div[#class="droplist-group droplist-submenu level3"]//ul//li/a[contains(.,"Allotment Results")]')))
ele2.location_once_scrolled_into_view
ele2.click()
html = driver.page_source
print html
the Error Log show as below when run it.
File "run.py", line 6, in <module>
driver = webdriver.Firefox()
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/webdriver.py", line 167, in __init__
keep_alive=True)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 156, in __init__
self.start_session(capabilities, browser_profile)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 251, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 320, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 1
You should be printing the value inside brackets, i.e.
html = driver.page_source
print(html)

mobile devices requests emulation using JSR223 in JMeter - No such property: driver for class

Scenario:
open main page and click on "Accept All Cookies" (JSR223 Sampler1 in Once Only controller);
open pages from the set of parametrized urls (JSR223 Sampler2 in another controller).
JSR223 Sampler1 code for main page:
import org.apache.jmeter.samplers.SampleResult; import
org.openqa.selenium.chrome.ChromeOptions; import
org.openqa.selenium.chrome.ChromeDriver; import
org.openqa.selenium.WebDriver; import org.openqa.selenium.By; import
org.openqa.selenium.WebElement; import
org.openqa.selenium.support.ui.ExpectedConditions; import
org.openqa.selenium.support.ui.WebDriverWait; import
java.util.concurrent.TimeUnit;
System.setProperty("webdriver.chrome.driver",
"vars.get("webdriver_path")");
Map<String, Object> mobileEmulation = new HashMap<>();
mobileEmulation.put("userAgent", "vars.get("userAgent")");
Map<String, Object> chromeOptions = new HashMap<>();
chromeOptions.put("mobileEmulation", mobileEmulation); ChromeOptions
options = new ChromeOptions();
options.setExperimentalOption("mobileEmulation", mobileEmulation);
ChromeDriver driver = new ChromeDriver(options);
driver.get("https://vars.get("main_page")"); WebDriverWait wait = new
WebDriverWait(driver, 20);
wait.until(ExpectedConditions.visibilityOfElementLocated(By.xpath("xpath")));
driver.findElement(By.xpath("xpath")).click();
log.info(driver.getTitle());
JSR223 Sampler2 code for any page from the set of urls:
driver.get("https://${url}");
Error message:
Response message:javax.script.ScriptException: groovy.lang.MissingPropertyException: No such property: driver for class
Problem:
If I just copy all code from JSR223 Sampler1 to JSR223 Sampler2 and change destination url, urls are opening, but in unproper way - launching each time new browser instance, and I can't have realistic response time (for driver.get("url") only), because result provides time of Sampler work, which includes driver initialization, new browser instance start and it takes several seconds...
Could you please propose any ideas, how is possible to resolve this problem? To get all requests in 1 browser instance and to have realistic response time for all requests in JSR223 Sampler2 for browser.get("url") only?
Will appreciate for any help.
In the first JSR223 Sampler you need to store your driver instance into JMeter Variables like:
vars.putObject("driver", driver)
it should be the last line of your script
In the second JSR223 Sampler you need to get the driver instance from JMeter Variables like:
driver = vars.getObject("driver")
it should be the first line of your script
vars is the shorthand for JMeterVariables class instance, see the JavaDoc for all available functions and Top 8 JMeter Java Classes You Should Be Using with Groovy article for more information on JMeter API shorthands available for JSR223 Test Elements
P.S. the same approach with vars you should follow when executing driver.get() function like:
driver.get("https://" + vars.get("url"))

Resources