Recently I tried to start web-scraping with python, in order to extract some basic informations in instagram using beautiful soup.
I wrote a simple code which is showed below:
from bs4 import BeautifulSoup
import selenium.webdriver as webdriver
url = 'http://instagram.com/umnpics/'
driver = webdriver.Firefox()
driver.get(url)
soup = BeautifulSoup(driver.page_source)
for x in soup.findAll('li', {'class':'photo'}):
print (x)
but after run it, some exceptions occured:
Traceback (most recent call last):
File "C:\Users\Mhdn\AppData\Roaming\Python\Python37\site-packages\selenium\webdriver\common\service.py", line 76, in start
stdin=PIPE)
File "C:\Program Files (x86)\Python37-32\lib\subprocess.py", line 775, in __init__
restore_signals, start_new_session)
File "C:\Program Files (x86)\Python37-32\lib\subprocess.py", line 1178, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Mhdn\Desktop\test2.py", line 5, in <module>
driver = webdriver.Firefox()
File "C:\Users\Mhdn\AppData\Roaming\Python\Python37\site-packages\selenium\webdriver\firefox\webdriver.py", line 164, in __init__
self.service.start()
File "C:\Users\Mhdn\AppData\Roaming\Python\Python37\site-packages\selenium\webdriver\common\service.py", line 83, in start
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH.
You need to download geckodriver to your local system from here
In your code you need to provide executable_path for the geckodriver
Adding executable_path to your code:
from bs4 import BeautifulSoup
import selenium.webdriver as webdriver
url = 'http://instagram.com/umnpics/'
driver = webdriver.Firefox(executable_path= 'path/to/geckodriver') #<---Add path to your geckodriver
#example: driver = webdriver.Firefox(executable_path= 'home/downloads/geckodriver')
driver.get(url)
soup = BeautifulSoup(driver.page_source)
for x in soup.findAll('li', {'class':'photo'}):
print (x)
Related
I'm new to coding but have been using selenium for chrome in python fine for a few weeks, I don't know what I changed but I am now getting an error message.
I simplified it as much as I can to see what I'm doing wrong, but I can't work it out.
from selenium import webdriver
driver = webdriver.Chrome(r"C:\Users\smim1\PycharmProjects\test\chromedriver.exe")
Error:
Traceback (most recent call last):
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\selenium\webdriver\common\service.py", line 71, in start
self.process = subprocess.Popen(cmd, env=self.env,
File "C:\Users\smim1\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 832, in __init__
errread, errwrite) = self._get_handles(stdin, stdout, stderr)
File "C:\Users\smim1\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 1294, in _get_handles
c2pwrite = msvcrt.get_osfhandle(self._get_devnull())
File "C:\Users\smim1\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 1077, in _get_devnull
self._devnull = os.open(os.devnull, os.O_RDWR)
FileNotFoundError: [Errno 2] No such file or directory: 'nul'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\smim1\PycharmProjects\test\test5.py", line 43, in <module>
driver = webdriver.Chrome(service = s)
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\selenium\webdriver\chrome\webdriver.py", line 70, in __init__
super(WebDriver, self).__init__(DesiredCapabilities.CHROME['browserName'], "goog",
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\selenium\webdriver\chromium\webdriver.py", line 90, in __init__
self.service.start()
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\selenium\webdriver\common\service.py", line 81, in start
raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://chromedriver.chromium.org/home
But when I open that file location in the terminal it starts successfully:
PS C:\Users\smim1\PycharmProjects\test> C:\Users\smim1\PycharmProjects\test\chro
medriver.exe
Starting ChromeDriver 101.0.4951.41 (93c720db8323b3ec10d056025ab95c23a31997c9-re
fs/branch-heads/4951#{#904}) on port 9515
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggest
ions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
My chrome is the same build:
Chrome is up to date
Version 101.0.4951.54 (Official Build) (64-bit)
I have also tried to move it to the project folder and remove the path but it comes up with the same error
I have tried the webdriver-manager 3.5.4 but get an error:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
Error:
====== WebDriver manager ======
Traceback (most recent call last):
File "C:\Users\smim1\PycharmProjects\test\test5.py", line 51, in <module>
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\webdriver_manager\chrome.py", line 32, in install
driver_path = self._get_driver_path(self.driver)
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\webdriver_manager\manager.py", line 19, in _get_driver_path
binary_path = self.driver_cache.find_driver(driver)
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\webdriver_manager\driver_cache.py", line 74, in find_driver
driver_version = driver.get_version()
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\webdriver_manager\driver.py", line 39, in get_version
self.get_latest_release_version()
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\webdriver_manager\driver.py", line 65, in get_latest_release_version
self.browser_version = get_browser_version_from_os(self.chrome_type)
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\webdriver_manager\utils.py", line 144, in get_browser_version_from_os
OSType.WIN: windows_browser_apps_to_cmd(
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\webdriver_manager\utils.py", line 125, in windows_browser_apps_to_cmd
powershell = determine_powershell()
File "C:\Users\smim1\PycharmProjects\test\venv\lib\site-packages\webdriver_manager\utils.py", line 245, in determine_powershell
with subprocess.Popen(
File "C:\Users\smim1\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 832, in __init__
errread, errwrite) = self._get_handles(stdin, stdout, stderr)
File "C:\Users\smim1\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 1276, in _get_handles
p2cread = msvcrt.get_osfhandle(self._get_devnull())
File "C:\Users\smim1\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 1077, in _get_devnull
self._devnull = os.open(os.devnull, os.O_RDWR)
FileNotFoundError: [Errno 2] No such file or directory: 'nul'
I'm sure there is something simple that I am doing wrong but I have tried everything I can find online
I'm getting following error when I try to run the tcms-api module but following the steps given,
https://tcms-api.readthedocs.io/en/latest/modules/tcms_api.html#module-tcms_api
I'm using python 3 in CentOS, applied our own domain and certificates by mounting the certificates to docker container.
Can you please tell how to solve the SSL Certificate verification failure error?
[root#KiwiTCMS-Testcase-Portal docker-compose]# python3 test-api.py
Traceback (most recent call last):
File "test-api.py", line 5, in <module>
rpc_client = TCMS()
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/tcms_api/__init__.py", line 123, in __init__
config['tcms']['url']).server
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/tcms_api/xmlrpc.py", line 124, in __init__
self.login(username, password, url)
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/tcms_api/xmlrpc.py", line 131, in login
self.server.Auth.login(username, password)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in __call__
return self.__send(self.__name, args)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request
verbose=self.__verbose
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/xmlrpc/client.py", line 1154, in request
return self.single_request(host, handler, request_body, verbose)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/xmlrpc/client.py", line 1166, in single_request
http_conn = self.send_request(host, handler, request_body, verbose)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request
self.send_content(connection, request_body)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_content
connection.endheaders(request_body)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/http/client.py", line 1282, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/http/client.py", line 1042, in _send_output
self.send(msg)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/http/client.py", line 980, in send
self.connect()
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/http/client.py", line 1448, in connect
server_hostname=server_hostname)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/ssl.py", line 407, in wrap_socket
_context=self, _session=session)
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/ssl.py", line 817, in __init__
self.do_handshake()
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/ssl.py", line 1077, in do_handshake
self._sslobj.do_handshake()
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/ssl.py", line 689, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)
Introducing the following line fix the issue,
import ssl
try:
_create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
# Legacy Python that doesn't verify HTTPS certificates by default
pass
else:
# Handle target environment that doesn't support HTTPS verification
ssl._create_default_https_context = _create_unverified_https_context
I see that you are using Python 3.6 from RedHat's SoftwareCollections. That version contains a bug (or arguably a security feature) which doesn't respect settings documented in upstream Python which allow you to accept untrusted SSL certificates. There are lots of these things reported on bugzilla.redhat.com but I don't think they will change it!
This is how we do it in our test suite:
https://github.com/kiwitcms/tcms-api/blob/master/tests/krb5/integration_test.py#L18
I simply try to call
from moviepy.editor import VideoFileClip
but I receive this error
File "/Users/macbook/python/main_video.py", line 3, in <module>
from moviepy.editor import VideoFileClip
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/moviepy/editor.py", line 22, in <module>
from .video.io.VideoFileClip import VideoFileClip
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/moviepy/video/io/VideoFileClip.py", line 3, in <module>
from moviepy.video.VideoClip import VideoClip
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/moviepy/video/VideoClip.py", line 20, in <module>
from .io.ffmpeg_writer import ffmpeg_write_image, ffmpeg_write_video
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/moviepy/video/io/ffmpeg_writer.py", line 19, in <module>
from moviepy.config import get_setting
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/moviepy/config.py", line 38, in <module>
FFMPEG_BINARY = get_exe()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/imageio/plugins/ffmpeg.py", line 86, in get_exe
raise NeedDownloadError('Need ffmpeg exe. '
imageio.core.fetching.NeedDownloadError: Need ffmpeg exe. You can download it by calling:
imageio.plugins.ffmpeg.download()
And if I try to call this one
imageio.plugins.ffmpeg.download()
Answer is
Imageio: 'ffmpeg.osx' was not found on your computer; downloading it now.
Error while fetching file: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)>.
Error while fetching file: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)>.
Error while fetching file: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)>.
Error while fetching file: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)>.
Traceback (most recent call last):
File "/Users/macbook/python/test.py", line 29, in <module>
imageio.plugins.ffmpeg.download()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/imageio/plugins/ffmpeg.py", line 55, in download
get_remote_file('ffmpeg/' + FNAME_PER_PLATFORM[plat])
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/imageio/core/fetching.py", line 121, in get_remote_file
_fetch_file(url, filename)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/imageio/core/fetching.py", line 177, in _fetch_file
os.path.basename(file_name))
OSError: Unable to download 'ffmpeg.osx'. Perhaps there is a no internet connection? If there is, please report this problem.
What I can to do?
TRY
import imageio
imageio.plugins.ffmpeg.download()
Include above lines in your code . I had face same problem see the given picand this solved it.
Otherwise Check your internet connection.
I believe this is related a number of python packages I have recently installed environmental variables that I have changed. I have re-installed Numpy and GAE, which did not help. So any suggestions on this? Thanks!
The GAE log indicated that the failure was lined to file import
2013-12-11 11:45:20 Running command: "['C:\\Python27\\pythonw.exe', 'C:\\Program Files (x86)\\Google\\google_appengine\\dev_appserver.py', '--skip_sdk_update_check=yes', '--port=8094', '--admin_port=8004', 'D:\\Dropbox\\ubertool_src']"
Traceback (most recent call last):
File "C:\Program Files (x86)\Google\google_appengine\dev_appserver.py", line 197, in <module>
_run_file(__file__, globals())
File "C:\Program Files (x86)\Google\google_appengine\dev_appserver.py", line 193, in _run_file
execfile(script_path, globals_)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\tools\devappserver2\devappserver2.py", line 27, in <module>
import tempfile
File "C:\Python27\Lib\tempfile.py", line 34, in <module>
from random import Random as _Random
File "C:\Python27\Lib\site-packages\numpy\random\__init__.py", line 102, in <module>
ranf = random = sample = random_sample
NameError: name 'random_sample' is not defined
2013-12-11 11:45:21 (Process exited with code 1)
Update
just did a little test.
I can run from numpy import random
but if I run import random
The error msg is:
>>> import random
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\Lib\site-packages\numpy\random\__init__.py", line 102, in <module>
ranf = random = sample = random_sample
NameError: name 'random_sample' is not defined
Update
The problem is solved after removing C:\Python27\Lib\site-packages\numpy (I am not sure how this was added) from PYTHONPATH in Environmental Variable
I scripted download script.
When it runs it throws an error.
Code:
import urllib2, shutil
ftpfile = urllib2.urlopen("ftp://user:password#domain.com/file.txt")
localfile = open("C:\\dtmp", "wb")
shutil.copyfileobj(ftpfile, localfile)
Error:
Traceback (most recent call last):
File "download.py", line 4, in <module>
localfile = open("C:\\dtmp", "wb")
IOError: [Errno 13] Permission denied: 'C:\\dtmp'
You do not have write access on the path you tried to open.
In general it's not a good style to write directly on C:\. Instead you can write in your user directory or in a temporary directory.
import os.path
homedir = os.path.expanduser('~')
with open(os.path.join(homedir, 'filename')) as localfile:
shutil.copyfileobj(ftpfile, localfile)