2019년 1월 18일 금요일

Python : Selenium 으로 웹 크롤링

0. 준비사항

0.1  패키지 설치

pip install selenium
0.2 사용할 웹드라이버 다운로드 Chrome: https://sites.google.com/a/chromium.org/chromedriver/downloads Edge: https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/ Firefox: https://github.com/mozilla/geckodriver/releases Safari: https://webkit.org/blog/6900/webdriver-support-in-safari-10/ 1. 실습
1.1 크롬 드라이버로 http://www.python.org 크롤링

from selenium import webdriver

driver = webdriver.Chrome('/Users/teom/Downloads/chromedriver')

driver.get("http://www.python.org")

# 원하는 테그 선택
# find_element_by_id
# find_element_by_name, find_elements_by_name
# find_element_by_xpath, find_elements_by_xpath
# find_element_by_link_text, find_elements_by_link_text
# find_element_by_partial_link_text, find_elements_by_partial_link_text
# find_element_by_tag_name, find_elements_by_tag_name
# find_element_by_class_name, find_elements_by_class_name
# find_element_by_css_selector, find_elements_by_css_selector
elem = driver.find_element_by_id("top")

print elem.text

driver.quit()
결과
Skip to content
Python
PSF
Docs
PyPI
Jobs
Community
1.2 모바일 환경으로 셋팅
from selenium import webdriver
from selenium.webdriver.chrome.options import Options

mobile_emulation = {
    "deviceMetrics": { "width": 360, "height": 640, "pixelRatio": 3.0 },
    "userAgent": "Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 5 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.166 Mobile Safari/535.19" }

chrome_options = Options()
chrome_options.add_experimental_option("mobileEmulation", mobile_emulation)
driver = webdriver.Chrome('/Users/teom/Downloads/chromedriver', chrome_options = chrome_options)

driver.get("http://www.python.org")

elem = driver.find_element_by_id("top")

print elem.text

driver.quit()
결과
Skip to content
 Close
Python
PSF
Docs
PyPI
Jobs
Community
 The Python Network
1.3 클릭 하고 크롤링
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome('/Users/teom/Downloads/chromedriver')

driver.get("http://www.python.org")

element = WebDriverWait(driver, 10).until(
    EC.element_to_be_clickable((By.CLASS_NAME, "psf-meta"))
)
element.click()

elem = driver.find_element_by_xpath('//ul[@class="navigation menu"]//li[@id="about"]')

print elem.text

driver.quit()
결과
about

댓글 없음:

댓글 쓰기

추천 게시물

python: SVD(Singular Value Decomposition)로 간단한 추천시스템 만들기( feat. surprise )

svd_example In [15]: # !pip install surprise In [21]: from...