前言

爬网站的时候遇到了cf拦截，根据百度到的尝试添加参数还是无法跳过

service = Service('msedgedriver.exe')
options = Options()
# 开启开发者模式
options.add_experimental_option('excludeSwitches', ['enable-automation'])
# 禁用Blink运行时功能
options.add_argument('--disable-blink-features=AutomationControlled')
driver = webdriver.Edge(service=service)

undetected-chromedriver

Optimized Selenium Chromedriver patch which does not trigger anti-bot services like Distill Network / Imperva / DataDome / Botprotect.io Automatically downloads the driver binary and patches it.
Tested until current chrome beta versions
Works also on Brave Browser and many other Chromium based browsers, some tweaking
Python 3.6++**

我主要使用的Edge，介绍说会自动下载Chrome，并没有体验到，于是自己安装了Chrome浏览器

代码跟之前selenium的相差不大,成功解决了问题,再没出现过Cf拦截

from pyquery import PyQuery as pq
import re
import time
from undetected_chromedriver import ChromeOptions
import undetected_chromedriver as uc

options = ChromeOptions()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
driver = uc.Chrome(options=options)


driver.get('http://...')
html_source = driver.page_source
doc = pq(html_source)
titles = doc.find('tag')