scrapy 安装

#先进入虚拟环境
workon py3scrapy
# 安装scrapy
pip3 install scrapy #安装很慢才几kb/s ，proxy或者豆瓣源开起来

安装好scrapy之后就可以使用了，注意如果退出虚拟环境就不能用了

(py3scrapy) ➜  ~ scrapy
Scrapy 1.6.0 - no active project

Usage:
  scrapy <command> [options] [args]

Available commands:
  bench         Run quick benchmark test
  fetch         Fetch a URL using the Scrapy downloader
  genspider     Generate new spider using pre-defined templates
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

  [ more ]      More commands available when run from project directory

Use "scrapy <command> -h" to see more info about a command

scrapy startproject ArticleSpider
#created in:
#    /Users/scottxiong/ArticleSpider
cd ArticleSpider
# scrapy genspider example example.com
scrapy genspider cnblogs news.cnblogs.com

启动scrapy

方法1: 终端启动spider

# 先进入scrapy项目中
scrapy crawl spiderName

方法2:用脚本启动spider

项目根目录下创建一个py文件，这里命名为main.py

# -*- coding: utf-8 -*-
__author__ = 'scott'

from scrapy.cmdline import execute

import sys
import os

# get project folder
sys.path.append(os.path.dirname(os.path.abspath(__file__)))
execute(["scrapy","crawl","cnblogs"])

results matching ""

No results matching ""