Skip to main content
Version: Next

scraping

  • Scraping is an automated browsing

  • Steps

    • set url
    • download html
    • parse the html
    • extract useful information
    • transform or aggregate the data
    • save the data
  • CSS selectors and XPath

  • Libraries

    • requests
    • Beautiful soup
    • scrapy
      • scrapy shell
      • cli
      • ...
    • Selenium and Request html for pages genetated dynamically