If you are not signed up already for the Zyte Developers Community newsletter, you can sign up here.
In this issue:
- Scrapy 2.5.0 is out
- Recipe scraping app (with source code)
- Web scraping in Elixir
- Easy table scraping with R
Scrapy 2.5.0 is out
The first new Scrapy release of the year is here!
– Official Python 3.9 support
– Experimental HTTP/2 support
– New get_retry_request() function to retry requests from spider callbacks
– New headers_received signal that allows stopping downloads early
– New Response.protocol attribute
Recipe scraping app
Web scraping in Elixir
If you are using Elixir for web dev, and considering a web scraping project, you might want to check out this framework: Crawly, a high-level web crawling & scraping framework for Elixir. Check out the documentation and the quickstart guide.
Easy table scraping with R
Extracting data from HTML tables can be messy. For one-off jobs though, there’s an easy alternative. If you’re using R Studio, there’s an addin which makes it easy to scrape tables: datapasta. You literally just copy the table from the page, paste it into the tool and you get the data in structured form. Here’s a tutorial video.