Scrapy is the most popular Python web scraping framework for a reason — it's fast, extensible, and built for scale. But Scrapy's built-in proxy support only handles HTTP/HTTPS CONNECT proxies. For SOCKS5, you need a little extra wiring.

Here are two battle-tested methods to route Scrapy through Snowpad's SOCKS5 gateway.

Method 1: Custom Downloader Middleware

This is the approach I use in production. It gives you full control over proxy selection per request.

Create a middleware file middlewares.py:

from scrapy import signals

class SnowpadProxyMiddleware:
    def process_request(self, request, spider):
        request.meta['proxy'] = 'socks5://your_username:your_password@gw.snowpad.io:9999'
        return None

Then enable it in settings.py:

DOWNLOADER_MIDDLEWARES = {
    'myproject.middlewares.SnowpadProxyMiddleware': 350,
}

The priority value 350 places it early in the middleware chain — before the HTTP downloader runs. Every request now routes through Snowpad's mobile IP pool.

For rotation across multiple Snowpad accounts, maintain a list and cycle through it:

import itertools

class RotatingSnowpadMiddleware:
    def __init__(self):
        self.proxies = itertools.cycle([
            'socks5://user1:pass1@gw.snowpad.io:9999',
            'socks5://user2:pass2@gw.snowpad.io:9999',
        ])

    def process_request(self, request, spider):
        request.meta['proxy'] = next(self.proxies)

Method 2: scrapy-proxy-middleware Package

If you prefer a drop-in solution, the scrapy-proxy-middleware package handles rotation out of the box:

pip install scrapy-proxy-middleware

Configure in settings.py:

PROXY_LIST = [
    'socks5://user1:pass1@gw.snowpad.io:9999',
    'socks5://user2:pass2@gw.snowpad.io:9999',
]
PROXY_MODE = 0  # 0 = rotate, 1 = random, 2 = sticky

DOWNLOADER_MIDDLEWARES = {
    'scrapy_proxy_middleware.middlewares.ProxyMiddleware': 100,
}

Why Snowpad for Scrapy?

Snowpad's mobile IPs are ideal for Scrapy at scale. Each request comes from a real Indian mobile carrier (Jio, Airtel, BSNL), which means target websites see genuine mobile traffic — not datacenter IPs that trigger blocks. The SOCKS5 gateway handles authentication and routing, so your middleware stays clean.

For more on proxy types, check out what is a proxy server and the SOCKS5 vs HTTP proxy comparison.

Production Tips

  • Retry on failure: Scrapy's built-in RetryMiddleware works with SOCKS5. Set RETRY_TIMES = 3 and RETRY_HTTP_CODES = [403, 429, 500, 502, 503].
  • Concurrency: Scrapy's CONCURRENT_REQUESTS default of 16 works well. Snowpad's gateway handles parallel connections without issue.
  • DNS routing: Use socks5h:// (not socks5://) to route DNS through the proxy, preventing DNS leaks that reveal your true location.

For Node.js alternatives, see the Playwright and Puppeteer SOCKS5 guide.