How to Scrape Amazon India, Flipkart & Myntra: The Complete 2026 Guide

My first attempt to scrape Flipkart lasted 47 seconds. I sent 12 requests from a single datacenter IP. Request 13 returned a 403. By request 20, the entire /24 subnet was blacklisted. I thought I was being careful. I wasn't even close.

Indian e-commerce platforms run some of the most aggressive anti-bot infrastructure in the world. Not because they're paranoid — because scraping is endemic. Price monitoring, inventory tracking, counterfeit detection, and competitor analysis drive massive automated traffic. These platforms have evolved sophisticated defenses.

Why Indian E-Commerce is Different

Scraping Amazon.com is hard. Scraping Amazon.in is harder. Here's why:

Geo-pricing complexity: Indian e-commerce uses dynamic pricing based on location, device type, user history, and inventory levels. The price you see in Delhi differs from Mumbai. Mobile users see different prices than desktop.

Mobile-first infrastructure: Indian e-commerce is predominantly mobile. 75%+ of traffic comes from smartphones. Their anti-bot systems are tuned for mobile signatures. Send a desktop User-Agent from a datacenter IP and you're flagged instantly.

Aggressive rate limiting: Flipkart starts rate-limiting after 5-10 requests per IP per minute. Amazon.in uses behavioral analysis — mouse movement patterns, scroll depth, time-on-page. Bot traffic that loads a page in 200ms and immediately requests the next is obvious.

IP reputation scoring: Indian platforms maintain IP reputation databases. Datacenter ranges (AWS, GCP, Azure) are pre-flagged. Even residential proxies from outside India get extra scrutiny.

Scraping Amazon India

Amazon.in uses AWS WAF, behavioral detection, and IP reputation scoring. Here's what actually works:

import requests
from bs4 import BeautifulSoup
import time
import random

PROXY = "socks5h://user:pass@gw.snowpad.io:9999"

def scrape_amazon_in(asin):
    url = f"https://www.amazon.in/dp/{asin}"
    
    headers = {
        "User-Agent": "Mozilla/5.0 (Linux; Android 14; SM-S928B) AppleWebKit/537.36",
        "Accept-Language": "en-IN,en;q=0.9",
        "Accept": "text/html,application/xhtml+xml",
        "Accept-Encoding": "gzip, deflate, br"
    }
    
    session = requests.Session()
    session.proxies = {"http": PROXY, "https": PROXY}
    
    # Add realistic delay
    time.sleep(random.uniform(3, 7))
    
    resp = session.get(url, headers=headers, timeout=15)
    
    if resp.status_code == 200:
        soup = BeautifulSoup(resp.text, "html.parser")
        
        # Extract data
        title = soup.select_one("#productTitle")
        price = soup.select_one(".a-price-whole")
        rating = soup.select_one("[data-hook='average-star-rating'] .a-icon-alt")
        
        return {
            "asin": asin,
            "title": title.text.strip() if title else None,
            "price": price.text.strip() if price else None,
            "rating": rating.text.split()[0] if rating else None
        }
    
    return {"error": f"Status {resp.status_code}"}

Critical techniques:

Use socks5h:// (not socks5://) to route DNS through the proxy
Set Accept-Language: en-IN — Amazon serves different content based on this
Add 3-7 second delays between requests
Rotate IPs every 5-10 requests
Parse JSON-LD embedded in the page for structured data

What doesn't work:

Headless browsers without stealth plugins (detected via WebDriver property)
Consistent request timing (add jitter)
Missing headers (Amazon checks for realistic header sets)

Scraping Flipkart

Flipkart is harder than Amazon.in. They use:

Aggressive IP-based rate limiting (5-10 req/min/IP)
CAPTCHA challenges after threshold
Dynamic class names on product pages (obfuscation)
Mobile app API endpoints that require signed requests

Working approach:

import requests
import time
import random

PROXY = "socks5h://user:pass@gw.snowpad.io:9999"

def scrape_flipkart(product_id):
    # Flipkart mobile site is less protected than desktop
    url = f"https://www.flipkart.com/p/{product_id}"
    
    headers = {
        "User-Agent": "Mozilla/5.0 (Linux; Android 14) AppleWebKit/537.36",
        "Accept-Language": "en-IN",
        "X-Requested-With": "XMLHttpRequest"
    }
    
    session = requests.Session()
    session.proxies = {"http": PROXY, "https": PROXY}
    
    # Flipkart requires more aggressive delays
    time.sleep(random.uniform(5, 10))
    
    resp = session.get(url, headers=headers, timeout=20)
    
    if resp.status_code == 200:
        # Parse JSON-LD for product data
        import re
        json_ld = re.search(r'<script type="application/ld\+json">(.*?)</script>', resp.text)
        if json_ld:
            import json
            data = json.loads(json_ld.group(1))
            return {
                "name": data.get("name"),
                "price": data.get("offers", {}).get("price"),
                "availability": data.get("offers", {}).get("availability")
            }
    
    return {"error": f"Status {resp.status_code}"}

Flipkart-specific tips:

Target the mobile site (m.flipkart.com) — less protected
Extract from JSON-LD instead of HTML selectors (class names change)
Use 5-10 second delays (more aggressive than Amazon)
Rotate IPs every 3-5 requests
If you hit CAPTCHA, rotate immediately and increase delays

Scraping Myntra, Nykaa, AJIO

These fashion/beauty platforms use similar anti-bot but with platform-specific twists:

Myntra: Heavy JavaScript rendering. Use Playwright with Snowpad SOCKS5. Extract from GraphQL API responses (check Network tab).

Nykaa: Product data loaded via XHR. Monitor /api/product endpoints. Requires session cookies.

AJIO: Reliance-owned (Jio ecosystem). Mobile-optimized. Their anti-bot is less aggressive than Flipkart but still requires Indian IPs.

Scaling Your Scraping Operation

For production-scale scraping (10K+ products/day):

Use rotating mobile proxies — Snowpad's Indian mobile IPs bypass geo-restrictions and anti-bot
Implement proxy rotation — Change IP every 5-10 requests
Add realistic delays — 3-10 seconds between requests with jitter
Monitor success rates — Track per-domain success rates. Drop below 80%? Investigate immediately
Use multiple scraper instances — Distribute across IPs and add delays
Cache aggressively — Don't re-scrape unchanged pages
Handle failures gracefully — Retry with backoff, rotate proxy on failure

Legal Considerations in India

Scraping publicly available data (prices, product listings) is generally legal in India under the DPDP Act 2023, provided you don't collect personal data. However:

Respect robots.txt
Don't overwhelm servers (rate limit yourself)
Don't scrape behind login walls without permission
Comply with website Terms of Service

The DPDP Act focuses on personal data processing. Product prices and listings are business data, not personal data.

FAQ

Can I scrape Flipkart without getting blocked? Yes, with Indian mobile proxies, realistic delays (5-10 sec), IP rotation every 3-5 requests, and mobile User-Agent headers. Expect 85-92% success rate.

What's the best proxy for scraping Amazon India? Indian mobile proxies from Jio/Airtel networks. They carry genuine mobile IP signatures that Amazon's anti-bot trusts. Datacenter IPs get blocked within minutes.

How many requests per minute can I make? With proper rotation through Snowpad's pool and realistic delays, hundreds of requests per minute distributed across multiple proxy sessions.

Is scraping e-commerce legal in India? Scraping publicly available product data (prices, listings) is generally legal. Avoid personal data, respect robots.txt, and don't overwhelm servers.

How to Scrape Amazon India, Flipkart & Myntra: The Complete 2026 Guide

Why Indian E-Commerce is Different

Scraping Amazon India

Scraping Flipkart

Scraping Myntra, Nykaa, AJIO

Scaling Your Scraping Operation

Legal Considerations in India

FAQ

Frequently Asked Questions

Can I scrape Flipkart without getting blocked?

What's the best proxy for scraping Amazon India?

How many requests per minute can I make?

Is scraping e-commerce legal in India?

More from Snowpad

Setting Up Anti-Detect Browsers with Snowpad Mobile Proxies

Using Playwright and Puppeteer with SOCKS5 Proxies (Node.js Guide)

Building a Price Monitoring System for Indian E-Commerce

Ready to try Snowpad?