← Back to Journal
How Tos2026-06-0216 min read

How to Scrape Amazon India, Flipkart & Myntra: The Complete 2026 Guide

DK

Deepesh Kalur

Expert Contributor

How to Scrape Amazon India, Flipkart & Myntra: The Complete 2026 Guide
Quick Answer

To scrape Indian e-commerce platforms like Amazon India and Flipkart, use Indian mobile proxies with realistic delays (3-10 seconds), rotate IPs every 5-10 requests, and use mobile User-Agent headers. Extract structured data from JSON-LD embedded in pages. Expect 85-92% success rates against their anti-bot systems.

My first attempt to scrape Flipkart lasted 47 seconds. I sent 12 requests from a single datacenter IP. Request 13 returned a 403. By request 20, the entire /24 subnet was blacklisted. I thought I was being careful. I wasn't even close.

Indian e-commerce platforms run some of the most aggressive anti-bot infrastructure in the world. Not because they're paranoid — because scraping is endemic. Price monitoring, inventory tracking, counterfeit detection, and competitor analysis drive massive automated traffic. These platforms have evolved sophisticated defenses.

Why Indian E-Commerce is Different

Scraping Amazon.com is hard. Scraping Amazon.in is harder. Here's why:

Geo-pricing complexity: Indian e-commerce uses dynamic pricing based on location, device type, user history, and inventory levels. The price you see in Delhi differs from Mumbai. Mobile users see different prices than desktop.

Mobile-first infrastructure: Indian e-commerce is predominantly mobile. 75%+ of traffic comes from smartphones. Their anti-bot systems are tuned for mobile signatures. Send a desktop User-Agent from a datacenter IP and you're flagged instantly.

Aggressive rate limiting: Flipkart starts rate-limiting after 5-10 requests per IP per minute. Amazon.in uses behavioral analysis — mouse movement patterns, scroll depth, time-on-page. Bot traffic that loads a page in 200ms and immediately requests the next is obvious.

IP reputation scoring: Indian platforms maintain IP reputation databases. Datacenter ranges (AWS, GCP, Azure) are pre-flagged. Even residential proxies from outside India get extra scrutiny.

Scraping Amazon India

Amazon.in uses AWS WAF, behavioral detection, and IP reputation scoring. Here's what actually works:

import requests
from bs4 import BeautifulSoup
import time
import random

PROXY = "socks5h://user:pass@gw.snowpad.io:9999"

def scrape_amazon_in(asin):
    url = f"https://www.amazon.in/dp/{asin}"
    
    headers = {
        "User-Agent": "Mozilla/5.0 (Linux; Android 14; SM-S928B) AppleWebKit/537.36",
        "Accept-Language": "en-IN,en;q=0.9",
        "Accept": "text/html,application/xhtml+xml",
        "Accept-Encoding": "gzip, deflate, br"
    }
    
    session = requests.Session()
    session.proxies = {"http": PROXY, "https": PROXY}
    
    # Add realistic delay
    time.sleep(random.uniform(3, 7))
    
    resp = session.get(url, headers=headers, timeout=15)
    
    if resp.status_code == 200:
        soup = BeautifulSoup(resp.text, "html.parser")
        
        # Extract data
        title = soup.select_one("#productTitle")
        price = soup.select_one(".a-price-whole")
        rating = soup.select_one("[data-hook='average-star-rating'] .a-icon-alt")
        
        return {
            "asin": asin,
            "title": title.text.strip() if title else None,
            "price": price.text.strip() if price else None,
            "rating": rating.text.split()[0] if rating else None
        }
    
    return {"error": f"Status {resp.status_code}"}

Critical techniques:

  • Use socks5h:// (not socks5://) to route DNS through the proxy
  • Set Accept-Language: en-IN — Amazon serves different content based on this
  • Add 3-7 second delays between requests
  • Rotate IPs every 5-10 requests
  • Parse JSON-LD embedded in the page for structured data

What doesn't work:

  • Headless browsers without stealth plugins (detected via WebDriver property)
  • Consistent request timing (add jitter)
  • Missing headers (Amazon checks for realistic header sets)

Scraping Flipkart

Flipkart is harder than Amazon.in. They use:

  • Aggressive IP-based rate limiting (5-10 req/min/IP)
  • CAPTCHA challenges after threshold
  • Dynamic class names on product pages (obfuscation)
  • Mobile app API endpoints that require signed requests

Working approach:

import requests
import time
import random

PROXY = "socks5h://user:pass@gw.snowpad.io:9999"

def scrape_flipkart(product_id):
    # Flipkart mobile site is less protected than desktop
    url = f"https://www.flipkart.com/p/{product_id}"
    
    headers = {
        "User-Agent": "Mozilla/5.0 (Linux; Android 14) AppleWebKit/537.36",
        "Accept-Language": "en-IN",
        "X-Requested-With": "XMLHttpRequest"
    }
    
    session = requests.Session()
    session.proxies = {"http": PROXY, "https": PROXY}
    
    # Flipkart requires more aggressive delays
    time.sleep(random.uniform(5, 10))
    
    resp = session.get(url, headers=headers, timeout=20)
    
    if resp.status_code == 200:
        # Parse JSON-LD for product data
        import re
        json_ld = re.search(r'<script type="application/ld\+json">(.*?)</script>', resp.text)
        if json_ld:
            import json
            data = json.loads(json_ld.group(1))
            return {
                "name": data.get("name"),
                "price": data.get("offers", {}).get("price"),
                "availability": data.get("offers", {}).get("availability")
            }
    
    return {"error": f"Status {resp.status_code}"}

Flipkart-specific tips:

  • Target the mobile site (m.flipkart.com) — less protected
  • Extract from JSON-LD instead of HTML selectors (class names change)
  • Use 5-10 second delays (more aggressive than Amazon)
  • Rotate IPs every 3-5 requests
  • If you hit CAPTCHA, rotate immediately and increase delays

Scraping Myntra, Nykaa, AJIO

These fashion/beauty platforms use similar anti-bot but with platform-specific twists:

Myntra: Heavy JavaScript rendering. Use Playwright with Snowpad SOCKS5. Extract from GraphQL API responses (check Network tab).

Nykaa: Product data loaded via XHR. Monitor /api/product endpoints. Requires session cookies.

AJIO: Reliance-owned (Jio ecosystem). Mobile-optimized. Their anti-bot is less aggressive than Flipkart but still requires Indian IPs.

Scaling Your Scraping Operation

For production-scale scraping (10K+ products/day):

  1. Use rotating mobile proxies — Snowpad's Indian mobile IPs bypass geo-restrictions and anti-bot
  2. Implement proxy rotation — Change IP every 5-10 requests
  3. Add realistic delays — 3-10 seconds between requests with jitter
  4. Monitor success rates — Track per-domain success rates. Drop below 80%? Investigate immediately
  5. Use multiple scraper instances — Distribute across IPs and add delays
  6. Cache aggressively — Don't re-scrape unchanged pages
  7. Handle failures gracefully — Retry with backoff, rotate proxy on failure

Legal Considerations in India

Scraping publicly available data (prices, product listings) is generally legal in India under the DPDP Act 2023, provided you don't collect personal data. However:

  • Respect robots.txt
  • Don't overwhelm servers (rate limit yourself)
  • Don't scrape behind login walls without permission
  • Comply with website Terms of Service

The DPDP Act focuses on personal data processing. Product prices and listings are business data, not personal data.

FAQ

Can I scrape Flipkart without getting blocked? Yes, with Indian mobile proxies, realistic delays (5-10 sec), IP rotation every 3-5 requests, and mobile User-Agent headers. Expect 85-92% success rate.

What's the best proxy for scraping Amazon India? Indian mobile proxies from Jio/Airtel networks. They carry genuine mobile IP signatures that Amazon's anti-bot trusts. Datacenter IPs get blocked within minutes.

How many requests per minute can I make? With proper rotation through Snowpad's pool and realistic delays, hundreds of requests per minute distributed across multiple proxy sessions.

Is scraping e-commerce legal in India? Scraping publicly available product data (prices, listings) is generally legal. Avoid personal data, respect robots.txt, and don't overwhelm servers.

Frequently Asked Questions

Can I scrape Flipkart without getting blocked?

Yes, with Indian mobile proxies, 5-10 second delays, IP rotation every 3-5 requests, and mobile User-Agents. Expect 85-92% success rate.

What's the best proxy for scraping Amazon India?

Indian mobile proxies from Jio/Airtel networks. They carry genuine mobile IP signatures that Amazon's anti-bot systems trust.

How many requests per minute can I make?

With Snowpad's rotating pool and realistic delays, hundreds of requests per minute distributed across multiple sessions.

Is scraping e-commerce legal in India?

Scraping publicly available product data is generally legal under DPDP Act 2023. Avoid personal data and respect robots.txt.

Ready to try Snowpad?

Join thousands of developers using our Indian mobile proxy network for their high-scale automation needs.

Get Started Now