How to Scrape Amazon India, Flipkart & Myntra: The Complete 2026 Guide
Deepesh Kalur
Expert Contributor
To scrape Indian e-commerce platforms like Amazon India and Flipkart, use Indian mobile proxies with realistic delays (3-10 seconds), rotate IPs every 5-10 requests, and use mobile User-Agent headers. Extract structured data from JSON-LD embedded in pages. Expect 85-92% success rates against their anti-bot systems.
My first attempt to scrape Flipkart lasted 47 seconds. I sent 12 requests from a single datacenter IP. Request 13 returned a 403. By request 20, the entire /24 subnet was blacklisted. I thought I was being careful. I wasn't even close.
Indian e-commerce platforms run some of the most aggressive anti-bot infrastructure in the world. Not because they're paranoid — because scraping is endemic. Price monitoring, inventory tracking, counterfeit detection, and competitor analysis drive massive automated traffic. These platforms have evolved sophisticated defenses.
Why Indian E-Commerce is Different
Scraping Amazon.com is hard. Scraping Amazon.in is harder. Here's why:
Geo-pricing complexity: Indian e-commerce uses dynamic pricing based on location, device type, user history, and inventory levels. The price you see in Delhi differs from Mumbai. Mobile users see different prices than desktop.
Mobile-first infrastructure: Indian e-commerce is predominantly mobile. 75%+ of traffic comes from smartphones. Their anti-bot systems are tuned for mobile signatures. Send a desktop User-Agent from a datacenter IP and you're flagged instantly.
Aggressive rate limiting: Flipkart starts rate-limiting after 5-10 requests per IP per minute. Amazon.in uses behavioral analysis — mouse movement patterns, scroll depth, time-on-page. Bot traffic that loads a page in 200ms and immediately requests the next is obvious.
IP reputation scoring: Indian platforms maintain IP reputation databases. Datacenter ranges (AWS, GCP, Azure) are pre-flagged. Even residential proxies from outside India get extra scrutiny.
Scraping Amazon India
Amazon.in uses AWS WAF, behavioral detection, and IP reputation scoring. Here's what actually works:
import requests
from bs4 import BeautifulSoup
import time
import random
PROXY = "socks5h://user:pass@gw.snowpad.io:9999"
def scrape_amazon_in(asin):
url = f"https://www.amazon.in/dp/{asin}"
headers = {
"User-Agent": "Mozilla/5.0 (Linux; Android 14; SM-S928B) AppleWebKit/537.36",
"Accept-Language": "en-IN,en;q=0.9",
"Accept": "text/html,application/xhtml+xml",
"Accept-Encoding": "gzip, deflate, br"
}
session = requests.Session()
session.proxies = {"http": PROXY, "https": PROXY}
# Add realistic delay
time.sleep(random.uniform(3, 7))
resp = session.get(url, headers=headers, timeout=15)
if resp.status_code == 200:
soup = BeautifulSoup(resp.text, "html.parser")
# Extract data
title = soup.select_one("#productTitle")
price = soup.select_one(".a-price-whole")
rating = soup.select_one("[data-hook='average-star-rating'] .a-icon-alt")
return {
"asin": asin,
"title": title.text.strip() if title else None,
"price": price.text.strip() if price else None,
"rating": rating.text.split()[0] if rating else None
}
return {"error": f"Status {resp.status_code}"}
Critical techniques:
- Use
socks5h://(notsocks5://) to route DNS through the proxy - Set
Accept-Language: en-IN— Amazon serves different content based on this - Add 3-7 second delays between requests
- Rotate IPs every 5-10 requests
- Parse JSON-LD embedded in the page for structured data
What doesn't work:
- Headless browsers without stealth plugins (detected via WebDriver property)
- Consistent request timing (add jitter)
- Missing headers (Amazon checks for realistic header sets)
Scraping Flipkart
Flipkart is harder than Amazon.in. They use:
- Aggressive IP-based rate limiting (5-10 req/min/IP)
- CAPTCHA challenges after threshold
- Dynamic class names on product pages (obfuscation)
- Mobile app API endpoints that require signed requests
Working approach:
import requests
import time
import random
PROXY = "socks5h://user:pass@gw.snowpad.io:9999"
def scrape_flipkart(product_id):
# Flipkart mobile site is less protected than desktop
url = f"https://www.flipkart.com/p/{product_id}"
headers = {
"User-Agent": "Mozilla/5.0 (Linux; Android 14) AppleWebKit/537.36",
"Accept-Language": "en-IN",
"X-Requested-With": "XMLHttpRequest"
}
session = requests.Session()
session.proxies = {"http": PROXY, "https": PROXY}
# Flipkart requires more aggressive delays
time.sleep(random.uniform(5, 10))
resp = session.get(url, headers=headers, timeout=20)
if resp.status_code == 200:
# Parse JSON-LD for product data
import re
json_ld = re.search(r'<script type="application/ld\+json">(.*?)</script>', resp.text)
if json_ld:
import json
data = json.loads(json_ld.group(1))
return {
"name": data.get("name"),
"price": data.get("offers", {}).get("price"),
"availability": data.get("offers", {}).get("availability")
}
return {"error": f"Status {resp.status_code}"}
Flipkart-specific tips:
- Target the mobile site (m.flipkart.com) — less protected
- Extract from JSON-LD instead of HTML selectors (class names change)
- Use 5-10 second delays (more aggressive than Amazon)
- Rotate IPs every 3-5 requests
- If you hit CAPTCHA, rotate immediately and increase delays
Scraping Myntra, Nykaa, AJIO
These fashion/beauty platforms use similar anti-bot but with platform-specific twists:
Myntra: Heavy JavaScript rendering. Use Playwright with Snowpad SOCKS5. Extract from GraphQL API responses (check Network tab).
Nykaa: Product data loaded via XHR. Monitor /api/product endpoints. Requires session cookies.
AJIO: Reliance-owned (Jio ecosystem). Mobile-optimized. Their anti-bot is less aggressive than Flipkart but still requires Indian IPs.
Scaling Your Scraping Operation
For production-scale scraping (10K+ products/day):
- Use rotating mobile proxies — Snowpad's Indian mobile IPs bypass geo-restrictions and anti-bot
- Implement proxy rotation — Change IP every 5-10 requests
- Add realistic delays — 3-10 seconds between requests with jitter
- Monitor success rates — Track per-domain success rates. Drop below 80%? Investigate immediately
- Use multiple scraper instances — Distribute across IPs and add delays
- Cache aggressively — Don't re-scrape unchanged pages
- Handle failures gracefully — Retry with backoff, rotate proxy on failure
Legal Considerations in India
Scraping publicly available data (prices, product listings) is generally legal in India under the DPDP Act 2023, provided you don't collect personal data. However:
- Respect robots.txt
- Don't overwhelm servers (rate limit yourself)
- Don't scrape behind login walls without permission
- Comply with website Terms of Service
The DPDP Act focuses on personal data processing. Product prices and listings are business data, not personal data.
FAQ
Can I scrape Flipkart without getting blocked? Yes, with Indian mobile proxies, realistic delays (5-10 sec), IP rotation every 3-5 requests, and mobile User-Agent headers. Expect 85-92% success rate.
What's the best proxy for scraping Amazon India? Indian mobile proxies from Jio/Airtel networks. They carry genuine mobile IP signatures that Amazon's anti-bot trusts. Datacenter IPs get blocked within minutes.
How many requests per minute can I make? With proper rotation through Snowpad's pool and realistic delays, hundreds of requests per minute distributed across multiple proxy sessions.
Is scraping e-commerce legal in India? Scraping publicly available product data (prices, listings) is generally legal. Avoid personal data, respect robots.txt, and don't overwhelm servers.
Frequently Asked Questions
Can I scrape Flipkart without getting blocked?
Yes, with Indian mobile proxies, 5-10 second delays, IP rotation every 3-5 requests, and mobile User-Agents. Expect 85-92% success rate.
What's the best proxy for scraping Amazon India?
Indian mobile proxies from Jio/Airtel networks. They carry genuine mobile IP signatures that Amazon's anti-bot systems trust.
How many requests per minute can I make?
With Snowpad's rotating pool and realistic delays, hundreds of requests per minute distributed across multiple sessions.
Is scraping e-commerce legal in India?
Scraping publicly available product data is generally legal under DPDP Act 2023. Avoid personal data and respect robots.txt.
Ready to try Snowpad?
Join thousands of developers using our Indian mobile proxy network for their high-scale automation needs.
Get Started Now