Scraping Google Search Results from India: The Complete Technical Guide
Deepesh Kalur
Expert Contributor
To scrape Google search results from Indian IPs, use Snowpad's SOCKS5 mobile proxies with gl=IN and hl=en parameters. Rotate IPs every 10-20 queries, add 3-8 second delays with jitter, and use mobile User-Agent headers. Verify data consistency across multiple IPs to detect poisoned results. Expect 2000-5000 queries/day per proxy session.
In early 2025, our SEO agency noticed something strange. Rankings we tracked for 12,000 keywords started showing impossible fluctuations. Keywords that had been stable for months would jump 50 positions overnight, then back the next day.
After investigating, we discovered Google was serving "poisoned" SERPs — deliberately inaccurate results to suspected scrapers. Instead of blocking us outright, they were feeding us garbage data. We spent 3 weeks collecting worthless ranking reports before catching on.
That experience taught me everything about how Google detects scrapers and how to scrape their results reliably.
Why Scrape Google Instead of Using the API?
Google's Custom Search JSON API seems like the obvious choice. It's official, documented, and legal. But it has crippling limitations:
- 100 queries/day free tier — Useless for any serious SEO work
- $5 per 1000 queries beyond that — $50/day for 10K queries, $1,500/month
- Limited SERP features — No featured snippets, People Also Ask, image packs, video carousels
- Different results — API results differ from what real users see
- No geo-targeting granularity — Country-level only, not city-level
For serious SEO monitoring, scraping is the only viable option.
How Google Detects Scrapers
Google's anti-scraping is sophisticated but not infallible. They use:
1. IP Reputation
- Datacenter IPs: Blocked or CAPTCHA'd within 10-20 queries
- Residential IPs: Work for 50-100 queries before challenges
- Mobile IPs: Hundreds of queries before any friction
2. Query Patterns
- Sequential queries (keyword1, keyword2, keyword3...) are obvious
- High frequency (>1 query/second) triggers rate limiting
- Unusual query combinations flag accounts
3. Browser Fingerprinting
- Google checks for headless browser indicators
- WebDriver property, navigator.webdriver, plugins array
- Canvas and WebGL fingerprints
4. Behavioral Analysis
- Time on page (bots often leave immediately)
- Mouse movements (or lack thereof)
- Click patterns
5. The Poisoned Data Trap
This is Google's most insidious technique. Instead of blocking suspected scrapers, they serve slightly altered results:
- Swapped rankings (positions 3 and 4 reversed)
- Different featured snippets
- Modified meta descriptions
- Incorrect local pack results
The goal: make scrapers think they're getting real data while actually collecting garbage.
Scraping Google from Indian IPs
Indian mobile proxies provide unique advantages for Google scraping:
Lower scrutiny: Western anti-bot systems are optimized for US/European traffic patterns. Indian mobile traffic looks different — different peak hours, different query patterns, different device mix. This makes detection harder.
Geo-targeting: Scrape localized Indian results with gl=IN and hl=en parameters. See exactly what Indian users see.
Higher quotas: Mobile IPs have higher trust scores, meaning more queries before any friction.
Working Code Example
import requests
from bs4 import BeautifulSoup
import time
import random
PROXY = "socks5h://user:pass@gw.snowpad.io:9999"
def search_google(query, num_results=10, gl="IN", hl="en"):
"""Scrape Google search results with Indian mobile proxy."""
# Construct URL with geo parameters
url = (
f"https://www.google.com/search?q={requests.utils.quote(query)}"
f"&num={num_results}&hl={hl}&gl={gl}&gws_rd=cr"
)
headers = {
"User-Agent": "Mozilla/5.0 (Linux; Android 14; SM-S928B) AppleWebKit/537.36",
"Accept-Language": f"{hl}-{gl},{hl};q=0.9",
"Accept": "text/html,application/xhtml+xml",
"Accept-Encoding": "gzip, deflate, br"
}
session = requests.Session()
session.proxies = {"http": PROXY, "https": PROXY}
# Realistic delay
time.sleep(random.uniform(3, 8))
resp = session.get(url, headers=headers, timeout=15)
if resp.status_code != 200:
return {"error": f"Status {resp.status_code}"}
soup = BeautifulSoup(resp.text, "html.parser")
results = []
# Parse organic results
for g in soup.select("div.g, div[data-hveid]"):
title_elem = g.select_one("h3")
link_elem = g.select_one("a[href]")
snippet_elem = g.select_one("div.VwiC3b, span.aCOpRe")
if title_elem and link_elem:
results.append({
"position": len(results) + 1,
"title": title_elem.get_text(),
"url": link_elem["href"],
"snippet": snippet_elem.get_text() if snippet_elem else ""
})
return {
"query": query,
"results": results,
"total_results": len(results)
}
# Example usage
results = search_google("best mobile proxies India", gl="IN")
for r in results["results"][:5]:
print(f"{r['position']}. {r['title']}")
Advanced Techniques
1. Detect Poisoned Data
Compare results across multiple IPs and time windows. If rankings differ significantly between concurrent queries, you're likely getting poisoned data.
def verify_results(query, num_checks=3):
"""Verify SERP consistency across multiple IPs."""
all_results = []
for _ in range(num_checks):
result = search_google(query)
rankings = [r['url'] for r in result['results']]
all_results.append(rankings)
time.sleep(5)
# Check consistency
if all(r == all_results[0] for r in all_results):
return {"status": "consistent", "data": result}
else:
return {"status": "inconsistent", "warning": "Possible poisoned data"}
2. Extract SERP Features
def extract_serp_features(soup):
features = {}
# Featured snippet
featured = soup.select_one("div.xpdopen, div[class*='featured-snippet']")
if featured:
features['featured_snippet'] = featured.get_text()
# People Also Ask
paa = []
for item in soup.select("div[jsname='Cpkphb']"):
question = item.select_one("span")
if question:
paa.append(question.get_text())
if paa:
features['people_also_ask'] = paa
# Knowledge panel
knowledge = soup.select_one("div.kp-blk")
if knowledge:
features['knowledge_panel'] = True
return features
3. Scale to Thousands of Queries
For large-scale SEO monitoring:
import concurrent.futures
from queue import Queue
def scrape_keyword_batch(keywords, max_workers=5):
"""Scrape multiple keywords with rotation and delays."""
results = {}
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
future_to_keyword = {
executor.submit(search_google, kw): kw
for kw in keywords
}
for future in concurrent.futures.as_completed(future_to_keyword):
keyword = future_to_keyword[future]
try:
results[keyword] = future.result()
except Exception as e:
results[keyword] = {"error": str(e)}
return results
Best Practices
- Rotate IPs every 10-20 queries — Use Snowpad's auto-rotation
- Add 3-8 second delays — With jitter (random variation)
- Use mobile User-Agents — Match your proxy type
- Set Accept-Language — Critical for localized results
- Verify data consistency — Cross-check across multiple IPs
- Monitor for poisoned data — Sudden impossible ranking changes
- Distribute queries across time — Don't batch all at once
- Use realistic query patterns — Mix with benign queries
Scaling Limits
With proper configuration using Snowpad's Indian mobile proxies:
- Conservative: 500-1000 queries/day per proxy session
- Moderate: 2000-5000 queries/day with aggressive rotation
- Aggressive: 10000+ queries/day with multiple proxy sessions and delays
Beyond these limits, expect increased CAPTCHA frequency and potential temporary IP blocks.
FAQ
Can I scrape Google with mobile proxies? Yes. Mobile proxies are among the most effective for Google scraping because they carry genuine carrier IPs with high reputation scores. Indian mobile IPs have the added advantage of lower Western anti-bot scrutiny.
How do I get Indian Google search results?
Use gl=IN (geolocation) and hl=en (language) URL parameters. Route requests through Snowpad's Indian mobile IPs to see exactly what Indian users see.
How many Google searches can I scrape per day? With proper rotation and 3-8 second delays: 2000-5000 queries/day per proxy session. Scale horizontally with multiple sessions for higher volume.
Is scraping Google legal? Scraping publicly available search results is generally legal. However, Google's Terms of Service prohibit automated access. The legal landscape is complex. For business-critical applications, consider the risks and consult legal counsel.
Frequently Asked Questions
Can I scrape Google with mobile proxies?
Yes. Mobile proxies with genuine carrier IPs are highly effective. Indian mobile IPs have lower Western anti-bot scrutiny.
How do I get Indian Google search results?
Use gl=IN and hl=en URL parameters. Route through Snowpad's Indian mobile IPs for authentic localized results.
How many Google searches can I scrape per day?
With proper rotation and delays: 2000-5000 queries/day per session. Scale horizontally with multiple sessions.
Is scraping Google legal?
Scraping public search results is generally legal, but Google's ToS prohibit automated access. Consult legal counsel for business-critical applications.
Ready to try Snowpad?
Join thousands of developers using our Indian mobile proxy network for their high-scale automation needs.
Get Started Now