Scraping Google Search Results from India: The Complete Technical Guide

In early 2025, our SEO agency noticed something strange. Rankings we tracked for 12,000 keywords started showing impossible fluctuations. Keywords that had been stable for months would jump 50 positions overnight, then back the next day.

After investigating, we discovered Google was serving "poisoned" SERPs — deliberately inaccurate results to suspected scrapers. Instead of blocking us outright, they were feeding us garbage data. We spent 3 weeks collecting worthless ranking reports before catching on.

That experience taught me everything about how Google detects scrapers and how to scrape their results reliably.

Why Scrape Google Instead of Using the API?

Google's Custom Search JSON API seems like the obvious choice. It's official, documented, and legal. But it has crippling limitations:

100 queries/day free tier — Useless for any serious SEO work
$5 per 1000 queries beyond that — $50/day for 10K queries, $1,500/month
Limited SERP features — No featured snippets, People Also Ask, image packs, video carousels
Different results — API results differ from what real users see
No geo-targeting granularity — Country-level only, not city-level

For serious SEO monitoring, scraping is the only viable option.

How Google Detects Scrapers

Google's anti-scraping is sophisticated but not infallible. They use:

1. IP Reputation

Datacenter IPs: Blocked or CAPTCHA'd within 10-20 queries
Residential IPs: Work for 50-100 queries before challenges
Mobile IPs: Hundreds of queries before any friction

2. Query Patterns

Sequential queries (keyword1, keyword2, keyword3...) are obvious
High frequency (>1 query/second) triggers rate limiting
Unusual query combinations flag accounts

3. Browser Fingerprinting

Google checks for headless browser indicators
WebDriver property, navigator.webdriver, plugins array
Canvas and WebGL fingerprints

4. Behavioral Analysis

Time on page (bots often leave immediately)
Mouse movements (or lack thereof)
Click patterns

5. The Poisoned Data Trap

This is Google's most insidious technique. Instead of blocking suspected scrapers, they serve slightly altered results:

Swapped rankings (positions 3 and 4 reversed)
Different featured snippets
Modified meta descriptions
Incorrect local pack results

The goal: make scrapers think they're getting real data while actually collecting garbage.

Scraping Google from Indian IPs

Indian mobile proxies provide unique advantages for Google scraping:

Lower scrutiny: Western anti-bot systems are optimized for US/European traffic patterns. Indian mobile traffic looks different — different peak hours, different query patterns, different device mix. This makes detection harder.

Geo-targeting: Scrape localized Indian results with gl=IN and hl=en parameters. See exactly what Indian users see.

Higher quotas: Mobile IPs have higher trust scores, meaning more queries before any friction.

Working Code Example

import requests
from bs4 import BeautifulSoup
import time
import random

PROXY = "socks5h://user:pass@gw.snowpad.io:9999"

def search_google(query, num_results=10, gl="IN", hl="en"):
    """Scrape Google search results with Indian mobile proxy."""
    
    # Construct URL with geo parameters
    url = (
        f"https://www.google.com/search?q={requests.utils.quote(query)}"
        f"&num={num_results}&hl={hl}&gl={gl}&gws_rd=cr"
    )
    
    headers = {
        "User-Agent": "Mozilla/5.0 (Linux; Android 14; SM-S928B) AppleWebKit/537.36",
        "Accept-Language": f"{hl}-{gl},{hl};q=0.9",
        "Accept": "text/html,application/xhtml+xml",
        "Accept-Encoding": "gzip, deflate, br"
    }
    
    session = requests.Session()
    session.proxies = {"http": PROXY, "https": PROXY}
    
    # Realistic delay
    time.sleep(random.uniform(3, 8))
    
    resp = session.get(url, headers=headers, timeout=15)
    
    if resp.status_code != 200:
        return {"error": f"Status {resp.status_code}"}
    
    soup = BeautifulSoup(resp.text, "html.parser")
    results = []
    
    # Parse organic results
    for g in soup.select("div.g, div[data-hveid]"):
        title_elem = g.select_one("h3")
        link_elem = g.select_one("a[href]")
        snippet_elem = g.select_one("div.VwiC3b, span.aCOpRe")
        
        if title_elem and link_elem:
            results.append({
                "position": len(results) + 1,
                "title": title_elem.get_text(),
                "url": link_elem["href"],
                "snippet": snippet_elem.get_text() if snippet_elem else ""
            })
    
    return {
        "query": query,
        "results": results,
        "total_results": len(results)
    }

# Example usage
results = search_google("best mobile proxies India", gl="IN")
for r in results["results"][:5]:
    print(f"{r['position']}. {r['title']}")

Advanced Techniques

1. Detect Poisoned Data

Compare results across multiple IPs and time windows. If rankings differ significantly between concurrent queries, you're likely getting poisoned data.

def verify_results(query, num_checks=3):
    """Verify SERP consistency across multiple IPs."""
    all_results = []
    
    for _ in range(num_checks):
        result = search_google(query)
        rankings = [r['url'] for r in result['results']]
        all_results.append(rankings)
        time.sleep(5)
    
    # Check consistency
    if all(r == all_results[0] for r in all_results):
        return {"status": "consistent", "data": result}
    else:
        return {"status": "inconsistent", "warning": "Possible poisoned data"}

2. Extract SERP Features

def extract_serp_features(soup):
    features = {}
    
    # Featured snippet
    featured = soup.select_one("div.xpdopen, div[class*='featured-snippet']")
    if featured:
        features['featured_snippet'] = featured.get_text()
    
    # People Also Ask
    paa = []
    for item in soup.select("div[jsname='Cpkphb']"):
        question = item.select_one("span")
        if question:
            paa.append(question.get_text())
    if paa:
        features['people_also_ask'] = paa
    
    # Knowledge panel
    knowledge = soup.select_one("div.kp-blk")
    if knowledge:
        features['knowledge_panel'] = True
    
    return features

3. Scale to Thousands of Queries

For large-scale SEO monitoring:

import concurrent.futures
from queue import Queue

def scrape_keyword_batch(keywords, max_workers=5):
    """Scrape multiple keywords with rotation and delays."""
    results = {}
    
    with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
        future_to_keyword = {
            executor.submit(search_google, kw): kw 
            for kw in keywords
        }
        
        for future in concurrent.futures.as_completed(future_to_keyword):
            keyword = future_to_keyword[future]
            try:
                results[keyword] = future.result()
            except Exception as e:
                results[keyword] = {"error": str(e)}
    
    return results

Best Practices

Rotate IPs every 10-20 queries — Use Snowpad's auto-rotation
Add 3-8 second delays — With jitter (random variation)
Use mobile User-Agents — Match your proxy type
Set Accept-Language — Critical for localized results
Verify data consistency — Cross-check across multiple IPs
Monitor for poisoned data — Sudden impossible ranking changes
Distribute queries across time — Don't batch all at once
Use realistic query patterns — Mix with benign queries

Scaling Limits

With proper configuration using Snowpad's Indian mobile proxies:

Conservative: 500-1000 queries/day per proxy session
Moderate: 2000-5000 queries/day with aggressive rotation
Aggressive: 10000+ queries/day with multiple proxy sessions and delays

Beyond these limits, expect increased CAPTCHA frequency and potential temporary IP blocks.

FAQ

Can I scrape Google with mobile proxies? Yes. Mobile proxies are among the most effective for Google scraping because they carry genuine carrier IPs with high reputation scores. Indian mobile IPs have the added advantage of lower Western anti-bot scrutiny.

How do I get Indian Google search results? Use gl=IN (geolocation) and hl=en (language) URL parameters. Route requests through Snowpad's Indian mobile IPs to see exactly what Indian users see.

How many Google searches can I scrape per day? With proper rotation and 3-8 second delays: 2000-5000 queries/day per proxy session. Scale horizontally with multiple sessions for higher volume.

Is scraping Google legal? Scraping publicly available search results is generally legal. However, Google's Terms of Service prohibit automated access. The legal landscape is complex. For business-critical applications, consider the risks and consult legal counsel.

Scraping Google Search Results from India: The Complete Technical Guide

Why Scrape Google Instead of Using the API?

How Google Detects Scrapers

1. IP Reputation

2. Query Patterns

3. Browser Fingerprinting

4. Behavioral Analysis

5. The Poisoned Data Trap

Scraping Google from Indian IPs

Working Code Example

Advanced Techniques

1. Detect Poisoned Data

2. Extract SERP Features

3. Scale to Thousands of Queries

Best Practices

Scaling Limits

FAQ

Frequently Asked Questions

Can I scrape Google with mobile proxies?

How do I get Indian Google search results?

How many Google searches can I scrape per day?

Is scraping Google legal?

More from Snowpad

Setting Up Anti-Detect Browsers with Snowpad Mobile Proxies

Using Playwright and Puppeteer with SOCKS5 Proxies (Node.js Guide)

Building a Price Monitoring System for Indian E-Commerce

Ready to try Snowpad?