When we started Optimal Shopping, we quickly ran into the same wall every price-comparison tool hits: retail websites and APIs aggressively block datacenter IP ranges. Our first approach — a cloud VM behind a proxy service — lasted about two weeks before Walmart started returning garbage data and Kroger's API began rate-limiting us to uselessness.
The Residential IP Problem
Retail chains invest heavily in distinguishing human shoppers from automated requests. Datacenter IPs are trivially identifiable — their ASN is registered to AWS, DigitalOcean, or a known proxy provider, and that's enough. Residential proxy services solve this on paper, but at $15–50/GB they become prohibitively expensive at our request volumes.
Raspberry Pi as Edge Node
The solution was to treat the scraper infrastructure as a distributed IoT problem rather than a cloud one. A Raspberry Pi 4 (4GB) costs $55. A home internet connection gives it a genuine residential IP registered to a real ISP. We ship nodes to contributors, who plug them in like a smart home device.
Each node runs our scraper agent inside a Docker container. Nodes authenticate to our coordination server via Tailscale, receive job assignments from a Supabase queue, execute the scrapes, and push results back. The coordination layer handles deduplication, failure retry, and circuit-breaking when a node goes offline.
Results
At 12 active nodes, we achieve 99.2% success rates on Walmart and Kroger API calls that previously failed 60%+ of the time. Our per-request cost dropped from ~$0.04 (proxy) to ~$0.0008 (amortized hardware + electricity).