What You Can Actually Scrape from Shopify Stores

Shopify stores are publicly accessible websites. That means store names, product listings, and contact information published on the storefront are generally available for collection. Most stores display an email in their footer, contact page, or "About Us" section.

What you typically get from scraping:

Store name and URL
Contact email (if published)
Social media links
Product categories
Store location (sometimes)
Phone number (rarely)

What you won't get from storefront scraping alone: the store owner's personal email, revenue data, or internal team contacts. Those require different approaches.

Method 1: Apify Shopify Scraper (Recommended)

Apify offers pre-built scrapers specifically designed for Shopify stores. The most popular one is the Shopify Shops Email Leads Scraper by xmiso_scrapers.

Setup Steps

Create an Apify account — Free tier includes $5 in compute credits monthly.
Find the scraper — Search for "Shopify shops email leads scraper" on Apify's marketplace.
Configure filters:
- language: en — English-language stores
- currency: USD — US-based stores
- has_email: true — Only stores with a published email
- max_results: 1000 — Limit per run
Run the actor — Results download as JSON or CSV.

Cost

Expect around $0.60 per 1,000 stores scraped. Verification adds another $2-4 per 1,000 emails via ZeroBounce or similar services.

Pros and Cons

Apify handles the technical complexity. You don't need to write code or manage proxies. The scraper respects rate limits and handles pagination automatically. On the downside, you're limited by the scraper's configuration options and Apify's pricing structure for high volumes.

Method 2: Custom Python Script

If you want full control, a custom script gives you flexibility that pre-built tools can't match.

Basic Approach

The workflow goes like this: discover Shopify stores through search engines, visit each store's website, extract the contact email from the page HTML, and save results to a CSV file.

Discovery Phase

Use Google's site: operator with Shopify-specific footprints. The string "myshopify.com" appears in many Shopify store URLs during setup. Searching for site:myshopify.com "contact" "@gmail.com" or site:myshopify.com "contact us" can surface relevant stores.

Alternatively, use Google's Custom Search API to automate discovery at scale.

Extraction Phase

For each discovered URL, use Python with requests and BeautifulSoup to:

Fetch the homepage HTML
Search for email patterns using regex: [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
Check common locations: footer, contact page, meta tags
Visit the /pages/contact route if the homepage doesn't reveal an email

Rate Limiting

Don't hammer servers. Add random delays between 2-5 seconds between requests. Use rotating proxies if you're scraping more than a few hundred stores. Most web hosts will temporarily block IPs making rapid sequential requests.

Method 3: Browser Extensions

For manual prospecting at smaller scale, browser extensions work fine.

Hunter.io Extension: Shows email addresses associated with any website domain. Free tier gives 25 searches per month.
Snov.io Extension: Similar to Hunter but with LinkedIn integration. 50 free credits/month.
Skrap (Chrome): Extracts data from search results into spreadsheets. Good for Google-based discovery.

These tools work well for building lists of 50-200 contacts. Beyond that, the manual effort becomes prohibitive.

Email Verification: Non-Negotiable

Raw scraped emails have a 20-40% bounce rate depending on the source. Sending to unverified addresses will destroy your domain reputation within hours.

Run every list through a verification service before using it:

ZeroBounce: ~$0.003 per email. Most accurate for bulk verification.
NeverBounce: Similar pricing. Good API integration.
Bounceless: Budget option. Less accurate but workable for initial filtering.

Expect to lose 15-25% of scraped emails during verification. Budget accordingly.

Legal Considerations

Scraping publicly available data from websites is generally legal in the US and EU, with some caveats:

Crawl responsibly: Respect robots.txt, rate limits, and server load.
GDPR compliance: If you're contacting EU-based merchants, ensure you have a lawful basis for processing their data. The "legitimate interest" ground covers most B2B outreach, but you need to include an unsubscribe option in every email.
CAN-SPAM compliance: For US contacts, include a physical address and unsubscribe mechanism in marketing emails.
Don't scrape behind login walls: If data requires authentication to access, it's not public.

When to Buy Instead of Scrape

Building your own scraping pipeline takes technical knowledge, time, and ongoing maintenance. If your core business isn't data collection, the cost of engineering time usually exceeds the cost of buying a verified list.

Pre-verified lists start around $29 per 1,000 contacts. Compare that to the hours spent setting up scrapers, maintaining proxies, running verification, and cleaning data — and the economics of buying become clear for most outreach teams.