Official Python wrapper for Scrappey.com - Web scraping API
Responsible use: This wrapper is intended for collecting publicly available data in compliance with applicable laws. See the Disclaimer.
- Browser Automation - Full browser control with actions like click, type, scroll
- Session Management - Maintain cookies and state across requests
- Proxy Support - Built-in proxy rotation with country selection
- Async Support - Both sync and async clients included
- Type Hints - Full type annotations for IDE support and AI assistants
- Familiar API - Modeled after the popular
requestslibrary
Scrappey offers strong value for web scraping with JavaScript rendering and residential proxies:
| Feature | Description | Scrappey |
|---|---|---|
| Price per 1K Scrapes | JS render + residential proxies | From β¬1 |
| Concurrent Requests | Simultaneous scraping | Up to 200 |
| Browser Automation | Actions and interactions | 30+ Actions |
| Billing Model | Payment flexibility | Pay-as-you-go |
| Success Rate | Successful scrapes | High |
Why Scrappey?
- π Cost-effective for JS rendering at scale
- β‘ High concurrency for large workloads
- π― 30+ browser actions for rich automation
- π° Pay-as-you-go - no monthly commitments
graph TB
A[Your Application] -->|1. Send Request| B[Scrappey API]
B -->|2. Route Request| C{Request Type?}
C -->|Browser Mode| D[Headless Browser]
C -->|Request Mode| E[HTTP Library + TLS]
D -->|3. Execute| F[Browser Actions]
E -->|3. Execute| G[HTTP Request]
F -->|4. Return| J[HTML/JSON Response]
G -->|4. Return| J
J -->|5. Deliver| A
style A fill:#e1f5ff
style B fill:#4CAF50,color:#fff
style D fill:#2196F3,color:#fff
style E fill:#FF9800,color:#fff
style J fill:#4CAF50,color:#fff
Request Flow:
- Your application sends a request to the Scrappey API
- Scrappey routes to browser or HTTP mode based on
requestType - Browser/HTTP engine executes the request
- Response returned with HTML, JSON, or extracted data
- Delivered back to your application
pip install scrappeyYou can provide your Scrappey API key in two ways:
Set the SCRAPPEY_API_KEY environment variable:
Windows (PowerShell):
# Temporary (current session only)
$env:SCRAPPEY_API_KEY = "your_api_key_here"
# Permanent (user-level)
[System.Environment]::SetEnvironmentVariable('SCRAPPEY_API_KEY', 'your_api_key_here', [System.EnvironmentVariableTarget]::User)Windows (Command Prompt):
# Temporary (current session only)
set SCRAPPEY_API_KEY=your_api_key_here
# Permanent (user-level)
setx SCRAPPEY_API_KEY "your_api_key_here"Linux/macOS (Bash/Zsh):
# Temporary (current session only)
export SCRAPPEY_API_KEY="your_api_key_here"
# Permanent (add to ~/.bashrc or ~/.zshrc)
echo 'export SCRAPPEY_API_KEY="your_api_key_here"' >> ~/.bashrc
source ~/.bashrcLinux/macOS (Fish):
# Temporary (current session only)
set -x SCRAPPEY_API_KEY "your_api_key_here"
# Permanent (add to ~/.config/fish/config.fish)
echo 'set -x SCRAPPEY_API_KEY "your_api_key_here"' >> ~/.config/fish/config.fishfrom scrappey import Scrappey
scrappey = Scrappey(api_key="your_api_key_here")Note: Get your API key from https://app.scrappey.com
from scrappey import Scrappey
# Initialize with your API key
scrappey = Scrappey(api_key="YOUR_API_KEY")
# Simple GET request
result = scrappey.get(url="https://example.com")
print(result["solution"]["response"])
# Don't forget to close the client when done
scrappey.close()Or use as a context manager:
from scrappey import Scrappey
with Scrappey(api_key="YOUR_API_KEY") as scrappey:
result = scrappey.get(url="https://example.com")
print(result["solution"]["statusCode"])Scrappey supports two request modes with different trade-offs:
| Mode | Description | Cost | Speed |
|---|---|---|---|
browser |
Headless browser (default) | 1 balance | Slower, more capable |
request |
HTTP library with TLS | 0.2 balance | Faster, cheaper |
Uses a real headless browser. Best for:
- Sites with JavaScript rendering
- Pages that require a full browser environment
- Browser actions and screenshots
# Browser mode is the default
result = scrappey.get(url="https://example.com")Uses an HTTP library with TLS fingerprinting. Best for:
- Simple API calls
- High-volume scraping
- When you need speed and low cost
Limitations:
- β No browser actions - JavaScript execution not available
- β No screenshots - Visual rendering not supported
# Request mode - cheaper and faster
result = scrappey.get(url="https://api.example.com", requestType="request")
# Works with all HTTP methods
result = scrappey.post(
url="https://api.example.com/data",
postData={"key": "value"},
requestType="request",
)import asyncio
from scrappey import AsyncScrappey
async def main():
async with AsyncScrappey(api_key="YOUR_API_KEY") as scrappey:
# Parallel requests
urls = ["https://example1.com", "https://example2.com"]
results = await asyncio.gather(*[
scrappey.get(url=url) for url in urls
])
for result in results:
print(result["solution"]["statusCode"])
asyncio.run(main())Scrappey provides an interface modeled after the popular requests library.
# Before (using requests)
import requests
response = requests.get("https://example.com")
print(response.text)
# After (using Scrappey) - just change the import!
from scrappey import requests
response = requests.get("https://example.com")
print(response.text)That's it!
Note: Set the
SCRAPPEY_API_KEYenvironment variable with your API key.
The Response object works much like requests.Response:
from scrappey import requests
response = requests.get("https://httpbin.org/get")
# Standard attributes
print(response.status_code) # 200
print(response.ok) # True
print(response.text) # Response body as text
print(response.content) # Response body as bytes
print(response.headers) # Response headers
print(response.cookies) # Response cookies
print(response.url) # Final URL
print(response.elapsed) # Time elapsed
# Methods
data = response.json() # Parse JSON
response.raise_for_status() # Raise on 4xx/5xxSessions maintain cookies and headers across requests:
from scrappey import requests
session = requests.Session()
try:
# Authenticate
session.post("https://example.com/login", data={"user": "test"})
# Subsequent requests include cookies from the session
response = session.get("https://example.com/dashboard")
# Session-level headers
session.headers.update({"Authorization": "Bearer token"})
finally:
session.close() # Clean up Scrappey sessionOr use as a context manager:
from scrappey import requests
with requests.Session() as session:
session.get("https://example.com")
# Session automatically closed when exiting| Parameter | Supported | Notes |
|---|---|---|
params |
Yes | Query parameters |
data |
Yes | Form data |
json |
Yes | JSON data |
headers |
Yes | Custom headers |
cookies |
Yes | Request cookies |
timeout |
Yes | Request timeout |
proxies |
Yes | Proxy configuration |
request_type |
Yes | "browser" (default) or "request" (faster) |
result = scrappey.get(
url="https://example.com"
)
if result["data"] == "success":
print("Request completed successfully!")
print(result["solution"]["response"])result = scrappey.browser_action(
url="https://example.com/login",
actions=[
{"type": "wait_for_selector", "cssSelector": "#login-form"},
{"type": "type", "cssSelector": "#email", "text": "user@example.com"},
{"type": "type", "cssSelector": "#password", "text": "password123"},
{"type": "click", "cssSelector": "#submit-btn", "waitForSelector": ".dashboard"},
{"type": "execute_js", "code": "document.querySelector('.user-name').innerText"},
],
)
# Get JavaScript return values
print(result["solution"]["javascriptReturn"])# Form data
result = scrappey.post(
url="https://httpbin.org/post",
postData="username=user&password=pass",
)
# JSON data
result = scrappey.post(
url="https://api.example.com/data",
postData={"key": "value"},
customHeaders={"Content-Type": "application/json"},
)result = scrappey.screenshot(
url="https://example.com",
width=1920,
height=1080,
)
# Save screenshot
import base64
with open("screenshot.png", "wb") as f:
f.write(base64.b64decode(result["solution"]["screenshot"]))Copy directly from the Request Builder:
result = scrappey.request({
"cmd": "request.get",
"url": "https://example.com",
"browserActions": [
{"type": "wait", "wait": 2000},
{"type": "scroll", "cssSelector": "footer"}
],
"screenshot": True
})Scrappey(
api_key: str, # Your API key (required)
base_url: str = "...", # API URL (optional)
timeout: float = 300, # Request timeout in seconds
)| Method | Description |
|---|---|
get(url, **options) |
Perform GET request |
post(url, postData, **options) |
Perform POST request |
put(url, postData, **options) |
Perform PUT request |
delete(url, **options) |
Perform DELETE request |
patch(url, postData, **options) |
Perform PATCH request |
request(options) |
Send request with full options dict |
create_session(**options) |
Create a new session |
destroy_session(session) |
Destroy a session |
browser_action(url, actions, **options) |
Execute browser actions |
screenshot(url, **options) |
Capture screenshot |
| Option | Type | Description |
|---|---|---|
requestType |
str | "browser" (default) or "request" (faster, cheaper) |
session |
str | Session ID for state persistence |
proxy |
str | Custom proxy (http://user:pass@ip:port) |
proxyCountry |
str | Proxy country (e.g., "UnitedStates") |
premiumProxy |
bool | Use premium residential proxies |
mobileProxy |
bool | Use mobile carrier proxies |
browserActions |
list | Browser automation actions |
screenshot |
bool | Capture screenshot |
cssSelector |
str | Extract content by CSS selector |
customHeaders |
dict | Custom HTTP headers |
{
"solution": {
"verified": True,
"response": "<html>...</html>",
"statusCode": 200,
"currentUrl": "https://example.com",
"cookies": [...],
"cookieString": "session=abc; token=xyz",
"userAgent": "Mozilla/5.0...",
"screenshot": "base64...",
"javascriptReturn": [...],
},
"timeElapsed": 1234,
"data": "success", # or "error"
"session": "session-id",
"error": "error message if failed"
}Examples are provided for:
- Python -
examples/python/ - Node.js -
examples/nodejs/ - TypeScript -
examples/typescript/ - Go -
examples/go/ - Java -
examples/java/ - C# -
examples/csharp/ - PHP -
examples/php/ - Ruby -
examples/ruby/ - Rust -
examples/rust/ - Kotlin -
examples/kotlin/ - cURL -
examples/curl/
The API returns errors in the response body. Check the data field:
result = scrappey.get(url="https://example.com")
if result["data"] == "success":
html = result["solution"]["response"]
else:
error = result.get("error", "Unknown error")
print(f"Request failed: {error}")Client-side errors raise exceptions:
from scrappey import (
ScrappeyError,
ScrappeyConnectionError,
ScrappeyTimeoutError,
ScrappeyAuthenticationError,
)
try:
result = scrappey.get(url="https://example.com")
except ScrappeyConnectionError:
print("Could not connect to API")
except ScrappeyTimeoutError:
print("Request timed out")
except ScrappeyAuthenticationError:
print("Invalid API key")
except ScrappeyError as e:
print(f"API error: {e}")Visit https://scrappey.com/pricing for detailed pricing information and plans.
Key Benefits:
- π° Pay-as-you-go - Only pay for what you use
- π― No monthly commitments - Cancel anytime
- π Transparent pricing - See costs before you scrape
- π Volume discounts - Better rates for high-volume users
- Website: https://scrappey.com
- Documentation: https://wiki.scrappey.com/getting-started
- Request Builder: https://app.scrappey.com/#/builder
- API Reference: https://wiki.scrappey.com/api-reference
- Pricing: https://scrappey.com/pricing
- GitHub: https://github.com/pim97/scrappey-wrapper-python
MIT License - see LICENSE for details.
Please ensure that your web scraping activities comply with each website's terms of service and applicable laws and regulations. Scrappey is intended for lawful use only, and is not responsible for any misuse. Always obtain proper authorization before scraping, respect robots.txt and rate limits, and handle any collected data responsibly.