🕷️ LinkedIn Spider

A professional CLI tool for scraping LinkedIn profiles via Google Search

📦 PyPI Package Name: This project is available on PyPI as linkedin-tarantula because linkedin-spider was already taken by another project. The GitHub repository remains linkedin-spider.

📖 Overview

LinkedIn Spider is a powerful, user-friendly command-line tool that helps you collect and analyze LinkedIn profiles at scale. By leveraging Google Search instead of direct LinkedIn scraping, it significantly reduces the risk of account restrictions while providing comprehensive profile data.

✨ Features

🔍 Smart Search - Find profiles via Google Search to avoid LinkedIn rate limits
🎨 Beautiful CLI - Interactive arrow-key menu navigation with ASCII art
📊 Data Export - Export to CSV, JSON, or Excel formats
🔐 Secure - Environment-based configuration for credentials
🌐 VPN Support - Optional IP rotation for enhanced privacy
⚡ Fast & Efficient - Progress tracking and batch processing
🛡️ Anti-Detection - Random delays, user agents, and human-like behavior
🤖 CAPTCHA Handler - Automatic CAPTCHA detection with auto-resume
🎮 Interactive Menu - Navigate with arrow keys (↑↓) and Enter
💾 Resume Support - Interrupt and resume scraping sessions
🔍 Dry Run Mode - Preview URLs before scraping
📊 Smart Statistics - Detailed progress and success rate tracking
🔧 Debug Mode - Verbose logging for troubleshooting

📦 Installation

Option 1: PyPI Installation (Recommended)

Note: The PyPI package is named linkedin-tarantula (not linkedin-spider) because the latter name was already taken.

# Install from PyPI
pip install linkedin-tarantula

# Or with Excel export support
pip install linkedin-tarantula[excel]

This installs the linkedin-spider command globally.

Option 2: Quick Install from Source

# Clone the repository
git clone https://github.com/alexcolls/linkedin-spider.git
cd linkedin-spider

# Run the installation script
./install.sh

The installation script provides three options:

System Installation - Installs globally as linkedin-spider command
Development Installation - Installs locally with Poetry for testing
Both - Installs both system and development modes

Option 3: Development Installation

# Install from source with Poetry
poetry install

# Optional: Install with Excel support
poetry install -E excel

# Activate the virtual environment
poetry shell

Option 4: Install from GitHub (Direct)

# Install directly from GitHub
pip install git+https://github.com/alexcolls/linkedin-spider.git

# Or with Excel support
pip install "linkedin-spider[excel] @ git+https://github.com/alexcolls/linkedin-spider.git"

⚙️ Configuration

1. Environment Variables

cp .env.sample .env
# Edit .env with your LinkedIn credentials

2. Configuration File

Edit config.yaml for advanced settings (delays, VPN, export format, etc.)

🎯 Usage

Quick Start

# If installed from PyPI (pip install linkedin-tarantula)
linkedin-spider

# If installed from source with system mode
linkedin-spider

# If installed with development mode
./run.sh

# Or with Poetry directly
poetry run python -m linkedin_spider

Interactive Mode

The CLI provides an interactive menu with ASCII art and arrow-key navigation:

linkedin-spider  # or ./run.sh for development

Navigation:

Use ↑↓ arrow keys to navigate
Press Enter to select
Or type the number directly

Menu options:

🔍 Search & Collect Profile URLs
📊 Scrape Profile Data
🤝 Auto-Connect to Profiles
📁 View/Export Results
⚙️ Configure Settings
❓ Help
🚪 Exit

Command-Line Mode

# Search for profiles
linkedin-spider search "Python Developer" "San Francisco" --max-pages 10

# Scrape profiles from file
linkedin-spider scrape --urls data/profile_urls.txt --output results --format csv

# Dry run - preview URLs without scraping
linkedin-spider scrape --urls data/profile_urls.txt --dry-run

# Resume interrupted scraping
linkedin-spider scrape --resume session_20250110_143022

# Custom session name with debug mode
linkedin-spider scrape --urls data/profile_urls.txt --session my-scrape --debug

# List active sessions
linkedin-spider sessions

# Clean up session files
linkedin-spider sessions --cleanup

# Show version
linkedin-spider version

🚀 Advanced Features

Resume Interrupted Scraping

If your scraping session is interrupted (Ctrl+C, network issues, etc.), you can resume exactly where you left off:

# Start scraping
linkedin-spider scrape --urls data/profile_urls.txt

# If interrupted, list sessions to find your session name
linkedin-spider sessions

# Resume from where you left off
linkedin-spider scrape --resume session_20250110_143022

Progress is saved every profile, so you never lose work!

Dry Run Mode

Preview what will be scraped before starting:

linkedin-spider scrape --urls data/profile_urls.txt --dry-run

This shows:

✅ List of URLs to scrape
✅ Total count
✅ No actual scraping or login

Debug Mode

Get detailed logs for troubleshooting:

linkedin-spider scrape --urls data/profile_urls.txt --debug

Session Management

View all active progress sessions:

# List all sessions with statistics
linkedin-spider sessions

# Clean up completed sessions
linkedin-spider sessions --cleanup

Custom Session Names

Use memorable session names instead of timestamps:

linkedin-spider scrape --urls data/profile_urls.txt --session my-project-name
linkedin-spider scrape --resume my-project-name

🗑️ Uninstallation

To remove LinkedIn Spider from your system:

./uninstall.sh

This will:

Remove the system command (if installed)
Clean up Poetry virtual environments
Optionally remove .env and data files

🔧 Key Features Explained

CAPTCHA Handling

LinkedIn Spider automatically detects and handles Google CAPTCHA challenges:

Automatic Detection: Instantly detects when CAPTCHA appears
Clear Instructions: Shows what to do in the terminal
Auto-Resume: Automatically continues when CAPTCHA is solved (no manual Enter press needed!)
Progress Updates: Shows elapsed time every 10 seconds
Smart Polling: Checks every 2 seconds for resolution
Timeout Protection: 5-minute maximum wait with fallback

Data Directory

All data is saved in the data/ folder in your current working directory:

Profile URLs: data/profile_urls.txt
Exported profiles: data/profiles_YYYYMMDD_HHMMSS.csv/json/xlsx
Logs: logs/linkedin-spider.log

⚠️ Legal & Ethical Considerations

Terms of Service: This tool is for educational purposes. Always comply with LinkedIn's Terms of Service.
Rate Limiting: Use appropriate delays to avoid overwhelming servers.
Privacy: Respect privacy. Only collect publicly available information.
Usage: Use this tool responsibly and ethically.

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Built with Selenium, Typer, Rich, and Poetry.

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions

⭐ Show Your Support

If this project helped you, please consider:

⭐ Starring the repository
🐛 Reporting bugs
💡 Suggesting features
🤝 Contributing code
📢 Sharing with others

                                                |
                                                |
                                                |
                                                |
                                                |
                                                |
                                                |
                                    ____        |              ,
                                   /---.'.__    |        ____//
                                        '--.\   |       /.---'
                                   _______  \\  |      //
                                 /.------.\  \| |    .'/  ______
                                //  ___  \ \ ||/|\  //  _/_----.\__
                               |/  /.-.\  \ \:|< >|// _/.'..\   '--'
                                  //   \'. | \'.|.'/ /_/ /  \\
                                 //     \ \_\/" ' ~\-'.-'    \\
                                //       '-._| :H: |'-.__     \\
                               //           (/'==='\)'-._\     ||
                               ||                        \\    \|
                               ||                         \\    '
                               |/                          \\
                                                            ||
                                                            ||
                                                            \\
                                                             '
               ╔════════════════════════════════════════════════════╪═╪═╪══════════╗
               ║                                                                   ║
               ║    ██╗     ██╗███╗   ██╗██╗  ██╗███████╗██████╗ ██╗███╗   ██╗     ║
               ║    ██║     ██║████╗  ██║██║ ██╔╝██╔════╝██╔══██╗██║████╗  ██║     ║
               ║    ██║     ██║██╔██╗ ██║█████╔╝ █████╗  ██║  ██║██║██╔██╗ ██║     ║
               ║    ██║     ██║██║╚██╗██║██╔═██╗ ██╔══╝  ██║  ██║██║██║╚██╗██║     ║
               ║    ███████╗██║██║ ╚████║██║  ██╗███████╗██████╔╝██║██║ ╚████║     ║
               ║    ╚══════╝╚═╝╚═╝  ╚═══╝╚═╝  ╚═╝╚══════╝╚═════╝ ╚═╝╚═╝  ╚═══╝     ║
               ║                                                                   ║
               ║    ███████╗██████╗ ██╗██████╗ ███████╗██████╗      ^.-.^          ║
               ║    ██╔════╝██╔══██╗██║██╔══██╗██╔════╝██╔══██╗    '^\+/^`         ║
               ║    ███████╗██████╔╝██║██║  ██║█████╗  ██████╔╝    '/`"'\`         ║
               ║    ╚════██║██╔═══╝ ██║██║  ██║██╔══╝  ██╔══██╗                    ║
               ║    ███████║██║     ██║██████╔╝███████╗██║  ██║                    ║
               ║    ╚══════╝╚═╝     ╚═╝╚═════╝ ╚══════╝╚═╝  ╚═╝                    ║
               ║    ═══╪══╪═╪═╪═══╪════════════════════════════════════════════    ║
               ║                                                                   ║
               ║               Professional Network Profile Scraper                ║
               ║                ━━━ Weaving Through Networks ━━━                   ║
               ║                                                                   ║
               ╚═══════════════════════════════════════════════════════════════════╝

Made with ❤️ and 🐍 Python
Get Linkedin profiles at scale

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.github		.github
k8s		k8s
scripts		scripts
src/linkedin_spider		src/linkedin_spider
tests		tests
.dockerignore		.dockerignore
.env.sample		.env.sample
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DISTRIBUTED-SUMMARY.md		DISTRIBUTED-SUMMARY.md
Dockerfile		Dockerfile
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README-K8S.md		README-K8S.md
README.md		README.md
config.yaml		config.yaml
docker-compose.yml		docker-compose.yml
install.sh		install.sh
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
run.sh		run.sh
setup.py		setup.py
uninstall.sh		uninstall.sh

Folders and files

Latest commit

History

Repository files navigation

🕷️ LinkedIn Spider

📖 Overview

✨ Features

📦 Installation

Option 1: PyPI Installation (Recommended)

Option 2: Quick Install from Source

Option 3: Development Installation

Option 4: Install from GitHub (Direct)

⚙️ Configuration

1. Environment Variables

2. Configuration File

🎯 Usage

Quick Start

Interactive Mode

Command-Line Mode

🚀 Advanced Features

Resume Interrupted Scraping

Dry Run Mode

Debug Mode

Session Management

Custom Session Names

🗑️ Uninstallation

🔧 Key Features Explained

CAPTCHA Handling

Data Directory

⚠️ Legal & Ethical Considerations

📄 License

🙏 Acknowledgments

📞 Support

⭐ Show Your Support

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages