We are seeking a skilled and innovative Python developer – Web Scraping Specialist to join our data acquisition team. The ideal candidate will be responsible for developing and maintaining robust web scraping systems to collect high-quality business data from various online sources.
Responsibilities
- Design, develop, and maintain efficient and scalable web scraping systems
- Create and optimize web crawlers to extract data from diverse websites and web applications
- Implement techniques to bypass anti-scraping measures and ensure consistent data collection
- Develop strategies to handle dynamic content, AJAX-loaded data, and complex website structures
- Ensure the legality and ethics of all web scraping activities
- Monitor and maintain the performance and reliability of scraping systems
- Collaborate with data engineers to integrate scraped data into our data pipeline
- Stay updated od scraping techniques
- Troubleshoot and resolve issues related to data collection and crawler performance
- Document scraping processes and maintain code repositories
Requirements
- Bachelor’s degree in Computer Science, Software Engineering, or a related field
- 3+ years of experience in web scraping and data extraction
- Strong programming skills in Python, with expertise in scraping libraries (e.g., Scrapy, Beautiful Soup, Selenium)
- Proficiency in HTML, CSS, and JavaScript
- Experience with handling CAPTCHAs, IP rotation, and other anti-bot detection techniques
- Familiarity with proxies, VPNs, and other tools to maintain anonymity while scraping
- Knowledge of web protocols (HTTP/HTTPS) and experience with RESTful APIs
- Understanding of web architecture and common web technologies
- Ability to reverse-engineer websites and web applications
- Experience with version control systems (e.g., Git)
- Strong problem-solving skills and attention to detail
Preferred Qualifications
- Experience with cloud-based scraping solutions (e.g., AWS, GCP)
- Knowledge of distributed scraping systems and scalable architectures
- Familiarity with data privacy regulations and ethical scraping practices
- Experience with scraping social media platforms and professional networks
- Understanding of B2B data structures and business information
- Knowledge of additional programming languages (e.g., JavaScript, Go)
- Experience with database systems (SQL and NoSQL)