We are seeking a skilled and motivated Data Engineer to join our team. The ideal candidate will be responsible for designing, implementing, and maintaining our data infrastructure to support our B2B intelligence platform.
Responsibilities
- Design, build, and maintain scalable data pipelines for collecting, processing, and storing large volumes of business data
- Develop ETL processes to integrate data from various sources, including web scraping, APIs, and third-party data providers
- Implement data quality checks and monitoring systems to ensure data accuracy and integrity
- Optimize data storage and retrieval processes for high performance and scalability
- Collaborate with data scientists to implement machine learning models in production environments
- Work with the backend team to design and implement APIs for data access
- Collaborate with data scientists to develop and deploy machine learning models.
- Implement data security and privacy measures to protect sensitive information.
- Stay up to date with the latest big data technologies and best practices
Requirements
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field
- 4+ years of experience in data engineering roles
- Strong programming skills in Python, Scala and/or Java
- Expertise in SQL and experience with NoSQL databases (e.g., MongoDB, Cassandra)
- Proficiency with big data technologies such as Apache Spark, Hadoop, and Kafka
- Experience with cloud platforms (AWS, GCP, or Azure) and their data services
- Familiarity with data warehousing concepts and ETL processes
- Experience with data warehousing solutions (e.g., AWS Redshift, Snowflake).
- Knowledge of data modelling, data architecture, and data pipeline design
- Experience with version control systems (e.g., Git) and CI/CD practices
- Excellent problem-solving skills and attention to detail
Preferred Qualifications
- Experience in the B2B data or sales intelligence industry
- Familiarity with web scraping techniques and tools
- Knowledge of data privacy regulations (e.g., GDPR, CCPA)
- Experience with real-time data processing and streaming architectures