Senior Data Engineer with AWS, Python (FastAPI)

Senior Data Engineer with AWS, Python (FastAPI)

DataArt

Remote

B2B
Festanstellung

Hexjobs Insights

DataArt seeks a Senior Data Engineer to design and build data pipelines for a legal data platform, requiring skills in AWS, Python, and SQL. Position is remote based in Łódź.

Schlüsselwörter

data pipelines
AWS
Python
SQL
web scraping
data APIs
Docker
data quality
PostgreSQL
Apache Spark

ClientOur client is a leading legal recruiting company aiming to build a data-driven platform specifically designed for lawyers and law firms. The platform brings everything together in one place — news and analytics, real-time deal and case tracking from multiple sources, firm and lawyer profiles enriched with cross-linked insights, rankings, and more.Project overviewThe platform aggregates data from hundreds of public sources including law firm websites, deal announcements, legal databases, and media publications creating a unified ecosystem of structured and interconnected legal data. It combines AI-driven enrichment, automated data processing, and scalable infrastructure to ensure comprehensive and reliable coverage of the legal market.Position overviewWe are seeking a Senior Data Engineer to join our team to design, build, and scale robust data pipelines for collecting, transforming, and structuring large volumes of legal and financial data collected via scrapers. You will collaborate closely with AI/ML engineers, DevOps, Front-end and Back-end teams to ensure smooth and efficient data workflows integral to the platform.ResponsibilitiesDesign and implement data ingestion pipelines to collect and process structured and unstructured data from multiple online sources (web scraping, APIs, feeds, etc.).Develop and optimize ETL/ELT workflows using Python and SQL.Build and orchestrate scalable data workflows leveraging AWS services such as Batch and S3.Develop and deploy internal data APIs and utilities supporting platform data access and manipulation.Implement robust text extraction and parsing logic to handle diverse data formats.Ensure data quality through validation, deduplication, normalization, and lineage tracking across Raw ➝ Curated ➝ Enriched data layers.Containerize and orchestrate data workloads using Docker and native AWS solutions.Collaborate closely with AI, Back-end, and Front-end teams to ensure efficient data integration and flow.RequirementsExperience with AWS services (AWS Batch, S3, Step Functions)Data Quality experienceAWS Batch and Amazon S3AWS Step FunctionsAmazon SQSMaster Data Management (MDM) experienceRelational databases, specifically PostgreSQLProven expertise in Python programmingSolid understanding of the AWS ecosystemPractical experience with Docker and containerized development workflowsExperience with web scraping, text extraction, or other data‑ingestion techniques from diverse online sourcesStrong analytical mindset, effective communication skills, and ability to collaborate across multiple teamsNice to haveHands-on experience with Apache Spark and SQL for distributed data processing.Experience with EMR, SageMaker.

Aufrufe: 4
Veröffentlichtvor 17 Tagen
Läuft abin 27 Tagen
Art des VertragsB2B, Festanstellung

Ähnliche Jobs, die für Sie von Interesse sein könnten

Basierend auf "Senior Data Engineer with AWS, Python (FastAPI)"

Keine Angebote gefunden, versuchen Sie, Ihre Suchkriterien zu ändern.