We are looking for a versatile developer with experience in both data processing and backend development to join our team. The role involves working on data ingestion pipelines, normalizing large datasets, and integrating them into a backend system. You will help us design and implement scalable data solutions while ensuring real-time data updates and maintaining web applications.
Responsibilities:
Build and maintain data pipelines using Dagster and Python for data ingestion, transformation, and storage (Parquet in S3).
Analyze data from various sources, and make decisions on batch vs. streaming processing (using tools like Flink or Dagster).
Ensure data is processed and stored efficiently, keeping it up-to-date and in sync with business needs.
Develop backend services in Java (JDK 17, Spring Boot or other frameworks) for connecting data pipelines with web applications.
Work with event-driven architectures, ensuring proper handling of customer and transaction data at the entity level.
Optimize Postgres databases, managing SQL queries for efficient data storage and retrieval.
Support the integration of GraphQL and gRPC APIs to connect services.
Contribute to frontend development using React to support customer-facing features like reporting, analytics, and workflow management.
Collaborate with cross-functional teams to improve product features and build out analytics and reporting capabilities.
Requirements:
Strong experience with Dagster, Python, and Pandas/Numpy for data processing and transformation.
Familiarity with Flink or other real-time data processing tools.
Proficiency in SQL for database management and data manipulation.
Backend development experience with Java, preferably using Spring Boot or similar frameworks.
Experience with AWS
Comfortable working in a Kubernetes environment
Familiarity with event-driven architectures and systems like Apache Kafka or similar.
Frontend experience with React.
Generalist approach to development, with a focus on data engineering and backend integration.
Desirable Skills:
Experience working with S3 and Parquet for data storage.
Familiarity with data integration and processing in cloud environments.
Comfortable with API design and integration (GraphQL, gRPC).