The data infrastructure team is responsible for all things data — including the data lake, data pipelines, distributed query engines, experimentation framework, data dashboards, and visualization tools. The data ecosystem is composed of open-source technologies such as Kafka, Hive, Trino, Spark, and Airflow, along with a few internally built systems. As a member of the team, you would spend time designing and scaling existing infrastructure, improving system reliability, working closely with other teams to improve or create new tooling, and promoting the correct use of data across the organization. We are looking for someone excited by the prospect of optimizing, enhancing, or even redesigning data architecture and pipelines to support next-generation data initiatives.
Responsibilities
1.Design and implement scalable and efficient data architectures, meeting data needs from data scientists, engineers and business partners
2.Identify and implement improvements to system’s cost efficiency, reducing data storage and processing costs
3.Bring your expertise to help model structured & unstructured data. Own these data models at a high level & be a data consultant for partner teams
4.Champion best practices on system usages and hold high-quality code standards
5.Participate in the team’s oncall rotation, helping solve timely reliability incidents
Perks
• Comprehensive medical coverage for employees and dependents (100% covered for employee-only plans)
• Generous paid parental leave (country-dependent)
• Employee Assistance Program for mental health, legal, financial, and counseling support
• Equity (RSUs) for eligible employees
• Generous PTO and paid holidays
• Professional development support for training, licenses, and books
• Employee activity clubs (language learning, trivia, chess, and more)
• Work from almost anywhere (remote-first)
• Flexible working hours with async collaboration outside coordination hours
• Home office & wellness stipend
• Optional coworking space access (short- or long-term)