Slack is expanding our Core Infrastructure team - looking for experts to drive the way we architect, deliver, and operate the services that run at the heart of our Infrastructure.
Teams of all sizes — from the world’s largest public companies to the smallest of startups — use Slack to get work done. We work at a tremendous scale, aiming for “five nines” of uptime for all services.
- 1M Slack messages sent per minute via the API
- 300M events per minute broadcast to Slack clients
- 90B database queries per day
- 6B background jobs enqueued per day
- 2T metrics collected per day
Our systems deliver trust and scale for people all over the world to communicate and collaborate together, as well as internal capabilities like the company's monitoring fabric and detection/response security systems. Our Core Infrastructure team maintains and builds the lower levels of our stack, including:
- Edge services
- Data Stores and Caches
- Real-time messaging and event fanout
- Asynchronous background job processing
We’ve done our job correctly when none of our users think about us, much like a vital utility. We don’t typically ship new user-facing features, but rather ensure our systems are exceptionally performant, highly available, reliable, and scalable. In other words, we make Slack work seamlessly. Slack's API and web backend is built on PHP/Hack, and our backend services are written in Java and Go. Our data infrastructure is built on Kafka, Hadoop, Hive, Presto, Spark, and MySQL/Vitess.
The Data Streaming Team
We are forming a new Data streaming team that will focus on providing Kafka as a Service for the company at the scale of trillions of messages per day across dozens of clusters in Amazon data centers. The team will work both on extensions to the Kafka ecosystem to bridge clusters together and store streams as well as on highly reliable automation for tuning, operating, and scaling Kafka clusters.
As a Manager of a growing team you will be responsible for building and operating distributed services that work with massive amounts of data. You will have the opportunity to explore cutting edge open source and proprietary technologies to meet the ever changing needs of the company’s growth. You will be driving innovation at the intersection of distributed computing, security and data sciences.
Required skills
- 3+ years of experience managing people
- Track record of managing performance and potential of employees to deliver results as a team
- Experience and passion for service ownership, building reliable/self-healing services
- Experience with distributed systems with exposure to Kubernetes, Kafka, Real-time streaming and analytics systems
- Experience in cloud computing technologies (AWS, GCP, Azure)
Preferred skills
- Experience building and running distributed services at scale
- History of driving build vs buy decisions, and track record of integrating open source and proprietary stack.
- Experience at Scrum or other agile development methodologies, with attention to code quality, delivering secure code
- Experience crafting multi-year strategy and driving cross team alignment behind it
Come join us!
Slack is registered as an employer in many, but not all, states. If you are not located in or able to work from a state where Slack is registered, you will not be eligible for employment. Visa sponsorship may not be available in certain remote locations.
Visa sponsorship is not available for candidates living outside the country of this position.