Skip to main content

Member of Technical Staff – AI Inference platform, features

Zürich
Full-time
Permanent employee

Your mission

You will expand the capabilities of Lyceum's AI inference platform, the first EU-sovereign inference cloud. You'll own the features that customers interact with directly: model serving configurations, API surface, framework integrations, and developer experience. This means understanding what customers need, building it fast, and making sure it works reliably at scale.

 

Your focus

-              Feature development: Design and ship new platform capabilities - from supporting new model architectures and serving frameworks to building out API features that customers are asking for.

-              Customer-facing engineering: Work closely with customers and the commercial team to understand real-world usage patterns, translate feature requests into technical designs, and iterate based on feedback.

-              Developer experience: Improve the end-to-end experience of deploying and running inference on Lyceum, from initial setup through to monitoring and debugging in production.

 

Your KPIs

-              Number of platform features shipped

-              Time from customer request to feature availability

-              Breadth of supported models, frameworks, and deployment configurations

          -              Customer feedback on platform usability and capability

Your profile

We consider candidates from diverse backgrounds, with a deep love for technical challenges and the desire to take on ownership beyond what's reasonably expected. You're someone who stays close to the rapidly evolving open-source AI ecosystem and gets energy from turning emerging tools into production-grade platform capabilities.

 

Requirements

-              3+ years of experience in software engineering, with a focus on backend or infrastructure systems

-              Strong proficiency in Go and Python

-              Hands-on experience with at least one ML inference serving framework (vLLM, TGI, etc)

-              Solid understanding of how large language models and other AI models are deployed and served in production

-              Experience working with REST/gRPC APIs and designing developer-facing interfaces

    

Nice to have

-              Familiarity with GPU scheduling, batching strategies, or inference optimisation (quantisation, speculative decoding, etc.)

-              Experience with Kubernetes and container orchestration in a production setting

-              Knowledge of AI model formats and conversion pipelines (GGUF, SafeTensors, ONNX)

-              Background in developer tools, platform engineering, or API design

Why us?

-              Outstanding team: Work with some of the best engineers in the world, coming from hedge funds, big tech, AI startups and top universities.

-              Once in a lifetime opportunity: Early-stage company in the fastest-growing market in the world

-              Ownership: Shape how European AI companies access GPU compute

-              European mission: Build sovereign, GDPR-compliant AI infrastructure for the next generation of deep-tech


About us

Lyceum is building AI-native GPU infrastructure for the next generation of deep-tech companies. We’re an early-stage team moving fast, obsessed with customer value, and focused on building a category-defining product.