Natural language processing expert m/f/d,The project aims to develop a machine learning system for the automated classification of assay protocol documents. The system will create a model or agent, track and improve its performance on challenging, imbalanced classes, and ensure it is robust and ready for integration into a production environment. The outcome will help scientists focus more on research, accelerate data access for project teams, and improve overall data quality.
The ideal candidate is an expert AI/NLP consultant with hands-on experience fine-tuning Transformer models and strong proficiency in PyTorch. They should have demonstrable experience with multi-task and multi-label classification, expertise in handling severe class imbalance in text data, and proficiency in deploying ML models as REST APIs. Strong software engineering fundamentals are also essential.
Tasks & Responsibilities:
- Review and optimize a transformer architecture and training pipeline.
- Implement and experiment with advanced techniques to improve performance on fields with severe class imbalance.
- Conduct in-depth error analysis to identify patterns in misclassifications and propose data-driven improvements.
- Refine and validate the data processing, label mapping, and stratified data splitting procedures to ensure maximum reliability.
- Collaborate with our software architect to integrate the final, optimized model into a production-ready API for inference
- Document the final model architecture, training procedures, performance benchmarks, and best practices for future development.
Must Haves: - Educational background: advanced degree in AI/NLP or related field
- Min. 3 years hands-on experience fine-tuning Transformer models
- Demonstrable experience with multi-task and multi-label classification problems, expertise in handling severe class imbalance in text data
- Proficiency in deploying machine learning models as REST APIs
- Strong proficiency in PyTorch, including creating custom model architectures (e.g., multi-head classifiers) and custom loss functions
- Strong software engineering fundamentals and the ability to write clean, modular, and well-documented Python code, experience with Docker
- Professional proficiency in English
- Strong analytical and problem-solving skills, and collaboration abilities.
Nice to Haves: - Direct experience working with biomedical, scientific, or other technical document formats.
- Familiarity with advanced data splitting techniques for multi-label datasets.
- Experience with MLOps principles and tools.
Jetzt bewerben
Sounds like a great job?
Then we look forward to receiving your complete application documents through our online application form.
When applying by email, the sender agrees that his or her data will be used in accordance with our data privacy policy.
Find more vacancies at: coopers.ch