Rohail Taimour
Summary
Seasoned Python Software Engineer with a Master's degree in Statistics. I am proficient in building applications following best software practices such as CI/CD, containerization and operating in an AWS or Azure cloud environment.
Job History
- Python Backend Engineer at Engie GEMS, Brussels, Belgium | Jan 2024 - Present
- Python Software Engineer at Illumina, Mechelen, Belgium | April 2023 - October 2023
- Machine Learning Engineer at GSK Vaccines, Brussels, Belgium | October 2022 - Feb 2023
- Lead Data Scientist, AI Developer at UCB BioPharma, Brussels, Belgium | August 2016 - October 2022
Relevant Experience
Backend Engineering for Green Energy Trading Platform using Flask and Python Ecosystem
Python Backend Engineer, Engie Green energy Management solutions (GEMS), Brussels, Belgium
Jan 2024 - present
- Support processes for trading wind and solar assets on the European electricity market, as well as manage contract lifecycle as a first line support for incidents and feature requests in a team of developers.
- Diagnose incidents using a combination of Kibana, Sentry, and full coverage of features by testing client workflows instead of unit testing and mocking objects.
- Implement endpoints to integrate different services, modularizing legacy code to improve maintainability and reliability.
- Enhanced resilience of a scheduled celery task making an API call by incorporating retry mechanism. Extended our internal mock for testing celery task execution to support testing that tasks are appropriately retried by monkeypatching celery internals.
- Led the migration of two codebases from SQLAlchemy v1 to v2. Updated the contributing guidelines to include preferred querying syntax by conferring with the development team.
- Setup a scheduled github actions based workflow for integration testing between two services, a database and message queue dependency.
- Increased reliability and modularity of CI/CD pipeline in github actions to prevent pull requests being merged into the main branch if they failed certain steps.
Python solution enabling launching, pipeline management and automated report downloads for Customer-Uploaded Data
Python Software Engineer and Data Pipeline Architect, Illumina, Mechelen, Belgium
April 2023 - October 2023
- Designed a Python service that automates the monitoring and processing of customer-uploaded sequencing data, initiating further analysis or report generation based on predefined criteria.
- Implemented a dual-layered approach: the first layer handles the initiation and tracking of analysis pipelines, while the second layer is registered as a Docker image in the analytics backend to perform post-processing on the output files and create comprehensive summary reports for the customer.
- Scheduled the Python service to operate every 30 minutes for new data and updates, ensuring seamless progression from data upload to final report delivery to customer environment.
- Implemented comprehensive systems integration, utilizing a combination of CLI tools and API calls for effective coordination and automation across various software components.
- Applied Object-Oriented Programming (OOP) principles to organize API, database interactions and endpoint processing to reduce code duplication and utilize self-documenting object names.
- Implemented unit testing using pytest and implemented fail-safe mechanisms for robust error handling.
Automated SQL Script Generation to facilitate PostgreSQL Data Migration in multiple environments
Python Software Engineer, Illumina, Mechelen, Belgium
July-Aug 2023
- Designed and implemented a data ingestion framework to parse and validate input files for generating and validating SQL update statements.
- Conducted comprehensive testing of generated SQL scripts using a mock of production database tables to test that SQL scripts run as expected.
- Utilized SQLAlchemy for database schema management, creating and populating mock tables in a test environment to ensure the integrity and functionality of SQL scripts.
- Implemented the solution as a Python package encapsulating the entire data migration logic within a Docker entrypoint for portability and ease of deployment.
- Leveraged Jinja2 templating to generate dynamic, parameterized SQL scripts, enabling the script to adapt seamlessly across different deployment environments, such as Development, Integration, and Production.
Design and implement information retrieval methods using Natural language processing (NLP)
Machine Learning Engineer, IT Supply Quality, GSK Belgium
Oct 2022‑Feb 2023
- Improved performance of information retrieval by 20% on unseen test data using a custom named entity recognition (NER) from Spacy.
- Performed POC’s on Azure DataBricks environment to improve model performance using rule-based techniques as well as NER and annotated data to train custom NER.
- Added text preprocessing features to the NLP pipeline such as Spacy tokenization, Part of speech (POS) tagging, better handling of non‑english emails, breaking emails into sentences, etc.
Unit Commitment Solver for Power Grid Optimization via FastAPI
- Developed a REST API using FastAPI for optimizing energy distribution among powerplants based on load requirements and fuel costs.
- Implemented multiple algorithms to solve the unit-commitment problem, considering factors like fuel cost, powerplant efficiency, and environmental constraints.
- Utilized Pydantic for data validation and schema definition, ensuring data integrity and streamlined request handling.
- Packaged and containerized the application using Docker, with detailed documentation and a Dockerfile for easy deployment and scalability.
- Employed pytest, along with Python best practices such as typing and linting.
- Managed project dependencies using Poetry, facilitating efficient workflow and package management.
- Deployed the API service using Uvicorn and integrated a Swagger UI for interactive API documentation and testing
Yield optimization for batch and continuous production processes using Machine Learning in Python
Lead Data Scientist, Supply and Manufacturing, UCB Switzerland/Belgium
Aug 2020‑Oct 2022
- Production setting proposed by model directly led to an increased throughput of 20%, turning in a recurring 1.5 million euro in annual cost savings
- Analyze time series data collected from equipment sensors and visually summarize golden batch insights
- Created (Bayesian) and tree-based regression models to quantify impact of process changes and predict batch performance
- Performed a thorough model validation and hyperparameter tuning exercise before recommending model insights be tested in a live production environment
- Supported delivery of workshops demystifying the process of conducting AI projects and machine learning to process experts
Marketing Mix Optimization and Customer Segmentation modelling in EU5
AI/ML engineer, Lead Data Scientist, Go to Market/Commercial EU5, US and Japan, UCB
June 2019‑June 2021
- Developed customer segmentation models, identifying key segments for high-potential growth and revenue by integrating multiple data sources to gain a comprehensive understanding of customer interactions and investigate the relationship between customer segments and marketing channel responsiveness.
- Optimized marketing resource allocation based on the model’s insights, tailoring marketing strategies to enhance customer engagement and maximize ROI across diverse channels.
- Performed feature engineering using PySpark and validated ingested data using data visualization methods and discussions with subject-matter experts.
- Adapted data science methodologies to address country and product specificities, delivering tailored solutions for up to ten different use cases across various products and countries.
- Developed a Python package with Cookiecutter templates that abstract the complexities of the data science workflow, enabling configurable deployments across diverse scenarios such as different countries and disease areas.
- Enhanced the package to seamlessly wrap over scikit-learn, thereby simplifying key data science tasks from preprocessing to model training and tuning.
- Incorporated MLflow into the package for robust artifact management, allowing for the tracking of model versions, data inputs, and predictions.
Personal projects
Web Scraper to analyse Property Purchase and Rental Trends in Belgium
- Developed web scraper using Beautiful Soup to collect information such as apartment data such as price, area, etc.
- Implemented SQLite for data storage, using `Pydantic` for data validation and `SQLAlchemy` for database interactions.
- Encapsulated the concerns into a python package with dependency management using Poetry.
- Employed Prefect for job orchestration, managing the workflow's scheduling and monitoring of scraping tasks.
Personal Portfolio and blogging website built using Hugo and hosted using Github Pages
- Created website using Hugo and implemented features such as a contact form, and visitor commenting capabilities.
- Hosted the static website on GitHub Pages and automated the deployment process using GitHub Actions.
- Codebase hosted on github
Automated Resume Builder and Continuous Deployment System with GitHub Pages Hosting
- Engineered an automated system for generating, versioning, and hosting a dynamic CV using Markdown, HTML, Jinja templating and CSS.
- Set up a trio of GitHub repositories to separately manage the CV's content, styling, and public hosting on Github Pages.
- Developed a Python package for automating the styling and generation of the CV, integrating with Markdown and HTML/CSS.
- Implemented version control for CV content using a private GitHub repository, ensuring secure and organized data management.
- Leveraged GitHub Actions for automating the CV's generation and deployment process, enabling updates through git pushes.
- Hosted the final CV on GitHub Pages, providing a live, online version that can be easily updated