Author of “Software Engineering for Data Scientists”, a guide for data scientists who want to level up their coding skills, published by O’Reilly in May 2024. I’m currently consulting for GenAI startups and looking for my next role bridging the software engineering and data science spaces.
Previously, I was a Principal Data Scientist at SAP Concur. I have extensive experience deploying NLP models to production and evaluating machine learning. I’m also co-author of the book “Building Machine Learning Pipelines”, published by O’Reilly in 2020. I’ve given many conference talks including PyCon US, PyData Global and the Grace Hopper Conference.
2023-present: Consultant and author, self-employed
- Authored “Software Engineering for Data Scientists” (O’Reilly, 2024).
- Developed methods to evaluate LLM applications for a GenAI startup.
- Developed a workflow for AI-powered apps for a developer tools company and wrote it up as a tutorial.
2021-2023: Principal Data Scientist, SAP Concur
- Developed NLP models for extracting expense information from receipts and deployed them to production, including upgrading RNN models to Transformer models.
- Led development of internal processes for MLOps and model retraining.
- Created a new product feature to calculate carbon emissions from business travel receipts - carried out research, found product fit, built a prototype, technical leadership on the ML backend. Patent filed in 2021.
- Built a full-stack data labeling tool using Python and Javascript to allow the team to inspect their data more easily.
2018-2021: Senior Data Scientist, SAP Concur
- Data scientist for Concur Labs, a small innovation team trying out new technologies and making recommendations to the rest of the company. Consulted with product teams on how best to apply ML.
- Co-authored “Building Machine Learning Pipelines” published by O’Reilly in 2020.
- Demonstrated proof of concepts in the areas of privacy-preserving ML, MLOps, and active learning for ML.
- Conference talks at Grace Hopper Conference, PyCon US, PyData Global.
2017-2018: Data Scientist, SAP
- Explored the effects of data anonymization on machine learning.
- Worked on proof-of-concept projects building ML models across industries including agriculture and retail.
2016: Data Scientist, SAP Concur
- Applied deep learning to the extraction of information from receipt text and achieved a 15% error reduction.
2011-2014: Geophysicist, Cairn Energy
- Analysis and interpretation of various types of geophysical data to drive decisions on where to drill wells.
- Evaluation of global data on oil industry success rates and data visualizations of the results.