Phillip J. Paine, PhD. CV e-mail: pmxpp88@gmail.com

Data scientist with over 5 years of experience working in both start-up and larger organisations, in addition 2 years as a statistical consultant and 2 years as a post-doctoral researcher in applied statistics.

Programming Languages: Python, R, Java, PySpark, Matlab Data Science Skills: SQL, Cloud Services (AWS + GCP), Visualisation Tools (PowerBI, Dash, Shiny), Kubernetes, DB Services (Postgres, SQLite)

Professional Experience

Alloy.ai Data Scientist - Forecasting (May 2022 - Current)

Primarily responsible for improving and maintaining end-to-end forecasting engine in continuous-integration environment and developing the team roadmap in coordination with Managers, Product, Marketing and Sales teams.

  • Increased demand forecast accuracy over 10\% by industry-standard metrics using traditional time series methods and modern machine learning techniques across the forecasting pipeline in a continuous-integration environment.
  • Designed and implemented testing framework for evaluating proposed forecast engine improvements and enable faster iterations on vendor forecast backtesting for use by customer support and sales teams
  • Successfully contributed to forecast roadmap vision through creating long-term objectives, and developing the short-term projects, in coordination with Product, Sales and C-suite personnel with the aim of simplifying the offering to deliver forecast improvements

BCAA Data Scientist - Underwriting Analytics (May 2022 - Apr 2022) Senior Data Analyst - Underwriting Analytics (May 2019 - May 2020)

Main duties include developing best-in-class insurance pricing models, creating BI tools for use by C-suite, Marketing and Finance teams and providing statistical analysis for marketing campaigns across the insurance product.

  • Established end-to-end data pipeline across insurance products increasing the visibility of business information across multiple teams including C-suite. Leveraged PySpark, AWS Glue and PowerBI to create a reliable BI framework across the organisation.
  • Implemented early detection of large insurance claim clusters to create awareness of potential extreme event risks, using a pre-trained word-embedding with CNN models implemented in Tensorflow to predict insurance claim outcomes from early customer conversation transcripts
  • Improved assessment of residential insurance risk leading to more competitive pricing structure through more accurate prediction of policy counts using ARIMA time series models and of policy costs using ML algorithms including K-Nearest Neighbour and GBMs. Shiny dashboards created in R to demonstrate business value of proposed changes to exec. finance team.

University of Sheffield Statistical Consultant (Sept 2016 - Dec 2018)

Statistical consultant working with external research institutes to provide statistical analysis, reports on best practices and advanced courses for customers.

Bayesian Analysis of Clonogenic Survival Assay Data Created a dose-response curve with uncertainty contours from data on the irradiation of cancer cells in Python and R. A Bayesian hierarchical model was used to create a heat map around the dose-response mean curve quantifying the level of uncertainty for multiple sources of error. The outcome was a paper published in the Journal of Radiotherapy and Oncology.

University of Nottingham Post-doctoral Research Fellow - Applied Statistics (Sept 2014 - Sept 2016)

The purpose of the grant was to develop novel regression methods for manifold-valued data and landmark data. Applications of the work includes predicting weather patterns and modelling vector-cardiogram signals.

Education

University of Nottingham PhD Statistics

Selected Publications

Teaching Experience

University of Nottingham Lecturer for “Topics in Statistics”