I am a PhD statistician who enjoys programming (particularly with Julia) for difficult optimization and machine learning problems. My niche is the intersection of statistics and computer science, which allows me to quickly translate whiteboard math into efficient programs. During my PhD years I researched on-line algorithms for statistics (single-pass algorithms for streaming and big data), an underused paradigm where statistics/models can be updated on new batches of data without revisiting past observations (see OnlineStats.jl). I am a research scientist, data scientist, machine learning engineer, and software engineer. I contribute to a variety of open source data science tools, some of which can be found here: https://github.com/joshday.Follow @joshday
One Click Tuner: Chromatic musical instrument tuner for iOS.
TrendSpot: REST API and web app dashboard for agentless monitoring.
OnlineStats: Single-pass algorithms for statistics.
AverageShiftedHistograms: Kernel density estimation for big data.
SparseRegression: Penalized (Ridge, LASSO, etc.) regression and classification models.
View all my other projects on GitHub.
Web API for NLP (Natural language processing)
Web app and backend for time series analysis based on news article NLP (sentiment, entities) of various sources.
Data processing/visualization tools for test flight data.
|PhD, Statistics||2018||NC State University|
|MS, Statistics||2014||NC State|
|BS, Math & Statistics||2012||Winona State University|
|BA, Economics & Music||2009||Winona State University|
Built custom Julia software for clients (healthcare and government sectors).
Consulted with businesses on optimizing Julia code.
Maintained and contributed to a variety of open source projects.
Researched use cases for streaming data models in Ad Tech.
Developed on-line algorithms for advertising retargeting (logistic and survival models).
Worked with big data and associated technologies like Hadoop and Spark.
Fit a lot of Scikit-learn models.
Wrote test suites using JMP Scripting Language (JSL) for validating statistical results.
Redesigned the UI for the JMP Starter.
Researched new methodologies being considered for JMP platforms.
Assisted students and faculty with experiment design, data analysis, and visualization.
Created weekly A/B testing reports using SAS and SQL.
Researched customer subgroups as the first step of large-scale experiments.
Slides and other materials available at https://github.com/joshday/Talks.
Financial Modeling Using Julia on Large, Streaming Datasets: Julia Computing Webinar March 2020
Scalable Data Analysis with JuliaDB and OnlineStats: JuliaCon 2018
SparseRegression.jl: Linear Models with Sparse Coefficients: JuliaCon 2017
Sorting Algorithms: NC State, ST 758: Statistical Computing (Fall 2017)
Online MM Algorithms for Machine Learning: International Chinese Statistical Association Conference 2016
Julia for Modern Data Analysis: PyData Carolinas 2016
OnlineStats.jl: Online Algorithms for Big and Streaming Data: Joint Statistical Meetings 2016
Overview of Stochastic Gradient Descent: NC State Statistical Learning Group (Fall 2015)
Intro to Julia: NC State, ST 758: Statistical Computing (Fall 2015)
Intro to R and RCpp: NC State, ST 790: Advanced Computing (Spring 2015)
Online Optimization: NC State, ST 790: Advanced Computing (Spring 2015)
Penalized Methods: Ridge, Lasso, and Elastic Net: NC State Statistical Learning Group (Fall 2014)
ST 312 (Spring 2017, Spring 2015)
ST 311 (Fall 2016, Fall 2014)
ST 350 (Fall 2012)
Mentor for Summer Institute in Biostatistics (Summer 2014)