24 Mar 2026
New manuscript. TL;DR: we used Fitbod’s data to find a better formula to estimate your one-rep-max (1RM) for a given exercise based on any arbitrary reps x weight combination.
Abstract:
Classical equations for predicting one-repetition maximum (1RM) from submaximal performance were derived from small samples performing a single exercise, yet are routinely applied to hundreds of exercises. All use a fixed conversion factor relating repetitions to estimated 1RM, regardless of exercise or load. We used large-scale observational data from a consumer fitness app (303,494 near-failure sets from 14,966 users across 388 exercises spanning 16 muscle groups) to derive and evaluate a generalization in which the conversion factor varies logarithmically with the weight lifted: 1RM = w * (1 + (r - 1)^0.85 / (-2.55 + 4.58 * ln(w))). Because the dataset contains no directly measured maxima, we optimized and evaluated the formula using an internal consistency criterion – the degree to which different weight-repetition combinations from the same person, exercise, and time window yield the same estimated 1RM. The proposed formula reduced inconsistency by 17-22% relative to four classical benchmarks, with the improvement positive for every one of the 183 exercises with sufficient data. Five-fold user-level cross-validation confirmed near-zero overfitting. An ablation analysis attributed 91% of the improvement to the weight-dependent conversion factor and 9% to the sub-linear repetition exponent. The conversion factor increases with load: at light weights each additional repetition implies a larger fraction of maximal capacity than at heavy weights, consistent with prior evidence that the repetitions-%1RM relationship varies by exercise. Classical equations, by applying a single conversion factor across all loads, systematically underestimate this variation – and the discrepancy is largest for the lighter, more diverse exercises that dominate real-world training programs.
31 Jan 2026
New manuscript, co-authored with Federica Conti, Andy Galpin, and Brad Schoenfeld. TL;DR: we used Fitbod’s data to find out what sorts of characteristics and behaviors predict that you’ll keep lifting weights consistently long past your first workout.
At Fitbod we have been mining our data internally for years in order to improve our workout recommendations, but this is the first time we’re doing science out in the open. This was fun and we expect many more papers to come!
Abstract:
Background:
Digital fitness applications offer unprecedented access to structured training programs, yet the behavioral factors that predict sustained engagement in real-world settings remain incompletely understood.
Methods:
This observational study analyzed data from 522,994 adult digital fitness app users (mean age: 34.2 ± 9.8) of various experience levels followed for six months from their first recorded workout. Long-term adherence was defined as completing at least one workout per week, allowing up to three missed weeks. Adherence trajectories were examined, along with associations between early training behaviors (training frequency, workout duration, exercise composition, equipment diversity), demographic factors, and time to dropout. Effect modifications by sex, workout duration, and training experience were also investigated.
Results:
Adherence declined steadily over time, with 18.1% of beginner users remaining adherent at 6 months. The median dropout time was 14 weeks. Higher sustained participation was observed in older versus younger (51+: 23.8%; 18-40: 15%), male versus female (M: 19.9%; F: 15.2%), and more experienced users (intermediate: 28.6%; advanced: 38.2%). Training consistency during the first 28 days was the strongest predictor of adherence and exhibited a protective association that attenuated over time. Greater diversity in equipment use and higher emphasis on resistance exercise were also associated with lower dropout risk. Longer workout duration was associated with improved adherence among users who trained more frequently, particularly early in follow-up.
Conclusions:
Early consistency and structured training behaviors were strongly associated with long-term adherence among beginners, with relatively modest differences based on age and sex. These findings suggest that frequent training sessions and engagement with resistance-based exercise during the initial stages of exercise adoption may be relevant behavioral correlates of sustained engagement.
29 Sep 2023
This just came out and I wrote one of the chapters. The book is meant for people who did math-heavy PhDs and are considering non-academic careers. It should help them realize that if they know what a p-value is then they can do a lot more than please journal editors and teach bored undergrads; they can use their quant skills to have impact in the real world (and make a lot more money than their academic counterparts). If you check it out let me know what you think.

07 Jul 2021
New paper. Abstract:
Brazilian banks commonly use linear regression to appraise real estate: they regress price on features like area, location, etc, and use the resulting model to estimate the market value of the target property. But Brazilian banks do not test the predictive performance of those models, which for all we know are no better than random guesses. That introduces huge inefficiencies in the real estate market. Here we propose a machine learning approach to the problem. We use real estate data scraped from 15 thousand online listings and use it to fit a boosted trees model. The resulting model has a median absolute error of 8,16%. We provide all data and source code.
18 Jun 2021
New manuscript. Abstract:
How much insider trading happens in Brazil’s stock market? Previous research has used the model proposed by Easley et al. [1996] to estimate the probability of insider trading (PIN) for different stocks in Brazil. Those estimates have a number of problems: i) they are based on a factorization that biases the PIN downward, especially for high-activity stocks; ii) they fail to account for boundary solutions, which biases most PIN estimates upward (and a few of them downward); and iii) they are a decade old and therefore based on a very different market (for instance, the number of retail investors grew from 600 thousand in 2011 to 3.5 million in 2021). In this paper I address those three problems and estimate the probability of insider trading for 431 different stocks in the Brazilian stock market, for each quarter from October 2019 to March 2021.