Cancel

Papers

Papers I’ve had the privilege of contributing to:

Title/Link	Author(s)	Year	Description
Portfolio contstruction as linearly constrained separable optimization	Moehle et al	2022	ADMM-based fast portfolio optimization.
Finding AI-Generated Faces in the Wild	Aniano et al	2023	AI-generated face detection at scale.

As time permits, I also like to (try) to keep read papers about applied math and machine learning. Below you’ll find an archive of papers I’ve read that I think are worthwhile:

Title/Link	Author(s)	Year	Description
Scaling and evaluating sparse autoencoders	Gao et al, OpenAI	2024	This paper discusses a top-k sparse autoencoder approach to explainability in large language models.
Sparse Autoencoders Find Highly Interpertable Feature in Language Models	Cunningham at al	2023	This paper discusses using sparse autoencoders to try to learn monosemantic interpretable features in language models.
The Platonic Representation Hypothesis	Huh et al	2024	This paper argues that large deep neural network models are converging to similar underlying representations of reality.
DoRA: Weight-Decomposed Low-Rank Adaptation	Liu et al, NVIDIA	2024	This paper introduces a method for parameter-efficient fine-tuning that decomponses a pre-trained weight into magnitude and direction component.
The Era of 1-bit LLMS: All Large Language Models are in 1.58 Bits	Ma et al, Microsoft	2024	This paper introduces a 1-bit LLM variant where parameters are ternary. It matches performance of similarily sized full-precision models with increased efficiency.
Revisiting k-means: New Algorithms via Bayesian Nonparametrics	Kulis and Jordan	2012	This paper introduces a Bayesian nonparametric approach to clustering. This leads to an elegant algorithm that doesn’t require us to choose k.
Accelerating Large Language Model Decoding with Speculative Sampling	Chen et al, DeepMind	2023	This paper introduces a method to speed up the decoding process in large language models. It explores speculative sampling techniques to enhance performance efficiency.
Modularity and community structure in networks	M.E.J. Newman	2006	Elegant spectral method for community detection.
Scalable Hierarchical Agglomerative Clustering	Monath et al, Google	2021	A scalable, level-based approach to hierarchical agglomerative clustering.
Pearl: A Production-Ready Reinforcement Learning Agent	Zhu et al, Meta	2023	This paper introduces Pearl, a modular reinforcement learning agent designed for production environments.
Discovering faster matrix multiplication algorithms with reinforcement learning	Fawzi et al, DeepMind	2022	This paper introduces AlphaTensor, an RL-based algorithm to find faster ways to multiply matrices.

Recent Update

Trending Tags

misc computer-science statistics linear-algebra machine-learning calculus infinity optimization combinatorics diff-eq

Trending Tags

misc computer science statistics linear algebra machine learning calculus infinity optimization combinatorics diff eq