Home Papers
Papers
Cancel

Papers

Papers I’ve had the privilege of contributing to:

Title/LinkAuthor(s)YearDescription
Portfolio contstruction as linearly constrained separable optimizationMoehle et al2022ADMM-based fast portfolio optimization.
Finding AI-Generated Faces in the WildAniano et al2023AI-generated face detection at scale.

As time permits, I also like to (try) to keep read papers about applied math and machine learning. Below you’ll find an archive of papers I’ve read that I think are worthwhile:

Title/LinkAuthor(s)YearDescription
Scaling and evaluating sparse autoencodersGao et al, OpenAI2024This paper discusses a top-k sparse autoencoder approach to explainability in large language models.
Sparse Autoencoders Find Highly Interpertable Feature in Language ModelsCunningham at al2023This paper discusses using sparse autoencoders to try to learn monosemantic interpretable features in language models.
The Platonic Representation HypothesisHuh et al2024This paper argues that large deep neural network models are converging to similar underlying representations of reality.
DoRA: Weight-Decomposed Low-Rank AdaptationLiu et al, NVIDIA2024This paper introduces a method for parameter-efficient fine-tuning that decomponses a pre-trained weight into magnitude and direction component.
The Era of 1-bit LLMS: All Large Language Models are in 1.58 BitsMa et al, Microsoft2024This paper introduces a 1-bit LLM variant where parameters are ternary. It matches performance of similarily sized full-precision models with increased efficiency.
Revisiting k-means: New Algorithms via Bayesian NonparametricsKulis and Jordan2012This paper introduces a Bayesian nonparametric approach to clustering. This leads to an elegant algorithm that doesn’t require us to choose k.
Accelerating Large Language Model Decoding with Speculative SamplingChen et al, DeepMind2023This paper introduces a method to speed up the decoding process in large language models. It explores speculative sampling techniques to enhance performance efficiency.
Modularity and community structure in networksM.E.J. Newman2006Elegant spectral method for community detection.
Scalable Hierarchical Agglomerative ClusteringMonath et al, Google2021A scalable, level-based approach to hierarchical agglomerative clustering.
Pearl: A Production-Ready Reinforcement Learning AgentZhu et al, Meta2023This paper introduces Pearl, a modular reinforcement learning agent designed for production environments.
Discovering faster matrix multiplication algorithms with reinforcement learningFawzi et al, DeepMind2022This paper introduces AlphaTensor, an RL-based algorithm to find faster ways to multiply matrices.