Technical Articles

Value Augmented Sampling for Language Model Alignment and Personalization

Seungwook Han , Idan Shenfeld , Akash Srivastava , Yoon Kim , Pulkit Agrawal

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

PHUDGE: PHI-3 as Scalable Judge

Mahesh Deshwal , Apoorva Chawla

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

RLHF Workflow: From Reward Modeling to Online RLHF

Hanze Dong , Wei Xiong , Bo Pang , Haoxiang Wang , Han Zhao , Yingbo Zhou , Nan Jiang , Doyen Sahoo , Caiming Xiong , Tong Zhang

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

LoRA Learns Less and Forgets Less

Dan Biderman , Jacob Portes , Jose Javier Gonzalez Ortiz , Mansheej Paul , Philip Greengard , Connor Jennings , Daniel King , Sam Havens , Vitaliy Chiley , Jonathan Frankle , Cody Blakeney , John P. Cunningham

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model

Xu Wanting , Liu Yang , He Langping , Huang Xucheng , Jiang Ling

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Oleksiy Ostapenko , Zhan Su , Edoardo Maria Ponti , Laurent Charlin , Nicolas Le Roux , Matheus Pereira , Lucas Caccia , Alessandro Sordoni

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization

Jialong Guo , Xinghao Chen , Yehui Tang , Yunhe Wang

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Ting Jiang , Shaohan Huang , Shengyue Luo , Zihan Zhang , Haizhen Huang , Furu Wei , Weiwei Deng , Feng Sun , Qi Zhang , Deqing Wang , Fuzhen Zhuang

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Attention as an RNN

Leo Feng , Frederick Tung , Hossein Hajimirsadeghi , Mohamed Osama Ahmed , Yoshua Bengio , Greg Mori

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Dense Connector for MLLMs

Huanjin Yao , Wenhao Wu , Yuxin Song , Mengxi Zhang , Haocheng Feng , Yifan Sun , Zhiheng Li , Wanli Ouyang , Jingdong Wang

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability

Fei Zhao , Taotian Pang , Chunhui Li , Zhen Wu , Junjie Guo , Shangyu Xing , Xinyu Dai

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

SimPO: Simple Preference Optimization with a Reference-Free Reward

Yu Meng , Mengzhou Xia , Danqi Chen

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Instruction Tuning With Loss Over Instructions

Zhengyan Shi , Adam X. Yang , Bin Wu , Laurence Aitchison , Emine Yilmaz , Aldo Lipani

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

The Road Less Scheduled

Aaron Defazio , Xingyu (Alice) Yang , Harsh Mehta , Konstantin Mishchenko , Ahmed Khaled , Ashok Cutkosky

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Wenyu Du , Tongxu Luo , Zihan Qiu , Zeyu Huang , Yikang Shen , Reynold Cheng , Yike Guo , Jie Fu

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

gzip Predicts Data-dependent Scaling Laws

Rohan Pandey

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning

Runqian Wang , Soumya Ghosh , David Cox , Diego Antognini , Aude Oliva , Rogerio Feris

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections

Roy Miles , Pradyumna Reddy , Ismail Elezi , Jiankang Deng

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

LLaMA-NAS: Efficient Neural Architecture Search For Large Language Models

Anthony Sarah , Sharath Nittur Sridhar , Maciej Szankin , Sairam Sundaresan

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Contextual Position Encoding: Learning to Count What?s Important

Olga Golovneva , Tianlu Wang , Jason Weston , Sainbayar Sukhbaatar

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Show, Don?t Tell: Aligning Language Models with Demonstrated Feedback

Omar Shaikh , Michelle Lam , Joey Hejna , Yijia Shao , Michael Bernstein , Diyi Yang

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

Tianwen Wei , Bo Zhu , Liang Zhao , Cheng Cheng , Biye Li , Weiwei Lü , Peng Cheng , Jianhao Zhang , Xiaoyu Zhang , Liang Zeng , Xiaokun Wang , Yutuan Ma , Rui Hu , Shuicheng Yan , Han Fang , Yahui Zhou

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

OLoRA: Orthonormal Low-Rank Adaptation of Large Language Models

Kerim Büyükakyüz

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

The Geometry of Categorical and Hierarchical Concepts in Large Language Models

Kiho Park , Yo Joong Choe , Yibo Jiang , Victor Veitch

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Towards Scalable Automated Alignment of LLMs: A Survey

Boxi Cao , Keming Lu , Xinyu Lu , Jiawei Chen , Mengjie Ren , Hao Xiang , Peilin Liu , Yaojie Lu , Ben He , Xianpei Han , Le Sun , Hongyu Lin , Bowen Yu

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Scalable MatMul-free Language Modeling

Rui-Jie Zhu , Yu Zhang , Ethan Sifferman , Tyler Sheaves , Yiqiao Wang , Dustin Richmond , Peng Zhou , Jason K. Eshraghian

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Block Transformer: Global-to-Local Language Modeling for Fast Inference

Namgyu Ho , Sangmin Bae , Taehyeon Kim , Hyunjik Jo , Yireun Kim , Tal Schuster , Adam Fisch , James Thorne , Se-Young Yun

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

Ling Yang , Zhaochen Yu , Tianjun Zhang , Shiyi Cao , Minkai Xu , Wentao Zhang , Joseph E. Gonzalez , Bin Cui

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

The Prompt Report: A Systematic Survey of Prompting Techniques

Sander Schulhoff , Michael Ilie , Nishant Balepur , Konstantine Kahadze , Amanda Liu , Chenglei Si , Yinheng Li , Aayush Gupta , HyoJung Han , Sevien Schulhoff , Pranav Sandeep Dulepet , Saurav Vidyadhara , Dayeon Ki , Sweta Agrawal , Chau Pham , Gerson Kroiz , Feileen Li , Hudson Tao , Ashay Srivastava , Hevander Da Costa , Saloni Gupta , Megan L. Rogers , Inna Goncearenco , Giuseppe Sarli , Igor Galynker , Denis Peskoff , Marine Carpuat , Jules White , Shyamal Anadkat , Alexander Hoyle , Philip Resnik

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Transformers need glasses! Information over-squashing in language tasks

Federico Barbero , Andrea Banino , Steven Kapturowski , Dharshan Kumaran , João G.M. Araújo , Alex Vitvitskyi , Razvan Pascanu , Petar Veličković

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Are We Done with MMLU?

Aryo Pradipta Gema , Joshua Ong Jun Leang , Giwon Hong , Alessio Devoto , Alberto Carlo Maria Mancino , Rohit Saxena , Xuanli He , Yu Zhao , Xiaotang Du , Mohammad Reza Ghasemi Madani , Claire Barale , Robert McHardy , Joshua Harris , Jean Kaddour , Emile van Krieken , Pasquale Minervini

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization

Zhanhao Liang , Yuhui Yuan , Shuyang Gu , Bohan Chen , Tiankai Hang , Mingxi Cheng , Ji Li , Liang Zheng

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

The Dead Internet Theory: A Survey on Artificial Interactions and the Future of Social Media

Prathamesh Muzumdar , Sumanth Cheemalapati , Srikanth Reddy RamiReddy , Kuldeep Singh , George Kurian , Apoorva Muley

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Boosting Large-scale Parallel Training Efficiency with C4 : A Communication-Driven Approach

Jianbo Dong , Bin Luo , Jun Zhang , Pengcheng Zhang , Fei Feng , Yikai Zhu , Ang Liu , Zian Chen , Yi Shi , Hairong Jiao , Gang Lu , Yu Guan , Ennan Zhai , Wencong Xiao , Hanyu Zhao , Man Yuan , Siran Yang , Xiang Li , Jiamang Wang , Rui Men , Jianwei Zhang , Huang Zhong , Dennis Cai , Yuan Xie , Binzhang Fu

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

CRAG ? Comprehensive RAG Benchmark

Xiao Yang , Kai Sun , Hao Xin , Yushi Sun , Nikita Bhalla , Xiangsen Chen , Sajal Choudhary , Rongze Daniel Gui , Ziran Will Jiang , Ziyu Jiang , Lingkun Kong , Brian Moran , Jiaqi Wang , Yifan Ethan Xu , An Yan , Chenyu Yang , Eting Yuan , Hanwen Zha , Nan Tang , Lei Chen , Nicolas Scheffer , Yue Liu , Nirav Shah , Rakesh Wanga , Anuj Kumar , Wen-tau Yih , Xin Luna Dong

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Wildbench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Bill Yuchen Lin , Yuntian Deng , Khyathi Chandu , Faeze Brahman , Abhilasha Ravichander , Valentina Pyatkin , Nouha Dziri , Ronan Le Bras , Yejin Choi

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Mixture-of-Agents Enhances Large Language Model Capabilities

Junlin Wang , Jue Wang , Ben Athiwaratkun , Ce Zhang , James Zou

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

BERTs are Generative In-Context Learners

David Samuel

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Jianing Yang , Xuweiyi Chen , Nikhil Madaan , Madhavan Iyengar , Shengyi Qian , David F. Fouhey , Joyce Chai

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Creativity Has Left the Chat: The Price of Debiasing Language Models

Behnam Mohammadi

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Peize Sun , Yi Jiang , Shoufa Chen , Shilong Zhang , Bingyue Peng , Ping Luo , Zehuan Yuan

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Margin-aware Preference Optimization for Aligning Diffusion Models without Reference

Jiwoo Hong , Sayak Paul , Noah Lee , Kashif Rasul , James Thorne , Jongheon Jeong

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

HUSKY: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Joongwon Kim , Bhargavi Paranjape , Tushar Khot , Hannaneh Hajishirzi

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters

Yixin Song , Haotong Xie , Zhengyan Zhang , Bo Wen , Li Ma , Zeyu Mi , Haibo Chen

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching

Xiaoying Zhang , Baolin Peng , Ye Tian , Jingyan Zhou , Yipeng Zhang , Haitao Mi , Helen Meng

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

An Image is Worth 32 Tokens for Reconstruction and Generation

Qihang Yu , Mark Weber , Xueqing Deng , Xiaohui Shen , Daniel Cremers , Liang-Chieh Chen

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

TextGrad: Automatic ?Differentiation? via Text

Mert Yuksekgonul , Federico Bianchi , Joseph Boen , Sheng Liu , Zhi Huang , Carlos Guestrin , James Zou

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

Simple and Effective Masked Diffusion Language Models

Subham Sekhar Sahoo , Marianne Arriola , Yair Schiff , Aaron Gokaslan , Edgar Marroquin , Justin T Chiu , Alexander Rush , Volodymyr Kuleshov

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models

An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding

Tong Wu , Yanpeng Zhao , Zilong Zheng

Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models