Technical Articles
Value Augmented Sampling for Language Model Alignment and Personalization
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
PHUDGE: PHI-3 as Scalable Judge
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
RLHF Workflow: From Reward Modeling to Online RLHF
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
LoRA Learns Less and Forgets Less
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Towards Modular LLMs by Building and Reusing a Library of LoRAs
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
SimPO: Simple Preference Optimization with a Reference-Free Reward
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Instruction Tuning With Loss Over Instructions
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
gzip Predicts Data-dependent Scaling Laws
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
LLaMA-NAS: Efficient Neural Architecture Search For Large Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Contextual Position Encoding: Learning to Count What?s Important
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Show, Don?t Tell: Aligning Language Models with Demonstrated Feedback
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
OLoRA: Orthonormal Low-Rank Adaptation of Large Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
The Geometry of Categorical and Hierarchical Concepts in Large Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Towards Scalable Automated Alignment of LLMs: A Survey
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Scalable MatMul-free Language Modeling
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
The Prompt Report: A Systematic Survey of Prompting Techniques
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Transformers need glasses! Information over-squashing in language tasks
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
The Dead Internet Theory: A Survey on Artificial Interactions and the Future of Social Media
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Boosting Large-scale Parallel Training Efficiency with C4 : A Communication-Driven Approach
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
CRAG ? Comprehensive RAG Benchmark
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Wildbench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Mixture-of-Agents Enhances Large Language Model Capabilities
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
BERTs are Generative In-Context Learners
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Creativity Has Left the Chat: The Price of Debiasing Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
HUSKY: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
An Image is Worth 32 Tokens for Reconstruction and Generation
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
TextGrad: Automatic ?Differentiation? via Text
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
Simple and Effective Masked Diffusion Language Models
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models
An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding
Artificial Intelligence , Machine Learning , Natural Language Processing , Large Language Models