Conclusion

The development of large language models has reached a critical juncture. It is no longer sufficient to ask only what these models are capable of doing. We must also grapple with questions such as who is—and is not—involved in and accounted for in the creation of LLMs; what the impacts of these models are at both the individual and societal level; and which values and principles these technologies uphold and promote. This survey examines how such human-centered principles are inherently intertwined with the design, training, and deployment of LLMs.

Ultimately, the trajectory of LLM development must be guided by more than technical benchmarks and capability milestones. The questions of inclusion, impact, and values explored in this survey are not peripheral concerns to be addressed after the fact; they are foundational to what these systems become and who they serve. By centering human-centered principles at every stage of the LLM lifecycle, from design and data curation to training and deployment, researchers and practitioners can work toward models that are not only more capable, but also more equitable, accountable, and aligned with the diverse needs of the people they affect. The path forward demands a broader coalition of voices, a more expansive notion of responsibility, and a sustained commitment to ensuring that progress in AI is measured not only by what these models can do, but by the kind of world their development helps to build.

Acknowledgments

The idea for this manuscript originated in the Fall 2024 offering of Stanford’s CS 329X course on HCLLMs, taught by the instructor Diyi Yang and assistants Rose E. Wang and Caleb Ziems. Diyi Yang developed the initial structure and outline of the paper. The enrolled students collaborated on a first draft of the manuscript as part of their coursework, with each student pair assigned responsibility for drafting a subsection of the survey according to the course assignment structure. The teaching assistants subsequently reviewed, graded, and provided feedback on these drafts. In Winter 2025, a subset of students continued to revise and expand the manuscript under the primary direction of Rose E. Wang and the secondary direction of Caleb Ziems.

Caleb Ziems and Dora Zhao largely re-wrote and restructured the manuscript, with significant conceptual revisions and new chapters, to produce the current version. This revision phase was supported by additional contributions from Sunny Yu and Advit Deepak. All authors reviewed and approved the final manuscript.

The core authors were jointly responsible for deciding on and writing the paper in its final form. The core are as follows:

Diyi Yang designed the overall structure, chapter outline, and course material that initialized this survey. She supervised the project and provided guidance on the direction and scope of the survey at every stage.
Caleb Ziems restructured the paper from its initial conception, wrote Introduction and Data for HCLLMs, and co-wrote NLP for HCLLMs, Evaluation, and Responsible Human-Centered LLMs. He also contributed to structuring and reviewing all student drafts, and co-lead the second round of student revisions.
Dora Zhao restructured the paper from its initial conception, wrote HCI for HCLLMs and Case Study: HCLLMs and the Future of Work, and co-wrote NLP for HCLLMs, Evaluation, and Responsible Human-Centered LLMs. Dora also co-designed the figures.

Additionally, the leadership of this work included:

Rose E. Wang contributed to structuring and reviewing all student drafts. She led the second round of student revisions.
Matthew Jörke contributed From Human-Centered Design Challenges to Technical Artifacts and provided suggestions on HCI for HCLLMs more broadly.
Ahmad Rushdi contributed guidance and final edits.

Student contributions are as follows:

Anshika Agarwal co-wrote Quantitative Evaluation and edited Evaluation.
Harshvardhan Agarwal helped with the course paper and edited Evaluation.
Gabriela Aranguiz-Dias co-wrote Quantitative Evaluation.
Aditri Bhagirath helped with the course paper and second-round student edits.
Justine Breuch co-wrote Bias and Fairness Evaluation.
Huanxing Chen helped with the course paper.
Ruishi Chen helped with the course paper.
Sarah Chen co-wrote the first draft of Interpretable and Explainable HCLLMs.
Advit Deepak co-wrote Responsible Human-Centered LLMs.
Haocheng Fan co-wrote Human-Level Evaluations
William Fang helped with the course paper and second-round student edits.
Cat Gonzales Fergesen helped with the course paper.
Daniel Frees co-wrote Supervised Fine-tuning for HCLLMs.
Tian Gao co-wrote Safety Evaluations.
Ziqing Huang co-wrote Safety Evaluations.
Vishal Jain co-wrote Safe HCLLMs
Yucheng Jiang co-wrote Human-Level Evaluations
Kirill Kalinin helped with second-round student edits.
Su Doga Karaca co-wrote Human-Level Evaluations and edited Evaluation.
Arpandeep Khatua helped with the course paper and second-round student edits.
Teland La helped with the course paper.
Isabelle Levent helped with the course paper.
Miranda Li helped with the course paper and second-round student edits.
Xinling Li co-wrote Consent and Ownership.
Yongce Li co-wrote Expanding Data Sources: Synthetic and Non-Traditional Data.
Angela Liu helped with the course paper and second-round student edits.
Minsik Oh co-wrote Quantitative Evaluation, and edited Evaluation.
Nathan J. Paek helped with the course paper and second-round student edits.
Anthony Qin helped with the course paper.
Emily Redmond co-wrote Scaling Human Centered LLMs.
Michael J. Ryan wrote Pluralism and co-wrote the remainder of NLP for HCLLMs.
Aadesh Salecha co-wrote Bias and Fairness Evaluation.
Xiaoxian Shen co-wrote Consent and Ownership.
Pranava Singhal helped with the course paper.
Shashanka Subrahmanya co-wrote Societal-level Evaluation
Mei Tan co-wrote Benchmarks.
Irawadee Thawornbut helped with the course paper.
Michelle Vinocour helped with the course paper.
Xiaoyue Wang co-wrote Expanding Data Sources: Synthetic and Non-Traditional Data
Zheng Wang co-wrote Interpretable and Explainable HCLLMs.
Henry Jin Weng helped with the course paper.
Pawan Wirawarn helped with the course paper.
Shirley Wu helped with the course paper.
Sophie Wu co-wrote Learning from Human Preferences.
Yichen Xie co-wrote Learning from Human Preferences.
Patrick Ye helped with the course paper and second-round student edits.
Sunny Yu co-wrote Quantitative Evaluation and Responsible Human-Centered LLMs.
Sean Zhang helped with the course paper and second-round student edits.
Yutong Zhang co-designed the figure in Introduction, the figure in HCI for HCLLMs, the figure in Evaluation, the figure in Responsible Human-Centered LLMs.
Cathy Zhou co-wrote Supervised Fine-tuning for HCLLMs.
Yiling Zhao co-wrote Multilinguality.

Conclusion

Acknowledgments

Graph View

Backlinks

Literature NotesLiterature NotesLiterature NotesLiterature NotesLiterature NotesLiterature Notes

Acknowledgments

Graph View

Backlinks