cv
General Information
Full Name | Haoyuan Peng - 彭浩源 |
phy_fdu@163.com | |
Homepage | haoyuanpeng.github.io |
Education
-
2015 - 2018
Master's degree, Computer Science
Software School of Fudan University
- Under the supervision of Professor Zheng Xiaoqing, my main research areas included Parsing and Word Embeddings, with results published at AAAI-17 and AAAI-18.
- Recognized as an Outstanding Graduate of Shanghai in 2018.
-
2011 - 2015
Bachelor's degree, Computer Software Engineering
Software School of Fudan University
- Under the supervision of Professor Zheng Xiaoqing, my main research area was Parsing, with results published at IJCAI-15.
- Completed the Fudan University Undergraduate Research Project (Denghui Program) under the supervision of Professor Zheng Xiaoqing.
Experience
-
2024.04 - Present
Senior Algorithm Engineer
ByteDance
- Utilizing SFT, DPO and RAG of LLMs, I build AI avatars for creators on Douyin that possess similar personas, knowledge scopes and speaking styles.
- Design and develop a long-term memory framework that provides chatbots with memory capabilities beyond the conversational context window, including memory summarization, updating, retrieval, and evaluation.
-
2023.01 - 2024.04
Senior Algorithm Engineer
Learnable.AI, Shanghai, China
- Conducted research on enhancing the reasoning error detection capabilities of LLMs through Chain-of-Thought (CoT) technology, and co-first authored the research findings published at IJCAI-24.
- Trained large-scale models ranging from 7B to 70B parameters for real-world systems in the education domain. Applications include directly grading students' mathematical free-response answers and translating student responses into internally defined languages.
- Investigated OCR result correction algorithms for scenarios involving student responses, effectively addressing the challenge of distinguishing between student writing errors and OCR recognition errors.
- Evaluated as Excellent in the performance assessment of probation period.
-
2018.07 - 2022.12
Senior Researcher
Tencent, Shanghai, China
- Led the development of multiple video information extraction algorithms within the Yunzhi Media AI Platform, including key information extraction from video frames, video tagging, and error correction for ASR/OCR. The video tagging algorithm achieved the second place in the AIWIN 2021 Algorithm Technology Competition without using the competition's training data.
- Responsible for the development of NLP algorithms as part of the Public Opinion Analysis System, tailored for securities industry regulators.
- Implemented traditional ML algorithms and deep learning-based NLP algorithms on the Tencent TI-ONE ML Platform, enabling users to train models on their custom data.
- Rated as Five-Star Performance once and Four-Star Performance twice in performance assessments.
-
2014 - 2015
Data Analyst Intern
eBay, Shanghai, China
Skills
- Proficient in conducting cutting-edge research in natural language processing, with a track record of publishing papers in top conferences.
- Experienced in all aspects of the full life cycle of AI/ML projects, including training, inference, engineering, and integration.
- Highly skilled in Python programming, capable of writing high-quality python codes.
- Familiar with Docker, capable of effectively deploying applications, managing versions, and migrating to enhance development efficiency.
Service
- Conference Reviewer: KDD-23, EMNLP-23, SDM-24, COLING-24, ACL-24, MM-24
- External Reviewer: ACL-23, ECAI-23
Honors and Awards
-
2021
- Second Place of AIWIN 2021 Algorithm Technology Competition
-
2020
- Tencent New Code Culture Award - Award for outstanding internal open source code projects
-
2018
- Outstanding Graduate in Shanghai
Publications
-
2024
LLMs Can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought
Zhuoxuan Jiang, Haoyuan Peng (Equal Contribution), Shanshan Feng, Fan Li, Dongsheng Li. Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence.
-
2023
VKIE: The Application of Key Information Extraction on Video Text
An, Siyu and Liu, Ye and Peng, Haoyuan and Yin, Di. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track.
-
2023
OSAN: A One-Stage Alignment Network To Unify Multimodal Alignment and Unsupervised Domain Adaptation
Liu, Ye and Qiao, Lingfeng and Lu, Changchong and Yin, Di and Lin, Chen and Peng, Haoyuan and Ren, Bo. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
-
2022
Grafting Pre-trained Models for Multimodal Headline Generation
Qiao, Lingfeng and Wu, Chen and Liu, Ye and Peng, Haoyuan and Yin, Di and Ren, Bo. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track.
-
2019
Detecting Abnormal Start-Ups, Unusual Resource Consumptions of the Smart Phone: A Deep Learning Approach
ZHENG, Xiaoqing and LU, Yaping and PENG, Haoyuan and FENG, Jiangtao and ZHOU, Yi and JIANG, Min and MA, Li and ZHANG, Ji and JI, Jie. ZTE Communications.
-
2018
Attention-based belief or disbelief feature extraction for dependency parsing
Peng, Haoyuan and Liu, Lu and Zhou, Yi and Zhou, Junying and Zheng, Xiaoqing. Proceedings of the AAAI Conference on Artificial Intelligence.
-
2018
RNN-based sequence-preserved attention for dependency parsing
Zhou, Yi and Zhou, Junying and Liu, Lu and Feng, Jiangtao and Peng, Haoyuan and Zheng, Xiaoqing. Proceedings of the AAAI Conference on Artificial Intelligence.
-
2017
Learning context-specific word/character embeddings
Zheng, Xiaoqing and Feng, Jiangtao and Chen, Yi and Peng, Haoyuan and Zhang, Wenqing. Proceedings of the AAAI Conference on Artificial Intelligence.
-
2015
Character-based parsing with convolutional neural network
Zheng, Xiaoqing and Peng, Haoyuan and Chen, Yi and Zhang, Pengjing and Zhang, Wenqiang. Twenty-Fourth International Joint Conference on Artificial Intelligence.