Dingjie Song

Welcome! I am a Ph.D. student in the Department of Computer Science and Engineering at Lehigh University, advised by Prof. Lichao Sun. Previously, I was a research assistant with the CUHK-Shenzhen NLP group, mentored by Dr. Benyou Wang. I obtained my M.E. from the Software Institute and Natural Language Processing Group at Nanjing University, under the guidance of Dr. Xinyu Dai and Dr. Jidong Ge. Before that, I completed my B.E. at the Software Institute of Nanjing University.

Email: dingjiesong.cs@gmail.com

Google Scholar / CV

Links: Research Overview / Updates / Awards / Papers

Research Overview

My research interests are in Natural Language Processing, especially intelligent interactive systems πŸ€– and Domain-specific LLMs πŸ‘¨πŸ»β€βš•οΈ and the following directions:

Updates

Jan 2025: πŸŽ‰πŸŽ‰ MLLM-Bench was accepted to NAACL’25 main conference!

Dec 2024: πŸŽ‰πŸŽ‰ TRIM was accepted to COLING’25 main conference!

Nov 2024: MM-Detect πŸ•΅οΈ released! MM-Detect is the first Data Contamination Detection Framework for MLLMs! More information can be found in πŸ“ƒ paper and the GitHub.

Sep 2024: TRIM βœ‚οΈ released! TRIM is a simple yet effective Image Token Reduction Method for efficient MLLMs! More information can be found in πŸ“ƒ paper, πŸ€— HuggingFace and the GitHub.

Sep 2024: LongLLaVA πŸπŸ¦™ released! LongLLaVA is the first MLLM with hybrid architecture that can handle up to 1000 images! More information can be found in πŸ“ƒ paper, πŸ€— HuggingFace and the GitHub. πŸ₯ˆ#2 Paper of the day on Huggingface Daily Paper.

July 2024: πŸŽ‰πŸŽ‰ Two papers MileBench and HuatuoGPT2 were accepted to COLM’24 main conference!

April 2024: MileBench πŸ›£οΈ released! MileBench is a pioneering benchmark designed to rigorously test the MultImodal Long-contExt capabilities of MLLMs. More information can be found on the 🌐 website, πŸ“ƒ paper, πŸ€— HuggingFace and the GitHub.

March 2024: πŸŽ‰πŸŽ‰ Two papers CMB and AceGPT were accepted to NAACL’24 main conference!

Before 2024

Nov 2023: HuatuoGPT2 released! Try it out on the 🌐 demo! HuatuoGPT2 employs an innovative domain adaptation method to significantly boost its medical knowledge and dialogue proficiency and showcases SOTA performance in several medical benchmarks, especially surpassing GPT-4 in expert evaluations and the fresh medical licensing exams. More info can be found in πŸ“ƒ paper and πŸ€— HuggingFace.

Sep 2023: We publish AceGPT that achieved top performance among open-source Arabic language models in benchmark tests. More info can be found in πŸ“ƒ paper and πŸ€— HuggingFace.

Aug 2023: Checkout our πŸ“ƒ new paper that focuses on benchmarking prevalent Medical LLMs for their medical knowledge and clinical diagnostic capabilities. More information can be found on the 🌐 website and the πŸ€— HuggingFace.

Jul 2023: Start the journey in CUHK-sz as a research assistant under the guidance of Benyou Wang.

Jun 2023: I defended my master's degree and got my master's degree in software engineering. Thanks to all those who have supported me.

Aug 2022 - Apr 2023: Finished my internship with Jiaxing Zhang on LLM SFT.

Papers

Google Scholar

Preprints

2025

2024

2023

2022


Awards

  • Outstanding Graduate Student, Nanjing University, 2022
  • Yingcai Scholarship, Nanjing University, 2022
  • Renmin Scholarship (People’s Scholarship), Nanjing University, 2018-2021
  • Third Runner’s Up in 15th Citi Cup Financial Innovation Application Competition, Citigroup, 2019
  • Second Runner’s Up in 2019 β€œChain to Future” University Blockchain Technology Application Competition, CCF, 2019
  • Outstanding Student Leader of the Communist Youth League, Nanjing University, 2018-2019

Services

  • Conference reviewer: EMNLP, ACL Rolling Review