Dingjie Song

Welcome! I am a research assistant affiliated with the CUHK-Shenzhen NLP group, under the guidance of Dr. Benyou Wang. I obtained my M.E. from the Software Institute and the Natural Language Processing Group at Nanjing University, advised by Dr. Xinyu Dai and Dr. Jidong Ge. Prior to this, I completed my B.E. at the Software Institute of Nanjing University.

Email: dingjiesong.cs@gmail.com

Google Scholar / CV

Links: Research Overview / Updates / Awards / Papers

Research Overview

My research interests are in Natural Language Processing, especially intelligent interactive systems πŸ€– and Domain-specific LLMs πŸ‘¨πŸ»β€βš•οΈ and the following directions:

Updates

Nov 2024: MM-Detect πŸ•΅οΈ released! MM-Detect is the first Data Contamination Detection Framework for MLLMs! More information can be found in πŸ“ƒ paper and the GitHub.

Sep 2024: TRIM βœ‚οΈ released! TRIM is a simple yet effective Image Token Reduction Method for efficient MLLMs! More information can be found in πŸ“ƒ paper, πŸ€— HuggingFace and the GitHub.

Sep 2024: LongLLaVA πŸπŸ¦™ released! LongLLaVA is the first MLLM with hybrid architecture that can handle up to 1000 images! More information can be found in πŸ“ƒ paper, πŸ€— HuggingFace and the GitHub. πŸ₯ˆ#2 Paper of the day on Huggingface Daily Paper.

July 2024: πŸŽ‰πŸŽ‰ Two papers MileBench and HuatuoGPT2 were accepted to COLM’24 main conference!

April 2024: MileBench πŸ›£οΈ released! MileBench is a pioneering benchmark designed to rigorously test the MultImodal Long-contExt capabilities of MLLMs. More information can be found on the 🌐 website, πŸ“ƒ paper, πŸ€— HuggingFace and the GitHub.

March 2024: πŸŽ‰πŸŽ‰ Two papers CMB and AceGPT were accepted to NAACL’24 main conference!

Before 2024 **Nov 2023**: HuatuoGPT2 released! Try it out on the [🌐 demo](https://www.huatuogpt.cn/#/)! HuatuoGPT2 employs an innovative domain adaptation method to significantly boost its medical knowledge and dialogue proficiency and showcases SOTA performance in several medical benchmarks, especially **surpassing GPT-4 in expert evaluations and the fresh medical licensing exams**. More info can be found in [πŸ“ƒ paper](https://arxiv.org/abs/2311.09774) and [πŸ€— HuggingFace](https://huggingface.co/FreedomIntelligence/HuatuoGPT2-34B). **Sep 2023**: We publish AceGPT that achieved **top performance** among open-source Arabic language models in benchmark tests. More info can be found in [πŸ“ƒ paper](https://arxiv.org/abs/2309.12053) and [πŸ€— HuggingFace](https://huggingface.co/FreedomIntelligence/AceGPT-13B-chat). **Aug 2023**: [Checkout our πŸ“ƒ new paper](https://arxiv.org/abs/2308.08833) that focuses on benchmarking prevalent Medical LLMs for their medical knowledge and clinical diagnostic capabilities. More information can be found on the [🌐 website](https://cmedbenchmark.llmzoo.com/#home) and the [πŸ€— HuggingFace](https://huggingface.co/datasets/FreedomIntelligence/CMB). **Jul 2023**: Start the journey in CUHK-sz as a research assistant under the guidance of [Benyou Wang](https://scholar.google.com/citations?user=Jk4vJU8AAAAJ). **Jun 2023**: I defended my master's degree and got my master's degree in software engineering. Thanks to all those who have supported me. **Aug 2022 - Apr 2023**: Finished my internship with [Jiaxing Zhang](https://scholar.google.com/citations?user=ozXuhOUAAAAJ) on LLM SFT.

Papers

Google Scholar

2024

2023

2022


Awards

  • Outstanding Graduate Student, Nanjing University, 2022
  • Yingcai Scholarship, Nanjing University, 2022
  • Renmin Scholarship (People’s Scholarship), Nanjing University, 2018-2021
  • Third Runner’s Up in 15th Citi Cup Financial Innovation Application Competition, Citigroup, 2019
  • Second Runner’s Up in 2019 β€œChain to Future” University Blockchain Technology Application Competition, CCF, 2019
  • Outstanding Student Leader of the Communist Youth League, Nanjing University, 2018-2019

Services

  • Conference reviewer: EMNLP, ACL Rolling Review