Zhihua Tian
AI Researcher | Healthcare AI & Large Language Models Specialist
Education
-
Master of Operations Research & Control Theory
School of Artificial Intelligence, Nankai University (September 2023 - Present)
(Supervisor: Prof. Jianda Han, Prof. Weiguang Huo)
-
Bachelor of Computer Science and Technology
School of Computer Science, Hangzhou Dianzi University (September 2017 - June 2021)
Research Experience
-
Healthcare AI (September 2023 - Present)
- Proposed a novel video-based framework for Parkinson's Freezing of Gait (FoG) prediction, addressing multi-camera requirements and occlusion. Enhanced prediction accuracy from 79.5 ± 3.3% to 83.2 ± 2.8% through IMU-guided multimodal contrastive learning from monocular leg movement videos.
- Developed an intelligent video-based diagnostic system (gaitanalysis.simplaj.fun) for accurate and comprehensive assessment of Parkinsonian gait disorders, utilizing keypoint detection and Graph Convolutional Networks.
- Achieved performance comparable to clinical experts in disease severity prediction (AUC=0.87, F1=0.806).
- Effectively differentiated medication efficacy on gait disorders with 73.68% accuracy, demonstrating superior resolution over UPDRS in discerning subtle drug-induced gait changes.
- Discovered novel digital biomarkers via an explainable framework, more sensitive to disease progression and drug response than traditional markers.
-
Large Language Models (LLM) (July 2024 - Present)
- Enhanced Qwen2-VL-7B's geometric reasoning capabilities by constructing a multimodal image-text reasoning dataset of 3,000 pairs, boosting geometric proof analysis accuracy by 18% over baseline via supervised fine-tuning (SFT).
- Achieved knowledge distillation for operations optimization modeling problems based on Deepseek R1, implementing SFT on a 1.5B parameter model; improved performance by 27% over GPT-4 on NL4OPT and 5% on MAMO EasyLP.
- Developed a Voice Appointment Assistant intelligent agent (va.simplaj.top) using AutoGen, enabling voice-based appointment modification via ASR, Function Call, and TTS.
- Evaluated dietary image assessment and scoring for chronic disease patients by training Qwen2.5-VL-7B with SFT and GRPO, finding GRPO outperformed SFT on specific scoring tasks.
- Explored enabling Multi-modal Large Language Models (MLLMs) to learn from literature for multi-task diagnosis of Parkinson's patient videos (gait, finger-tapping, standing), addressing traditional deep learning models' limitations in generalization and single-task focus.
Awards and Honors
- Alibaba Cloud Tianchi ModelScope-Sora Challenge - Third Place (4/347 teams, 2024)
- Chinese Operations Research Society Competition - First Prize (Top 3%, 2024)
- Nankai University - Second-Class Scholarship (Top 20%, 2023-2024)
- Nankai University - Academic Competition Special Scholarship (Top 10%, 2023-2024)
Work Experience
Technical Skills
- Programming Languages: C, Python
- Deep Learning Frameworks: PyTorch, TRL, ONNX
- Computer Vision: Face/Body Detection, Pose Estimation, Action Classification
- Tools: Linux, Docker, Git, Blender