Jitesh Jain
Jitesh Jain

Ph.D. Student

I am a second-year Ph.D. student in the School of Interactive Computing at Georgia Tech, advised by Humphrey Shi. I completed my Bachelor's in Computer Science and Engineering in 2023 at IIT Roorkee. I am currently working as a Research Intern in the PRIOR team at AI2 with Christopher Clark.

In the past, I have interned at Microsoft Research, Redmond (Summer 2024, with Jianwei Yang) and Picsart AI Research (Summer 2021-22, with Humphrey Shi).

My current research interests revolve around developing multimodal systems. I am also interested in representation learning, efficiency, and various real-world applications of multimodal systems. My recent works are based on analyzing and improving the visual perception ability in Multimodal Large Language Models [OLA-VLM, VCoder], building upon my experience from working on developing models for dense prediction tasks [OneFormer, SeMask].

Reach out if you are interested in my research or would like to discuss any ideas. If you are a self-motivated researcher who's looking for guidance on one of your projects, feel free to drop me an email with a brief description about your (manifested) research project.

I am seeking internship/collaboration opportunities starting in Fall 2025/Spring 2026. If you are working on multimodal reasoning, unified multimodal models or agent system, please reach out!

Professional Life Happenings

  • [May 2025]: Started my Research Internship in the PRIOR team at Allen AI. HMU if you are in Seattle! πŸ₯‚
  • [April 2025]: Passed my Ph.D. Qualifying Exam! πŸ₯‚
  • [December 2024]: Checkout my new blog what's a good metric for PhD Students?! πŸ“œ
  • [December 2024]: Checkout out OLA-VLM, result of my internship at Microsoft Research, Redmond! πŸš€
  • [May 2024]: Excited to start Summer Internship at Microsoft Research, Redmond! πŸ§‘β€πŸ’»
  • [February 2024]: VCoder is accepted to CVPR 2024! See you in Seattle! πŸ₯‚
  • [June 2023]: I will be joining Georgia Tech as a Ph.D. student in Computer Science in Fall 2023! πŸ₯‚
Featured Publications
Blogs

what's a good metric for PhD Students?

So, I started my PhD in CS at Georgia Tech in Fall 2023. It’s only been around 16 months, but it feels more like several years in deep learning research trends.

Ascending the Research Trail

In case you missed it, I shared my experience as a rookie undergrad researcher in a previous blog: Riding the Noisy Research Track. Since then, I have grown from a rookie researcher into a more mature beginner researcher with a better outlook on the bigger picture in research (thinking beyond publishing a paper) owing to my close collaboration with the SHI Labs in the last couple of years.

Summer Diaries: Intern Diary of an Undergrad DL Researcher

This summer, I worked as a remote research intern at SHI Lab @ University of Oregon (UO) and Picsart AI Research (PAIR). I joined the SHI Lab to work with Professor Humphrey Shi in November 2020 during my second year and continued my work there during the summers, and I joined PAIR during the summers (June 2021).

The Contemporary Overthinking Problem

Since the onset of the COVID-19 pandemic, the following lockdowns and stay & work from home situations, the concept of Overthinking has gained popularity. Moreover, now and then, I find the gen-z people (and even me sometimes) replying to β€œWhat you doing?