KAIST BDI Lab | Big Data Intelligence La

KAIST BDI Lab | Big Data Intelligence Lab
The Big Data Intelligence lab
at KAIST is led by Prof. Joyce Jiyoung Whang at the School of Computing. The group conducts fundamental research on diverse aspects of data intelligence, with an emphasis on developing novel computational algorithms for massive, complex datasets arising across various scientific and industrial applications. In particular, the group focuses on graph machine learning and mining techniques, where data is represented and interpreted in terms of the interactions between entities. The group explores various AI, graph machine learning, and data mining tasks to solve open challenges in these areas and develop scalable algorithms for real-world data.
BDI Lab YouTube Channel
Youtube Channel
Research Areas
Graph Machine Learning
Deep Learning
Knowledge Graphs
Data Mining
Big Data Analytics
Data Science
연구 영역 상세
연구 영역을 선택하면 자세한 내용이 표시됩니다.
Professor
Joyce Jiyoung Whang (황지영)
Associate Professor, School of Computing, KAIST
Adjunct Professor, Kim Jaechul Graduate School of AI, KAIST
Adjunct Professor, Graduate School of Data Science (GSDS), KAIST
Email:
Office:
KAIST, N1 Building 905
Education
Ph.D. in Computer Science, The University of Texas at Austin, TX, USA, 2015.
(Supervisor:
Inderjit S. Dhillon
Research Interests
Graph Machine Learning, Deep Learning, Data Mining, Big Data Analytics, and Data Science.
Curriculum Vitae
Members
Ph.D. Students
Chanyoung Chung (정찬영)
Mar. 2022 ~:
M.S./Ph.D. Integrated Program in School of Computing
Mar. 2021 ~ Feb. 2022:
M.S. Program in School of Computing
Feb. 2021:
B.S. in Computing (double major: Mathematical Sciences), KAIST
Jaejun Lee (이재준)
Mar. 2023 ~:
Ph.D. Program in School of Computing
Feb. 2023:
M.S. in Computing, KAIST
Aug. 2021:
B.S. in Computing (double major: Mathematical Sciences), KAIST
Heehyeon Kim (김희현)
Sep. 2024 ~:
Ph.D. Program in School of Computing
Aug. 2024:
M.S. in Computing, KAIST
Aug. 2022:
B.S. in IoT Artificial Intelligence Convergence, Chonnam National University
Minsung Hwang (황민성)
Sep. 2024 ~:
Ph.D. Program in School of Computing
Aug. 2024:
M.S. in Computing, KAIST
Feb. 2023:
B.S. in Electrical Engineering (minor: Computing), KAIST
Kyeongryul Lee (이경률)
Mar. 2026 ~:
Ph.D. Program in School of Computing
Feb. 2026:
M.S. in Data Science, KAIST
Feb. 2024:
B.S. in Data Science (double major: Financial Mathematics & Statistics), The University of Sydney
M.S. Students
Donggyu Yoon (윤동규)
Mar. 2025 ~:
Program in School of Computing
Feb. 2025:
B.S. in Computing (minor: Electrical Engineering), KAIST
Kidong Nam (남기동)
Mar. 2025 ~:
KT-AI M.S. Program in School of Computing
Aug. 2024:
B.S. in Philosophy (double major: Computer Science and Engineering), Sogang University
Jeesoo Kim (김지수)
Sep. 2025 ~:
M.S. Program in Graduate School of Data Science
Aug. 2025:
B.A. in Business Administration (double major: Computer Science and Engineering), Sogang University
Jeemin Kim (김지민)
Mar. 2026 ~:
M.S. Program in School of Computing
Feb. 2026:
B.S. in Computing (minor: Bio and Brain Engineering), KAIST
Seheon Kim (김세헌)
Mar. 2026 ~:
M.S. Program in School of Computing
Feb. 2026:
B.S. in Computing, KAIST
Suhyeon Lim (임수현)
Mar. 2026 ~:
KT-AI M.S. Program in School of Computing
Aug. 2024:
B.S. in Electric Engineering, Kyunghee University
Alumni
Jaejun Lee (이재준)
M.S. in Computing, Feb. 2023, KAIST
Thesis:
Image-based Augmented Knowledge Graph Embedding
Aug. 2021:
B.S. in Computing (double major: Mathematical Sciences), KAIST
Seunghwan Kong (공승환)
M.S. in Computing, Feb. 2023, KAIST
Thesis:
Representation Learning on Knowledge Graphs with Entity Types
Aug. 2021:
B.S. in Computing (double major: Mathematical Sciences), KAIST
Heehyeon Kim (김희현)
M.S. in Computing, Aug. 2024, KAIST
Thesis:
Fraud Detection Using Graph Neural Networks
Aug. 2022:
B.S. in IoT Artificial Intelligence Convergence, Chonnam National University
Minsung Hwang (황민성)
M.S. in Computing, Aug. 2024, KAIST
Thesis:
Theoretical Generalization Bounds for Knowledge Graph Representation Learning
Feb. 2023:
B.S. in Electrical Engineering (minor: Computing), KAIST
Jinhyeok Choi (최진혁)
M.S. in Computing, Feb. 2025, KAIST
Thesis:
Spatio-Temporal Graph Forecasting by Modeling Long-Range Dependency via Selective State Spaces
Feb. 2023:
B.S. in Computing (minor: Electrical Engineering), KAIST
Junho Park (박준호)
M.S. in Computing, Feb. 2025, KAIST
Thesis:
Root Cause Analysis for Microservice Systems with Resource-Sharing Dependencies
Aug. 2021:
B.S. in Electrical Engineering and Computer Science, GIST
Kyeongryul Lee (이경률)
M.S. in Data Science, Feb. 2026, KAIST
Thesis:
Probing Safety Vulnerabilities in Large Language Models and Multimodal Models via Auto-Generated Jailbreak Prompts
Feb. 2024:
B.S. in Data Science (double major: Financial Mathematics & Statistics), The University of Sydney
Minhyeong An (안민형)
M.S. in Computing, Feb. 2026, KAIST
Thesis:
Graph-enhanced Retrieval-Augmented Generation for Microservice Root Cause Analysis with Large Language Models
Feb. 2024:
B.S. in Computing (double major: Electrical Engineering), KAIST
Selected Publications
International Publications
‡: Equal Contribution, *: Corresponding Author
2025
Beneath the Facade: Probing Safety Vulnerabilities in LLMs via Auto-Generated Jailbreak Prompts
H. Kim, K. Lee, and J. J. Whang*
Findings of the Association for Computational Linguistics: EMNLP (Findings of EMNLP)
Link
Paper
BibTeX
Slides
Poster
Code
2025
Structure Is All You Need: Structural Representation Learning on Hyper-Relational Knowledge Graphs
J. Lee and J. J. Whang*
International Conference on Machine Learning (ICML)
Link
Paper
BibTeX
Poster
Code
2025
Stability and Generalization Capability of Subgraph Reasoning Models for Inductive Knowledge Graph Completion
M. Hwang, J. Lee, and J. J. Whang*
International Conference on Machine Learning (ICML)
Link
Paper
BibTeX
Poster
2025
Unveiling the Threat of Fraud Gangs to Graph Neural Networks: Multi-Target Graph Injection Attacks against GNN-Based Fraud Detectors
J. Choi, H. Kim, and J. J. Whang*
AAAI Conference on Artificial Intelligence (AAAI)
Link
Paper
arXiv
BibTeX
Slides
Poster
Code
2025
Unifying Inductive, Cross-Domain, and Multimodal Learning for Robust and Generalizable Recommendation
C. Chung, K. Lee, S. Park, and J. J. Whang*
Multimodal Generative Search and Recommendation (MMGenSR) Workshop at Conference on Information and Knowledge Management (CIKM)
arXiv
BibTeX
Code
2025
SAIF: A Comprehensive Framework for Evaluating the Risks of Generative AI in the Public Sector
K. Lee, H. Kim, and J. J. Whang*
AI for Public Missions (AIPM) Workshop at AAAI Conference on Artificial Intelligence (AAAI)
arXiv
BibTeX
2024
PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning
J. Lee, M. Hwang, and J. J. Whang*
International Conference on Machine Learning (ICML)
Link
Paper
arXiv
BibTeX
Slides
Poster
Code
2024
Why So Gullible? Enhancing the Robustness of Retrieval-Augmented Models against Counterfactual Noise
G. Hong
, J. Kim
, J. Kang
, S. Myaeng, and J. J. Whang*
Findings of the Association for Computational Linguistics: NAACL (Findings of NAACL)
Link
Paper
arXiv
BibTeX
Slides
Poster
Code
2024
SpoT-Mamba: Learning Long-Range Dependency on Spatio-Temporal Graphs with Selective State Spaces
J. Choi, H. Kim, M. An, and J. J. Whang*
Spatio-Temporal Reasoning and Learning (STRL) Workshop at International Joint Conference on Artificial Intelligence (IJCAI)
Link
Paper
arXiv
BibTeX
Slides
Code
2023
VISTA: Visual-Textual Knowledge Graph Representation Learning
J. Lee, C. Chung, H. Lee, S. Jo, and J. J. Whang*
Findings of Empirical Methods in Natural Language Processing (Findings of EMNLP)
Link
Paper
BibTeX
Slides
Poster
Code
2023
FinePrompt: Unveiling the Role of Finetuned Inductive Bias on Compositional Reasoning in GPT-4
J. Kim
, G. Hong
, S. Myaeng, and J. J. Whang*
Findings of Empirical Methods in Natural Language Processing (Findings of EMNLP)
Link
Paper
BibTeX
Poster
Code
2023
Representation Learning on Hyper-Relational and Numeric Knowledge Graphs with Transformers
C. Chung
, J. Lee
, and J. J. Whang*
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)
Link
arXiv
BibTeX
Slides
Poster
Code
Promo
2023
InGram: Inductive Knowledge Graph Embedding via Relation Graphs
J. Lee, C. Chung, and J. J. Whang*
International Conference on Machine Learning (ICML)
Link
Paper
arXiv
BibTeX
Slides
Poster
Code
2023
Learning Representations of Bi-level Knowledge Graphs for Reasoning beyond Link Prediction
C. Chung and J. J. Whang*
AAAI Conference on Artificial Intelligence (AAAI)
Link
Paper
arXiv
BibTeX
Slides
Poster
Code
2023
Dynamic Relation-Attentive Graph Neural Networks for Fraud Detection
H. Kim, J. Choi, and J. J. Whang*
Machine Learning on Graphs (MLoG) Workshop at IEEE International Conference on Data Mining (ICDM)
Link
Paper
arXiv
BibTeX
Poster
Code
2022
Semantic Grasping via a Knowledge Graph of Robotic Manipulation: A Graph Representation Learning Approach
J. H. Kwak
, J. Lee
, J. J. Whang*, and S. Jo*
IEEE Robotics and Automation Letters
Link
Paper
BibTeX
Slides
2022
HiddenCPG: Large-Scale Vulnerable Clone Detection Using Subgraph Isomorphism of Code Property Graphs
S. Wi, S. Woo, J. J. Whang, and S. Son*
The ACM Web Conference
Link
Paper
BibTeX
Slides
2021
Knowledge Graph Embedding via Metagraph Learning
C. Chung and J. J. Whang*
International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)
Link
Paper
BibTeX
Slides
Poster
Code
2021
Image-based Lifelogging: User Emotion Perspective
J. Bum, H. Choo, and J. J. Whang*
Computers, Materials & Continua
Link
Paper
BibTeX
2021
Sentiment-based Sub-event Segmentation and Key Photo Selection
J. Bum, J. J. Whang*, and H. Choo*
Journal of Visual Communication and Image Representation
Link
Paper
BibTeX
2020
MEGA: Multi-View Semi-Supervised Clustering of Hypergraphs
J. J. Whang, R. Du, S. Jung, G. Lee, B. Drake, Q. Liu, S. Kang, and H. Park
International Conference on Very Large Data Bases (VLDB)
Link
Paper
BibTeX
Slides
Code
Video
2020
Sparse Probabilistic K-means
Y. M. Jung, J. J. Whang, and S. Yun*
Applied Mathematics and Computation
Link
Paper
BibTeX
2020
Scalable Anti-TrustRank with Qualified Site-level Seeds for Link-based Web Spam Detection
J. J. Whang*, Y. Jung, S. Kang, D. Yoo, and I. S. Dhillon
Workshop on CyberSafety: Computational Methods in Online Misbehavior at the Web Conference
Link
Paper
BibTeX
Slides
Code
2019
Hyperlink Classification via Structured Graph Embedding
G. Lee, S. Kang, and J. J. Whang*
International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)
Link
Paper
BibTeX
Slides
Poster
Code
2019
Non-exhaustive, Overlapping Clustering
J. J. Whang*
, Y. Hou
, D. F. Gleich, and I. S. Dhillon
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
Link
Paper
BibTeX
Slides
Code
2019
SmartGrip: Grip Sensing System for Commodity Mobile Devices through Sound Signals
N. Kim, J. Lee, J. J. Whang, and J. Lee*
Personal and Ubiquitous Computing
Link
Paper
BibTeX
2018
Fast Asynchronous Anti-TrustRank for Web Spam Detection
J. J. Whang*, Y. S. Jeong, I. S. Dhillon, S. Kang, and J. Lee
Workshop on MIS2: Misinformation and Misbehavior Mining on the Web at ACM International Conference on Web Search and Data Mining (WSDM)
Paper
BibTeX
Poster
2017
Non-Exhaustive, Overlapping Co-Clustering
J. J. Whang* and I. S. Dhillon
ACM Conference on Information and Knowledge Management (CIKM)
Link
Paper
arXiv
BibTeX
Poster
Code
2017
An Empirical Study of Community Overlap: Ground-truth, Algorithmic Solutions, and Implications
J. J. Whang*
ACM Conference on Information and Knowledge Management (CIKM)
Link
Paper
BibTeX
Poster
2016
Fast Multiplier Methods to Optimize Non-exhaustive, Overlapping Clustering
Y. Hou, J. J. Whang, D. F. Gleich, and I. S. Dhillon
SIAM International Conference on Data Mining (SDM)
Link
Paper
BibTeX
2016
Overlapping Community Detection Using Neighborhood-Inflated Seed Expansion
J. J. Whang*, D. F. Gleich, and I. S. Dhillon
IEEE Transactions on Knowledge and Data Engineering (TKDE)
Link
Paper
BibTeX
Code
2015
Non-exhaustive, Overlapping Clustering via Low-Rank Semidefinite Programming
Y. Hou
, J. J. Whang
, D. F. Gleich, and I. S. Dhillon
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)
Link
Paper
BibTeX
Slides
Poster
2015
Scalable Data-driven PageRank: Algorithms, System Issues, and Lessons Learned
J. J. Whang, A. Lenharth, I. S. Dhillon, and K. Pingali
International European Conference on Parallel and Distributed Computing (Euro-Par)
Link
Paper
BibTeX
2015
Non-exhaustive, Overlapping k-means
J. J. Whang, I. S. Dhillon, and D. F. Gleich
SIAM International Conference on Data Mining (SDM)
Link
Paper
BibTeX
Poster
Code
2013
Stochastic Blockmodel with Cluster Overlap, Relevance Selection, and Similarity-Based Smoothing
J. J. Whang, P. Rai, and I. S. Dhillon
IEEE International Conference on Data Mining (ICDM)
Link
Paper
BibTeX
Slides
2013
Overlapping Community Detection Using Seed Set Expansion
J. J. Whang, D. F. Gleich, and I. S. Dhillon
ACM Conference on Information and Knowledge Management (CIKM)
Link
Paper
BibTeX
Slides
Code
2012
Scalable and Memory-Efficient Clustering of Large-Scale Social Networks
J. J. Whang, X. Sui, and I. S. Dhillon
IEEE International Conference on Data Mining (ICDM)
Link
Paper
BibTeX
Slides
2012
Scalable Clustering of Signed Networks using Balance Normalized Cut
K. Chiang, J. J. Whang, and I. S. Dhillon
ACM Conference on Information and Knowledge Management (CIKM)
Link
Paper
BibTeX
Slides
2012
Parallel Clustered Low-rank Approximation of Graphs and Its Application to Link Prediction
X. Sui, T. Lee, J. J. Whang, B. Savas, S. Jain, K. Pingali, and I. S. Dhillon
International Workshop on Languages and Compilers for Parallel Computing (LCPC)
Link
Paper
BibTeX
Slides
Domestic Papers
2025
Root Cause Analysis for Microservice Systems Using Anomaly Propagation by Resource Sharing (자원 공유에 따른 이상치 전파를 활용한 마이크로서비스 시스템 결함의 근본 원인 분석)
박준호, 황지영
정보과학회논문지 (Journal of KIISE), 2025년 4월
Paper
2022
Knowledge Graph Embedding with Entity Type Constraints (개체 유형 정보를 활용한 지식 그래프 임베딩)
공승환, 정찬영, 주수헌, 황지영
정보과학회논문지 (Journal of KIISE), 2022년 9월
Paper
Code
2022
Knowledge Graph Embedding with Dynamic Attention (동적 어텐션을 이용한 지식그래프 임베딩)
황민성, 황지영
한국컴퓨터종합학술대회 논문집, 2022년 6월
Paper
Poster
Patents
2025
복층 지식 그래프 임베딩 방법 및 그 시스템
황지영, 정찬영
등록번호: 10-2834344-0000
Link
Projects
National Research Foundation of Korea
Responsible Multimodal Graph AI (책임 있는 멀티모달 그래프 인공지능)
Mar. 2025 ~ Feb. 2028
Principal Investigator
Extendable Graph Representation Learning (확장 가능한 그래프 표현학습)
Mar. 2022 ~ Feb. 2025
Principal Investigator
MARS Artificial Intelligence Integrated Research Center (MARS 인공지능 통합연구센터)
Aug. 2018 ~ Feb. 2024
Semi-Supervised Multi-View Learning with Graphs (다각적 데이터 융합을 통한 그래프기반 준지도 학습)
Mar. 2019 ~ Feb. 2022
Principal Investigator
Modeling Information Propagation by Exploiting the Clustering Structure of Massive Social Networks (거대 소셜 네트워크의 클러스터링 구조를 활용한 정보 전파 메커니즘 모델링)
Nov. 2016 ~ Oct. 2019
Principal Investigator
Samsung Electronics
AI Agent-based Omni Knowledge Graph Construction and its Applications (AI Agent 기반 Omni Knowledge Graph 구축 및 활용 기술 개발)
Oct. 2025 ~ Sep. 2030
Principal Investigator
Knowledge Graph Modeling for Semiconductor Data (반도체 공정 데이터의 다각적 분석을 위한 지식 그래프 모델링)
Sep. 2020 ~ Sep. 2023
Principal Investigator
Institute of Information & communications Technology Planning & Evaluation (IITP)
LG AI STAR Talent Development Program for Leading Large-Scale Generative AI Models in the Physical AI Domain (Physical AI 분야의 거대 생성모델 기술 선도를 위한 LG AI STAR 인재양성 사업)
Jul. 2025 ~ Dec. 2028
Development of AI Technology to support Expert Decision-making that can Explain the Reasons/Grounds for Judgment Results based on Expert Knowledge (전문지식 대상 판단결과의 이유/근거를 설명가능한 전문가 의사결정 지원 인공지능 기술개발)
Apr. 2022 ~ Dec. 2026
Kyobo & DPLANEX
Multimodal Cross-Domain Recommendation Systems for Personalized Services (개인 맞춤형 서비스를 위한 멀티모달 크로스 도메인 추천 시스템)
Dec. 2024 ~ Nov. 2025
Principal Investigator
GNN-based Insurance Fraud Detection (그래프 신경망(GNN) 기반 보험사기 예측 연구)
Aug. 2022 ~ Nov. 2025
Principal Investigator
Telecommunications Technology Association (TTA)
생성형 AI의 기술 혁신에 대응 가능한 안정성 평가체계 수립 연구
Oct. 2024 ~ Dec. 2024
Principal Investigator
AI 위험 분야별 안전성 평가를 위한 데이터셋 로드맵 구축
Oct. 2024 ~ Dec. 2024
Principal Investigator
DevStack
Development of a Technology for Analyzing Root Causes and Proposing Countermeasures for Containerized OpenStack Service’s Failures Using LLM and Knowledge Graphs (거대언어모델 및 지식그래프를 활용하여 컨테이너화된 오픈스택 장애원인 분석 및 대응방안 제시 기술개발)
Mar. 2025 ~ Nov. 2025
Principal Investigator
Development of Intelligent Kubernetes Fault Cause Analysis and Response Proposal Method Using LLM and Knowledge Graphs (LLM 및 지식그래프를 활용한 지능형 K8s 장애 원인 분석 및 대응 방안 제시 기술 개발)
May 2024 ~ Nov. 2024
Principal Investigator
Photos
Loading photos...
Recruit
연구실 동영상
소개자료
설명자료
[학생 모집 중]
석사 신입생:
지원서 링크
학부 연구생:
지원 링크
공고문
우리 연구실에 관심있는 학생들은
jjwhang@kaist.ac.kr
로 문의바랍니다.
Location
KAIST, N1 Building
Prof.:
N1, 905
Lab.:
N1, 921
Admin.:
N1, 904
Email
Professor Joyce Jiyoung Whang
jjwhang@kaist.ac.kr
Tel
Prof.:
042-350-3584
Lab.:
042-350-7784
Admin.:
042-350-7884