Broadly, I am
interested in large-scale parallelism in computer systems and its
implications on application performance, operating system design, fault
tolerance and big data in cloud computing. My particular interests focus on
secondary memory system technologies and optimization, scalable file and
key-value storage systems, scalable machine learning and systematic testing
for large scale systems.
I have a strong interest in shepherding technological advances from blackboard through standards and to commercial reality .
I joined the faculty
of CMU's Computer Science Department in 1991. Previously I received a
Ph.D. and a M.Sc. in Computer Science in 1991 and 1987, respectively,
from the University of California at Berkeley. Prior to Berkeley, I received
a Bachelor of Mathematics in Computer Science and Applied Mathematics
in 1983 from the University of Waterloo in Ontario, Canada.
In 1993 I founded CMU's
Parallel Data Laboratory (
PDL
and led it until April 1999.
Today the PDL is led by
Dave Andersen
and
George Amvrosaidis
. The PDL is a
community
that typically comprises
between 6 to 9 faculty, 2 to 3 dozen students and 4 to 10 staff. It
receives support and guidance from a consortium of 15 to 25 companies
with interests in parallel data systems, the
Parallel Data Consortium
This community holds biannual
retreats
and workshops
to exchange technology ideas, analysis and future directions.
The
publications
of the PDL are
available for your inspection.
The principal contributions
of my first twenty years of research: Redundant Arrays of Inexpensive
Disks (
RAID
), Informed Prefetching
and Caching (
TIP
) and Network-Attached Secure
Disks (
NASD
),
whose architectural basis shapes the
Google File System
and its descendents such as the Hadoop Distributed File System (HDFS) and the Parallel Network
File System,
pNFS,
features in
NFS v4.1
(video discussion),
have all stimulated derivative
research and development in academia and industry. RAID, in particular,
is now the organizing concept of a 10+ billion-dollar marketplace (more
on RAID in my 1995 RAID
tutorial
).
In 1999 I started
Panasas
Inc., a scalable storage cluster company using an object storage architecture and providing 100s of TB of
high-performance storage in a single management domain for national laboratory, energy sector, auto/aero-design, life
sciences, financial modeling, digital animation, and engineering design markets
USENIX FAST08
PDSW07
SC04
).
In 2006 I founded a Petascale Data Storage Institute (
PDSI
) for the Department of Energy's Scientific Discovery through Advanced Computing (
SciDAC
). Led by CMU, with partners at Los Alamos, Sandia, Oak Ridge, Pacific Northwest and Lawrence Berkeley National Labs, and University of California, Santa Cruz and University of Michigan, Ann Arbor, this Institute gathers together leading experts in leadership class supercomputing storage systems to address the challenges involved in moving from today's terascale computers to the petascale computers of the next decade.
PDSI has run its course, leaving ongoing collaboration among the community at the annual Parallel Data Storage Workshop
PDSW
), between Los Alamos National Laboratory and CMU (
IRHPIT
), and an open source parallel checkpoint middleware file system
PLFS
).
In 2008 I turned to Data Intensive Scalable Computing, Clouds, and Scalable Analytics, participating in the
design and installation of 2 TF, 2TB, 1/2PB of computing in an OpenCirrus and an OpenCloud cluster. We installed and
operate a Hadoop cluster for any and all researchers at CMU and have published observations on their use of this
facility and benchmarking tools for it. Astrophysics was a strong early user, computational biology and geophysics
filling out a natural science slate, but the heaviest users were doing variants of machine learning and big data and the
major collaboration has been the Intel Science and Technology Center for Cloud Computing
ISTC-CC
).
In 2011 I helped the New Mexico Consortium recycle retired Los Alamos National Laboratory supercomputing clusters
into an NSF funded open platform for scalable systems development and testing
PRObE
). PRObE offers multiple clusters with 1000s of cores in either
low-core-count high-node-count clusters or high-core-count low-node-count clusters. Researchers at universities and labs
from all around the country are using PRObE to demonstrate the scalability of their systems research ideas.
In 2012 I rallied a team of Machine Learning and Distributed
Systems researchers to form a
Big Learning
research group. Our premise is that Machine Learning on Big Data presents
both theoretical (exploitation of the inherent search-iness of machine
learning and ensuring convergence given concurrency induced error) and a
practical (distributed systems latency hiding and load balancing given
unusually flexible tolerance for bounded error) challenges.
Also in 2012 I created the curriculum for and welcomed the
first class of Big Data Systems masters students, now known as the
Systems Major
in the Master of Computational Data Sciences
(MCDS).
MCDS graduates are typically employed in the U.S. tech industry,
earning
an average salary of over $115,000 in their first post-MCDS job.
On Jan. 2 2018 I started with the Vector Institute for AI, founded in April 2017 by ACM Turing Award winner Geoffrey Hinton (Chief Scientific Advisor) and colleagues, as its first President and Chief Executive Officer. Over its first five years, I grew Vector from 20 employees (8 faculty and 12 staff) to over 150 (60 faculty and postdocs, 60 professional staff, and 30 interns) and a research community of over 700 faculty, affiliate (adjunct) faculty, postdocs, and graduate students drawn from 13 universities and 5 hospital and life sciences institutes, implemented an AI computing resource of over 1,000 GPUs and expanded Vector’s facilities from 18,000 to 33,000 sqft.
In the five years of my leadership total funding secured was increased from $135M CAD to $292M CAD. Affiliated with the University of Toronto and the University of Waterloo, among others, Vector facilitated AI research by extending the academic and computing support available in local universities, by annually sponsoring about 100 $17,500 scholarships for students pursuing AI-related master’s degrees at one of more than a dozen universities in Ontario, and by annually spending about $2M CAD topping up the stipends of graduate students and postdocs at partner universities. The other key to Vector’s mission was to assist and enable regional corporations, hospitals and entrepreneurial incubators to better use AI to compete nationally and internationally. Under my leadership, Vector annually provided over 4,000 training engagements involving multi-company collaborative projects, experiential workshops and courses, commercialization seminars, micro-advising on specific problems, and recruiting support through a facilitated talent hub. At the end of five years Vector had formal partnerships with over 200 organizations including 30 sponsoring enterprise companies (contributing over $9M CAD annually), more than 20 academic institutions, about 10 hospital and life-sciences research institutes, over 25 AI-related startup-scaleup companies, and a large and dynamic community of small and medium-sized fast growing companies involved in a variety of programs for upskilling AI in commercial uses.
After a 5 year term at Vector, I began to consult on AI Computing technology for industry and Canadian government ministries and research institutes, culminating in appointments on the Government of Canada's Advisory Council on Artificial Intelligence and its Minister of AI's 2025 Task Force on AI strategy where I contributed a set of recommendations for the country's strategy for AI Infrastructure.
In 2025 I returned to Panasas in its new incarnation as Vdura Inc as the Chief Technology and Artificial Intelligence Officer. It had recently made a set of big changes starting with its name and including a new management team and the conversion from an appliance vendor to a software defined storage vendor. I returned to lead the development of a suite of AI services and optimizations appropriate for the agentic era of modern computing.
2024 Keynote speaker, 38th Int’l Conf. on Massive Storage Systems and Tech., Santa Clara, CA.
2021-2023 Member of the Government of Canada’s Advisory Council on Artificial Intelligence.
2021-2023 Member of the OECD Network of Experts on AI (ONE AI), OECD.AI task force on AI compute (ONE AI compute).
2019 R&D World R&D 100 Award, IT/Electrical category, for DeltaFS – Rapidly Searching Big Data to Accelerate Scientific Discovery, with Los Alamos National Laboratory, Dec, 2019, San Mateo, CA.
2019 Test of Time Award for impact on the field from papers in USENIX Conf. on File and Storage Technologies at least 10 years prior, “Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You?” by Bianca Schroeder and Garth A. Gibson, Feb., 2007.
2018 Keynote speaker, USENIX Conf. on File and Storage Technologies, Oakland, CA.
2014 Fellow of the IEEE
for contributions to the performance and reliability of transformative storage systems.
2012 Fellow of the ACM
for contributions to the performance and reliability of storage systems.
2012
Jean-Claude
Laprie Award in Dependable Computing
Industrial/Commercial
Product Impact Category, by the IFIP Working Group 10.4 on
Dependable Computing and Fault Tolerance, for outstanding papers
published at least 10 years ago that have significantly influenced
the theory and/or practice of dependable computing. Awarded for the
SIGMOD88 "RAID" paper.
2011
SIGOPS Hall
of Fame
for the SIGMOD88 "RAID" paper because it is one of "the most influential Operating Systems papers that were published at least ten years in the past." Selected by past program chairs from SOSP, OSDI, EuroSys, past Weiser and Turing Award winners from the SIGOPS community, and representatives of each of the Hall of Fame Award papers.
1999
Reynold B. Johnson Information
Storage Award
, an IEEE Technical Field Award, for outstanding contributions
to the field of information storage, with emphasis in the area of computer
storage. Awarded at the 1999 International Symposium on Computer Architecture
for the development of Redundant Arrays of Inexpensive Disks (RAID).
1999
Allan Newell
Award for Research Excellence
, Carnegie Mellon University.
1998
Test of Time Award
for the most influential paper in the ACM SIGMOD Int. Conf. on Management
of Data proceedings 10 years prior.
1991 A.C.M. Doctoral Dissertation
Award (tied for second).
PLFS (see
SC09
paper below and
institutes.lanl.gov/plfs
) has been released on Sourceforge (
sourceforge.net/projects/plfs
) under a BSD license. It is available through MPI-IO libraries or FUSE user level file system reflector. It is being put into production HPC use at Los Alamos.
pNFS, or Parallel NFS, is a subset of the new features in NFS version 4.1 (
www.pnfs.com
, RFC 5661-5664,
tools.ietf.org/html/rfc566x
for x=1,2,3,4). pNFS hails from a workshop in Dec 2003 when Garth Gibson and Peter Honeyman asked "what is next for NFS" and Gibson/CMU/Panasas answered: delegation of file layouts (direct and indirect pointers in an inode, sort of). Based on the NASD file system work (
ASPLOS98
below) that inspired Panasas (
FAST08
below), pNFS allows client machines to request a (revocable) map of the locations of data in a storage area network (seen as SCSI blocks, SCSI objects or NFS files) which the client can use to directly access file data (without access being proxied by the NFSv4.1 server).
an implementation of pNFS built on top of NFS files was taken into Linux in 2.6.39
an implementation of pNFS built on top of SCSI objects was taken into Linux in 2.6.40, renamed as 3.0
an implementation of pNFS built on top of SCSI blocks was taken into Linux 3.1
extension of pNFS built on SCSI objects with RAID 1 and RAID 5 over objects on different data servers, taken into Linux 3.2
Past member of the
technical council of the Storage Networking Industry Association
SNIA
), an international organization
of about 100 networking and storage companies formed in July 1999.
Founder
and chair, National Storage Industry Consortium (NSIC) working group
on Network-Attached Storage Devices (
NASD
), 1996-1999. Program chair for eleven
NSIC/NASD sponsored public
workshops
(75 presentations
and 500 attendees). The result of these efforts was presented in "Object
Based Storage Devices: A Command Set Proposal," (http://www.nsic.org/nasd/final.pdf,
November 1999) which was written to launch an ANSI standards effort
in the X3/T10 (SCSI) committee.
Release of
NASD Scalable Storage
Systems Prototype
code, version 1.1 in July 1999, and version 1.3
in May 2000. This code implements CMU's view of the next generation
storage interface (SCSI-4?) and a simple set of changes for a distributed
file system to exploit it.
Release of
RAIDframe Rapid Prototyping Tool for
RAID Systems
code, August 1996. This code, suitably debugged and
adapted, appears as the RAID device driver in the current release of
the NetBSD operating system.
ACM DIgital Library Publication List
Google Scholar Publication List
Mochi: Composing data services for high-performance computing environments
. RB Ross, G Amvrosiadis, P Carns, CD Cranor, M Dorier, K Harms, et. al., Journal of Computer Science and Technology 35 (1), 121-144, 2020.
PDF
Priority-based parameter propagation for distributed DNN training
. A Jayarajan, J Wei, G Gibson, A Fedorova, G Pekhimenko Proceedings of Machine Learning and Systems 1, 132-145, 2019.
PDF
MLSys: The new frontier of machine learning systems
. A Ratner, D Alistarh, G Alonso, DG Andersen, P Bailis, S Bird, N Carlini, et. al. arXiv preprint arXiv:1904.03257, 2019.
PDF
Scaling embedded in-situ indexing with DeltaFS
. Q Zheng, CD Cranor, D Guo, GR Ganger, G Amvrosiadis, GA Gibson, ... SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, (SC'18), 2018.
PDF
Litz: Elastic framework for {High-Performance} distributed machine learning
. A Qiao, A Aghayev, W Yu, H Chen, Q Ho, GA Gibson, EP Xing 2018 USENIX Annual Technical Conference (USENIX ATC 18), 631-644, 2018.
PDF
SlimDB: A space-efficient key-value storage engine for semi-sorted data
. K Ren, Q Zheng, J Arulraj, G Gibson Proceedings of the VLDB Endowment 10 (13), 2037-2048, 2017.
PDF
Evolving Ext4 for Shingled Disks.
Abutalib Aghayev, Theodore Ts'o, Garth Gibson, Peter Desnoyers.
15th USENIX Conf. on File and Storage Technologies (FAST'17), Santa Clara, CA, Feb 27 - Mar 2, 2017.
Stateless model checking with data-race preemption points.
Ben Blum, Garth Gibson.
Proc. of the 2016 ACM SIGPLAN Int. Conf. on
Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA
2016), Amsterdam, Nov 2-4, 2016.
PDF
Addressing the straggler problem for iterative convergent parallel ML.
Aaron Harlap, Henggang Cui, Wei Dai, Jinliang Wei, Gregory R.
Ganger, Phillip B. Gibbons, Garth A. Gibson, Eric P. Xing.
2016 ACM Symposium on Cloud Computing (SoCC 2016), October 05-07, 2016, Santa Clara, CA, USA.
PDF
STRADS: A Distributed Framework for Scheduled Model Parallel
Machine Learning.
Kim, Jin Kyu, Qirong Ho, Seunghak Lee, Xun Zheng, Wei Dai, Garth A.
Gibson, Eric P. Xing.
ACM European Conference on Computer Systems, 2016 (EuroSys'16),
18th-21st April, 2016, London, UK.
PDF
DeltaFS: Exascale File Systems Scale Better Without Dedicated Servers.
Zheng, Qing, Kai Ren, Garth Gibson, Bradley W. Settlemyer, Gary Grider.
Proc. of the Tenth Parallel Data Storage Workshop (PDSW15),
co-located with the Int. Conference for High Performance Computing,
Networking, Storage and Analysis (SC15),
Austin, TX, November 2015.
PDF
Managed Communication and Consistency for Fast
Data-Parallel Iterative Analytics
. Wei, Jinliang, Wei Dai, Aurick
Qiao, Qirong Ho, Henggang Cui, Gregory R. Ganger, Phillip B. Gibbons, Garth
A. Gibson, Eric P. Xing. 2015 ACM Symposium on Cloud Computing (SOCC 2015),
Aug 29-30, 2015, Hawaii.
PDF
Best Paper
ShardFS vs. IndexFS: Replication vs. Caching Strategies
for Distributed Metadata Management in Cloud Storage Systems
Xiao, Lin, Kai Ren, Qing Zheng, Garth A. Gibson. 2015 ACM Symposium on
Cloud Computing (SOCC 2015), Aug 29-30, 2015, Hawaii.
PDF
Caveat-Scriptor: Write Anywhere Shingled Disks
. Kadekodi, Saurabh, Swapnil Pimpale, Garth A. Gibson. Proc. Of the Seventh USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’15), Santa Clara, CA, July 2015.
PDF
High-Performance Distributed ML at Scale through Parameter
Server Consistency Models
. Wei Dai, Abhimanu Kumar, Jinliang Wei, Qirong
Ho, Garth Gibson, Eric P. Xing. 29th AAAI Conf. on Artificial Intelligence
(AAAI-15), Jan 25-29, 2015, Austin, Texas.
PDF
On Model Parallelization and Scheduling Strategies for Distributed Machine Learning
. Seunghak
Lee, Jin Kyu Kim, Xun Zheng, Qirong Ho, Garth Gibson, Eric Xing. 2014
Neural Information Processing Systems (NIPS 2014), Dec 8-11, Montreal, CA.
PDF
Scaling File System Metadata Performance With Stateless
Caching and Bulk Insertion
. Ren, Kai, Qing Zheng, Swapnil Patil, Garth
Gibson. ACM/IEEE Int'l Conf. for High Performance Computing, Networking,
Storage and Analysis (SC'14), November 16-21, 2014, New Orleans, LA.
PDF
Best Paper
BatchFS: Scaling the File System Control Plane with
Client-Funded Metadata Servers
. Qing Zheng, Kai Ren, Garth Gibson.
Proc. of the Ninth Parallel Data Storage Workshop (PDSW14),
co-located with the Int. Conference for High Performance Computing,
Networking, Storage and Analysis (SC14),
New Orleans, LA, November 2014.
PDF
Exploiting iterative-ness for parallel ML computations
Henggang Cui, Alexey Tumanov, Jinliang Wei, Lianghong Xu, Wei Dai, Jesse
Haber-Kucharsky, Qirong Ho, Gregory R. Ganger, Phillip B. Gibbons, Garth A.
Gibson, Eric P. Xing. 2014 ACM Symposium on Cloud Computing (SOCC 2014),
Nov 3-5, Seattle, WA.
PDF
Will They Blend?: Exploring Big Data Computation atop Traditional HPC NAS Storage
. Elis Wilson, Mahmut Kandemir, Garth Gibson. The 34th International
Conference on Distributed Computing Systems (ICDCS 2014). Madrid, Spain, June 30 - July 3, 2014.
PDF
Exploiting bounded staleness to speed up Big Data analytics
. Cui, Henggang, Qirong Ho, James Cipar, Jin Kyu Kim, Seunghak
Lee, Abhimanu Kumar, Jinliang Wei, Wei Dai, Gregory R. Ganger, Phil B. Gibbons, Garth A. Gibson, Eric P. Xing. USENIX Annual Technical
Conference (ATC'14), June 19-20, Philadelphia, PA.
PDF
MultiMedia link
More Effective Distributed ML via a Stale Synchronous Parallel
Parameter Server
Ho, Qirong, James Cipar, Henggang Cui, Seunghak Lee,
Jin Kyu Kim, Phil B. Gibbons, Garth A. Gibson, Gregory R. Ganger, Eric P. Xing.
2013 Neural Information Processing Systems (NIPS 2013), Dec 5-10, Lake Tahoe, NV.
PDF
Structuring PLFS for Extensibility
Cranor, Chuck, Milo Polte, Garth A. Gibson.
Proc. of the Eighth Parallel Data Storage Workshop (PDSW13),
co-located with the Int. Conference for High Performance Computing, Networking, Storage and Analysis (SC13),
Denver, CO, November 2013.
PDF
PARROT: A Practical Runtime for Deterministic, Stable and
Reliable Threads
. Heming Cui, Jiri Simsa, Yi-Hong Lin, Hao Li, Ben Blum,
Xinan Xu, Junfeng Yang, Garth A. Gibson. 24th ACM Symposium on Operating
Systems Principles (SOSP'13), Nov 4-6, 2013, Farmington, PA.
ACM_DL
YouTube_Video
TABLEFS: Enhancing Metadata Efficiency in the Local File
System
. Kai Ren, Garth Gibson. 2013 USENIX Annual Technical Conference,
June 26-28, 2013, San Jose, CA.
PDF
Solving the Straggler Problem with Bounded Staleness
James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Gregory R. Ganger, Garth
Gibson, Kimberly Keeton, Eric Xing. 14th USENIX Workshop on Hot Topics in
Operating Systems, May 13-15, 2013, Santa Ana Pueblo, NM.
PDF
Shingled Magnetic Recording: Areal Density Increase Requires New Data Management
. Tim Feldman, Garth Gibson. USENIX ;login:, v 38, n 3, June 2013.
PDF
I/O Acceleration with Pattern Detection
. Jun He, John Bent, Aaron Torres, Gary Grider, Garth Gibson, Carlos Maltzahn, Xian-He Sun.The 22nd Int. ACM Symposium on High Performance Parallel and Distributed Computing (HPDC'13), New York City, June 17-21, 2013.
PDF
Discovering Structure in Unstructured I/O
. Jun He, John Bent, Aaron Torres, Gary Grider, Garth Gibson, Carlos Maltzahn, Xian-He Sun. Proc. of the Seventh Parallel Data Storage Workshop (PDSW12), co-located with the Int. Conference for High Performance Computing, Networking, Storage and Analysis (SC12), Salt Lake City, UT, November 2012.
PDF
A Case for Scaling HPC Metadata Performance through De-specialization
. Swapnil Patil, Kai Ren, Garth Gibson. Proc. of the Seventh Parallel Data Storage Workshop (PDSW12), co-located with the Int. Conference for High Performance Computing, Networking, Storage and Analysis (SC12), Salt Lake City, UT, November 2012.
PDF
TABLEFS: Embedding a NoSQL Database inside the Local File System
. Kai Ren, Garth Gibson. 1st Storage System, Hard Disk and Solid State Technologies Summit, IEEE Asia-Pacific Magnetic Recording Conference (APMRC), November 2012, Singapore.
PDF
Scalable Dynamic Partial Order Reduction
. Jiri Simsa, Randal Bryant, Garth Gibson, Jason Hickey. Third Int. Conf. on Runtime Verification (RV2012), 25-28 September 2012, Istanbul, Turkey.
PDF
The Power and Challenges of Transformative I/O
. Adam Manzanares, Meghan McClelland, John Bent, Garth Gibson. 2012 IEEE Int. Conf. on Cluster
Computing (CLUSTER12), 24-28 September 2012, Beijing, China.
PDF
Indexing a large-scale database of astronomical objects
. Bin Fu, Eugene Fink, Garth Gibson, and Jaime Carbonell. Proceedings of the Fourth Workshop on Interfaces and Architecture for Scientific Data Storage (IASDS), September 2012, Beijing, China.
PDF
Exact and Approximate Computation of a Histogram of Pairwise Distances between Astronomical Objects
. Bin Fu, Eugene Fink, Garth Gibson, Jaime Carbonell. 1st Workshop on High Performance Computing in Astronomy (AstroHPC'12), June 2012, Delft, Netherlands.
PDF
File System Virtual Appliances: Portable File System Implementations
. Michael Abd-El-Malek , Matthew Wachs, James Cipar, Karan Sanghi, Gregory R. Ganger, Garth A. Gibson, Michael K. Reiter. ACM Transactions on Storage, Vol. 8, No. 3, Article 39, May 2012.
PDF
YCSB++: Benchmarking and Performance Debugging Advanced Features in Scalable Table Stores
. Swapnil Patil, Milo Polte, Kai Ren, Wittawat Tantisiriroj, Lin Xiao, Julio Lopez, Garth Gibson, Adam Fuchs, Billie Rinaldi. Proc. of the 2nd ACM Symposium on Cloud Computing (SOCC '11), October 27–28, 2011, Cascais, Portugal. Supersedes Carnegie Mellon University Parallel Data Laboratory Technical Report CMU-PDL-11-111, August 2011.
PDF
On the Duality of Data-intensive File System Design: Reconciling HDFS and PVFS
. Wittawat Tantisiriroj, Swapnil Patil, Garth Gibson, Seung Woo Son, Samuel J. Lang, Robert B. Ross. SC11, November 12-18, 2011, Seattle, Washington USA. Supersedes Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-11-108. April 2011.
PDF
Recipes for Baking Black Forest Databases: Building and Querying Black Hole Merger Trees from Cosmological Simulations
. Lopez, J., C. Degraf, T. DiMatteo, B. Fu, E. Fink, G. Gibson. 23rd Scientific and Statistical Database Management Conference (SSDBM'11), July 2011.
PDF
Six Degrees of Scientific Data: Reading Patterns for Extreme Scale Science IO
. Lofstead, Jay, Milo Polte, Garth Gibson, Scott A. Klasky, Karsten Schwan, Ron Oldfield, Matthew Wolf, Qing Liu. 20th ACM Int. Symp. On High-Performance Parallel and Distributed Computing (HPDC'11), June 2011.
PDF
Otus: Resource Attribution and Metrics Correlation in Data-Intensive Clusters.
Kai Ren, Julio Lopez, Garth Gibson. The 2nd International Workshop on MapReduce and its Applications (MapReduce'11), June, 2011. PDF
Scale and Concurrency of GIGA+: File System Directories with Millions of Files
. Patil, S., G. Gibson.
Proc 9th USENIX Conf. on File and Storage Technologies (FAST11),
February, 2011.
PDF
dBug: Systematic Evaluation of Distributed Systems.
Jiri Simsa, Randy Bryant, Garth Gibson. 5th Int. Workshop on Systems Software Verification (SSV’10), co-located with 9th USENIX Symp. On Operating Systems Design and Implementation (OSDI’10), Vancouver BC, October 2010.
PDF
pWalrus: Towards Better Integration of Parallel File Systems into Cloud Storage.
Yoshihisa Abe, Garth Gibson.Workshop on Interfaces and Abstractions for Scientific Data Storage (IASDS10), co-located with IEEE Int. Conference on Cluster Computing 2010 (Cluster10), Heraklion, Greece, September 2010.
PDF
DiscFinder: A Data-Intensive Scalable Cluster Finder for Astrophysics
. Fu, B., K. Ren, J. Lopez, E. Fink, G. Gibson. ACM Int. Symp. On High Performance Distributed Computing (HPDC), June 2010.
PDF
PLFS: A Checkpoint Filesystem for Parallel Applications
. Bent, J., G. Gibson, G. Grider, B. McClelland, P. Nowoczynski, J. Nunez, M. Polte, M. Wingate, “,” Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC2009), Nov. 2009.
PDF
Understanding and Maturing the Data-Intensive Scalable Computing Storage Substrate.
Gibson, G., B. Fan, S. Patil, M. Polte, W. Tantisiriroj, L. Xiao. 2009 Microsoft eScience Workshop, Pittsburgh, PA, October, 2009.
PDF
Directions for TDMR System Architecture: Synergies with SSDs.
Gibson, G.A., Milo Polte. Proc. of the I.E.E.E. International Symposium on Magnetics (INTERMAG09), Sacramento CA, March 2009.
PDF
TALK
Safe and Effective fine-grain TCP Retransmissions for Datacenter Incast Communication.
Vasudevan, V., A. Phanishayee, H. Shah, E. Krevat, D.G. Andersen, G.R. Ganger, G.A. Gibson, B. Mueller, S. Seshan. SIGCOMM'09, August 16-21, 2009, Barcelona, Spain.
PDF
In Search of an API for Scalable File Systems: Under the table or above it?
Patil, S., G.A. Gibson, G.R. Ganger, J. Lopez, M. Polte, W. Tantisiriroj, L. Xiao. HotCloud'09, June 15, 2009, San Diego, CA.
PDF
Enabling Enterprise Solid State Disks Performance.
Polte, M., J. Simsa, G. Gibson. 1st Workshop on Integrating Solid-state Memory into the Storage Hierarchy, March 7, 2009, Washington DC.
PDF
Scalable Performance of the Panasas Parallel file System.
Welch, B., M. Unangst, Z. Abbasi, G. Gibson, B. Mueller, J. Small, J. Zelenka, B. Zhou. 6th USENIX Conf on File and Storage Technologies (FAST'08), Feb. 2008, San Francisco, CA.
PDF
Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems
. Phanishayee., A., E. Krevat, V. Vasudevan, D.G. Andersen, G.R. Ganger, G.A. Gibson, S. Seshan. 6th USENIX Conf on File and Storage Technologies (FAST'08), Feb. 2008, San Francisco, CA.
PDF
Understanding failure in petascale computers.
Schroeder, B., Gibson, G.A. SciDAC 2007. Journal of Physics: Conf. Ser. 78.
PDF
Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You?
Schroeder, B., Gibson, G.A. Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST '07),
February 13–16, 2007, San Jose, CA.
PDF
A Large-scale Study of Failures in High-performance-computing Systems.
Schroeder, B., Gibson, G.A. Proceedings of the International Conference
on Dependable Systems and Networks (DSN2006), Philadelphia, PA, USA,
June 25-28, 2006.
PDF
Scheduling Speculative Tasks in a Compute Farm.
Petrou, D., G. Ganger, G.A. Gibson. High Performance Computing, Networking, and Storage Conference (SC2005), Seattle, WA., November 2005.
PDF
Active Disks for Large-Scale Data Mining.
Riedel, E., C. Faloutsos, G.A. Gibson, D. Nagle. ACM Computer Magazine, June 2001.
Network Attached Storage Architecture.
Gibson, G.A., R. Van Meter. Comm. of the ACM, Vol. 43,
No 11, November, 2000.
PDF
Dynamic Function Placement for Data-Intensive
Cluster Computing.
Amiri, K., D. Petrou, G.
Ganger, G.A. Gibson. USENIX Technical Conference, San Diego, June 2000.
PDF
Highly Concurrent Shared Storage
. Amiri, K., G.A. Gibson,
R. Golding. Int. Conf. on Distributed
Computing Systems (ICDCS2000), April 2000.
PDF
Automatic I/O Hint Generation through Speculative Execution
. Chang, F.,
G.A. Gibson.
Proceedings of the Third USENIX Symposium of Operating Systems Design
and Implementation (OSDI), February 1999.
PDF
Active Storage for Large-scale Data Mining and Multimedia
Applications
. E. Riedel, G. A. Gibson,
C. Faloutsos. Proceedings of the 1998 Very Large Data Bases conference
(VLDB), August 1998.
PDF
Cost-Effective High-Bandwidth Storage Architecture
. Gibson, G.A, et. al. Int. Conf. on Architectural
Support for Programming Languages and Operating Systems, ASPLOS, October,
1998.
PDF
Report of the Working Group on Storage I/O Issues
in Large-Scale Computing
. G. A. Gibson, J. S. Vitter,
J. Wilkes, eds. ACM Workshop on Strategic Directions in Computing
Research. ACM Computing Surveys, 28, 1, Dec. 1996.
PDF
Informed Prefetching
and Caching
. R. H. Patterson, G. A.
Gibson, E. Ginting, D. Stodolsky, and J. Zelenka. Proc. of the 15th Symposium of Operating Systems Principles,
December 3-6, 1995.
PDF
Architectures and Algorithms for On-line Failure Recovery
in Redundant Disk Arrays
. M. Holland, G. A. Gibson,
D. P. Siewiorek. J. of Distributed and Parallel Databases,
Vol. 2, No. 3, July, 1994.
PDF
A Case for Redundant Arrays of Inexpensive Disks
(RAID)
. D. A. Patterson, G. A.
Gibson, R. H. Katz. Proceedings of the International Conference on Management of
Data (SIGMOD), June 1988.
PDF