Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Simulating Multi-Table Contention in Catalog Formats
Published:
Simulating multi-table commit rates across S3, Azure, and GCS for Apache Iceberg catalog-as-file commits.
Simulating Catalog and Table Conflicts
Published:
Simulating the Apache Iceberg single-table commit protocol to understand the upper bound on commit throughput.
Conditional Operations in Object Stores
Published:
Benchmarking atomic operations across AWS S3/S3 Express One Zone, Azure Blob Storage (Premium), and Google Cloud Storage.
portfolio
publications
Gridmix3: Emulating Production {IO} Workload for Apache Hadoop
Published in USENIX Conference on File and Storage Technologies (FAST), 2010
A Framework for Obtaining the Ground-Truth in Architectural Recovery
Published in 2012 Joint Working IEEE/IFIP Conference on Software Architecture and European Conference on Software Architecture, 2012
True Elasticity in Multi-Tenant Data-Intensive Compute Clusters
Published in ACM Symposium on Cloud Computing (SoCC), 2012
Walnut: A Unified Cloud Object Store
Published in ACM International Conference on Management of Data (SIGMOD), 2012
Writing data-centric concurrent programs in imperative languages
Published in USENIX Hot Topics in Parallelism (HotPar), 2012
Apache Hadoop YARN: Yet Another Resource Negotiator
Published in ACM Symposium on Cloud Computing (SoCC), 2013
REEF: Retainable Evaluator Execution Framework
Published in International Conference on Very Large Data Bases (VLDB), 2013
Reservation-Based Scheduling: If You’re Late Don’t Blame Us!
Published in ACM Symposium on Cloud Computing (SoCC), 2014
Blind Men and an Elephant: Coalescing Open Source, Academic, and Industrial Perspectives on BigData
Published in IEEE International Conference on Data Engineering (ICDE), 2015
Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters
Published in USENIX Annual Technical Conference (ATC), 2015
REEF: Retainable Evaluator Execution Framework
Published in ACM International Conference on Management of Data (SIGMOD), 2015
Apache REEF: Retainable Evaluator Execution Framework
Published in ACM Transactions on Computer Systems (TOCS), 2017
Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics
Published in ACM International Conference on Management of Data (SIGMOD), 2017
NetCo: Cache and I/O Management for Analytics over Disaggregated Stores
Published in ACM Symposium on Cloud Computing (SoCC), 2018
Hydra: A Federated Resource Manager for Data-Center Scale Analytics
Published in USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2019
QuiCK: A Queuing System in CloudKit
Published in ACM International Conference on Management of Data (SIGMOD), 2021
SkyPIE: A Fast & Accurate Oracle for Object Placement
Published in ACM International Conference on Management of Data (SIGMOD), 2024
talks
Blind Men and an Elephant: Coalescing Open Source, Academic, and Industrial Perspectives on BigData
Published:
YARN Test of Time Award
Published:
Consistency in Motion
Published:
