Tag Archives: Brendan Gregg

ZFS Performance Analysis and Tools

Brendan Gregg’s talk at ZFS Day (an event I also organized and ran).

The performance of the file system, or disks, is often the target of blame, especially in multi-tenant cloud environments. At Joyent we deploy a public cloud on ZFS-based systems, and frequently investigate performance with a wide variety of applications in growing environments. This talk is about ZFS performance observability, showing the tools and approaches we use to quickly show what ZFS is doing. This includes observing ZFS I/O throttling, an enhancement added to illumos-ZFS to isolate performance between neighbouring tenants, and the use of DTrace and heat maps to examine latency distributions and locate outliers.

DTracing the Cloud

Brendan Gregg at illumos Day (an event I also organized and ran).

Cloud computing facilitates rapid deployment and scaling, often pushing high load at applications under continual development. DTrace allows immediate analysis of issues on live production systems even in these demanding environments – no need to restart or run a special debug kernel. For the illumos kernel, DTrace has been enhanced to support cloud computing, providing more observation capabilities to zones as used by Joyent SmartMachine customers. DTrace is also frequently used by the cloud operators to analyze systems and verify performance isolation of tenants. This talk covers DTrace in the illumos-based cloud, showing examples of real-world performance wins.

Performance Analysis: The USE Method

Brendan Gregg’s talk at FISL, July 2012.

This talk introduces the USE Method: a simple strategy for performing a complete check of system performance health, identifying common bottlenecks and errors. This methodology can be used early in a performance investigation to quickly identify the most severe system performance issues, and is a methodology the speaker has used successfully for years in both enterprise and cloud computing environments. Checklists have been developed to show how the USE Method can be applied to Solaris/illumos-based and Linux-based systems.

Many hardware and software resource types have been commonly overlooked, including memory and I/O busses, CPU interconnects, and kernel locks. Any of these can become a system bottleneck. The USE Method provides a way to find and identify these.

This approach focuses on the questions to ask of the system, before reaching for the tools. Tools that are ultimately used include all the standard performance tools (vmstat, iostat, top), and more advanced tools, including dynamic tracing (DTrace), and hardware performance counters.

Other performance methodologies are included for comparison: the Problem Statement Method, Workload Characterization Method, and Drill-Down Analysis Method.