Class LatencyTracker


  • public class LatencyTracker
    extends Object
    A utility class for tracking latency metrics using TDigest for percentile calculation.

    This class wraps a TDigest data structure to collect latency samples and emit Hadoop counters with count, sum, and percentile values (p50, p95, p99).

    Usage:

     // In mapper/reducer setup
     latencyTracker = new LatencyTracker(NutchMetrics.GROUP_FETCHER, NutchMetrics.FETCHER_LATENCY);
     
     // During processing
     long start = System.currentTimeMillis();
     // ... operation ...
     latencyTracker.record(System.currentTimeMillis() - start);
     
     // In cleanup
     latencyTracker.emitCounters(context);
     

    Emits the following counters:

    • {prefix}_count_total - total number of samples
    • {prefix}_sum_ms - sum of all latencies in milliseconds
    • {prefix}_p50_ms - 50th percentile (median) latency
    • {prefix}_p95_ms - 95th percentile latency
    • {prefix}_p99_ms - 99th percentile latency
    Since:
    1.22
    • Constructor Detail

      • LatencyTracker

        public LatencyTracker​(String group,
                              String prefix)
        Creates a new LatencyTracker.
        Parameters:
        group - the Hadoop counter group name
        prefix - the prefix for counter names (e.g., "fetch_latency")
    • Method Detail

      • record

        public void record​(long latencyMs)
        Records a latency sample.
        Parameters:
        latencyMs - the latency in milliseconds
      • getCount

        public long getCount()
        Returns the number of recorded samples.
        Returns:
        the count of recorded latency samples
      • getSum

        public long getSum()
        Returns the sum of all recorded latencies.
        Returns:
        the sum of latencies in milliseconds
      • getPercentile

        public long getPercentile​(double quantile)
        Returns the percentile value for the given quantile.
        Parameters:
        quantile - the quantile (0.0 to 1.0)
        Returns:
        the percentile value in milliseconds
      • emitCounters

        public void emitCounters​(TaskInputOutputContext<?,​?,​?,​?> context)
        Emits all latency counters to the Hadoop context.

        Should be called once during cleanup to emit aggregated metrics.

        Parameters:
        context - the Hadoop task context