Class ErrorTracker
- java.lang.Object
-
- org.apache.nutch.metrics.ErrorTracker
-
public class ErrorTracker extends Object
A utility class for tracking errors by category with automatic classification.This class provides thread-safe error counting with automatic categorization based on exception type. It uses a bounded set of error categories to stay within Hadoop's counter limits (~120 counters).
Usage:
// In mapper/reducer setup or thread initialization errorTracker = new ErrorTracker(NutchMetrics.GROUP_FETCHER); // When catching exceptions try { // ... operation ... } catch (Exception e) { errorTracker.recordError(e); // Auto-categorizes } // Or with manual categorization errorTracker.recordError(ErrorTracker.ErrorType.NETWORK); // In cleanup - emit all error counters errorTracker.emitCounters(context);Emits the following counters:
- errors_total - total number of errors across all categories
- errors_network_total - network-related errors
- errors_protocol_total - protocol errors
- errors_parsing_total - parsing errors
- errors_url_total - URL-related errors
- errors_scoring_total - scoring filter errors
- errors_indexing_total - indexing filter errors
- errors_timeout_total - timeout errors
- errors_other_total - uncategorized errors
- Since:
- 1.22
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classErrorTracker.ErrorTypeError type categories for classification.
-
Constructor Summary
Constructors Constructor Description ErrorTracker(String group)Creates a new ErrorTracker for the specified counter group.ErrorTracker(String group, TaskInputOutputContext<?,?,?,?> context)Creates a new ErrorTracker with cached counter references.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static ErrorTracker.ErrorTypecategorize(Throwable t)Categorizes a throwable into an error type.voidemitCounters(TaskInputOutputContext<?,?,?,?> context)Emits all error counters to the Hadoop context.longgetCount(ErrorTracker.ErrorType type)Returns the count for a specific error type.static StringgetCounterName(Throwable t)Gets the counter name for a throwable based on its categorization.static StringgetCounterName(ErrorTracker.ErrorType type)Gets the counter name constant for a given error type.longgetTotalCount()Returns the total count of all errors.voidincrementCounters(Throwable t)Directly increments cached error counters without local accumulation.voidincrementCounters(ErrorTracker.ErrorType type)Directly increments cached error counters without local accumulation.voidinitCounters(TaskInputOutputContext<?,?,?,?> context)Initializes cached counter references from the Hadoop context.voidrecordError(Throwable t)Records an error with automatic categorization based on the throwable type.voidrecordError(ErrorTracker.ErrorType type)Records an error with explicit category.
-
-
-
Constructor Detail
-
ErrorTracker
public ErrorTracker(String group)
Creates a new ErrorTracker for the specified counter group.This constructor creates an ErrorTracker without cached counters. Call
initCounters(TaskInputOutputContext)in setup() to cache counter references for better performance.- Parameters:
group- the Hadoop counter group name (e.g., NutchMetrics.GROUP_FETCHER)
-
ErrorTracker
public ErrorTracker(String group, TaskInputOutputContext<?,?,?,?> context)
Creates a new ErrorTracker with cached counter references.This constructor caches all counter references at creation time, avoiding repeated counter lookups in hot paths.
- Parameters:
group- the Hadoop counter group namecontext- the Hadoop task context for caching counters
-
-
Method Detail
-
initCounters
public void initCounters(TaskInputOutputContext<?,?,?,?> context)
Initializes cached counter references from the Hadoop context.Call this method in the mapper/reducer setup() method to cache counter references and avoid repeated lookups during processing.
- Parameters:
context- the Hadoop task context
-
recordError
public void recordError(Throwable t)
Records an error with automatic categorization based on the throwable type.- Parameters:
t- the throwable to categorize and record
-
recordError
public void recordError(ErrorTracker.ErrorType type)
Records an error with explicit category.- Parameters:
type- the error type category
-
getCount
public long getCount(ErrorTracker.ErrorType type)
Returns the count for a specific error type.- Parameters:
type- the error type- Returns:
- the count for that error type
-
getTotalCount
public long getTotalCount()
Returns the total count of all errors.- Returns:
- the total error count
-
emitCounters
public void emitCounters(TaskInputOutputContext<?,?,?,?> context)
Emits all error counters to the Hadoop context.Should be called once during cleanup to emit aggregated metrics. Only emits counters for error types that have non-zero counts.
If counters were cached via
initCounters(TaskInputOutputContext), uses the cached references for better performance.- Parameters:
context- the Hadoop task context
-
incrementCounters
public void incrementCounters(Throwable t)
Directly increments cached error counters without local accumulation.Use this method when you want to immediately update Hadoop counters rather than accumulating locally and emitting in cleanup. Requires
initCounters(TaskInputOutputContext)to have been called.- Parameters:
t- the throwable to categorize and count- Throws:
IllegalStateException- if counters have not been initialized
-
incrementCounters
public void incrementCounters(ErrorTracker.ErrorType type)
Directly increments cached error counters without local accumulation.Use this method when you want to immediately update Hadoop counters rather than accumulating locally and emitting in cleanup. Requires
initCounters(TaskInputOutputContext)to have been called.- Parameters:
type- the error type to count- Throws:
IllegalStateException- if counters have not been initialized
-
categorize
public static ErrorTracker.ErrorType categorize(Throwable t)
Categorizes a throwable into an error type.The categorization checks the exception class hierarchy to determine the most appropriate category. Timeout exceptions are checked first as they are a subclass of IOException.
- Parameters:
t- the throwable to categorize- Returns:
- the appropriate ErrorType for the throwable
-
getCounterName
public static String getCounterName(ErrorTracker.ErrorType type)
Gets the counter name constant for a given error type.- Parameters:
type- the error type- Returns:
- the counter name constant from NutchMetrics
-
getCounterName
public static String getCounterName(Throwable t)
Gets the counter name for a throwable based on its categorization.This is a convenience method for direct use in catch blocks:
} catch (Exception e) { context.getCounter(group, ErrorTracker.getCounterName(e)).increment(1); }- Parameters:
t- the throwable to get the counter name for- Returns:
- the counter name constant from NutchMetrics
-
-