Class TableMultiResultNodeWorker<S,​R extends TableMultiResult>

  • All Implemented Interfaces:
    Runnable

    public abstract class TableMultiResultNodeWorker<S,​R extends TableMultiResult>
    extends Object
    implements Runnable
    The workers for table multi-results node. TODO: Instead of a fixed history size, aggregate data into larger time ranges and keep track of mean, min, max, and standard deviation (or perhaps 5th/95th percentile?). Keep the following time ranges:
     1 minute for 2 days = 2880 samples
     5 minutes for 5 days = 1440 samples
     15 minutes for 7 days = 672 samples
     30 minutes for 14 days = 672 samples
     1 hour for 28 days = 672 samples
     2 hours for 56 days = 672 samples
     4 hours for 112 days = 672 samples
     1 day forever beyond this
     ==================================
     total: 7680 samples + one per day beyond 224 days
     
    Update in a single background thread across all workers, and handle recovery from unexpected shutdown gracefully by inserting aggregate before removing samples, and detect on next aggregation. Also, the linked list should always be sorted by time descending, confirm this on aggregation pass.
    Author:
    AO Industries, Inc.
    • Method Detail

      • getNextStartupDelay

        protected int getNextStartupDelay()
        The default startup delay is within five minutes.
      • run

        public final void run()
        Specified by:
        run in interface Runnable
      • getSleepDelay

        protected long getSleepDelay​(boolean lastSuccessful,
                                     AlertLevel alertLevel)
        The default sleep delay is five minutes when successful or one minute when unsuccessful.
        Parameters:
        alertLevel - When null, treated as AlertLevel.UNKNOWN
      • getHistorySize

        protected abstract int getHistorySize()
        The number of history items to store.
      • getSample

        protected abstract S getSample()
                                throws Exception
        This is the main monitor routine. Gets the current sample for this worker, any error should result in an exception. The sample may be any object that encapsulates the state of the resource in order to determine its alert level, alert message, and overall result.
        Throws:
        Exception
      • newErrorResult

        protected abstract R newErrorResult​(long time,
                                            long latency,
                                            AlertLevel alertLevel,
                                            String error)
        Creates a new result container object for error condition.
      • newSampleResult

        protected abstract R newSampleResult​(long time,
                                             long latency,
                                             AlertLevel alertLevel,
                                             S sample)
        Creates a new result container object for success condition.
      • cancel

        protected void cancel​(Future<S> future)
        Cancels the current getSample call on a best-effort basis. Implementations of this method must not block. This default implementation calls future.cancel(true).
      • getAlertLevelAndMessage

        protected abstract AlertLevelAndMessage getAlertLevelAndMessage​(S sample,
                                                                        Iterable<? extends R> previousResults)
                                                                 throws Exception
        Determines the alert level and message for the provided result. If unable to parse, may throw an exception to report the error. This should not block or delay for any reason.
        Throws:
        Exception
      • useFutureTimeout

        protected boolean useFutureTimeout()
        By default, the call to getSample uses a Future and times-out at 5 minutes. If the monitoring check cannot block indefinitely, it is more efficient to not use this decoupling.
      • getFutureTimeout

        protected long getFutureTimeout()
        The default future timeout is 5 minutes.
      • getFutureTimeoutUnit

        protected TimeUnit getFutureTimeoutUnit()
        The default future timeout unit is MINUTES.
        See Also:
        TimeUnit.MINUTES