Uploaded image for project: 'Hippo CMS'
  1. Hippo CMS
  2. CMS-15839

Synchronous event listeners in Hippo Repository / Jackrabbit should be ordered

    XMLWordPrintable

Details

    • Improvement
    • Status: Closed
    • Normal
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      Jackrabbit has two kind of event listeners:

      1. Synchronous Event Listeners
      2. Asynchronous Event Listeners

      The synchronous event listeners are executed blocking the request which triggers a transaction. The asynchronous listeners are updated later asynchronously by a different thread than the thread making the transaction.

      Note that the event listeners, also the synchronous, are triggered after the database persistence has been finished. In case the transaction is large change set, SOME the handling of the synchronous event listeners can take considerable time : for example the (Hippo)SearchManager is a synchronous event listener which takes care of indexing all newly added and deleted nodes. This process can take quite some time (multiple seconds) in case a large set of nodes are added, for example during an HST channel branch which makes three copies of an hst:configuration (live workspace/upstream and preview workspace)

      In general, all synchronous event listeners are really lightweight, except the SearchManager which does the indexing of newly added and removed jcr nodes. The SearchManager is always the FIRST synchronous event listener registered by Jackrabbit as done via:

      org.apache.jackrabbit.core.RepositoryImpl.WorkspaceInfo#doPostInitialize
      

      After that, the JR LockManager gets registered, and then all the Hippo Repository and later downstream projects like the HST bootstrapping a synchronous event listener in the InvalidationMonitor.

      when changes are persisted in JR, the

      org.apache.jackrabbit.core.observation.ObservationDispatcher#dispatchEvents
      

      dispatches the events first to all synchronous event listeners, and then asynchronously to all asynchronous event listeners.

      The HST Model uses the synchronous event listener as well as an asynchronous event listener (to avoid deadlocks) to 'know' when to reload the model, see ENT-5049 : the synchronous event listener is needed to always be in time for reloading the model, the asynchronous event listener is used to avoid deadlocks as the synchronous event listener cannot be used in the HST loading in a synchronized block as it results in deadlocks.

      To make sure the HST waits for the asynchronous event listener to catch up with the synchronous event listener, we have logic something like this:

      public void awaitEventsConsistency() {
      
              // max wait 1 second for JCR events to arrive : heuristics of 1 second found during monitoring tests on
              // jenkins: during heavy operations with eg branch merging the delay can be significant
              long maxWait = 1000;
      
              final long start = System.currentTimeMillis();
              final long waitUntil = start + maxWait;
      
              final long synchronousOnEventsCounter = getSynchronousOnEventsCounter();
              final long asynchronousOnEventsCounter = getAsynchronousOnEventsCounter();
      
              // in the extreme unlikely situation, after getSynchronousOnEventsCounter() is invoked, the async event counter
              // could receive (just like the sync event counter) a new event, hence in the if below => instead of just =
              if (asynchronousOnEventsCounter >= synchronousOnEventsCounter) {
                  syncAsyncListenersOutOfSync.set(0);
                  log.debug("asynchronousOnEventsCounter is on par with synchronousOnEventsCounter");
                  return;
              }
      
              log.info("Waiting for asynchronous events to arrive. Async counter is at '{}'. Waiting until it is at least '{}'",
                      asynchronousOnEventsCounter, synchronousOnEventsCounter);
       
              synchronized (this) { 
                    // code which waits until synchronousOnEventsCounter == asynchronousOnEventsCounter
      
              }
      

      As long as above does not return, the HST Model loading is halted. Obviously, the asynchronous event should arrive super fast after the synchronous event.

      Problem

      The order in which the synchronous event listeners are executed is non deterministic because JR uses a java Set

      As a result, it can be that first the (super lightweight) synchronousOnEventsCounter listener is notified (triggering an indication that the HST Model has to be reloaded), and after the synchronousOnEventsCounter, the synchronous JR SearchManager is notified: the latter however can take multiple seconds to index in case the change set is very large, for example during large hst configuration branching. During all this time, the asynchronousOnEventsCounter does not yet receive the event, the and awaitEventsConsistency keeps waiting until 1 second passed : all this time, users requiring an up2date HST Model are all waiting.

      Solution

      Jackrabbit should keep the synchronous event listener in the order in which they are registered: this way, the SearchManager synchronous listener is always the first one to be triggered.

      QA Expectation

      After this fix, it is more unlikely to get the message

      INFO  [org.hippoecm.hst.platform.model.InvalidationMonitor.awaitEventsConsistency():139] Expected within 1000 ms the async HstConfiguration EventListener to have caught up with the InvalidationMonitor#SynchronousOnEventCounter listener counter. However async counter is as '451' and sync counter at '452'. Stop waiting longer.
      

      For this reason, when we bump the JR version, we will change the above INFO level message to a WARN as it is an indication of a potential problem as it causes hiccups in reload

      Attachments

        Activity

          People

            aschrijvers Ard Schrijvers
            aschrijvers Ard Schrijvers
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: