Uploaded image for project: '[Read Only] - Hippo Site Toolkit 2'
  1. [Read Only] - Hippo Site Toolkit 2
  2. HSTTWO-3623

Let o.h.hst.content.beans.standard.AvailableTranslations use external service component for potential performance improvements

    XMLWordPrintable

Details

    • Improvement
    • Status: Closed
    • Normal
    • Resolution: Fixed
    • None
    • 4.0.0
    • None
    • None
    • Platform sprint 127, Platform Sprint 132

    Description

      When HippoItem#getAvailableTranslations() calls are too many, the CPU may peak high and it gets inefficient and less performant.
      HippoItem#getAvailableTranslations() calls always makes JCR queries which seems to make an impact on lucene index based search especially when being invoked too many (concurrently).

      So, Ard's idea on this (for a project invoking the method too many):

      ... could override (improve) org.hippoecm.hst.content.beans.standard.AvailableTranslations#populateTranslations and make it request the translation node id's from a service. This service (HST spring bean) caches the result of the query (into nodes ids) for, say, an hour (or you implement some more advanced invalidation). This should make sure that the query as a result of getTranslations is not invoked that frequently any more. Obviously, when you fetch the nodes via the node ids you have to account for the fact that a jcr node might be deleted.
      ...
      Then you could even make the HST Spring bean translations service run once per day and create a cache of all nodes ids that belong to the same translation id : You don't even need to query for it. Just run through all jcr nodes once, and for every handle create the 'translation id
      --> doc nodes ids map'

      Perhaps this helps to get rid of the queries (assuming that is the real culprit)

      Ps once you have this map, you can even keep it up to date by plain jcr event listeners, thus even keeping it in sync with repo changes.

      Note:
      1) Lazy loading is ok but requires a query to find the translations once you hit a translationId that is not in cache
      2) Eagerly loading does not require any query because you traverse every node any way so you can just build up the cache without any
      query at all

      Now, I don't say (1) or (2) is better. They just are different and it depends on the customer which is preferable.

      I have by the way one more remark: Most likely you need a double mapping, because you also need to be able to find the translation id
      for a handle id : Namely, if a document gets deleted, you cannot retrieve its translation id any more. All you have is the event, and the event from the handle gives you the handle id. Then, with this handle id, you need to be able to find the translation id belonging to it (if it exists), and then update the map of

      translation id --> handle nodes ids

      by deleting the correct handle id.

      Attachments

        Activity

          People

            Unassigned Unassigned
            wko Woonsan Ko (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: