Details
-
Improvement
-
Status: Closed
-
Top
-
Resolution: Fixed
-
None
-
None
-
None
Description
We should set a empty (or just one byte) hippo:text binary property when the cms fails to extract the pdf (and exception, including NPE, thus catch throwable)
Otherwise, when the cms extraction of a, say 10Mb, pdf fails, then after that every repo node in the cluster tries to extract the same pdf during indexing, If however there is a hippo:test binary property, this won't happen, This will avoid un-extractable pdfs to create expensive cluster wide hic-ups.
19.09.2012 14:33:48 WARN [org.hippoecm.frontend.editor.plugins.resource.ResourceHelper.handlePdfAndSetHippoTextProperty():144] An exception occurred while trying to set Tika configuration: {}:
org.apache.tika.exception.TikaException: Unexpected SAX error
at org.apache.tika.utils.ParseUtils.getStringContent(ParseUtils.java:115)
at org.hippoecm.frontend.editor.plugins.resource.ResourceHelper.handlePdfAndSetHippoTextProperty(ResourceHelper.java:135)
at org.hippoecm.frontend.plugins.gallery.model.DefaultGalleryProcessor.makeImage(DefaultGalleryProcessor.java:299)
at org.hippoecm.frontend.plugins.gallery.GalleryWorkflowPlugin.createGalleryItem(GalleryWorkflowPlugin.java:159)
at org.hippoecm.frontend.plugins.gallery.GalleryWorkflowPlugin.access$100(GalleryWorkflowPlugin.java:70)
at org.hippoecm.frontend.plugins.gallery.GalleryWorkflowPlugin$UploadDialog.handleUploadItem(GalleryWorkflowPlugin.java:92)
at org.hippoecm.frontend.plugins.yui.upload.MultiFileUploadDialog$2.onFileUpload(MultiFileUploadDialog.java:66)
at org.hippoecm.frontend.plugins.yui.upload.FileUploadWidget$1.onFileUpload(FileUploadWidget.java:111)
at org.hippoecm.frontend.plugins.yui.upload.ajax.AjaxMultiFileUploadComponent$UploadBehavior.onRequest(AjaxMultiFileUploadComponent.java:68)
at org.apache.wicket.request.target.component.listener.BehaviorRequestTarget.processEvents(BehaviorRequestTarget.java:157)
at org.apache.wicket.request.AbstractRequestCycleProcessor.processEvents(AbstractRequestCycleProcessor.java:92)
at org.hippoecm.frontend.PluginRequestCycleProcessor.processEvents(PluginRequestCycleProcessor.java:95)
at org.apache.wicket.RequestCycle.processEventsAndRespond(RequestCycle.java:1252)
at org.apache.wicket.RequestCycle.step(Req
Attachments
Issue Links
- discovered while testing
-
REPO-501 Even when a hippo:text property is available on the node, indexing will still attempt to extract it
- Closed