Details
-
Improvement
-
Status: Closed
-
Normal
-
Resolution: Fixed
-
None
-
None
Description
We use tika-parser (1.3) which has a very large footprint of 'optional' dependencies, many of which we don't need and shouldn't by default pull in.
Default excluded tika-parser parsers are:
- PKCS7 signed messages (bouncycastle bcmail/bcprov)
- audio and video formats (vorbis, mp4)
- NetCDF and HDF
- MIME4J (raw email and mbox files)
- EXIF (image full text indexing)
- Rome (RSS and Atom feeds)
Common Compress (archives like zips)(this had to be reverted because of regression inCMS7-9412)- asm (Java classes)
- Boilerpipe (surplus "clutter" around main html content)
Anyone needing such resources indexed with tika-parser can and should add the needed dependencies explicity instead.
Attachments
Issue Links
- causes
-
CMS-9412 Regression - Unable to upload image/asset in CMS
- Closed