Calls to String.toUpperCase() are sensitive regarding typographic ligatures , meaning that the length of the string may differ, String.length versus String.toUpperCase().length. E.g. the uppercase of single character ß is two characters SS
SimpleHtmlExtractor#getInnerHtmlSimply does line.substring(0, offset) based on the offset gotten from line.toUpperCase().indexOf(endTag); which is not correct and can lead to StringIndexOutOfBoundsException.
Reproduction: use the following as hippostd:content property of an HTML field and have it rendered by the <hst:html /> tag.
This results in StringIndexOutOfBoundsException since index of </BODY> of the uppercased second line is applied to the second line.
A deep corner case since <html><body> is no longer used except for old content.