help wantedquestion
Description
These are my thought on the issues that need to be addressed before version 1 is released, in no particular order:
- #161 - MathML support - COMPLETE
- NO-ISSUE-YET, COMPLETE - Entity support such as nbsp for HTML, InvisibleTimes for MathML and SVG entities. This is tricky as there is no way to programatically inject entities using the Java XML parser. I propose that we add a doctype dynamically to the start of the XML input, with the desired entities. However this means we have to read XML input into a string, rather than just passing a file or input stream to the XML reader. The builder can be used to specify which entities to load. Used custom doc types and an external entity resolver instead. DOCUMENT.
- #38 - Transforms. A few issues remaining to implement:
- Link placing doesn't take account of transforms.
- Translate is not implemented.
- Some work for transformed boxes in page margins.
- Testing. Do transforms of MathML, SVG and custom objects?
- NO-ISSUE-YET - Logging / error handling overhaul - Currently error handling is ad-hoc. For example should we continue on a load failure or fatally throw? I propose to allow this to be configurable by allowing the user to hook logging on a per-run basis and halt on any log message (which will be changed to enum constants) with a poison exception.
- #60 - CSS3 Columns - Currently implemented for text only. Need to debug to allow other box types in columns.
- #126 - Overflowing pages - Currently content that goes past the right margin is cut off silently. This is mostly a problem with tables. I propose a CSS property that allows cut off content to be printed on the next page. DOCUMENT.
- #204 - Multi run cache - Currently there is a multi-run cache hook method, but the objects stored may not be thread safe. This means it is unsuitable for many use-cases. Propose to remove all caches except font metrics cache.
- NO-ISSUE-YET - Per run cache - Need to make sure nothing is being placed into a PDF document more than once. For example, is an img from the img tag and a background image from the same url embedded twice?
- #83 - Unicode font justification fix - There is a fix in #143 but we are waiting for PDF-BOX 2.0.9 to implement it.
- #123 - RTL table layout - Altering table layout to correct RTL scares me but there have been a couple of requests so should try.
- NO-ISSUE-YET - Remove remnants of configuration class and move all config to builders. There are still some config values that are coming from various file locations.
- #145 - Padding with percentages not working - It appears that it is resolving padding percentage values with a zero base value.
- NO-ISSUE-YET - Make sure all dependencies are up to date. Do this after test system introduced.
- #208 - Semi automatic testing. Propose some sort of semi-auto testing with image diff. This would allow you to run before and after changes to make sure nothing has been broken. Unfortunately, we can't have one-true-source of reference results as reportedly font-handling, etc can change slightly between JREs.
- NO-ISSUE-YET - Java2D cleanup. Make sure all Java2D functionality is in the Java2D module and delete broken code samples and tools. Also make sure Java2D RTL works.
- NO-ISSUE-YET - Documentation. Review and complete the template author's guide, integration guide, create comparison with other solutions such as Flying Saucer, headless-browsers, etc.
- #180 - Performance and memory improvements - IN PROGRESS.
- #143 - Other improvements from this pull-request.
- NO-ISSUE-YET - Floating elements escape elements with
overflow:hiddenset.
Hopefully, most of the other open issues can wait for subsequent releases. NOTE: There will be several more release candidate version before version 1.
I'd appreciate feedback from anybody, especially @rototor. Any other issues that need to be addressed before version 1?