Bump tika-core from 1.23 to 1.25
Bumps tika-core from 1.23 to 1.25.
Changelog
Sourced from tika-core's changelog.
Release 2.0.0-ALPHA - 01/13/2021
BREAKING CHANGES in 2.0.0
General
- OCR is now triggered automatically for PDFs if tesseract is on the user's path see (https://cwiki.apache.org/confluence/display/TIKA/TikaOCR#TikaOCR-disable-ocr) for how to disable OCR.
- Remove deprecated Metadata keys/properties (TIKA-1974).
- Removed dangerous calls to read an inputstream or convert to bytes without specifying a charset
tika-parsers
- The parser modules have been broken into three main modules: tika-parsers-classic, tika-parsers-extended and tika-parsers-advanced. Users may now need to add tika-parsers-extended to tika-app and tika-server to include parsers that used to be included by default (for example: envi, gdal, grib, isatab, netcdf).
- ChmParser was moved to org.apache.tika.parser.microsoft.chm
- RTFParser was moved to org.apache.tika.parser.microsoft.rtf
tika-app
tika-server
tika-server now by default forks a process to isolate the parsing in the forked process (this was called the -spawnChild option in tika-1.x). Clients must now expect that tika-server will restart on OOM, timeouts, crashes or after parsing a large number of files. When this happens tika-server will restand and not receive connections for brief periods. The less robust, legacy behavior of not forking a process is available with "-noFork"
tika-server's /metadata endpoint requires tika-server-classic to write XMP/rdf output. This output is not available in tika-server-core.
Other changes:
General code cleanup (PeterAlfredLee)
Great optimization in ForkParser (TIKA-3237).
Fix parsing of emails attached to other emails in PST files (TIKA-3004).
Release 1.25 - 11/25/2020
Fix inconsistent license in xmpcore (TIKA-3204).
General upgrades including some dependencies with recently found security vulnerabilities (TIKA-3119).
Add detection and a parser for flat ODF files (TIKA-3159).
... (truncated)
Commits
- 
0090eba[maven-release-plugin] prepare release 1.25-rc2
- 
a464047roll back for 1.25-rc2, update release date
- 
3dc1e8broll back for 1.25-rc2
- 
65744e7Updated CHANGES.txt with details on TIKA-3189 and TIKA-3227
- 
1abf0ebfix whitespace
- 
2e89e4cupdate README.txt from main branch
- 
d4e607a[maven-release-plugin] prepare for next development iteration
- 
760aa4a[maven-release-plugin] prepare release 1.25-rc1
- 
2744672Fix license issues identified via rat check
- 
2775afeUpdate README for 1.25 release
- Additional commits viewable in compare view