Skip to content

Update parse options to reflect modern libxml usage #3439

@flavorjones

Description

@flavorjones

Context

In #3360 the libxml2 maintainer left some suggestions about how we're exposing and documenting some of the parse options.

He mentioned:

  • DTDATTR and DTDVALID imply DTDLOAD and are unsafe as well.
  • SAX1 should probably not be exposed.
  • NODICT should probably not be exposed.
  • XINCLUDE, NOXINCNODE and NOBASEFIX are only used by the XML Reader and XInclude API.
  • HUGE is safe these days (since 2.10)

and some forward-looking statements about the upcoming 2.14 release:

  • UNZIP: Enable decompression. This option has no real effect for now. The plan is that users who really need decompression start to add the option. At a later point, it will be required to enable decompression.
  • NO_SYS_CATALOG: Don't use system catalogs when resolving DTDs or entities.
  • CATALOG_PI: Enable oasis-xml-catalog PIs. This is a really obscure feature that should have never been enabled by default. I don't think your users need it.

Actions

I think the actions I'd like to take re: documentation:

  • Make the following bits :nodoc:: SAX1, NODICT
  • Update documentation for DTDATTR and DTDVALID to imply DTDLOAD and include safety warnings
    • And double-check that these are all off by default
  • Update documentation for the XINCLUDE set to specify they're only used by Reader and Node#process_xincludes

And the functional action I'd like to take:

  • Add HUGE to all the default bitsets if the libxml2 version is >= 2.10.0

I'd like to wait until the UNZIP bit is useful before adding it. We don't expose the catalog bits, so nothing to do there.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions