Skip to content

Conversation

@kettanaito
Copy link

@kettanaito kettanaito commented May 27, 2025

Motivation

This change parses out the image's cx and cy values that describe the end image dimensions in the document. As I've mentioned in the issue, these values are end values (e.g. after image resizing). DOCX doesn't store the original image dimensions. If consumers need those, they would have to infer them from the original image buffer using tools like sharp. Most of the time, however, you don't care about the original dimensions since you parse the document to render it in some way.

Roadmap

  • Discuss which units to use when exposing the image's cx and cy? They are in EMU by default. We can convert them to pixels, but we have to be consistent with how Mammoth treats other measurements here.
  • Add tests.

return combineResults(blips.map(readBlip.bind(null, element)));
var dimensions =
element.first('wp:extent') ||
picture
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you implement safe nested property access in .getElementsByTagName()? If not, why aren't we checking if the referenced tags exist?

.attributes['a:ext']

return combineResults(blips.map((blip) => {
return readBlip(element, blip, dimensions)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unwrapping the map function because there's no benefit to keeping it inlined. It only confuses the argument order.

return readImage(blipImageFile, altText);
return readImage(blipImageFile, {
altText,
height: dimensions.attributes['cy'],
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image dimensions are defined on a higher node than blips, so I assume dimensions apply to all blips. Let me know if I'm wrong in this assumption.

.getElementsByTagName("a:blip");

return combineResults(blips.map(readBlip.bind(null, element)));
var dimensions =
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some vendors (Microsoft, Google Docs) store image dimensions in wp:extent while others, like Apple Pages, set the on the pic:pic node instead. We have to parse both scenarios (they describe the same thing and are mutually exclusive).


function readDrawingElement(element) {
var blips = element
var picture = element
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm lifting up the picture node reference so we don't have to look it up multiple times. If this is premature, let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Image width and height

1 participant