Skip to content

Conversation

@etnoy
Copy link
Collaborator

@etnoy etnoy commented Feb 27, 2025

Whenever a library asset is changed it will be reimported during the next scan. However, the metadata extractor puts the incorrect date in fileModifiedAt, so that the file will be reimported on the next scan, and the next, and the next, no matter if it was changed or not.

This fix puts the correct date in fileModifiedAt so that this won't happen. fileCreatedAt should be derived from the file, and not exif data.

Yes, this is a bug from my previous PR #16225...hope this is correct now.

@etnoy etnoy changed the title fix(server) don't reimport files more than once fix(server): don't reimport files more than once Feb 27, 2025
@etnoy etnoy force-pushed the fix/dont-reimport-again branch from 23b6e31 to 4334c2a Compare February 27, 2025 09:24
localDateTime,
fileCreatedAt: exifData.dateTimeOriginal ?? undefined,
fileModifiedAt: exifData.modifyDate ?? undefined,
fileModifiedAt: asset.fileModifiedAt,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if you used a fileModifiedAt variable instead of assigning to asset.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if you used a fileModifiedAt variable instead of assigning to asset.

This is much less trivial than might first seen. It must be assigned to asset in order for getDates to work, and assigning a new variable only makes it more complex.

Rewriting the metadata extraction feature is on my list as it's getting rather messy, but that's not in scope for this PR.

@etnoy etnoy force-pushed the fix/dont-reimport-again branch from 4334c2a to bf53e5a Compare February 27, 2025 10:32
asset.fileModifiedAt = stats.mtime;
}

const fileModifiedAt = asset.fileModifiedAt;
Copy link
Member

@mertalev mertalev Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I mean that instead of asset.fileModifiedAt = stats.mtime; followed by const fileModifiedAt = asset.fileModifiedAt;, I'd prefer const fileModifiedAt = stats.mtime; (or inlining to fileModifiedAt: stats.mtime)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no other example in the metadata service from what I can see where the asset is modified rather than creating a variable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I inlined it as per your suggestion, let me know what you think.

@etnoy etnoy force-pushed the fix/dont-reimport-again branch from bf53e5a to 5fc4318 Compare February 27, 2025 11:20
@alextran1502
Copy link
Member

Is this something we can use Medium Tests for?

@etnoy
Copy link
Collaborator Author

etnoy commented Feb 27, 2025

Is this something we can use Medium Tests for?

I haven't played with them enough to know yet, but it's high on my todo list to migrate as many library tests as possible to that.

Let's get this fix in 1.127.1 first.

@alextran1502
Copy link
Member

Can you help fix the failed tests?

@etnoy
Copy link
Collaborator Author

etnoy commented Feb 27, 2025

Can you help fix the failed tests?

Sure, I didn't know they failed before I left home. I'll do that in a few hours when I'm back at the computer. If you are short on time the fix should be easy, just modify the mocks and asserts a little

@alextran1502 alextran1502 enabled auto-merge (squash) February 27, 2025 16:42
@alextran1502 alextran1502 merged commit d20e2e2 into main Feb 27, 2025
39 checks passed
@alextran1502 alextran1502 deleted the fix/dont-reimport-again branch February 27, 2025 16:45
ExceptionsOccur pushed a commit to ExceptionsOccur/immich that referenced this pull request Feb 28, 2025
* fix(server) don't reimport files more than once

* fix: test

---------

Co-authored-by: Alex Tran <[email protected]>
@scrapix
Copy link

scrapix commented Mar 5, 2025

I think the adaption is not sufficient... I'm currently on v1.128

I've changed access rights and owner with chmod / chown. (File Inode Change Date/Time)
Now all assets (+100.000) are being re-importet including usage of all services

  • thumbnail creation
  • face tagging
  • what not...

although nothing about the picture itself changed, each asset is being re-imported.

Solution proposal
If no other option is going to be found, an image hash comparison should be done prior to starting all service (like thumbnai re-generation, face tagging etc.). To prevent unnecessary processing.


immich_server            | [Nest] 7  - 03/05/2025, 6:30:37 PM   DEBUG [Microservices:LibraryService] Asset was offline or modified, updating asset record /usr/src/app/photos/etomy/Mali/2020/2020-04-13/20200413_170744.jpg
immich_server            | [Nest] 7  - 03/05/2025, 6:30:37 PM   DEBUG [Microservices:LibraryService] Asset was offline or modified, updating asset record /usr/src/app/photos/etomy/Mali/2020/2020-04-13/20200413_170748.jpg
immich_server            | [Nest] 7  - 03/05/2025, 6:30:37 PM   DEBUG [Microservices:LibraryService] Asset was modified, queuing 
...
... 200.000 more ...
...
❯ exiftool Mali/2020/2020-04-13/20200413_170744.jpg

ExifTool Version Number         : 12.41
File Name                       : 20200413_170744.jpg
Directory                       : Mali/2020/2020-04-13
File Size                       : 4.6 MiB
File Modification Date/Time     : 2020:10:27 21:01:33+00:00
File Access Date/Time           : 2023:10:17 00:57:17+00:00
File Inode Change Date/Time     : 2025:03:02 09:20:26+00:00
File Permissions                : -rwxrwxrwx
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg

@etnoy
Copy link
Collaborator Author

etnoy commented Mar 5, 2025

I think the adaption is not sufficient... I'm currently on v1.128

I've changed access rights and owner with chmod / chown. (File Inode Change Date/Time) Now all assets (+100.000) are being re-importet including usage of all services

  • thumbnail creation
  • face tagging
  • what not...

although nothing about the picture itself changed, each asset is being re-imported.

Solution proposal If no other option is going to be found, an image hash comparison should be done prior to starting all service (like thumbnai re-generation, face tagging etc.). To prevent unnecessary processing.


immich_server            | [Nest] 7  - 03/05/2025, 6:30:37 PM   DEBUG [Microservices:LibraryService] Asset was offline or modified, updating asset record /usr/src/app/photos/etomy/Mali/2020/2020-04-13/20200413_170744.jpg
immich_server            | [Nest] 7  - 03/05/2025, 6:30:37 PM   DEBUG [Microservices:LibraryService] Asset was offline or modified, updating asset record /usr/src/app/photos/etomy/Mali/2020/2020-04-13/20200413_170748.jpg
immich_server            | [Nest] 7  - 03/05/2025, 6:30:37 PM   DEBUG [Microservices:LibraryService] Asset was modified, queuing 
...
... 200.000 more ...
...
❯ exiftool Mali/2020/2020-04-13/20200413_170744.jpg

ExifTool Version Number         : 12.41
File Name                       : 20200413_170744.jpg
Directory                       : Mali/2020/2020-04-13
File Size                       : 4.6 MiB
File Modification Date/Time     : 2020:10:27 21:01:33+00:00
File Access Date/Time           : 2023:10:17 00:57:17+00:00
File Inode Change Date/Time     : 2025:03:02 09:20:26+00:00
File Permissions                : -rwxrwxrwx
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg

This is by design. You changed the files, and the mtime was updated. Then it will be reimported.

@scrapix
Copy link

scrapix commented Mar 5, 2025

@etnoy @alextran1502 do you think it's worth opening a Feature Request for my solution proposal to make re-importing more efficent?

Solution proposal

  1. On image import create min 2 hashes and save them to the database:
    1. of the complete image file
    2. just of the image / video part of the file

If a file change is being identified compare if image / video part hash has changed compared to the changed file
- if no change in image part, just run the metadata job
- if change also run jobs like thumb re-creation, face identification etc..

Goal: prevent unnecessary processing. if no changes happen to content

@etnoy
Copy link
Collaborator Author

etnoy commented Mar 5, 2025

@etnoy @alextran1502 do you think it's worth opening a Feature Request for my solution proposal to make re-importing more efficent?

Solution proposal

  1. On image import create min 2 hashes and save them to the database:

    1. of the complete image file
    2. just of the image / video part of the file

If a file change is being identified compare if image / video part hash has changed compared to the changed file - if no change in image part, just run the metadata job - if change also run jobs like thumb re-creation, face identification etc..

Goal: prevent unnecessary processing. if no changes happen to content

No, I don't think so. By design, we want to allow multiple files with identical hashes with different paths within an external library

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants