add retry logic and cleaner WS shutdown#622
Closed
mbartsch wants to merge 1013 commits into
Closed
Conversation
Sometimes I need a little reminder of what I am doing on this project.
``` WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip. Please see pypa/pip#5599 for advice on fixing the underlying issue. To avoid this problem you can invoke Python with '-m pip' instead of running pip directly. ```
…ing remote logging
Grafana does not understand that and so everything now has no level formatting
Manually imported from Unmanic#617 after the source PR branch was no longer available for merge. Imported only the runtime fix in unmanic/libs/plugins.py. The original test file from the PR was intentionally omitted from this manual import.
This ensures that during remote postprocessing we never delete the source file without first confirming the dest file has been successfully copied in
Carry out TZ conversion when reading back data. This will probably mess some things up for some people for a day, sorry. But this is probably a worthy change moving forward to better accomodate viewing this data.
… installations when it pushes a new task This means we only need to manage library configs in our main installation which will then seed any updates to the libraray config to all the other installations in our links.
This will prevent an error log and notification when the datastore is temporarily down for a restart or something
…ote instalaltion library even when that library has "Configure Library for receiving remote files only" enabled This "Configure Library for receiving remote files only" option defines a libraray as being managed for linking only. But I do not think that should stop it from being able to receive file paths for creating the task instead of needing to always use HTTP uploads
update Docker file to install correct intel media driver
Root cause: Async tasks spawned via spawn_callback() were not immediately interrupted when WebSocket closed. Tasks would be awaiting gen.sleep() or write_message() when on_close() set the flags to False. Setting flags to False doesn't immediately interrupt pending awaits, so tasks would try writing to a closed connection and fail. Solution: Check self.close_event.is_set() before and during sleep periods in all 5 async_* methods. This allows tasks to exit immediately when the connection closes, before they attempt to write to a closed socket. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
When SQLite encounters concurrent access, it may raise "database is locked" errors. Previously, these would crash the Foreman thread with an unhandled exception and no logging. Now: 1. Foreman thread exception handling: - Log the exception with full context using logger.exception() - Wait 5 seconds before continuing instead of crashing - Prevents thread death when transient database errors occur 2. Task queue fetch with retry logic: - Detect "database is locked" errors specifically - Retry up to 3 times with exponential backoff (0.1s, 0.2s, 0.4s) - Log retry attempts and failures for debugging - Re-raise errors that aren't lock-related (actual problems) This handles the common case where multiple Unmanic instances compete for database access (especially common in multi-machine setups). Lock errors are transient and typically resolve quickly with a brief wait. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Output 'No idle workers available; pending tasks waiting for worker availability' only when the total configured worker count is greater than 0. This prevents confusing log messages when no workers are configured at all. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
When saving task objects (including command logs), SQLite may raise 'database is locked' errors due to concurrent access. Now implements exponential backoff retry logic (0.1s, 0.2s, 0.4s) up to 3 attempts before giving up. This prevents crashes when multiple machines or threads compete for database access. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Added retry logic with exponential backoff to the build_tasks_query() function to handle SQLite 'database is locked' errors that occur during concurrent task lookups. Retries up to 3 times with 0.1s, 0.2s, 0.4s waits before giving up. This protects the initial task query/selection phase in addition to the existing protections in task claiming and saving. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Added configurable timeout for remote installation API requests with exponential backoff retry logic (0.5s, 1s, 2s) up to 3 attempts. Changes: - New config option: remote_installation_request_timeout (default: 10s) - Updated remote_api_get/post/delete with retry logic - Handles Timeout and ConnectionError exceptions gracefully - Increased default from 2s to 10s for slow/high-latency networks This allows users to adjust timeouts for their network conditions and prevents transient connection issues from failing requests. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
d77bf7f to
affd691
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull request
CLA
of my work contained in that pull request to the Unmanic project and the project
owner. My contribution will become licensed under the same license as the overall project.
This extends upon paragraph 11 of the Terms & Conditions stipulated in the GPL v3.0
Checklist
I have ensured that my pull request is being opened to merge into the staging branch.
I have ensured that all new python file contributions contain the correct header as
stipulated in the Contributing Docs.
Description of the pull request
This PR contains two fixes that prevent crashes in production environments:
Disclaimer: AI assisted PR