Skip to content

POC: Feat/make indexing more resiliant#16546

Closed
bielu wants to merge 101 commits intoumbraco:mainfrom
bielu:feat/make-indexing-more-resiliant
Closed

POC: Feat/make indexing more resiliant#16546
bielu wants to merge 101 commits intoumbraco:mainfrom
bielu:feat/make-indexing-more-resiliant

Conversation

@bielu
Copy link
Contributor

@bielu bielu commented Jun 3, 2024

Prerequisites

  • I have added steps to test this contribution in the description below

If there's an existing issue for this PR then this fixes

Description

This Pr is POC for solving issue with rebuilding of indexes on startup. Also adding small flexibility around page size when reindexing as current we pulled 10k nodes, which if they have more than 100 properties might cause really slow indexing when we pull more than 1k. I am making it as POC as won't spend more time on this pr unless HQ confirm this is the way and I should improve code in this pr.


This item has been added to our backlog AB#56627

@github-actions
Copy link

github-actions bot commented Jun 3, 2024

Hi there @bielu, thank you for this contribution! 👍

While we wait for one of the Core Collaborators team to have a look at your work, we wanted to let you know about that we have a checklist for some of the things we will consider during review:

  • It's clear what problem this is solving, there's a connected issue or a description of what the changes do and how to test them
  • The automated tests all pass (see "Checks" tab on this PR)
  • The level of security for this contribution is the same or improved
  • The level of performance for this contribution is the same or improved
  • Avoids creating breaking changes; note that behavioral changes might also be perceived as breaking
  • If this is a new feature, Umbraco HQ provided guidance on the implementation beforehand
  • 💡 The contribution looks original and the contributor is presumably allowed to share it

Don't worry if you got something wrong. We like to think of a pull request as the start of a conversation, we're happy to provide guidance on improving your contribution.

If you realize that you might want to make some changes then you can do that by adding new commits to the branch you created for this work and pushing new commits. They should then automatically show up as updates to this pull request.

Thanks, from your friendly Umbraco GitHub bot 🤖 🙂

@nul800sebastiaan
Copy link
Member

Thanks @bielu - I have noted this PR and asked for the team to have a look and see if it's the right direction. Of course we have a "little" conference coming up next week so it's taking a bit longer to get to.

/// <summary>
///
/// </summary>
public class IndexRebuildStatusManager : IIndexRebuildStatusManager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recommended approach would be to do this: Shazwazza/Examine#372 (comment)

The only real way to know if the indexing is done in a resilient way would be to have an actual document in the index certifying that rebuilding is successful instead of relying on in-memory cache which is problematic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shazwazza I am not 100% convinced about usage of additional index, as we both know less indexes is actually better with lucene. I am thinking maybe we should use additonal sql table, as it will be eqally resiliant as using index, but it will not require us to create index, what you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bielu Sorry, I probably wasn't clear in my suggestion. We don't want to use an extra index to store any data, we can just use a marker document within the index. For example:

  • Rebuilding an index deletes all data
  • The index is populated with the normal data
  • When the IndexPopulator is done populating the index, it then writes a special marker document signaling that the populator is done. Perhaps this document has a field like __Populated: y

Then the rebuild checker, just checks if the document count for __Populated: y == 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shazwazza that's make sense now! We can also extend it to check what populator are registered to show how many of them is done! I will make update to this pr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shazwazza i started changing implementation of this service to use examine underhood, can you have quick look and check if that is what you had in mind?
as in this way now we can also repeat failed batches (but i think I will need play around little more)

AndyButland and others added 24 commits March 18, 2025 11:05
…ckoffice document routes (umbraco#18691)

* Fixes issue with macro rendering in an RTE when GUIDs are used for backoffice document routes.

* Fixed null reference error.
* Disabled encrypt

* Skips integration tests for SQl Server on releases

* Removed encrypt
…mbraco#18922)

* Converts rebuild database cache operation to submit and poll.

* Update src/Umbraco.Web.UI.Client/src/views/dashboard/settings/publishedsnapshotcache.controller.js

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Handle HTTP error in status retrieval.

* Fixed test build.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Navya Sinha <navya.sinha@method4.co.uk>
# Conflicts:
#	version.json
…raco#19063)

* Handle file paths as not found in delivery API by route requests.

* Move check earlier to handle redirect logic as well.

* Spelling: Changed "resolveable" to "resolvable"

---------

Co-authored-by: kjac <kja@umbraco.dk>
Co-authored-by: Navya Sinha <navya.sinha@method4.co.uk>
… playwright (umbraco#19140)

* Updated pipeline to install only the chromium browser

* Added junit as reporter for acceptance tests
AndyButland and others added 21 commits July 22, 2025 07:50
…of backoffice external user login (umbraco#19766)

* Retrieve only user external logins when invalidate following removal of backoffice external user login.

* Improved variable name.
…ke 2 - handling scopes with a base parent) (umbraco#19797)

* Add integration tests that shows the problem

* Fix the problem and add explenation

* Improved comments slightly to help when we come back here!
Moved tests alongside existing ones related to scopes.
Removed long running attribute from tests (they are quite fast).

* Fixed casing in comment.

---------

Co-authored-by: Andy Butland <abutland73@gmail.com>
Co-authored-by: kjac <kja@umbraco.dk>
# Conflicts:
#	version.json
…t nodes into account (umbraco#19800)

* Fix users being able to see nodes they don't have access to when using the picker search

* Readability and naming improvements

* Additional fixes

* Adjust tests

* Additional fixes

* Small improvement

* Replaced the root ids with constants

* Update src/Umbraco.Web.BackOffice/Trees/MemberTreeController.cs

Co-authored-by: Andy Butland <abutland73@gmail.com>

---------

Co-authored-by: Andy Butland <abutland73@gmail.com>
* Fixes umbraco#19654

Adds the propertyAlias to the VariationContext so that products implementing the GetSegment method are aware which propertyAlias it's being called for

* Re-implement original variation context for backwards compatibility

* Fixes hidden overload method

Ensures the `GetSegment` method overload is not hidden when a null `propertyAlias` is passed.

* Resolve backward compatibility issues.

* Improved comments.

---------

Co-authored-by: Andy Butland <abutland73@gmail.com>
…n the culture being installed on the operating system (umbraco#19821)

* Use a regex to filter our invalid culture codes rather than relying on the culture being installed on the operating system.

* Update to more restrictive regex

Co-authored-by: Nuklon <Nuklon@users.noreply.github.com>

---------

Co-authored-by: Nuklon <Nuklon@users.noreply.github.com>
# Conflicts:
#	version.json
* feat: Add Arabic (ar) backoffice translation

* Make Arabic language general until having special words for other
countries.

* Corrected the language header

---------

Co-authored-by: Andy Butland <abutland73@gmail.com>
…9984)

Retain original backoffice location on login after timeout.
…umbraco#20142)

* Support querystring and anchor for local links in Delivery API output

* Add default implementation for backwards compat

* Add default implementation for backwards compat (also on the interface)

* Fix default implementation
…ent with changed data types (umbraco#20079)

* Avoid throwing an exception on getting references when migrating content with changed data types.

* Revert and handle exception at the consumer side

* Clean up

---------

Co-authored-by: Kenn Jacobsen <kja@umbraco.dk>
… v13/dev

# Conflicts:
#	src/Umbraco.Core/Models/DeliveryApi/IApiContentRoute.cs
@nielslyngsoe nielslyngsoe added the state/sprint-candidate We're trying to get this in a sprint at HQ in the next few weeks label Sep 17, 2025
@JasonElkin JasonElkin changed the base branch from v13/dev to main September 25, 2025 11:45
@JasonElkin JasonElkin marked this pull request as draft September 25, 2025 11:47
@JasonElkin
Copy link
Contributor

Since this is a PoC, I'm going to mark this as a draft for now. If and when progress can be made, we can re-activate this.

I'm afraid that rebasing this via the GitHub UI did terrible things 😔. A proper rebase and force push will fix this.

@bielu
Copy link
Contributor Author

bielu commented Sep 25, 2025

@JasonElkin I don't think there will be any activity on this pr, as you could there was back and forward which was stopped, and I do not work currently with Umbraco at all.

@JasonElkin
Copy link
Contributor

JasonElkin commented Sep 25, 2025

Thanks @bielu, I'll close this for now then.

Sorry to hear you're not working with Umbraco at the moment - you and your contributions will be missed!

Obviously feel free to pick this back up if you come back to working with Umbraco again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/backend state/sprint-candidate We're trying to get this in a sprint at HQ in the next few weeks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Examine rebuild on startup interferes with a manual rebuild