Skip to content

Indexing with gemini embedd takes huge lot of time during scrapping #3348

@ananta-code

Description

@ananta-code

Hi,

I am exploring symantic search using gemini AI .we have aroung 16 million documents we wanted to create embedding during indexing.

Here is my settings i am using while adding records.
basically because various limitations and also token refresh issue, i decided to everytime before adding record create the embedd using service account with the role and then use the same auth token to create embedding

Image

without the embedding , the entire scrapping/indexing takes max 35-40 mins..with embedding i wanted 5 hours still only 50000 record i can see in my meilisearch.

Even refreshing token (auth token) also most of times expires since i think the embedding is done in sequencial basis and when we send in batch still it takes time and when actually it picked to create embedding , the token expires

The refresh token logic is there before we create emdedding we are refreshinh token just to make sure it wont expire but that also not works because emdedding seems like takes huge lot of time

Please guide me if i am doing anything wrong or there is a better way

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions