Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question : Save without indexing, index a list of objects #477

Open
pulse-mind opened this issue Mar 13, 2024 · 1 comment
Open

Question : Save without indexing, index a list of objects #477

pulse-mind opened this issue Mar 13, 2024 · 1 comment

Comments

@pulse-mind
Copy link

Hello,

Thanks for the job done here.

I have two questions:

  • Is it possible to save an object, using the .save() and to pass an argument to not index in ES ?
  • I have a list of objects and how to index then using a bulk method to be more efficient than "saving" them one by one ?

I am searching for solution because I am importing a lot of data and doing .save() on each object but I can see that the indexation one by one is taking a lot of time.
For the first huge import I disabled ES in Django and at the end run the search_index --rebuild but then in a run mode it does not sound good to me. So I am searching another solution...

@pulse-mind
Copy link
Author

pulse-mind commented Mar 14, 2024

For people who wants to do the same, this is what I found :
Override handle_save signals by passing a private value _auto_refresh in the Object:

    def handle_save(self, sender, instance, **kwargs):        
        if hasattr(instance, '_auto_refresh') and instance._auto_refresh == False:
            logger.debug("Do not update ElasticSearch for this object {}".format(instance.mykey))
        else:
            registry.update(instance)
            registry.update_related(instance)

Check this : https://django-elasticsearch-dsl.readthedocs.io/en/latest/settings.html#elasticsearch-dsl-signal-processor

Then before calling .save() set this private value to False :

myobject._auto_refresh=False
myobject.save()

And then to index into ES the list :

MyObjectDocument().update(data_source_list)
# where MyObjectDocument is 
# @registry.register_document
# class MyObjectDocument(Document)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant