Mongoengine - How to perform the operation "save a new element or counter counter"?

I am using MongoEngine in a web clip project. I would like to keep track of all the images that I have encountered on all cleared web pages.

For this, I save the URL of the src image and the number of times the image was seen.

The definition of the MongoEngine model is as follows:

 class ImagesUrl(Document): """ Model representing images encountered during web-scraping. When an image is encountered on a web-page during scraping, we store its url and the number of times it has been seen (default counter value is 1). If the image had been seen before, we do not insert a new document in collection, but merely increment the corresponding counter value. """ # The url of the image. There cannot be any duplicate. src = URLField(required=True, unique=True) # counter of the total number of occurences of the image during # the datamining process counter = IntField(min_value=0, required=True, default=1) 

I am looking for a suitable way to implement the β€œsave or enlarge” process.

So far I have been doing this this way, but I feel that there might be a better, inline way to do this with MongoEngine:

 def save_or_increment(self): """ If it is the first time the image has been encountered, insert its src in mongo, along with a counter=1 value. If not, increment its counter value by 1. """ # check if item is already stored # if not, save a new item if not ImagesUrl.objects(src=self.src): ImagesUrl( src=self.src, counter=self.counter, ).save() else: # if item already stored in Mongo, just increment its counter ImagesUrl.objects(src=self.src).update_one(inc__counter=1) 

Is there a better way to do this?

Thanks so much for your time.

+6
source share
2 answers

You should be able to just upsert , for example:

  ImagesUrl.objects(src=self.src).update_one( upsert=True, inc__counter=1, set__src=self.src) 
+10
source

update_one , as in @ross answer, contains the number of changed documents as a result (or the full update result), and it does not return a document or a new counter number. If you want to have it, you should use upsert_one :

 images_url = ImagesUrl.objects(src=self.src).upsert_one( inc__counter=1, set__src=self.src) print images_url.counter 

It will create a document if it does not exist, or change it and increase the counter number.

0
source

All Articles