Flickr API returning duplicate photos

I ran into a confusing issue with the flickr API.

When I do a photo search (flickr.photos.search) and request high page numbers, I often get duplicate photos for different page numbers. There are three URLs here, each of them should return three sets of different images, however they - oddly - return the same images:

http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=ca3035f67faa0fcc72b74cf6e396e6a7&tags=gizmo&tag_mode=all&per_page=3&page=6820 http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=ca3035f67faa0fcc72b74cf6e396e6a7&tags=gizmo&tag_mode=all&per_page=3&page=6821 http://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=ca3035f67faa0fcc72b74cf6e396e6a7&tags=gizmo&tag_mode=all&per_page=3&page=6822 

Has anyone else come across this? It seems I can recreate this with any tag search.

Greetings.

+4
source share
2 answers

After further investigation, it seems that the API has an undocumented β€œfeature” that never allows you to receive more than 4000 photos returned from flickr.photos.search.

So, while 7444 pages are available, this will allow you to load the first 1333.

+6
source

More than 4000 images can be obtained from flickr; your request must be paginated (for example) by the time range, so that the total number of images from this request does not exceed 4000. You can also use other parameters, such as a bounding box, to limit the total number of images in the response.

For example, if you search with the tag 'dogs', this is what you can do (binary search by time range):

  • Indicate the minimum date and maximum date in the request URL, for example, January 1, 1990 and January 1, 2015.
  • View the total number of images in the response. If it is more than 4000, then divide the time range into two and work in the first half until you get less than 4000 images from the request. As soon as you get this, request all the pages from this time range and go to the next interval and do the same until (a) the number of required images is satisfied (b) for the entire initial time interval.
+6
source

All Articles