Efficiently store identical data in a master data model

I have a data model that contains several objects, each of which has several different attributes that store image data. All of them will be small images, and I need to store them in persistent storage, and not as external files.

Although I can just save the image data in a binary or converted attribute, it is likely that the user will specify the same image for two or more of these attributes, so I prefer to keep one copy of each unique image rather than duplicating the image data.

I was faced with creating an ImageBlob object to store image data and use relationships for this, but I'm new to Core Data, and it does not immediately seem to me that this is the right way, In particular, how can I deal with the following situations?

  • I want all of my image attributes in multiple entities to use the same โ€œimage data storeโ€ to save only one instance of each image.
  • I need to make sure that if no objects use the image in the data store it is deleted

What would be the best way to handle this?

+4
source share
2 answers

My first question is how do you plan to identify when two objects use the same image? Is there a property on the image that you can save and query to determine if an already installed image exists? And how expensive is it, computationally? If this takes a long time, you can end up optimizing for storage and performance impact.

However, if you have a way to do this efficiently, you can create an ImageBlob object to do what you describe. An entity that uses ImageBlob must have an ImageBlob or imageBlobs with ImageBlob . ImageBlob should have feedback with the name, for example, users .

In your code, when you want to reuse ImageBlob , it is as simple as doing something like this:

 NSManagedObject *blob = // get the image blob NSManagedObject *user = // get the user [user setValue:blob forKey:@"imageBlob"]; // do this if it uses a single image [[user mutableSetValueForKey:@"imageBlobs"] addObject:blob]; // do this if it uses multiple images 

Another consideration that you want to think about is what you need to do with blobs that are no longer needed. Presumably you want to delete any images that are not in use. To do this, you can register an application delegate or a subclass of NSPersistentDocument (depending on whether your application is document-based or not) to notify NSManagedObjectContextObjectsDidChangeNotification . Whenever the context of a managed entity changes, you can delete any unnecessary images, for example:

 - (void)managedObjectContextObjectsDidSave:(NSNotification *)notification { NSManagedObjectContext *managedObjectContext = [notification object]; NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init]; [fetchRequest setEntity:[NSEntity entityWithName:@"ImageBlob" inManagedObjectContext:managedObjectContext]]; [fetchRequest setPredicate:[NSPredicate predicateWithFormat:@" users.@count == 0"]; NSArray *unusedBlobs = [managedObjectContext executeFetchRequest:fetchRequest error:nil]; // Don't be stupid like me; catch and handle the error [fetchRequest release]; for (NSManagedObject *blob in unusedBlobs) { [managedObjectContext deleteObject:blob]; } } 
+2
source

You can add a unique md5 property to the Image object to make sure that you only save the same images once.

As for the Core Data stuff, I think something like this might work: Then create an abstract parent Entity ( Parent ). Add a link from Parent to Image called Image and set the Cascade option to delete so that Image also deleted when you remove Parent . Add a link from Image to Parent called Parent or something else, and set โ€œNullifyโ€ for the deletion method so that when you delete Image image for Parent set to nil . Then add other objects and set the parent element to Parent .

+1
source

All Articles