How to replace Django primary key with another integer unique to this table

I have a Django web application that uses automatically incrementing positive integers as the primary key by default. This key is used throughout the application and is often inserted into the URL. I do not want to disclose this number to the public, so that they can guess the number of users or other objects in my database.

This is a frequent requirement, and I met questions with similar answer mines. Most solutions recommend hashing the original primary key value. However, none of these answers fit my needs. These are my requirements:

  • I would like to save the primary key field type as an integer.
  • I would also prefer not to hash / decrypt this value every time it is read, written, or compared to a database. This seems unnecessary. It would be nice to do this only once: when the record is initially inserted into the database
  • The hash / encryption function should not be reversible, since I do not need to restore the original serial key. A hashed value just needs to be unique.
  • A hashed value must be unique to this table only — not universally unique.
  • The hashed value should be as short as possible. I would like to avoid extremely long (20+ characters) URLs

What is the best way to achieve this? Will there be a next job?

def hash_function(int): return fancy-hash-function # What function should I use?? def obfuscate_pk(sender, instance, created, **kwargs): if created: logger.info("MyClass #%s, created with created=%s: %s" % (instance.pk, created, instance)) instance.pk = hash_function(instance.pk) instance.save() logger.info("\tNew Pk=%s" % instance.pk) class MyClass(models.Model): blahblah = models.CharField(max_length=50, null=False, blank=False,) post_save.connect(obfuscate_pk, sender=MyClass) 
+1
source share
4 answers

Idea

I would recommend you the same approach that Instragam uses. Their demands seem to follow you.

Generated identifiers should be sorted by time (therefore, the list of photo identifiers for For example, can be sorted without getting more information about the photo) Ideally, identifiers should be 64 bits (for smaller indices and better storage in systems such as Redis). The system should introduce as many new ones as possible. "As far as possible, moving parts are a large part of how we could scale Instagram with very few engineers, choosing simple, easy-to-understand solutions that we trust.

They came up with a system that has 41 bits based on a timestamp, 13 - a shard of the database and 10 - for the auto-enlargement part. It looks like you are not using shards. You can only have 41 bits based on time and 23 bits randomly selected. This leads to an extremely unlikely 1 in 8.3 million chance of conflict if you insert records at the same time. But in practice, you will never come across this. So what about some code:

Identifier Generation

 START_TIME = a constant that represents a unix timestamp def make_id(): ''' inspired by http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram ''' t = int(time.time()*1000) - START_TIME u = random.SystemRandom().getrandbits(23) id = (t << 23 ) | u return id def reverse_id(id): t = id >> 23 return t + START_TIME 

Note. START_TIME in the above code is some harsh startup time. You can use time.time () * 1000, get the value and set it as START_TIME

Note that the reverse_id method that I posted lets you know at what time the record was created. If you need to track this information, you can do it without adding another field to it! Thus, your primary key actually saves your storage, rather than increasing it.

Model

Now it will look like your model.

 class MyClass(models.Model): id = models.BigIntegerField(default = fields.make_id, primary_key=True) 

If you make changes to your database outside of django, you will need to create the make_id equivalent as a sql function

How is your leg. This is similar to the approach used by Mongodb to generate _ ID for each object.

+9
source

You need to separate two problems:

  • The primary key, currently an auto-incrementing integer, is the best choice for a simple, relatively predictable unique identifier that can be applied at the database level.

  • This does not mean that you should disclose it to users in your URLs.

I would recommend adding a new UUID field to your model and reassigning your views to use it instead of PK to search for objects.

+3
source

In fact, a simple solution simply encrypts the identifier before sending it to an external source. You can decrypt it on the way back.

0
source

Save AUTO_INCREMENT , but pass it semi-secret: in cookies. It takes a bit of coding to set a cookie, set it and read it. But cookies are hidden from all but serious hackers.

-2
source

All Articles