Is a COMB GUID a good idea with Rails 3.1 if I use a GUID for primary keys?

I am using Rails 3.1 with PostgreSQL 8.4. Suppose I want / should use primary GUID keys. One potential disadvantage is index fragmentation. In MS SQL, the recommended solution for this is to use special sequential GUIDs. One approach to a sequential GUID is the COMBination GUID, which replaces the 6-byte timestamp for the MAC address at the end of the GUID. This has some mainstream application: COMBs are available natively in NHibernate ( NHibernate / Id / GuidCombGenerator.cs ).

I think I figured out how to create COMB identifiers in Rails (using the UUIDTools 2.1.2 harness), but it leaves some unanswered questions:

  • PostgreSQL suffers from index fragmentation when PRIMARY KEY is a UUID type?
  • Can fragmentation be avoided if the lower 6 bytes of the GUID are consecutive?
  • Is the COMB GUID implemented below an acceptable and reliable way to create sequential GUIDs in Rails?

Thanks for your thoughts.


create_contacts.rb migration

 class CreateContacts < ActiveRecord::Migration def up create_table :contacts, :id => false do |t| t.column :id, :uuid, :null => false # manually create :id with underlying DB type UUID t.string :first_name t.string :last_name t.string :email t.timestamps end execute "ALTER TABLE contacts ADD PRIMARY KEY (id);" end # Can't use reversible migration because it will try to run 'execute' again def down drop_table :contacts # also drops primary key end end 

/app/models/contact.rb

 class Contact < ActiveRecord::Base require 'uuid_helper' #rails 3 does not autoload from lib/* include UUIDHelper set_primary_key :id end 

/lib/uuid_tools.rb

 require 'uuidtools' module UUIDHelper def self.included(base) base.class_eval do include InstanceMethods attr_readonly :id # writable only on a new record before_create :set_uuid end end module InstanceMethods private def set_uuid # MS SQL syntax: CAST(CAST(NEWID() AS BINARY(10)) + CAST(GETDATE() AS BINARY(6)) AS UNIQUEIDENTIFIER) # Get current Time object utc_timestamp = Time.now.utc # Convert to integer with milliseconds: (Seconds since Epoch * 1000) + (6-digit microsecond fraction / 1000) utc_timestamp_with_ms_int = (utc_timestamp.tv_sec * 1000) + (utc_timestamp.tv_usec / 1000) # Format as hex, minimum of 12 digits, with leading zero. Note that 12 hex digits handles to year 10889 (*). utc_timestamp_with_ms_hexstring = "%012x" % utc_timestamp_with_ms_int # If we supply UUIDTOOLS with a MAC address, it will use that rather than retrieving from system. # Use a regular expression to split into array, then insert ":" characters so it "looks" like a MAC address. UUIDTools::UUID.mac_address = (utc_timestamp_with_ms_hexstring.scan /.{2}/).join(":") # Generate Version 1 UUID (see RFC 4122). comb_guid = UUIDTools::UUID.timestamp_create().to_s # Assign generted COMBination GUID to .id self.id = comb_guid # (*) A note on maximum time handled by 6-byte timestamp that includes milliseconds: # If utc_timestamp_with_ms_hexstring = "FFFFFFFFFFFF" (12 F's), then # Time.at(Float(utc_timestamp_with_ms_hexstring.hex)/1000).utc.iso8601(10) = "10889-08-02T05:31:50.6550292968Z". end end end 
+3
guid uuid ruby-on-rails activerecord
source share
1 answer
  • PostgreSQL suffers from index fragmentation when PRIMARY KEY is a UUID type?

Yes, this is to be expected. But if you are going to use the COMB strategy, this will not happen. Lines will always be in order (which is not entirely true, but carry with me).

In addition, the performance between native pgsql UUID and VARCHAR is not all that different . Another point to consider.

  • Can fragmentation be avoided if the lower 6 bytes of the GUID are consecutive?

In my test, I found that UUID1 (RFC 4122) is sequential, the timestamp in the generated uuid is already added there. But yes, adding a timestamp to the last 6 bytes, we will calm this ordering. That’s all I did, because apparently the already marked timestamp is not a guarantee of order. More about COMB here

  • Is the COMB GUID implemented below an acceptable and reliable way to create sequential GUIDs in Rails?

I do not use rails, but I will show you how I did it in django:

 import uuid, time def uuid1_comb(obj): return uuid.uuid1(node=int(time.time() * 1000)) 

Where node is a 48-bit positive integer identifying the hardware address.

About your implementation, one of the main advantages of using uuid is that you can safely generate them outside the database, so using a helper class is one of the valid ways to do this. You can always use an external service to generate uuid, for example snowflake , but it may be premature optimization at this point.

+4
source share

All Articles