I have a table that stores some basic data about visits by visitors to third-party sites. This is its structure:
id, site_id, unixtime, unixtime_last, ip_address, uid
There are four indexes: id, site_id/unixtime, site_id/ip_addressandsite_id/uid
There are many different ways to access this table, and all of them are site_id specific. An unixtime index is used to display a list of visitors for a given date or time range. The other two are used to search for all visits from the IP address or "uid" (a unique cookie value created for each visitor), and to determine whether this is a new visitor or a returning visitor.
Obviously, storing the site_id index inside 3 indexes is inefficient for write and storage speed, but I see no way around this, since I need to be able to quickly query this data for a specific site_id.
Any ideas on improving efficiency?
I really don't understand B-trees, except for some very simple things, but it's more efficient to have the very last column of the index with the least variance - right? Since I believed that site_id is the second index column for both ip_address and uid, but I think this will make the index less efficient, since the IP and UID will be different from the site identifier, because we only have about 8000 unique sites on a database server, but millions of unique visitors to all ~ 8000 sites on a daily basis.
site_id IP UID , , , , , , , , site_id . :
select id from sessions where uid = 'value' and site_id = 123 limit 1
... , , site_id . , . , , , 500 000 , 10 . . UID, , .
, , :)
- MyISAM MySQL 5.0. , . . , - , .
memcached , . , .