Effectively determine whether a business is open or not based on store hours

Given the time (for example, it is currently 4:24 pm Tuesday), I would like to be able to select all the enterprises that are currently open from a set of companies.

  • I have open and closed times for every business for every day of the week.
  • Suppose a business can only open / close at 00, 15, 30, 45 minutes of each hour.
  • I accept the same schedule every week.
  • What interests me most is the ability to quickly find a set of companies that are open at specific times, rather than data space requirements.
  • Keep in mind, some of my open at 11pm one day and close 1am the next day.
  • Holidays do not matter - I will process them separately

What is the most efficient way to store these openings / closures, so that once a day / day of the week I can quickly find out which businesses are open?

I use Python, SOLR and mysql. I would like to be able to fulfill the query in SOLR. But honestly, I am open to any suggestions and alternatives.

+6
performance python mysql solr
source share
7 answers

If you just want to look at one week at a time, you can canonize the entire opening / closing time to set the number of minutes from the beginning of the week, say, Sunday, 0 hours. For each store, you create several tuples of the form [startTime, endTime, storeId]. (For several hours that lasted at midnight on Sunday, you would need to create two tuples, one at the end of the week, one at the beginning of the week). This set of tuples will be indexed (say, with the tree you previously processed) in both startTime and endTime. Tuples should not be so large: there are only ~ 10k minutes per week that can fit in 2 bytes. This structure will be graceful inside the MySQL table with the corresponding indexes and will be very resistant to constant insertions and deletions of records when information changes. Your request will simply be "select storeId where startTime <= time and endtime> = time", where time was canonized minutes from midnight on Sunday.

If the information does not change very often, and you want the search queries to be very fast, you could solve every possible query and cache the results. For example, there are only 672 quarter hours per week. With a list of businesses, each with a list of openings and closures, such as Brandon Rhodes, you could simply scroll through every 15-minute period a week, find out who opens, and then save the answer in a lookup table or in memory.

+8
source share

The bitmap field mentioned by another respondent will be incredibly effective, but it becomes messy if you want to be able to process half an hour or a quarter of an hour, since you need to increase the number of bits and design field arithmetically every time you come across a new resolution that you must match.

Instead, I would try to save the values ​​as datetimes inside the list:

openclosings = [ open1, close1, open2, close2, ... ] 

Then I would use the Python “bisect_right ()” function in my built-in “bisect” module to quickly find the O (log n) time, where in your list your query time “fits”. Then look at the index returned. If this is an even number (0, 2, 4 ...), then the time lies between one of the "closed" times and the next "open" time, so the store closes then. If instead the bisector index is an odd number (1, 3, 5 ...), then the time lands between the opening and closing times, and the store is open.

Not as fast as bitmaps, but you don't have to worry about resolution, and I can't come up with another O (log n) solution that would be elegant.

+5
source share

You say you use SOLR, don’t care about storage and want your searches to be fast. Then, instead of storing open / closed tuples, index the record for each open time block at the level of detail you need (15 minutes). For the encoding itself, you can use only cumulative hours: minutes.

For example, a store open from 4-5 p.m. on Monday will index the values ​​added for [40:00, 40:15, 40:30, 40:45]. The request at 4:24 pm on Monday will be normalized to 40:15 and, therefore, corresponds to this store document.

It may seem ineffective at first glance, but it is a relatively small constant punishment for the speed and space of indexing. And makes the search as fast as possible.

+4
source share

Sorry, I do not have a simple answer, but I can tell you that as the manager of the development team in the company in the late 90s, we were entrusted with solving this problem, and it was hard.

This is not a weekly watch that can be tight, which can be done with a relatively small bitmask (168 bits = 1 per hour of the week), the trick is companies that close every alternating Tuesday.

Starting with a bitmask and then moving on to the exception field is the best solution I've ever seen.

+3
source share

In your Solr index, instead of indexing each business as a single document with a clock, indicate each “retail session” for each business for a week.

For example, if Joe was open Monday to Saturday from 6:00 to 19:00 and closed on Sunday, you would indicate six different documents, each of which had two indexed fields, “open” and “closed”. If your units are 15-minute intervals, then the values ​​can range from 0 to 7 * 24 * 4. Assuming you have a unique identifier for each business, save it in each document so that you can match sessions to enterprises.

Then you can simply do a range search in Solr:

open: [* TO N] And close: [N + 1 TO *]

where N is calculated on the N-th 15-minute interval, which falls the current time. For example, if on Wednesday 10:10 AM, your request would look like this:

open: [* TO 112] And close: [113 TO *]

aka "find a session starting before or before 10:00 a.m. and ending before or after 10:15 a.m."

If you want to include other criteria in your search, such as location or products, you will also need to index this with every session document. This is a bit redundant, but if your index is not huge, this should not be a problem.

+1
source share

If you can manage your data well, I see a simple solution like @ Sebastian's. Follow the tips for creating tuples other than creating their form [time = startTime, storeId] and [time = endTime, storeId], and then sort them in a list. To find out if a store is open, simply run a query like:

 select storeId from table where time <= '@1' group by storeId having count(storeId) % 2 == 1 

To optimize this, you can create a lookup table at any time t, keep stores open at t, and open / close the store between t and t + 1 (for any grouping t).

However, this has the disadvantage that it is more difficult to maintain (overlapping open / close should be combined over a longer period with open-close).

0
source share

Have you looked how many unique combinations of opening / closing time are there? If there are not many, make a reference table of unique combinations and save the index of the corresponding record for each business. Then you only need to search the lookup table and then find the business with these indexes.

0
source share

All Articles