How to keep recent usage frequency in MySQL

I am working on the Product Catalog module of the Invoicing application.

When a user creates a new invoice, the product name field should be an autocomplete field that shows the most recently used products from the product catalog.

How to save this "frequency / frequency of use" in the database?

I am going to add a new recency field that will increase by 1 each time a product is used, and reduced by 1/(count of all products) when another product is used. Then use this recency field to streamline, but that doesn't seem like the best solution.

Can you help me what is best for this kind of problem?

+6
sql database mysql
source share
7 answers

Solution for calculating deadlines:

Create a new column in the product table, for example, last_used_on . Its data type should be TIMESTAMP (MySQL representation for Unix time).

<strong> Benefits:

  • Timestamps contain both date and time.
  • This allows VERY accurate calculations and comparisons in relation to dates and times.
  • It allows you to format the stored values ​​in the date and time format of your choice.
  • You can convert from any date format to a timestamp.
  • As for your autocomplete fields, this allows you to filter the list of products as you wish. For example, to display all products used with [date-time]. Or get all products used between [date-time-1] and [date-time-2]. Or get products used only on Mondays, at 13:37:12, for the last two years, two months and three days (so flexible timestamps).

Resources

Solution for calculating the speed of use:

Well, in fact, you are not talking about calculating the frequency , but about the rate - although it can be argued that the frequency is also the norm.

Frequency means using time as a reference unit and measuring it in hertz (Hz = [1 / second]). For example, let's say you want to query how many times a product has been used in the past year.

Speed , on the other hand, is a comparison, the relationship between two related units. For example, the USD / EUR exchange rate is both currencies. If a comparison occurs between two members of the same type, the result is a number without units: percentage. For example: 50 apples / 273 apples = 0.1832 = 18.32%

However, I suppose you tried to calculate the utilization rate : the number of uses of the product depending on the number of uses of all products. For example, for a product: usage rate of the product = 17 usages of the product / 112 total usages = 0.1517... = 15.17% . And in autocomplete, you want to display products with a usage speed exceeding a specified percentage (for example, 9%).

It is easy to implement. In the products table, add a usages column of type int or bigint and simply increase its value each time a product is used. And then, when you want to extract the most used products, just apply a filter, as in this sql statement:

 SELECT id, name, (usages*100) / (SELECT sum(usages) as total_usages FROM products) as usage_rate FROM products GROUP BY id HAVING usage_rate > 9 ORDER BY usage_rate DESC; 

Here is a small study example:

Case Study

In the end, recyne , frequency and speed are three different things.

Good luck.

+6
source share

To provide future flexibility, I would suggest the following additional table (*) for storing the entire history of product use by all users:

Name: product_usage

Columns:

  • id - internal surrogate auto-increment primary key
  • product_id (int) - foreign key to the product identifier
  • user_id (int) - foreign key for user id
  • timestamp (datetime) - date / time of use of the product

This will fine-tune the request as needed. For example. You can only decide on an order in the past for a registered user. Or perhaps general use over a period of time will be more relevant. Such a table may also have a dual audit goal - for example, to report on the most popular or unpopular products among all users.

(*) if nothing similar exists in your database schema yet

+4
source share

Your problem is related to many other web search applications, such as, for example, showing spell fixes, related searches, or "trending" topics. You correctly recognized that both regency and frequency are important criteria when defining "popular" offers. In practice, it is advisable to compromise between them: only one regency will suffer from random fluctuations; but you also do not want to use only frequency, as some products may have been purchased a lot in the past, but their popularity is declining (or they may have lost their positions or were replaced by successors).

A very simple but effective implementation that is commonly used in these scenarios is exponential smoothing . First of all, most of the time it’s enough to update the popularity at fixed intervals (say, once a day). Set the decay parameter? (say 0.95), which tells you how many orders yesterday are calculated compared to today. Similarly, orders from two days ago will cost? *? ~ .9 times like today, and so on. To evaluate this parameter, note that the value decreases to half after log (.5) / log (Ξ±) days (about 14 days for Ξ± =. 95).

The implementation requires only one additional field for each product, orders_decayed . Then all you have to do is update this value every night with all daily orders:

orders_decayed = & alpha; * orders_decayed + (1- & alpha;) * orders_today.

You can sort your applicable offers according to this value.

+3
source share

To have an individual user interface, you should not rely on the field in the product table, but on the user's history.

Entering the product in past user-created invoices would be a good starting point. The advantage is that you do not need to add fields or tables for this function. In any case, you simply rely on data that is already there.

Since this is an autocomplete field, use in the past may not have much meaning. Display n search results by user type. If you feel that the results are better, if you included the regency in the calculation of the order, go with it.

+2
source share

Implementation can now be delayed depending on how and when the product should be displayed. Should it be used for a specific user frequency of use or for a specific application (in general). But in both cases, I suggest having a history table that you can use later for another analysis.

You can create a history table with the least number of columns:

 Id | ProductId | LastUsed (timestamp) | UserId 

And now you can create a view that will query this table for a specific time range (something like the frequency of products last week, last month or last year) and give you the best-selling product for a certain time interval.

The same can be used for a user-defined frequency by adding additional filtering conditions using Userid.

I am thinking of adding a new recreational area that will increase by 1 each time a product is used, and decrease by 1 / (the number of all products) when another product is used. Then use this field to order, but this does not seem to me the best solution.

Yes, it is not recommended to add a column for this and update every time. Imagine that this product is the most anticipated product, and people like to buy it. Now, at that time, 1000 people or more may be requested for this product, and for each request you are going to update the same line, because to support the concurrency database you need to block this specific line and update for each request, which is definitely going to hit your database and application performance, instead you can just insert a new line.


Another possible solution: you can use the existing table of accounts, because it will necessarily have all the information about the product and the user and create an idea of ​​how often to use the product, as I mentioned above.


Please note that this is another option to achieve the expected. But I would recommend using the history table instead.

+2
source share

Scenario

When a user creates a new invoice, the product name field should be an autocomplete field that shows the most recently used products from the product catalog.

your proposed solution

How to save this "frequency / frequency of use" in the database?

If this is a web application, do not store it in a database on your server. Each user has different options.

Store it in a user browser as a Cookie or Localstorage, because it will improve the user interface.

If you still want to save it in a MySQL table,

Do the following

  • Create a recency column as indicated in the question.

  • When each time an item is used, increase the number by 1, as indicated in the question.

  • Do not reduce it when using other elements.

  • To get the last most used item,

request

 SELECT * FROM table WHERE recence = (SELECT MAX(recence) FROM table); 

Side note

Switch to using the database only if you want to display the latest most frequently used products, independent of the user.

+1
source share

Since you are not sure which measure to choose, and it is rather a problem related to working with the user, I advise you to take a number of measures and provide the user with the opportunity to choose the one that he prefers. For example, the set of measures available may include the most popular product last week, last month, last 3 months, last year, total. For the sake of performance, I would prefer to store these statistics in a separate table, which is updated with a scheduled task that runs every 3 hours, for example.

+1
source share

All Articles