What is the best practice method for registering visits to a page / object

Take my profile , for example, or any question number on this site, what is the process of registering the number of visits to a page or object on a website, which, I believe, includes:

  • Counting registered users once (this should be reflected in db, which pages / objects the user visited). it will also not include unregistered users.
  • IP: record visits to each IP address / page; this can be frustrating because you can have 2 different people checking the same site; or do you really want to track repeat visitors.
  • Cookie: this is likely to cause people with multiple computers to be counted twice
  • another method goes here ....

The question is, what is the process and best practice for counting user requests?

EDIT

I have added computer languages ​​to the tag list because they interest me. Feel free to include any libraries, modules, and / or extensions that accomplish the task.

The question can be rephrased:

  • How does someone estimate the number of prints when a user navigates to a page? The question should not be similar to what Google analytics does, but rather it should be like when you click on a question or stackoverflow profile and see the number of views.
+7
python design php
source share
6 answers

The “correct” answer depends on the situation; First of all, the most desirable statistics and the availability of resources for their collection and processing: for example:

Server side

Raw Web Server Logs

All web servers have the ability to register requests. The problem with them is that it takes a lot of processing to get meaningful data, and for your example scenario, they won’t write specific application data; for example, whether the request was associated with a registered user.

This option will not work for what interests you.

File Based Application Logs

The programmer can apply a special code to the application to write the material that interests you the most into a log file. This is similar to a web server log; except that he may be aware of the application and record things like the participant making the request.

Programmers may also need to create scripts that extract material from these magazines that you are most interested in. This setting may be appropriate for a high traffic site with lots of disk space and system administrators who know how to ensure that the logs are rotated and clipped from production servers before bad things happen.

Database-Based Application Logs

An application programmer can write custom code for an application that writes each request to a database. This makes it easy to run reports and makes data instantly available. This solution imposes additional system overhead during each request, which is better suited for smaller sections of traffic or scenarios where data is highly valued.

Client side

Javascript callback

This applies to the above parameters. Google analytics does this.

Each page contains some javascript code that tells the client to report to the web server so that the page is viewed. Data can be written to a database or written to a file.

It has a strong advantage in improving the accuracy of scenarios when impressions are lost due to intensive caching / proxy between the client and server.

Cookies

Each time a request is received from a person who does not present a cookie, you assume that they are new and the record that was deleted as “anonymous” and return a uniquely identifying cookie after logging in. It depends on your statement about how accurately this is confirmed. Some applications cannot be cached, so they will be fairly accurate; others (high traffic) encourage caching, which reduces accuracy. Obviously, this is not so much until they are re-authenticated each time the browsers / locations switch.

What are you most interested in?

Then the question arises about which statistics are important to you. For example, in some situations you want to know:

  • how many times the page was viewed, period,
  • how many times a page was viewed by a known user
  • how many of your famous users viewed a particular page

From here, you usually want to break it down into time periods to see the trend. Respectively:

  • Are we getting more views from random people?
  • or do we get more views from registered users?
  • or has almost everyone who sees that the page now sees it?

So, back to your question: best practice for "the number of prints when the user goes to the page"?

It depends on your application.

I assume that your best bet is to use a database-enabled application that records what is most interesting to your application and uses cookies to track member sessions.

+17
source share

The best practice for hit counts depends on how much traffic you expect from your site. As wybiral suggested, you can implement something that is written to the database after each query. This can include an IP address if you want to count unique visitors, or it can be simple by simply increasing the total number for each page or each pair (page, user).

But this requires a database entry for each request, even if you just want to serve a static page. Ideally, a scalable web application should work as much as possible in the cache in memory. Whenever possible, you should avoid a database or disk I / O.

Thus, the ideal solution would be to create some representation of the server activity in memory, and then occasionally (say every 15 minutes) write these events to the database. Perhaps you could queue up thousands of queries and then save them with a single database entry.

Here is a tutorial describing how to do this in python using Celery and Carrot: http://packages.python.org/celery/tutorials/clickcounter.html . It also contains some examples of how to set up database tables using Django models and what code to call when someone accesses the page.

This tutorial will certainly be useful to you no matter what you decide to implement, although this level of architecture can be excessive if you don't expect thousands of hits every hour.

+4
source share

Use the database to record unique IP addresses (if the IP address does not exist in the database, create it, otherwise continue as planned), and then query the database for the number of these objects. Index it with IP and URLs to save views for individual pages. You do not have to worry about tracking registered users in this way, they will be combined into a unique IP account. As for multiple users from the same IP address, there are not so many that you may need some account and counting user-> - -page pages.

+1
source share

I would suggest using a persistent key / value store like Redis. If you use a list with a list key, which is a serialized identifier, you can save other serialized entries and use llen to find the size of the list.

Example (python) after initializing your Redis store:

def intializeAndPush(serializedKey, serializedValue): if not redisStore.exists(serializedKey): redisStore.push(serializedKey, serializedValue) else: if serializedValue not in redisStore.lrange(serializedKey, 0, -1): redisStore.push(serializedKey, serializedValue) def getSizeOf(serializedKey): if redisStore.exists(serializedKey): return redisStore.llen(serializedKey) else: return 0 

Using this method, you can use anything like serializedKey or serializedValue. If you want to store IP addresses with today's date or serialized login information, then everything will be just as simple. In addition, only unique serializedValues ​​are saved, as records are locked while reading (at least, as I recall).

+1
source share

I will try to implement pixel tracking to track the views on your page / object. This method is used by Google (google analytics) and other high-ranking media companies.

0
source share

Pixel tracking will be great as you can point the tracking pointer to an HttpHandler specific for this purpose. This way you can separate the load and even use some kind of queue for high load scenarios.

In addition, you can include user information in the tracking pixel, for example, who visited the page.

eg:

 <a href="fakeimages/imba.gif?uid=123&info2=a&info3=b" style="height:1px;width:1px;" /> 

Then you need to process the request sent to fakeimages / *. gif with a specific HttpHandler / php redirect / controller (any language you use) and process the information.

considers

0
source share

All Articles