Best way to get a continuous list with PostgreSQL on the Internet

Question

Best way to get a continuous list with PostgreSQL on the Internet

I am making an API via HTTP that retrieves large strings from PostgreSQL pagination. In normal cases, I usually implement categories such as naive OFFET / LIMIT . However, in this case there are special requirements:

There are many lines, but I believe that users cannot reach the end (imagine the Twitters timeline).
Pages should not be accessed randomly, but sequentially.
The API will return a URL that contains a cursor marker that points to a page of continuous snippets.
Cursor icons do not exist permanently, but for some time.
Its ordering often fluctuates (for example, a Reddit rating), however continuous cursors must maintain a consistent order.

How can I achieve a mission? I am ready to change the whole database schema for me!

+8

http postgresql pagination cursor continuous

minhee Oct 12 '11 at 21:37

source share

2 answers

I don't know anything about PostgreSQL, but I'm a pretty decent SQL Server developer, so I would like to do it anyway :)

How many lines / pages do you expect the user to view each session as much as possible? For example, if you expect the user to show a maximum of 10 pages for each session [each page contains 50 lines], you can make this max and configure the web service so that when the user requests the first page, you cache 10 * 50 lines (or just Id: s for strings, depends on how much memory / concurrent users you have).

This will certainly help speed up your web service to a greater extent than one. And it is quite easy to implement. So:

When the user requests data from page # 1. Run the request (complete with order, attach checks, etc.), Save all id: s to an array (but not more than 500 identifiers). Returns datarows that matches the id: s in the array at positions 0-9.
When the user requests page # 2-10. Returns datarows that matches the id: s in the array at positions (page 1) * 50 - (page) * 50-1.

You can also increase the numbers, an array of 500 int: s will only take 2K of memory, but it also depends on how quickly you want to get your initial request / response.

I used a similar technique on a live website, and when the user continued on page 10, I simply switched to queries. I assume that another solution would continue to expand / populate the array. (Starting the request again, but excluding id: s already included).

In any case, I hope this helps!

+1

Fredrik johansson Oct 15 '11 at 4:47

source share

Tavis rudd · Accepted Answer · 2011-10-19T21:44:59+0000

Assuming that only ordering results that fluctuate, rather than data in rows, Fredrik's answer makes sense. However, I would suggest the following additions:

save the list of identifiers in the postgresql table using array and not in memory. Performing this in memory, if you do not use something like redis with restrictions on automatic expiration and memory, is configured to attack DOS memory consumption. I assume it will look something like this:
```
 create table foo_paging_cursor ( cursor_token ..., -- probably a uuid is best or timestamp (see below) result_ids integer[], -- or text[] if you have non-integer ids expiry_time TIMESTAMP ); 
```
You need to decide whether the users cursor_token and result_ids can be shared between users to reduce your storage needs and the time it takes to run the initial request for each user. If they can be separated, select the cache window, say, 1 or 5 minutes, then create a cache_token for this period of time on a new request and then check if the result identifiers have already been calculated for this token. If not, add a new row for this token. You should probably add a check / insert code lock to handle concurrent requests for the new token.
You have a scheduled background job that clears old tokens / results and makes sure your client code can handle any errors related to expired / invalid tokens.

Don't even consider using real db cursors for this.

Storing result identifiers in Redis lists is another way to handle this (see the LRANGE command), but be careful with expiration and memory usage if you go down this path. The Redis key will be cursor_token, and the identifiers will be members of the list.

Best way to get a continuous list with PostgreSQL on the Internet

More articles: