I'm curious about the technology behind the search engine, for example torrentz.com. From what I could observe, it does not contain any torrent files, but rather connects you to other servers that do.
- you're looking for keywords, it displays a list of potential names matching your search.
- then you select one of them, and it provides you with another list of potential servers that host the corresponding torrent file.
First of all, I'm interested in a strategy for collecting and indexing all this content:
How do they collect, then aggregate data?
Is this a basic submit service where each of these servers sends its own content for indexing?
Is this a scanning algorithm? If so, how can you start crawling a site, for example, piratebay.org?
Do they have access to the databases of other servers?
My knowledge and understanding of the bittorrent protocol is not very complicated, but the documentation that I found on the Internet showed me more about the processes involved in creating a tracker service, which is not quite what interests me. and recommended reading material is recommended.
source
share