How to create a graphic site map of a large site

I would like to create a graphic site map for my site. As far as I can judge by two stages:

  • website crawl and link analysis to extract tree structure
  • create a visually pleasing tree visualization

Does anyone have any advice or experience with this, or know about the existing work that I can build on (ideally, in Python)?

I came across nice CSS for rendering a tree, but it only works for three levels.

thanks

+4
source share
5 answers

Here is a python web crawler that should be a good starting point. Your overall strategy is this:

  • you need to ensure that outbound links are not followed, including links in the same domain, but higher than your starting point.
  • as you spider, the site collects a hash of the URLs of the pages mapped to a list of all the internal URLs included in each page.
  • go through this list by assigning a token to each unique URL.
  • use the hash {token => [tokens]} to generate a graphviz file that displays a graph for you
  • converts graphviz output to a memory card where each node link links to a corresponding web page

The reason you need to do all this, Leon said, is that websites are graphs, not trees, and graph layouts are harder than you can do in the simple part of javascript and css. Graphviz is good at what it does.

+3
source

The only automatic way to create a site map is to know the structure of your site and write a program that is based on this knowledge. Just link crawling usually doesnโ€™t work, because links can be between any pages so that you get a graph (i.e. Connections between nodes). It is not possible to convert a graph to a tree in the general case.

So, you must determine the structure of your tree yourself, and then scan the corresponding pages to get the page titles.

As for โ€œbut it only works for three levelsโ€: three levels are more than enough. If you try to create more levels, your site map will become unusable (too big, too wide). Nobody wants to download a 1MB Sitemap, and then scroll through 100,000 link pages. If your site is growing so much that you should do some kind of search.

+4
source

See http://aaron.oirt.rutgers.edu/myapp/docs/W1100_2200.TreeView on how to format tree views. You can also change the sample application http://aaron.oirt.rutgers.edu/myapp/DirectoryTree/index to clean pages if they are organized as directories of HTML files.

+1
source

You can use Site Visualizer (standard or professional version) to create a graphic site map.

After installation, click "Project" โ†’ "Create", enter the URL of the website you want to scan, then click the "Start Scan" button.

Once the scan is completed, go to the tab "Visual map" , click the "Draw" button. The site will be drawn as a set of rectangles (pages) and lines with arrows (links). You can scroll up or down this visualization or select a specific page to highlight all your outbounds links. Click the "Save" button to save the visual site map to an image file:

graphic sitemap

0
source

DYNO Mapper ( http://www.dynomapper.com ) visual Sitemap generator can generate large sitemaps for large websites and export to HTML, XML and PDF files. In fact, you can sort and filter your pages on your site map using Google Analytics metrics if you previously installed it on your website. This is the perfect visual map generator as it also audits content and displays your content. The following video describes website mapping software:

DYNO Mapper - Visual Sitemap Generator

0
source

All Articles