I have several news aggregation websites on Twitter. I planned to add images from the articles I find on Twitter.
If I load the page and retrieve the image using the <img> , I get a bunch of images; not all of them relate to this article. For example, images of buttons, badges, ads, etc. Captured. How to extract the image accompanying the article? I know there is a solution - Facebook link sharer does it pretty well.
Mithun
Duplicate: How to find and extract the "main" image on a website
html parsing
mithun
source share