You can use UiPath to achieve this. It can clear 100% accurate PDF, Excel, HTML, Java, Windows, .NET, WPF, legacy. Also works with virtualized environments, but only with OCR cleanup.
It can be used from code (SDK), but you can also create visual automation (workflows) using UiPath Studio. Here is a tutorial on extracting web data
Note: I work in UiPath, so I know that it can handle this task. You should also try other visual automation tools such as Automation Anywhere, WinAutomation, Jacada, use them side by side and choose the one that suits you best.
source share